Of course
done.
Did you also pre-calculate (r*rows*magy) before calling the inner loop (you have to do it inside the outer loop as you change r in it). You could pre-calculate rows*magy outside both loops, then calculate r*(precalculated rows*magy) in the first loop just leaving you to add c when you index into mtmp.
All of this is probably not going to do much if the loops are small. How big are rows/cols on average?