Four ideas, from easy to hard:
You can simplify your x/y loop to run in one dimension instead of two--
for ( i = 0; i < y * x * c; i += 4 ). Since you're looking at the whole image, you don't need to worry about the stride. Not only will this reduce the number of operations required, but you may get better performance from your CPU due to better pipelining and branch prediction.If you can, use a lower color depth (I don't think you need 24 bits of color depth if you're just computing an average). The smaller storage size will yield a smaller memory area to scan. You will have to shift bits around to do the math, but that sort of thing is faster than memory access.
You could try resizing or scaling the bitmap to something lower rez. The resize operation will interpolate color. In theory you could scale it to a 1x1 image and just read that one pixel. If you use GDI+ to perform the scale it could use hardware acceleration and be very fast.
Keep a copy of the last bitmap and its totals. Use
REPE CMPSD(yes, this is assembly) to compare your new bitmap to the old one and find non-matching cells. Adjust totals and recompute average. This is probably a little harder than it sounds but the scan would be incredibly fast. This option would work better if most pixels are expected to stay the same from frame to frame.