I don’t have any answer for all the data I’m about to write here, but I thought it would be interesting to share with you the results of my performance tests using Bitmap objects under the .NET Compact Framework.
My Windows Mobile applications use a custom-made control library, where controls perform their own painting based on templates and views. All the painting is performed in an image, which is then copied back to the display, to avoid flickering.
While testing my applications on higher resolutions, I hit a very frustrating wall: OutOfMemoryException. That’s right. Creating the buffer Bitmap was too demanding for the emulator images I’m testing with. And we’re not speaking about very huge images. The typical size where it fails is 480x696, for a single form test application. Assuming a 32bit per pixel image, we’re talking about 1336320 bytes. That’s not that much. What is happening?
First, I decided that my library would fall back to a smaller bpp (bit per pixel) image when this happens. By default, I try creating my Bitmap using the constructor that takes only a width and height. When this fails, I rollback to using the constructor that also takes a PixelFormat, passing Format16bppRgb565.
That’s when the roof fell on me. Performance degraded dramatically. And after some profiling (by hand using QueryPerformanceCounter), the culprit was identified: Copying the buffer image back to the display’s Graphics.
So I decided to test and compare the various PixelFormat values, on a real 480x800 device running WM 6.1 (HTC Touch Pro 2), expecting Format32bppRgb to give me the same results as passing no PixelFormat parameter to Bitmap’s ctor. I was wrong!
| PixelFormat | Drawing ticks | Copying ticks |
| None: | 287392 | 215296 |
| Format32bppRgb: | 1233216 (4.3x slower) | 1963200 (9.1x slower) |
| Format24bppRgb: | 1302496 (4.5x slower) | 601088 (2.8x slower) |
| Format16bppRgb565: | 561696 (1.9x slower) | 301344 (1.4x slower) |
| Format16bppRgb555: | 586976 (2.0x slower) | 2163360 (10.0x slower) |
As you can see, the constructor that does not take a PixelFormat parameter is doing something special, because performance is drastically better. In the best case, using Format16bppRgb565, it was still almost 2 times slower. It was time for some Reflector! While the desktop version was calling its overload with Format32bppArgb (0x26200a, which is not available in the PixelFormat enum under .NET CF), the CF version was doing things completely differently than the ctor with a PixelFormat parameter:
.NET Framework:
public Bitmap(int width, int height)
{
this..ctor(width, height, 0x26200a);
}
.NET Compact Framework:
public Bitmap(int width, int height)
{
base..ctor();
this._Init(width, height);
}
public unsafe Bitmap(int width, int height, PixelFormat format)
{
PAL_ERROR pal_error;
IntPtr ptr;
base..ctor();
pal_error = GL.CreatePixfmt(width, height, format, &ptr);
if (pal_error < 0)
{
goto Label_002A;
}
base.m_cx = width;
base.m_cy = height;
base.m_how = ptr;
Label_002A:
MISC.HandleAr(pal_error);
}
private unsafe void _Init(int cx, int cy)
{
PAL_ERROR pal_error;
IntPtr ptr;
pal_error = GL.Create(cx, cy, &ptr);
if (pal_error < 0)
{
goto Label_0023;
}
base.m_cx = cx;
base.m_cy = cy;
base.m_how = ptr;
Label_0023:
MISC.HandleAr(pal_error);
}
When providing a PixelFormat, GL.CreatePixfmt is called, while it’s GL.Create that is called when no format is provided. What kind of image does that create? I have no clue. To try to identify it, I decided to try the other PixelFormat values available on the regular .NET Framework, but absent from the .NET CF. They all failed with an ArgumentException, except one! Format32bppArgb, the default value on the .NET Framework! And here were the results.
| PixelFormat | Drawing ticks | Copying ticks |
| Format32bppArgb: | 1251968 (4.4x slower) | 1945504 (9.0x slower) |
Still not the same results. Now is the best part: It did not cause OutOfMemoryException on high res emulators! Hun? The image created without a PixelFormat has greater resolution than 32bit? But Format48bppRgb, Format64bppArgb and Format64bppPArgb failed.
You might think I was done with the surprise, right? No.
The above numbers were clearly showing that using the PixelFormat-less Bitmap constructor was the preferred choice, and that I could revert to Format16bppRgb565 in case of OutOfMemoryException. Then I tested on a real 240x320 WM 6.1 device (HTC P4000):
| PixelFormat | Drawing ticks | Copying ticks |
| None: | 425056 | 287520 |
| Format32bppRgb: | 502240 (1.18x slower) | 721280 (2.5x slower) |
| Format24bppRgb: | 569952 (1.34x slower) | 646784 (2.25x slower) |
| Format16bppRgb565: | 373152 (1.14x faster) !!! | 286144 (same speed) !!! |
| Format16bppRgb555: | 371072 (1.15x faster) !!! | 765984 (2.7x slower) |
| Format32bppArgb: | 518496 (1.22 slower) | 722048 (2.5x slower) |
This time, the PixelFormat-less constructor is not the fastest option. If I want to achieve the best performance possible, I have to make different decisions on different platforms. Oh great. That’s just great… To complete the big picture, I tested on many emulator images. Here are the results, always comparing with the default format:
WM 6.0 Classic QVGA (240x320)
| PixelFormat | Drawing diff | Copying diff |
| Format32bppRgb: | 3.7x slower | 32.3x slower |
| Format24bppRgb: | 4.0x slower | 29.5x slower |
| Format16bppRgb565: | 1.41x faster | 1.03x slower |
| Format16bppRgb555: | 1.37x faster | 38.7x slower |
| Format32bppArgb: | 3.7x slower | 32.5x slower |
WM 6.0 Pro QVGA (240x320)
| PixelFormat | Drawing ticks | Copying ticks |
| Format32bppRgb: | 3.7x slower | 32.0x slower |
| Format24bppRgb: | 4.0x slower | 29.4x slower |
| Format16bppRgb565: | 1.3x faster | 1.07x slower |
| Format16bppRgb555: | 1.43x faster | 36.3x slower |
| Format32bppArgb: | 3.8x slower | 32.3x slower |
WM 6.0 Pro VGA (480x640)
| PixelFormat | Drawing diff | Copying diff |
| Format32bppRgb: | 6.8x slower | 76x slower |
| Format24bppRgb: | 7.7x slower | 72x slower |
| Format16bppRgb565: | 1.45x faster | 1.04x slower |
| Format16bppRgb555: | 1.45x faster | 88x slower |
| Format32bppArgb: | 6.9x slower | 77x slower |
WM 6.1 Classic QVGA (240x320)
| PixelFormat | Drawing diff | Copying diff |
| Format32bppRgb: | 3.6x slower | 27.1x slower |
| Format24bppRgb: | 3.8x slower | 24.7x slower |
| Format16bppRgb565: | 1.47x faster | 1.10x faster |
| Format16bppRgb555: | 1.43x faster | 30.4x slower |
| Format32bppArgb: | 3.5x slower | 26.7x slower |
WM 6.1 Pro QVGA (240x320)
| PixelFormat | Drawing diff | Copying diff |
| Format32bppRgb: | 3.8x slower | 32.1x slower |
| Format24bppRgb: | 4.1x slower | 29.7x slower |
| Format16bppRgb565: | 1.41x faster | 1.02x slower |
| Format16bppRgb555: | 1.41x faster | 36.8x slower |
| Format32bppArgb: | 3.8x slower | 31.7x slower |
WM 6.1 Pro VGA (480x640)
| PixelFormat | Drawing diff | Copying diff |
| Format32bppRgb: | 6.9x slower | 75x slower |
| Format24bppRgb: | 7.5x slower | 68x slower |
| Format16bppRgb565: | 1.52x faster | 1.04x faster |
| Format16bppRgb555: | 1.37x faster | 83x slower |
| Format32bppArgb: | 6.9x slower | 76x slower |
WM 6.1 Pro WVGA (480x800)
| PixelFormat | Drawing diff | Copying diff |
| Format32bppRgb: | 7.2x slower | 87x slower |
| Format24bppRgb: | 7.8x slower | 80x slower |
| Format16bppRgb565: | 1.54x faster | 1.01x faster |
| Format16bppRgb555: | 1.54x faster | 100x slower (10000%!!!) |
| Format32bppArgb: | 7.3x slower | 88x slower |
WM 6.5 Pro QVGA (240x320)
| PixelFormat | Drawing diff | Copying diff |
| Format32bppRgb: | 1.33x faster | 2.2x slower |
| Format24bppRgb: | 2.1x slower | 21.7x slower |
| Format16bppRgb565: | 1.39x faster | 1.02x slower |
| Format16bppRgb555: | 1.33x faster | 26.2x slower |
| Format32bppArgb: | 1.32x faster | 2.1x slower |
WM 6.5 Pro VGA (480x640)
| PixelFormat | Drawing diff | Copying diff |
| Format32bppRgb: | 1.41x faster | 3.6x slower |
| Format24bppRgb: | 3.5x slower | 44.7x slower |
| Format16bppRgb565: | 1.49x faster | 1.01x slower |
| Format16bppRgb555: | 1.45x faster | 55.5x slower |
| Format32bppArgb: | 1.41x faster | 3.4x slower |
WM 6.5 Pro WVGA (480x800)
The default format gives me an OutOfMemoryException, but forced formats show similar results as with VGA.
A lot of numbers that can be resumed to:
- Under WM 6.0 and 6.1, using Format16bppRgb565 can improve drawing times without compromising image copying times.
- Under WM 6.5, 32bpp formats have also improved drawing, offering better results than the default format, but still lacking by 2x on image copying. Format16bppRgb565 is still the best alternative to the default.
- The greater the resolution, the worse image copying is for non-default formats, except with Format16bppRgb565, which stays roughly on par.
- There is no good reason for using Format16bppRgb555 instead of Format16bppRgb565. Though drawing is as good, image copying is terrible on all platforms.
At this point, I’m clueless about what is different with a Bitmap created without a PixelFormat parameter, but my approach will be the following:
- Always use the PixelFormat-less constructor by default.
- If I get an OutOfMemoryException, call the other ctor with Format16bppRgb565.
I understand that all this could be different on real devices. But without answers, one must use its Duct Tape and move on.