Vector drawing: OpenGL shaders and cairo
06 Aug 2007 » permalink
The mystery
Some time ago I had a chance to talk to Zack Rusin about the differences between QT and the Gnome/Gtk drawing stack. Zack was showing some impressive visual toolkit demos using tiny fractions of the CPU horse power. One of the subjects we started arguing about was using GPU hardware to perform tessellation — as opposed to cairo, where tessellation happens always on the software side. The idea seemed tempting though the practical benefits were unclear to me.
During GUADEC I poked a few people about the possibility of using OpenGL and shader programs with cairo to perform hw-accelerated tessellation. I got some constructive and interesting (yet discouraging) feedback about the problems and complications related to this approach. A lot of those (highly valid) points are summarized by Tim Janik in his blog. Still, the possible gains/performance improvements were highly unclear.
Investigation
I therefore decided to write a small benchmark to see what kind of speed differences we could possibly be talking about. To see if the game is worth playing at all. As the testbed I've chosen one of the most fundamental bits of the vector graphics — a bezier curve drawing algorithm. I implemented a 100% hw-accelerated version as a vertex shader program running on the GPU. I compared it against cairo software version in two scenarios — using xsurface and image surface. The following setup was used for the test:
- Thinkpad t43 with ATI mobility x300 card (fglrx proprietary driver with GL support)
- A time to draw 100 random (pre-generated) bezier curves was measured
- Curves were randomized in a 640x480 space with a random width (1-10pixels) and a random color (red, green, blue, alpha)
- Same set of random curves was used in both examples
- Anti-aliasing was turned off
- An best of 3 test runs was taken
- Source code (CG toolkit from NVidia is needed to compile the shader program)
The rough-rough result of the test is that drawing using hardware opengl tessellation is 30 times faster than current software cairo implementation. By overlying the resulting images in ie. GIMP one can see tiny pixel differences between the two implementations (cairo implementation begin prolly more accurate) but essentially it’s the same thing. The OpenGL shader implementation is not optimized at all and could be made faster by using a geometry shader instead of a vertex shader.
Not only about speed
Using GPU to perform things normally happening on the CPU has one additional advantage — it frees the processor to do other things. In my setup, with the cairo implementation drawing random curves as fast as possible, I can get a ~10fps animation (100 curves per frame). This keeps the CPU busy at 100% and the framerate will drop as soon as CPU starts doing anything else.
With my 100% GPU implementation I get around 400fps and the CPU usage never goes beyond the 30% threshold (lots of it being the random-number generation I presume).
Caveats
- OpenGL shader program support is not yet a commodity, even though it’s coming fast even to the mobile space (ie. MBX chip having vertex shader support and the SGX chip having a full set of pixel/vertex/geometry shaders support).
- As Tim pointed out, OpenGL implementations vary and per-implementation fine tuning might be in order
- To efficiently use shader programs one needs a full OpenGL-based drawing stack
- Shader programs + math + drawing algorithms are hard
- OpenGL drivers suck on Linux
I still think that the results are interesting though. I quickly hacked another shader implementation to draw solid-filled bezier shapes (more about that soon). The performance differences seem to be even bigger. My gut feeling here is that the programmable hardware drawing might be the only way to go for the resolution-independent fully vector-drawn UI’s. Especially on the mobile where the CPU power is scarce, it scales badly, and everything that doesn’t throttle the processor means longer battery life.