Yesterday, the guys at the faculty techical department and I have been playing around with OpenSesame, trying to get a clear picture of how accurate the timing really is. We used a kind of modified buttonbox that simulates a button press when a light sensor is triggered. We attached the sensor to the screen and made an experiment that simply shows a white screen and waits for a button press. Since the sensor should, for all intents and purposes, respond instantaneously to the white display, we can use the "reaction times" as a measure of how accurate the timing is. Low reaction times are good and, even more importantly, a low variation is good.
I'm very pleased to say that the results look excellent, particularly when using the new "psycho" back-end (for now only available as an experimental GitHub code snapshot, but this will be part of 0.24), which uses PsychoPy to handle all display operations. On Windows XP, the reaction times are around 3ms and fairly constant.
For Linux users it might be worth to note that, although the average reaction times are about the same (around 5ms), there is substantially more variation (I tested it on Ubuntu 10.04) (Update 1/6: I ran some more tests, and the problem was the compositing layer. With this turned off, the timing on Linux is excellent.) E-Prime users might be interested to know how this compares to E-Prime's performance. Well... there is no real difference. Using the same set-up, E-Prime obtained around 5ms, also very reliably, although on one test system E-Prime failed to correctly synchronize to the vertical refresh, whereas OpenSesame had no such issues. In all fairness, though, you can probably also find systems where the reverse is true. (The full methods and results will be made available in a manuscript that we're currently writing up, so stay tuned.)
On a semi-related note, the tech guys also used a high speed camera (recording at 1000 fps) to record what happens if you show alternating black and white screens on different computer monitors. As you can see in the video above (played back at 3% of the original speed), there is a clear difference between TFT (i.e., flat screen) and CRT (i.e., non-flat screen) monitors. On a CRT monitor (in the centre), there is always one active line on the display, whereas TFT monitors (on the left and right) "flood fill" the display from the top down. I don't think this matters much in practice when running experiments, but it might be something to be aware of (or just a fun fact to know).