If you're trying to evaluate the difference in performance attributable to OS-specific characteristics, most benchmarking tools out there will give you misleading results. Speedometer, for example, operates at a fairly low level, so the effect of the OS will not be readily apparent.
I'd say that your best bet is to select a collection of "representative" tasks (whatever that means for you), and put them through their paces. Time them the old-fashioned way, with a stopwatch (or equivalent).