To accelerate GWorlds, you will have to control where they are allocated and create them in card RAM, assuming there is sufficient space for the size(s) being requested. Then, for starters, when there is a copy operation, you would check the source and destination and do an onboard translation to eliminate the NuBus transfers that would normally occur between main memory and the frame buffer.
You can find helpful info (architecture, traps, mechanics, etc.) in the newer Inside Mac QuickDraw Imaging book.
In terms of implementation, you will need to head-patch QuickDraw32 traps for anything you want to direct to your board. But, there are some caveats. If your head patch overhead is such that you slow down everything for the sake of a few operations, it's not really a benefit to the user. Also, if a GWorld application were to do extensive processing between main memory and the GWorld, then it might not be worth it to accelerate that application's use unless you also accelerate all the most common QuickDraw operations within the frame buffer and can make it worth the user's while. Otherwise, while it might appear faster to move the GWorld to the active buffer, all the other manipulations between the CPU and the card may take longer because of NuBus slow-downs. So, the ultimate payoff may not be there. I seem to recall many of these kinds of discussions in 1992 and it's one of the reasons (besides optimizing for compatibility) that there is a GWorld on/off switch in SuperVideo.
For QuickDraw, the low-hanging fruit is to accelerate common, atomic operations where there are clear wins (copybits, lines, patterns, modes, polygons, etc.) SuperMac's acceleration was relatively comprehensive across hardware and software, especially with the eventual addition of the custom silicon support of the Squid chip used on GWorld-capable boards like Thunder. Paul Campbell was instrumental in the Squid Chip design and (I think) it remained at the -01 level for its life. Peter Barrett did the software QuickDraw acceleration and often communicated with engineers at Apple. We had early access to the 8*24GC board and I'm sure there was interaction with people like Jean-Charles Mourey, Bruce Leak and Konstantin Othmer (but especially Bruce and Jean-Charles).
Not all of the acceleration required hardware. Some of it was purely software-based and capitalized on (
fortunate) inefficiencies in QuickDraw. I think one notable example was the initial line acceleration implementation on the Spec/24 III and ColorCard/24, where Squid eventually included/improved line drawing, I think.
SuperMac was initially a little nervous about the 8*24GC, but with a lot of mental and elbow grease, early SuperMac designs mostly outperformed it and had a higher degree of hardware/software compatibility. The 8*24GC was a very cool and visionary product, but it wasn't necessarily a huge logistical win that it was a Bus Master with Block Transfer Mode. The Bus Master/BTM implementation required additional NuBus memory space -- thus disabling access to an extra NuBus slot -- expensive in 3-slot machines like the IIcx/IIci. But, at the time, it was a nice (and interesting) hardware/technical achievement that showed developers what was possible, even if somewhat impractical. Apple eventually had the last laugh by killing an entire 3rd-party graphics ecosystem with onboard motherboard video and their own big-screen monitors. So, while Apple initially lost the acceleration competition with its 8*24GC, it was a dark harbinger of doom.
Anyway -- back on topic -- practically speaking, to see how things work with GWorlds, dump the traps with and without SuperMac code loaded and compare the disassemblies. You must already have a good suite of debugging tools, given what you are doing -- TMON Pro, MacsBug, MacNosy, etc. So, those should help you. If you need MacNosy, look for Steve Jasik's page at jasik.com and send him a note. He still sells it for retro hobbyists.
And...I haven't thought of the 8*24GC in a long time. I just nostalgically poked around to see if there was any 8*24GC info and (surprisingly) found this article -- should be very helpful for you and answers certain conceptual questions:
preserve.mactech.com