FWIW though the old Gamba instructions do work just fine for 8.1, I recommend saving copies of your images at each step however just in case something goes awry. I think preparing it in Basilisk pretending to be an IIci should also work just fine.
On performance though I concur with
@Mk.558. On my accelerated SE/30 (45mhz 68040) with accelerated graphics, it still slow in 8.1. It is usable, but it's not pleasant. With the stock SE/30 CPU, 8.1 may as well be unusable. Even a Quadra 650 at 40mhz - one of the fastest 68k macs - it's not great. It's clear that apple's priorities were shifting rapidly after the introduction of the PPC machines rather understandably as Moore's Law was ruthless back then.
On the topic of combo cards, I have flirted with the idea of building a combination Carrera+SEthernet+30Video GS card; I'd dubbed the concept the Trifecta card. However, as
@Mk.558 pointed out, grayscale cards and 68040 accelerators regardless of design are already expensive to source, build, and test to the point where they are not broadly attractive products. While there's an undeniable appeal in a combo card, it would 100% be a halo product easily costing over $1K if I were to take the approach of simply jamming together 3 independent designs. Fun idea, but inefficient due to lack of integration, and interest is going to be very, very limited. It's also a little dull in that there's no further ability to customize your system, which in my opinion accounts for much of the continued appeal and popularity of the SE/30. I'd rather drive costs down on my existing designs to make them more accessible, or build more new things.
The existing vintage 040 upgrades were already at the point of diminishing returns with the SE/30; it has a slow, rudimentary DRAM controller and a slow bus, and all the cache in the world can't save you when the problem is pushing pixels as fast as possible to a framebuffer or reading from a disk all which live on a 16mhz bus. To really improve performance over vintage designs, you'd be looking at making a (scratch designed) accelerator with onboard synchronous DRAM, ROM, and framebuffer at a minimum. At that point you're basically halfway to having built a Quadra, having rendered obsolete 2/3s of the original logic board. However, all your IO is still married to the slow parts which make you a SE/30: a slow SCSI controller without DMA, sound and floppy you need to massively slow down any time you talk to if you want them to work... It's a bit like sticking a V8 into a lawnmower.
All that said, I personally think there is some appeal in making a "modern" Mac logic board architecture leveraging cost-effective modern components in a way that remainds indelibly both vintage computing and a Mac without invalidating the original design in favor of a 100% FPGA or a PiStorm approach.