Regarding the "speed" question (along with the "uses a hard to find SCSI chip" objection), any design which requires the microcontroller to bit-bang the SCSI bus directly is inevitably going to suffer some performance issues. The reason why SCSI chips like the 53c80 and friends exist is because SCSI has very specific and rigorous requirements for the generation and handling of handshaking and arbitration signals, and if you're not going to invest in a fair amount of logic in a state machine which the provides buffers and triggers to offload some of that handshaking overhead the CPU is going to be spending a *lot* of its time either polling for activity or running precise timing loops to hold X wire high or low for however long the protocol calls for.
Here's the original draft specification for the SCSI interface. It *is* complicated, more complicated than most 8 or 16 bit CPU buses. This is why IDE seemed like a good idea when it came along: it's actually a simplified subset of an already simple 16 bit bus.
An MCU, even if it's one that *almost* manages one instruction per cycle like AVR (or even ARM), is going to be working pretty hard just watching the bus pins and keeping track of what phase of the conversation it's in at any given time. If you want good performance (without going ridiculously overboard on the CPU) it would help a *lot* to have some of the bus handler
in hardware. (That design that Techknight was playing with uses a second AVR as the bus handler, which is a clever idea that moves hardware complexity into software, but it still leaves you with one core that's having to spend most of its time listening to and triggering pins with just enough left over to shove semi-decoded data out the back door to the other AVR. That's not a recipe for "fast". Even *with* a SCSI ASIC to get speeds in the 1Mbyte/sec-plus territory you'll probably be needing a pretty darn fast CPU and/or DMA. And this is ignoring the fact that you're having to deal with the overhead of having to talk out the other side to the IDE/SD/USB/whatever drive and doing the necessary command translation.)
Anyway, having a bus handler in hardware which isn't a hard to find EOL IC is what leads us back to CPLDs/FPGAs. But FPGAs/CPLDs aren't magic; you'll still need a hardware design for the state machine to program into the devices. In casual Googling I have yet to find *anyone* who's "open-sourced" code for making a SCSI MAC, let alone a complete target out of CPLDs/FPGAs. (The closest I came is a confusing thread about making an ASCI-SCSI adapter for Atari ST machines out of CPLDs, which may well be a promising start.) One approach would be to try to find documentation/schematics for *very old* SCSI target devices (I.E., early SCSI hard drives, especially those ones utilizing SCSI-to-MFM adapter boards) and use an example which does SCSI with discrete TTL as a starting point. Quick googling turned up schematics for some primitive SCSI (and SASI) *host adapters* out there that fit into a dozen or two TTL devices, presumably the hard drive side isn't much harder. (And the manual/schematic for one may well be out there. It looks like based on the few board photos I've seen that a lot of the early formatter boards lacking ASICs used Zilog Z8 CPUs as the go-between for the TTL-decoded SCSI bus and the hard drive controller, which is convenient given the Z8 is vaguely comparable to an 8-bit PIC or AVR in capabilities.) From this point the project branches depending on whether you're using a CPLD or an FPGA:
CPLD: You figure out what triggers and buffers would make your life easier, pick the optimal bus design for presenting the simplified bus to the MCU of your choice (you'll still need one), and see how compressed you can make things. If you *really* want fast you still might need to consider working, say, an SRAM chip into the design so both SCSI and IDE transfers (if that's what you go with) can use DMA, in which case your CPLD(s) will need to arbitrate that too.
FPGA: With a roomy enough device you could put your bus handler and the softcore of your choice on one chip. (And presumably you'll choose a softcore that has high-level language support and lets you turn the clock speed up to eleven. Heck, if you want to keep the "hardware design" super-simple maybe you cram two softcores together and duplicate the "dual-CPU" nature of that AVR design. To speed things up massively you could have them share memory instead of having to communicate over a narrow bus, for instance. Heck, use three. One driving the SCSI MAC, one doing command translation, and a third driving the storage bus...)
So... did I have a point? Oh, yeah. Talking about CPLDs and FPGAs is all great, but what seriously needs to be kept in mind is that they don't significantly reduce the number of "moving parts" in the design. (And massively optimizing for performance will increase it that much more.) All they do is compress a bunch of discrete logic into a single package:
you still need to design that logic. I've read the thread on the VCF discussing the board that's the subject of this thread, and while I roll my eyes where it devolves into a self-righteous flaming match I do sort of understand the frustration that stemmed from and I shake my head myself when people throw out "use an FPGA!" as if it's a magic bullet that makes design problems go away with a wave of a wand. I agree that the Z-80 board here isn't an optimal solution for either performance nor future-proof-y-ness, but I have to give the folks some credit for actually pasting together something that at least partially works and is undoubtedly "educational". Two options for a next step, if someone wanted to take it slow, would be:
A: Interface a chip like 53c80 to a *fast* MCU and code a driver in a high-level language. Or.
B: Take this design (since it does have at least semi-working firmware that exists in source form) and concentrate on building a SCSI ASIC-in-a-CPLD/FPGA that works in place of the 53c80. (This of course could start with a few dozen TTL chips on a breadboard, later translated into a CPLD.)
Success with both A+B results in your dream board that's faster, easier to understand, and doesn't rely on EOL hardware. You don't *have* to do everything at once.