Well, ish. =D It is backward compatible with the Disk II, but some features are entirely new. Some of these are relatively trivial (slow/fast mode just halves/doubles the clock, for example), but the asynchronous mode that the Mac uses is a bit more complex...
This is certainly an interesting project, especially if it's done on a CPLD. If people are willing, it might function well as a CPLD and IWM tutorial for software-oriented people like me. As a preamble, I have actually designed a synthesizable, practical, 16-bit CPU in Verilog. I think this means I'm not entirely clueless, but my lack of overall experience means I'm likely to ask a lot of dumb questions. If that can be tolerated, it'd be wonderful. Any insights from
@bigmessowires would also be educational for me. So, if I can kick off with a couple of points:
The Shift And Buffer Regs
One of the most significant parts of the IWM is the shift register and buffer. Since the IWM can't both be reading and writing at the same time and can't be reading from one disk while writing to another, it makes me think that the IWM only has a single 8-bit shift reg and a single 8-bit buffer.
So, most shift reg bits can either come from the previous shift reg bit (for both reading and writing), or from the corresponding buffer reg. Conversely, buffer reg bits can come from the corresponding shift reg bit (for reading) or from the corresponding IO pin (for writing). Most of the Data buf I/O pins again are either outputting the corresponding buffer register or inputting from the data bus D<7:0>.
There are some exceptions to this. ShiftReg<7> can output to WRDATA in output mode; ShiftReg<0> can input from RDDATA. D<7> can be outputting a bit from a status reg or the RDSense.
I find FPGA architectures easier to understand than CPLD architectures, because they tend to consist of a LUT and flip-flop whereas a CPLD macrocell seems to be quite convoluted. Nevertheless, to my mind, it looks to me like a single 16 macrocell logic block can implement both the 8-bit shift and buffer regs.

The individual product terms should be able to select the right source for a shift reg bit (
shifting & ShiftReg<n-1>) | (
loading & BufferReg<n>) the product term for the clock input should be obtainable from the asynchronous clock or synchronous lock (which would be coming from a different macroblock). So, that's 3 product terms for most of the shift regs and something similar would apply for the BufferReg.
So, obviously I know a CPLD isn't normally programmed directly against it's underlying specific architecture, unless that's what CUPL does, but a decent HDL compiler should be able to achieve something like that efficiency I would have thought?
Clock Rate Puzzle
I understand that the clock rate can be synchronous (at 7MHz or 8MHz) or asychronous. However, the bit cell rate is 0.5MHz or 0.25MHz in synchronous mode (and I'm presuming FCLK/16 for Asynchronous mode). Why does the IWM need so many cycles per bit? Is it averaging them, or clocking each bit cell in the middle of a cycle? (the latter is my guess).
But even that's a bit of a puzzle, because surely disk data is asynchronous w.r.t IWM clock cycles?