During the vacations, as a new 'crazy project', I decided to have a shot a designing a co-processor in my FPGA for the 68030. The '020 and '030 have this nifty (if slow) co-processor interface. Not a traditional memory-mapped device, a real co-processor defining new instructions that can be added to your code to do extra stuff.
So far it only does AES encryption in a not-very-efficient way, but to the best of my knowledge it's the first non-Motorola co-processor for the '030 (I currently use synchronous cycle, should be easily downgraded to asynchronous for the '020). Motorola did the 68851 MMU for the '020 (a simpler MMU is builtin the '030), and the 68881/68882 FPU for the '020 and '030. I don't know of any other, anyone knows better? Sun did their own MMU, but as far as I can make out from the Sun 3 schematics and the NetBSD kernel source, that was memory-mapped. Weitek did FPUs that were occasionnaly used with 68k (Sun's FPA comes to mind), but as far as I can tell they were also memory-mapped. No-one back then seems to have bothered doing an '020/'030-only device using the co-processor interface. Memory-mapped for 32-bits addressing is a lot more generic and made a lot more sense, to be fair. The interface was dropped in the '040. Shame.
The nice thing about the co-processor interface is that you can use e.g. the AES instruction inline as if they were part of the CPU core (same as the FPU instructions). So you can define a set of macros such as:
Code:
#define KAES32E0(x, y) asm("lea %1, %%a0\n" \
".word 0xFC10\n.word 0x0000 + _num_%0\n" : "+d" (x) : "m" (y) : "a0")
Where the F-line instruction (0xFC10) indicates a co-processor instruction and the next word is sent to the co-processor. The co-processor can then request the CPU to send the memory content pointed to by the Effective Address (EA) in the co-processor instruction (here it's hardwired to %a0 as I wasn't sure how to produce the right opcodes for arbitrary effective addresses, the LEA takes care of that for me). It can also request the register (the number is passed as part of the co-processor instruction word), and then send back the result to the same register.
Then you can use it to implement an AES round in C easily:
Code:
#define AES_ROUND1T(TAB,I,X0,X1,X2,X3,Y) \
{ \
X0 = TAB[I++]; \
KAES32E0(X0, Y); \
X1 = TAB[I++]; \
KAES32E1(X1, Y); \
X2 = TAB[I++]; \
KAES32E2(X2, Y); \
X3 = TAB[I++]; \
KAES32E3(X3, Y); \
}
Each instruction handle the full updating of the relevant word from the four words in the array Y, they are basically merged version of the RV32K AES instructions. For instance the first instruction KAES32E0 is equivalent to the R5K sequence:
Code:
X0 = aes32esmi0(TAB[I++],Y0); \
X0 = aes32esmi1(X0,Y1); \
X0 = aes32esmi2(X0,Y2); \
X0 = aes32esmi3(X0,Y3); \
(the numbering means different thing in the two ISA, in R5K it's the byte offset in the word, in my code it's which word in the array to start with, original R5K code
here).
You can do a lot of powerful stuff with that interface - though the software side might quickly become a problem. For instance in the example code above, the array Y is sent four times (once per instruction) despite the fact it doesn't change. It could be sent just once, but then it creates an extra state in the "CPU"... and that extra state needs to be saved/restore when context switching, which MacOS obviously won't do.
Kinda useless (I guess
ssheven could theoretically benefits, but I don't have a NIC in my IIsi to test that theory), but I'm happy it seems to work