Along similar lines…the pi pico w has WiFi. It would be so cool to incorporate a WiFi modem on the second core and pipe it to umac on the main core. Would that work?
The logical or simplest thing to do, surely is make it compatible with AirTalk? Then it pipes AppleTalk packets over UDP and Macs or emulated Macs can then talk to it. It'd be quite a challenge to use full AppleTalk though, because the AppleTalk stack AFAIK won't fit within the RAM of a Pico-Mac.Along similar lines…the pi pico w has WiFi. It would be so cool to incorporate a WiFi modem on the second core and pipe it to umac on the main core. Would that work?
Hey Snial, weren’t you working on an emulator of your own? How far along is that?The logical or simplest thing to do, surely is make it compatible with AirTalk? Then it pipes AppleTalk packets over UDP and Macs or emulated Macs can then talk to it. It'd be quite a challenge to use full AppleTalk though, because the AppleTalk stack AFAIK won't fit within the RAM of a Pico-Mac.
However, you don't need the stack to fit within a Pico-Mac's RAM if you can add functionality to Pico-Mac. This could take three forms:
- A standard means of mapping Pico-IO (and Pico-interrupts) to the Pico-Mac 68K side. PICO IO occupies wide address space, but it's fairly sparse. A suitable scheme could compress it to e.g. 64kB of the 68K address space. Then you can write Mac code that directly interfaces with Pico-IO. For example, an app to access GPIOs or even PIOs!
- A standard means of extending the Pico-Mac emulator to provide access to additional Cortex M0+ APIs from the Mac side via traps. An obvious way to do that would be, literally defining a TRAP #n interface, one that isn't used by Debuggers, ATARI-TOS or Sinclair QL QDOS (in case one wants to run these within the Mac emulator). Most of the actual TRAPs #0 to #15 are available and it wouldn't AFAIK, clash with either of these and you only need one. Executing a PicoTrap would dispatch to some emulator host code.
- Universal binaries! A standard means of running Cortex M0+ code from Mac applications, INITs or control panels. Cdex resources define host code. The Mac side loads a reference to the resource and executes a TRAP #PicoTrap API call. Pico code takes over and runs the code in the resource pointed to by the reference. The important thing is that it's XIP code which can run from the flash (and thus doesn't need Mac RAM), and also that it's contiguous in Flash too).
Thanks for asking. I was a fair way along. It's all in Cortex M0+ assembly.Hey Snial, weren’t you working on an emulator of your own? How far along is that?
MØBius would be able to support 256kB Macs on a standard RP2040 and 512kB Macs on an RP2350.I just tried pico-mac on rp2040 and it’s not bad. But it would be nice to have more than a mac 128k running. The newer rp2350 has additional instructions as well as being a little faster. Had you considered targeting the rp2350 at all? If your emulator was faster, and maybe more importantly, lower ram, what kind of machine might be possible?
I wish pico-mac supported sound - it isn’t quite the same without the beep!
68kmla.org
Thanks!If you build it, they will come!
QuickDraw expects to be able to write to a memory-mapped display. Well, this is not quite true. When an app starts up, it sets up a QuickDraw variable that can point to replacement graphics primitives. It might be possible to replace those with functions that stream graphics commands and data to a second CPU.<snip> If the mac uses those 640x432 pixels as an output only (ie never reads it back) <snip>

Yea, it sounds kinda complicated.Thanks!
QuickDraw expects to be able to write to a memory-mapped display. Well, this is not quite true. When an app starts up, it sets up a QuickDraw variable that can point to replacement graphics primitives. It might be possible to replace those with functions that stream graphics commands and data to a second CPU.
View attachment 87403
For this system, you don't really even need much code to sit on the Mac or RP2040 host side. You probably need an INIT so that by default the pointers to the replacement routines are defined. These pointers could point to Cortex M0+/M33 code on the RP2040's flash using some kind of primitive mixed-mode manager. The actual routines grab any needed data and send the command and data to the target RP2040.
Many routines don't need much of an implementation on the host side: line, rect, rRect, oval, arc, poly, just need to send the command code and data. I believe txMeasProc and textProc can just point to the standard routines. The target would need to implement them properly, but it might be possible to hack that too, by having a Mac ROM on the target, running a rudimentary app which just picks up the data stream and translates it into equivalent commands.
An RP2040 dedicated to a Mac display can cope with far more than 640x430. 1440 x 900 could be done easily (162kB), or classic Portrait monitor sizes could be done. Having said that, it might be better to limit it to half the RAM so that when the data is picked up, it can be transferred offscreen and then target QuickDraw routines used to complete the operation. Still, 1024x768 would still be possible on an RP2040.
Oh, I was trying to say it was actually feasible, because at first I was thinking that video memory had to be memory-mapped on a Mac Plus and therefore since you can't memory-map the SRAM of a second CPU, then you wouldn't be able to have a larger display. In reality, you can substitute different QuickDraw functions. However, not all software would work: software that reads and writes the frame-buffer directly would fail, only software that goes through QuickDraw would work.Yea, it sounds kinda complicated.
As a general software design principle it's better to start with something simpler to get it to work and then enhance it.<snip> rp2350 <snip> external PSRAM <snip> throughput info here: https://forums.raspberrypi.com/viewtopic.php?t=386630
OK, I've done a bit of work on NBCD timing. NBCD Dn takes 6 cycles (2 internal) and NBCD <ea> takes at least 8c more for a 4 cycle fetch. This means the RP2040 version needs to be: (4/6.5+2/7.5)*125=110 RP2040 cycles at 125MHz. On MØBius, InstructionFetch+ExtraInsDecode+SrcDn+Dst.Dn(BL) will be 40 cycles, leaving 70 cycles for the actual instruction execution. This means I don't need an efficient routine to be faster than a real Mac. I estimate my routine might take 20 cycles overall, so it'll run at the equivalent of a 11.9MHz Mac.Thanks for asking. I was a fair way along. It's all in Cortex M0+ assembly.
- Stupidly I got stuck on Nbcd despite the fact it's hardly used, because I started to get obsessed about the quickest implementation, given that Cortex Mx doesn't support it directly. I was trying to be too clever by handling both nybbles at once. I should have figured out how many clock cycles I have free and just implemented nybble ops at least as fast as that.
Thirdly, I thought of a way to simplify the target side without having to rewrite QuickDraw! You make the target RP2040 also a Mac emulator, except this time it's running a dedicated program, one that simply picks up commands from the host and gets QuickDraw to draw them on a larger screen.
It would be! There's something called the Macintosh Application Environment, which has a compiled version of QuickDraw though.That’s really a brilliant idea - it was the rewriting QuickDraw that was sounding like a lot of work.
I only know about RP2040 doing VGA, but PIOs are clever.Aside from enabling a larger screen, you’d eventually be able to use the rp2350 hstx to do DVI
Well, I think I've done NBCD now. I've estimated the timings for the Dn rather than the Memory accessing case, though my implementation covers both. Ironically, although you'd think the register case is the simplest and fastest, in fact relatively-speaking it's the slowest and most critical. That's because Instruction decode and effective address decode (including the Dn mode) is slow, leaving relatively little time for the execution phase, but reading and writing memory locations is much faster than the 4x6.5MHz cycles real M68000, even with address translation. So, if the Dn mode is fine, then the Memory accessing modes will be fine.But I agree with your sentiment of getting the simplest thing working first.
I'd do the most basic testing bottom-up, so I'd test vectoring to all the different primary opcodes (which for me, means the 256 entries from the top byte). All I have to do there is check I can execute each M68000 instruction vector. I can replace each of them with a dummy routine that just goes back to the vectoring routine. The vectoring routine is small, but critical.Nice! So going back to the testing/validation question: how do you structure a test of the emulator core? Do you emulate the Cortex code so you don’t have to run it on pico hw?
So, when things are starting to stabilise, I think I'd probably grab tests from an existing 68K emulator and run those against MØBius.Perhaps obtaining that expected end state from an existing 68k emulator?
Minor MØBius update: I've now done the next instruction: PEA (Push Effective Address). As usual for this stage of development, even single instructions can lead to a relatively large amount of development. In this case, PEA, needs an effective address, but unlike my previous EA routines, I don't want to fetch the contents (nor store them); I just want to calculate the address. This led to another set of routines, however when writing them I realised that my 'MMU' code (which isn't a real 68K MMU, it just translates from logical 68K addresses to physical addresses on the RP2040 or calls an I/O routine if it's an I/O address) had a few errors and could have trampled over some registers used by routines that call it. So, I had to go through my code to make sure that didn't happen and needed to make a few corrections.Nice! So going back to the testing/validation question: how do you structure a test of the emulator core? Do you emulate the Cortex code so you don’t have to run it on pico hw?
I’d imagine there are specific tests to run, but after the emulator is passing those, would you do something like execute a randomly generated code listing and then compare the final state (flags, memory, etc) against the expected state? Perhaps obtaining that expected end state from an existing 68k emulator?
I would if it doesn't compromise the performance of MØBius. There's not much point if compatibility leaves MØBius without a compelling edge over other 'C'-based emulators. I"ve looked a bit at the interface for MinivMac, PCE-Mac, Cyclone and Musashi, they're all somewhat different and incompatible with each other. Another obvious interface would be MAME (as you said later, and correctly dismissed).<snip> the same conventions/interface functions as e.g. musashi? So a pico-umac build could swap in MØBius as the engine?
void Emulator(void)
{
Init();
while(!Quit) {
float ms=millis();
CpuExe(16.7); // 1/60th of a second's of Cpu=1 frame.
VideoUpdate();
IOUpdate();
while(ms+16.7<millis())
;
}
}
You're right, MAME would be too hefty. However, MØBius is a good candidate for Sega Megadrive emulation (what the Genesis is called in the rest of the world).I wonder what other platforms it might also work with? I guess mame in its full form is out of scope, but I wonder if a pico could emulate 68k based arcade systems? Or pico-genesis? Just thinking out loud about how broadly MØBius might be used
OK, next update: EXT.W dn, EXT.L dn and SWAP dn are implemented. These were both simple to do. EXT.W/L require Cortex M0+ flags to emulate N, Z, with V=0 and C=0. This takes 3 Cortex M0+ instructions, which illustrates the power of writing an emulator in assembly.