• Updated 2023-07-12: Hello, Guest! Welcome back, and be sure to check out this follow-up post about our outage a week or so ago.

The SUPER m68k is shipping!

Gorgonops

Moderator
Staff member
Why any EC model? How do they implement the MMU?
So far as I'm aware the shipping version of the Apollo core doesn't have a full MMU. AmigaOS doesn't need one, my vague understanding is that Amiga software is structured as relocatable blobs of code with runtime linking and doesn't strictly care about memory being contiguous nor need a virtual address space for each process ala Unix. That does of course mean the Vampire won't run NetBSD, and would also make it complicated to run MacOS with more than 24 bit addressing on it.

(The one approach I could see, assuming the FPGA could be set up to rearrange the memory map statically, would be to do some serious ROM patching Ala BasiliskII to remap all the address space on a Classic or Portables motherboard up into the NuBus slot space and use the 128MB of RAM on the card mapped from zero for all of user RAM. It would require EXTENSIVE patching of a 32 bit clean ROM to accomplish but it might not be impossible. You might even be able to set up the card's onboard video support to emulate a NuBus framebuffer.)

 

johnklos

Well-known member
Honestly, the patches and code from Shapeshifter / Basilisk could make patching a Mac's ROM really easy, since most of the work is already done. Heck, the A-MAX IV product patched Color QuickDraw, 32 bit memory and m68020 and greater support on to 128K ROMs from a Mac Plus!

 

Gorgonops

Moderator
Staff member
Honestly, the patches and code from Shapeshifter / Basilisk could make patching a Mac's ROM really easy, since most of the work is already done.
*snicker* Yeah, I just didn't want to sound too handwave-y.

Anyway, it does seem like it should be something doable in theory since the only thing a 68k Mac normally does with its MMU is statically juggle some chunks of memory around to switch between 24 and 32 bit address maps and, in some cases, to collate discontinuous blocks of SIMM socket RAM into a linear chunk at #0000 0000. (and of course the original Mac II and LC didn't even have a full MMU to do it with, just a gimped substitute.) Presumably the FPGA could handle that much in the way of address translation and, as noted, the MacOS ROM is so device-independent it should be totally possible to come up with some sort of synthetic "VampMac" memory map. The one downside is I can imagine compatibility might be something of an issue since, just like the aforementioned emulators, the resulting machine wouldn't bear a particularly close resemblance to any real Mac. (One of the frustrating things about BasiliskII sometimes is watching it crash and burn at amazing speeds.)

Still, it's arguably almost worth doing out of sheer ridiculousness alone. Imagine having that stuffed in a Mac Classic; you could have a CPU faster than most of the early-middle era PowerPCs for most things, an HDMI video port, and over a hundred MB of user RAM, but at the same time you'd be stuck using the Classic's wimpy little 8 bit SCSI controller for mass storage and its 512x342 1-bit display if you didn't want to use an external monitor. I... guess that's only slightly more ridiculous than using an Amiga 600 as a base, but still. ;)

 

olePigeon

Well-known member
If it has an HDMI port, you could connect an iPad LCD to simulate a color CRT.  The DPI on it is dense enough you wouldn't notice any interpolation at whatever resolution you run it at.

I just don't understand how the HDMI port would work.  Does it show up as a video card somehow?

 

Gorgonops

Moderator
Staff member
I just don't understand how the HDMI port would work.  Does it show up as a video card somehow?
Yes. There's some vaporware-ish sounding talk on the website about maybe in the future it gaining the ability to run AGA chipset graphics but at the moment it appears to implement a "Picasso 96" framebuffer. (That being a driver standard for using VGA chipset-equipped expansion cards in an Amiga for programs able to use the "ReTargetable Graphics" support in later versions of the Amiga OS.) Presumably normal ECS software still goes out through the chipset hardware in an Amiga 600 equipped with it.

Were you patch this thing up to fit in a Mac Classic (or Portable) presumably you'd set up the HDMI port to present a block of RAM as a framebuffer similar to a non-accelerated NuBus card or the built-in video of a Quadra. It would actually be substantially easier to run that then to try to patch up a mapping and driver for a Mac's motherboard video. (The portable would probably be easier than the Classic in that regard, because the Portable's video is provided by dedicated RAM and already looks sort of like a Nubus card while the Classic shares onboard memory like a 128k/Plus/whatever.)

Of course, if you're talking about ripping out the stock screen and replacing it with an iPad panel or something I start questioning why you'd bother with this accelerator. Might as well just stuff a Mac Mini or NUC or whatever in there and call it a day, BasiliskII or SheepShaver on that will still outrun it.

 

olePigeon

Well-known member
Why hotrod a vintage car when you can buy a modern sports car?  There's something to be said about suping up an original rather than getting something new, even if the original is only barely an original by the time your'e done.  I like upgrading the crap out of my vintage machines, sticking as much as possible inside them without going modern.  It's challenging and fun. :)

Probably the same reason why I like original arcade machines and not simply have a MAME cabinet.  It's just not the same.

 

johnklos

Well-known member
There's nothing vaporware-ish about running AGA in an FPGA. It already exists. Also, it's pretty dismissive to talk about something as vaporware-ish from a group of people who are shipping real products. Finally, an Intel NUC is not going to be as fast as or faster than a fast m68060 or the Vampire. That's possible when using JIT emulation, but straight emulation, which is necessary for proper compatibility, is not that fast yet.

 

Gorgonops

Moderator
Staff member
Why hotrod a vintage car when you can buy a modern sports car?  There's something to be said about suping up an original rather than getting something new, even if the original is only barely an original by the time your'e done.  I like upgrading the crap out of my vintage machines, sticking as much as possible inside them without going modern.  It's challenging and fun. :)
My point about cramming an LCD into a Classic case was very specific on this point: if you did that you wouldn't be able to use the original motherboards' native video anymore. (Short of designing a video converter circuit that could take said output and upscale it.) *Aaaand* you'd have to seriously bust up the case to do it. There were plenty of external video solutions for toaster macs to let them run with a second monitor, just sort of seems to me that would be the better way to use something like a Vampire if you *were* to get it running in a Mac.

There's nothing vaporware-ish about running AGA in an FPGA. It already exists. Also, it's pretty dismissive to talk about something as vaporware-ish from a group of people who are shipping real products.
I didn't mean offense by the language choice there, just saying it's not there yet. And, yes, AGA exists in those full FPGA re-creations, I guess I'm just curious how they're going to add AGA to a machine that already has ECS without having to essentially dumb the host machine down to an input device and port the "whole enchilada" into the FPGA. But those guys know way more about the Amiga than I do so presumably they'll be able to pull it off in some form or they wouldn't be announcing it.

Finally, an Intel NUC is not going to be as fast as or faster than a fast m68060 or the Vampire. That's possible when using JIT emulation, but straight emulation, which is necessary for proper compatibility, is not that fast yet.
It seems to me that "compatibility" is a pretty slippery concept here. Are you talking about "cycle compatibility" with the original system? Because the Vampire isn't that, not even remotely, nor is it exactly opcode compatible with any real 680x0 CPU. Further, if we imagine this thing stuffed into a Macintosh in some form that lets it transcend the original hardware limitations the whole machine isn't going to particularly closely resemble *any* real machine that Apple tested any version of MacOS on so... let's just say I don't think this argument holds as much water as it might appear.

Also, well, this is totally unscientific because it's not the software in question but according to the "Kronos" benchmark ARAnyM without JIT runs the CPU tests almost exactly as fast as a CT60-100 on my 2012 vintage Macbook, and completely slaughters it at FPU. (I found a page with a fully configured EmuTos dist with Kronos in a .zip file.) NUCs come in various speed grades, I don't know off the top of my head how the best one might compare to this laptop but I'm guessing it's pretty close.

 
Last edited by a moderator:

Gorgonops

Moderator
Staff member
benchmark, JIT, NoJIT, whatever
Because I'm annoyingly stubborn I dug up a Basilisk disk image and tried the 2014 build of BasiliskII E-Maculation links to and tried that. According to Speedometer 4 the integer performance of the same laptop as above is 19.46 times as fast as their baseline, which is a 25mhz 68040. Coincidentally that's the same Mhz as the Amiga 4000/40. I'm having a blazes of a time finding solid benchmark numbers for the Vampire but, well, one test on its page shows it achieving "31.2FPS!!!" verses 6.8 for an Amiga 4000/40. That's roughly 5x, not 19x. Is a 68060 *ever* really 19 times faster than a 68040@25mhz? Honestly curious.

For reference the same laptop scores 63.68 with JIT enabled.

 

johnklos

Well-known member
What I mean is that self-modifying code won't run properly on JIT. The last time I looked, a Core i7 would barely do about m68060 speeds for non-JIT, and was much slower with certain things and much faster with other things. Maybe I need to retest, but the last time I tested, a 3 GHZ Core i7 did not clearly best an m68060 at mixed code.

Nobody wants, nor cares about cycle accurate emulation of a CPU except for people who want to play some old, poorly written games. Most systems, real or fake, can be made to run as if it were running on a real m68000. When I talk about compatibility, I mean that pretty much all code runs as-is, excepting only code that attempts to do very CPU specific things. Amigas have all sorts of software that loads m68040 or m68060 optimized routines, for instance, and those are both unnecessary and not the kinds of things that require compatibility.

The FPGA implementation is interesting because there's no good reason why the CPU would need to, for instance, trap instructions which are available on the m68020 and m68030 but not on the m68040 or m68060. Since the trap code would simply return to the same spot after running code which would perform the same function as the actual instruction, implementation of the instruction itself won't do anything unexpected unless someone intentionally didn't follow the instruction definition. This means there's no good reason why you couldn't have a superset of the m68k instruction set which would have instructions from all processor models.

For FPU, the reason that the emulators are so fast is that they use the underlying processor's FPU to perform the math. This has caused issues in some real world cases in ARAnyM and had to be fixed, so I'd be interested to see where that is now. Emulating the FPU to give the same precision as the m68k FPUs wouldn't be as fast, but I bet it's still a lot faster than real m68k CPUs.

AGA emulation on an ECS machine is interesting, and can be done because the FPGA board controls all memory accesses. If on boot an AGA (or AGA superset) chipset were emulated, the FPGA CPU could easily be directed to access it instead of the motherboard from the very start.

I'm excited to see what these folks come up with, especially considering how powerful and affordable FPGAs are becoming.

 

johnklos

Well-known member
Gorgonops: BTW - I'd be very interested in trying out newer emulation software if it's really getting much faster, because currently I only have one 60 MHz m68060 and one 50 MHz m68040 to compile pkgsrc packages for m68k. What's the fastest m68k emulation software you've run that can also emulate the MMU?

Now that I think about it, I suppose my tests have all been with MMU emulation on, so I bet while I'm not seeing a Core i7 beat an m68060 and you're seeing it clearly beat one, the MMU emulation is why...

 

Gorgonops

Moderator
Staff member
Now that I think about it, I suppose my tests have all been with MMU emulation on, so I bet while I'm not seeing a Core i7 beat an m68060 and you're seeing it clearly beat one, the MMU emulation is why...
I actually tried to see what ARAnym would do with MMU enabled but the OS X port of it seems to be sort of broken. I was really curious how it would compare with the no-JIT-no-MMU score but in the time I had to fiddle with it I couldn't get it to launch. Might try it on a Linux box later, although the fastest thing I have at hand for that is *really old*. (2008 vintage Core Quad.)

... actually, wait, I just got it to launch; turns out the GUI is broken but it will go if you launch the binary inside the .app from the command line manually. And what's actually strange is the MMU binary runs faster than the no-MMU one. It's claiming a CPU score of "70.2" vs 49.5 for the CT60-100 (And about 50 for the non-MMU binary) and 23.8 for a Hades 060. I have no explanation as to why the MMU binary is faster but at least here it seems to believe it is. (If I could find a disk image with NetBSD already installed on it I'd be tempted to try to load that just to make sure it's working as advertised.)

It's been a long time since I ran any 68k emulation and cared about performance, it might be interesting to see where the limits really are these days. Basilisk even without JIT has been Quadra-ish speed since about the Pentium III era so being "fast enough" was a bar passed quite a long time ago. (First machine I ran it on way back in 1999-ish was a Cyrix 166+, which scored about as fast as an LC II. JIT put that up in the Quadra ballpark but was buggy as heck when it debuted.) I guess my one observation for today is that Aranym-no-JIT seems to be slower than Basilisk-no-JIT, assuming the benchmarks I'm using on either scale linearly (IE, twice the number means twice as fast.) Kronos says a "Milan 040", which was apparently a 25Mhz 68040, is an 8.1; even if I take the better score of 70.2 that makes this laptop only eight and a half times faster than a 25mhz 040 baseline vs. the claimed 19x for Basilisk. In fairness, Aryanm *is* also emulating an MMU *and* is at least partially emulating an Atari ST's custom chips so the higher load may be understandable.

In any case, *IF* Aranym is telling the truth it looks like MMU-enabled emulation faster than any 68060 (short of something cryogenically cooled, perhaps.) is indeed possible on an Ivy Bridge i7 machine. (I stand corrected, it's an "Early 2013 Retina", not 2012.)

What I mean is that self-modifying code won't run properly on JIT.
Isn't self-modifying code pretty much verboten on anything above a 68040 anyway? (IE, wasn't that the one biggest compatibility problem introduced by the Quadra?) On one hand that certainly is the theoretical advantage of creating your own syntho-CPU in an FPGA, you could design it to allow for *anything* written for a 68000 to work properly, but on the other, well, if you want to do that and *seriously* improve the performance compared to just an "over-clocked" 68000 it would seem that such a design decision would make it particularly complex to implement modern technologies such as pipelining and superscalar execution, both things the Apollo core claims to have. I'm curious how compatible it actually is with the sort of code that would blow up a JIT emulator *or* a 68040/60.

 

Gorgonops

Moderator
Staff member
Isn't self-modifying code pretty much verboten on anything above a 68040 anyway?
Actually I guess this page claims they break on anything above a 68020, but maybe your mileage may vary.

One "complaint" I saw about the Vampire was some grumbling that it was unusually difficult to get ancient games running on it reliably via WHDLoad, but I don't have enough information to pin that on any fault in its 68k core regarding self-modifying code, there's a lot of factors at work there, but... still, generally speaking "maximum performance" and "perfect backwards compatibility" usually don't occupy the same spot on a map.

 

Gorgonops

Moderator
Staff member
I know it's bad form to keep replying to myself, but...
 

And what's actually strange is the MMU binary runs faster than the no-MMU one. It's claiming a CPU score of "70.2" vs 49.5 for the CT60-100 (And about 50 for the non-MMU binary) and 23.8 for a Hades 060. I have no explanation as to why the MMU binary is faster but at least here it seems to believe it is. (If I could find a disk image with NetBSD already installed on it I'd be tempted to try to load that just to make sure it's working as advertised.)
Just to note, I did stumble across a pre-built Debian Sarge disk image and successfully fired that up, so apparently that ST benchmark is telling the truth that the MMU-capable binary runs faster than the plain no-JIT.

Of course, I have no idea how to meaningfully benchmark its performance running a UNIX-oid operating system short of building a kernel or something, but I don't have a unix-running m68k machine to compare it to so the resulting wall clock time wouldn't mean a whole lot. Maybe there's some category of operations related to UNIX task switching that it does abnormally slow and thus fails to beat a 68060 at some real-world task, I dunno.

 
Last edited by a moderator:

johnklos

Well-known member
Ha ha ha... Yes, don't stop replying to yourself. It's educational!

I'll have to make some time to play with ARAnyM on an AMD Bulldozer (not the fastest available these days, but I have eight cores which are idle 98% of the time available 24/7/365.242). I'll even make a NetBSD image available along the lines of the images made for ARM machines like the Raspberry Pi. You've given me hope, Gorgonops!

 

Gorgonops

Moderator
Staff member
You've given me hope, Gorgonops!
Yay!

I'm really curious to see if you can make it work now. I did poke around to see how arduous it might be to make my own NetBSD installation and I stumbled across this thread on the NetBSD mailing list. Unless they've changed the emulator to support the missing interrupt it may be a nonstarter unless you're able to patch together a special Aranym-specific kernel. :(

(From what I can gather the reason Aranym doesn't support it is that interrupt was mostly dedicated to sound generation on earlier STE models and isn't used/needed on the Falcon... but is apparently physically available in the hardware so that's pretty annoying regardless.)

If the CPU core is really as fast as it seems to be I wonder how much work it would be to port it to something like TME and tinker together a really fast Sun 3-x emulator?

 

Bunsen

Admin-Witchfinder-General
I noticed one Vampire user on one of the forums (I forget which, Apollo maybe?) was successful in running a Mac emulator on AmigaOS on the Vampire.  Perhaps that is where the useful code to get this up and running on a hardware Mac might be found.

 

Gorgonops

Moderator
Staff member
I'd wager they were using either Shapeshifter or the Amiga version of BasiliskII so, yeah, extensive ROM patching would be the trick in both cases.

I remember browsing through the docs in the BasiliskII source years ago and being absolutely fascinated with how it pulls off running MacOS on Amigas lacking an MMU. It must be truly spectacular when it crashes.

 
Top