• Hello MLAers! We've re-enabled auto-approval for accounts. If you are still waiting on account approval, please check this thread for more information.

NeXTStation Clock Doubler!

Ah, I just puzzled out what they're doing in the vintage designs with 32 bits of buffering and 32 bits of transceivers total. They aren't using the multiplexed bus mode on the CPU on the card at all! Instead, they're faking the multiplexed mode, as the address is always driven they just enable those buffers at the relevant times and switch the data buffers on otherwise.

My approach is more elegant as I only have one set of transceivers and no need to route 32 additional lines.

1746122040332.jpeg1746122056406.jpeg

Completed a second test card.... running into some DOA CPUs, unfortunately, which has complicated things a bit.
 
This is great! I have my original NeXTCube I need to dig out. Apart from possible physical fitment issues, I would think that this should work with the cube motherboard. At least with the cube, I can put the motherboard in any slot to make room (just have to get different drive cables to reach.

I'm happy to work on testing it in the Cube (25MHz 040 Cube motherboard).
 
Last edited:
There was a 50MHz accelerator advertised for the cube: Orb Pyro. I don't know if it ever shipped, but it states:

When available, the Pyro board will plug into the NeXT motherboard in place of the 68040 chip. It will be available in different form factors for the NeXT Cube, Mono Station, and Color Station. For those whose ‘040 CPU is socketed, a tool will be provided to remove the chip, and replace it with the Pyro board. For those whose CPU is socketed, a motherboard exchange program will be available. No additional software is required, and original memory can be preserved.

Other than the link above, I haven't found any technical details or whether it ever shipped. Anyone know?
 
There was a 50MHz accelerator advertised for the cube: Orb Pyro. I don't know if it ever shipped, but it states:



Other than the link above, I haven't found any technical details or whether it ever shipped. Anyone know?
They exist, and are extremely uncommon. It's a similar doubler type to mine, and should have identical performance unless they screwed something up in their implementation. They took a more complicated approach requiring two sets of buffers, but the end result ought to be identical.

For testing I can use some 68040 PGA sockets to space the accelerator out until it clears in at least the original cube boards. It's not the end of the world to design an alternate version to work in early cubes, though I'd want to know if Turbo Cubes use the same CPU orientation as that'd be a major reason to have that specific version. All types of slabs should be able to use one PCB design.

Did some quick benchmarking... it's definitely a large upgrade over the 25mhz, 33mhz is still a nice bump but less compelling.

1746234920238.png
 
@zigzagjoe are you planning on producing a batch and selling them? Or just open-sourcing the design and allowing random folks to build them?
I'm interested in two of these...
Then again, re-reading... looks like these would work best in non-Turbo systems? Or down-clock a Turbo to 25MHz to run at 50MHz? CPU speed would certainly improve/double, but does RAM/disk throughput then suffer with the 33-->25 drop?

Just need one with a built-in 060 adapter and new ROM / OS patches, and then the old slab would really fly ;)
Speaking of ROM, anybody know if the normal TL866-ii+ will program a v71 ROM to v74?
 
@zigzagjoe are you planning on producing a batch and selling them? Or just open-sourcing the design and allowing random folks to build them?
I'm interested in two of these...
Then again, re-reading... looks like these would work best in non-Turbo systems? Or down-clock a Turbo to 25MHz to run at 50MHz? CPU speed would certainly improve/double, but does RAM/disk throughput then suffer with the 33-->25 drop?

Just need one with a built-in 060 adapter and new ROM / OS patches, and then the old slab would really fly ;)
Speaking of ROM, anybody know if the normal TL866-ii+ will program a v71 ROM to v74?
These would be something I'd build, test, and sell.

You're correct that Turbo systems need to be downclocked to 25mhz bus first. This requires installing a jumper header only. Conceptually, this would reduce IO and RAM performance, if all else is held equal and we're only looking at raw throughput. However, this is offset by faster cpu performance (and additional waitstates @ 33mhz) so the effective performance in the worst case scenarios ends up same as before. More often, it's slightly better. You can see above that the IO performance is effectively a wash (disk, video) or slight gain (network).

1746405212198.png

Per this simple memory benchmark due to efficiencies elsewhere the practical throughput to DRAM via memcpy appears to be improved. Below 4K cache sizes, the speed increase is linear since that's entirely within CPU caches. Above that is where DRAM performance comes into play. I'll have to write a simple assembly memory benchmark at some point to get closer to the maximum practical though as who knows how effective the C memcpy implementation is. Still, it's a very good sign for practical use.

Here's some additional benchmarks I was asked to run of compiling.

1746405267296.png

060 would need a different design as it doesn't support the multiplexed bus mode used in NeXTs. That said, there's nothing preventing faking the multiplexed bus - that's what the vintage accelerators do, but I skipped that as it's more complicated in some ways. But, yeah, you'd need a ROM that at least doesn't throw up when an 060 is installed, and also install the appropriate FPSP and ISP from Motorola... not a trivial endavour. I may look at putting a cache on a doubler someday, but given I've been skiving off on my 060 in Q650 project, I don't see getting around to it in NeXT anytime soon :)
 
These would be something I'd build, test, and sell.
Awesome, looking forward to it. And thank you for all the details...

Would they be fully populated with 040s? Or supply my own?

I thought there was a 3.3V 040 variant which would run cooler and maybe overclock better?
 
Awesome, looking forward to it. And thank you for all the details...

Would they be fully populated with 040s? Or supply my own?

I thought there was a 3.3V 040 variant which would run cooler and maybe overclock better?
The V-suffix 68040 does not have a FPU, so it is no good for NeXT.

While my supplies last I'm intending to build these with QFP 68040s. These are mostly be the late L88M/K63H chips and would be fully assembled and tested at that speed. It looks like my first rev boards will be workable in slabs with a couple of bodges so I'd likely use these PCBs for an initial run. Cubes I expect to require a new PCB design that can't work in a slab.

I can't get many of these CPUs so depending on how demand goes in the future I'd probably have to look at a PGA variant. That would probably be a bring your own CPU scenario for a variety of reasons. Pretty much any 040 aside from the early chips is able to operate at 50mhz as long as a heatsink and thermal paste is used with the CPU.

1746406157264.jpeg
 
I pulled my Cube logic board today, and the CPU is socketed. I can do some measurements for you to check the fitment. Just send me the dimensions of the board. I could 3D print a mockup too.
 
Hello ZigZagJoe: I am really impressed and I know I am the guy you will enjoy working with hopefully sir. KUDO's!

I've been carrying the NeXT torch as an incredible career rescuing heck I've jumped in front of a bulldozer in a landfill at the School of mines (true story) to save some. :) repairing, trading , reselling, maintaining buying and selling NeXT hardware, software and memorabilia for the past 32 years lol amazing like the MayTag repairmen . I and the NeXT Community have wanted this upgrade to manifest for so long, this is epic, thank you :) I suffer from an ailment called aging and with my health conditions working from home using a walker works out well.

The Turbo Color 50 Mhz hack, I am the current owner by the way but it was a galliant effort in 2003 and I also own a NeXT 40 Mhz Nitro and a few hundred NeXT workstations and Cubes. I will happily donate some NeXT motherboards if you need them to your cause . Obviously 68040 25Mhz and 68030 25Mhz Cube Boards . I also have Larry Ellisons Turbo Color and NeXT Cubes number 4 and 5 :)
My youtube.com//robblessin channel shows hundreds of my NeXT and other projects some of them ecletic sense of humor.

Yes, Apologies about the confusing registering for the nextcomputers.org forum , it isn't a right of passage , it is because we were being attacked by bots that ran a denial of service attack? What ever it was removing the registration form page for sanity, over whelming the registration process , 80,000 fake registrations in 1 day then once the damn thing beat the registration form , it got worse by start machine gun blasting the threads with spam , putting Nitro our sys admin into a hellscape nightmare of unwinding a spaghetti monster , back up solved it . It seems like we may be able to set up an enquiry button just thought of it, if you all have any ideas or expertise we welcome them.

It was equivalent to 10,000 drones flying into your house all at once and crapping all over the place, the damn thing even booted me at one point , a few weeks of it was enough. Why our site as we are friendly , My name is Rob also computerpowwow on eBay band blackholeinc.com , over hauling the site as well.

Cloud Flair handles it now , we would love to have you , if you would like to join that goes to everyone here you can email me direct at bhi1@ix.netcom.com , My handle is rob blessin black hole
Your preferred user name and email and I'll make sure nitro gets the info. I'm also battling spam deleting hundreds of messages a day and yes I have a spam filter , I love my job.
I'm also recognizing some members in the threads here , very cool , I had signed up awhile back , so many things in my orbit this is incredible news and

ZIG Zag , we may have done business before .
PS I'm also one of the coauthors of Inside NeXT and we are working on the 2025 edition as of a few days ago and Luciano the author , we have few pages to fill and if you like I think is worthy of an addition I appreciate your time and look forward to hearing from you. Best Regards Rob Blessin
 
Hello ZigZagJoe: I am really impressed and I know I am the guy you will enjoy working with hopefully sir. KUDO's!

I've been carrying the NeXT torch as an incredible career rescuing heck I've jumped in front of a bulldozer in a landfill at the School of mines (true story) to save some. :) repairing, trading , reselling, maintaining buying and selling NeXT hardware, software and memorabilia for the past 32 years lol amazing like the MayTag repairmen . I and the NeXT Community have wanted this upgrade to manifest for so long, this is epic, thank you :) I suffer from an ailment called aging and with my health conditions working from home using a walker works out well.

The Turbo Color 50 Mhz hack, I am the current owner by the way but it was a galliant effort in 2003 and I also own a NeXT 40 Mhz Nitro and a few hundred NeXT workstations and Cubes. I will happily donate some NeXT motherboards if you need them to your cause . Obviously 68040 25Mhz and 68030 25Mhz Cube Boards . I also have Larry Ellisons Turbo Color and NeXT Cubes number 4 and 5 :)
My youtube.com//robblessin channel shows hundreds of my NeXT and other projects some of them ecletic sense of humor.

Yes, Apologies about the confusing registering for the nextcomputers.org forum , it isn't a right of passage , it is because we were being attacked by bots that ran a denial of service attack? What ever it was removing the registration form page for sanity, over whelming the registration process , 80,000 fake registrations in 1 day then once the damn thing beat the registration form , it got worse by start machine gun blasting the threads with spam , putting Nitro our sys admin into a hellscape nightmare of unwinding a spaghetti monster , back up solved it . It seems like we may be able to set up an enquiry button just thought of it, if you all have any ideas or expertise we welcome them.

It was equivalent to 10,000 drones flying into your house all at once and crapping all over the place, the damn thing even booted me at one point , a few weeks of it was enough. Why our site as we are friendly , My name is Rob also computerpowwow on eBay band blackholeinc.com , over hauling the site as well.

Cloud Flair handles it now , we would love to have you , if you would like to join that goes to everyone here you can email me direct at bhi1@ix.netcom.com , My handle is rob blessin black hole
Your preferred user name and email and I'll make sure nitro gets the info. I'm also battling spam deleting hundreds of messages a day and yes I have a spam filter , I love my job.
I'm also recognizing some members in the threads here , very cool , I had signed up awhile back , so many things in my orbit this is incredible news and

ZIG Zag , we may have done business before .
PS I'm also one of the coauthors of Inside NeXT and we are working on the 2025 edition as of a few days ago and Luciano the author , we have few pages to fill and if you like I think is worthy of an addition I appreciate your time and look forward to hearing from you. Best Regards Rob Blessin

Good to hear from you Rob.

Nitro got me registered. There's now a parallel thread to this one over there. I sent you an email a few weeks ago, search for zigzagjoe and you ought to find it :)

I still haven't quite gotten the handling of the bus arbitration signal to a point where I'd be happy to call it done and go 100% on hardware testing mode, but despite that it works very well. I'm going to do some testing with some NeXT gear that a local guy has this Friday so should hopefully know if my accelerator works in the non-turbo hardware. I definitely could use some pictures of a Turbo Cube board so as to understand the CPU rotation. @powellb, if this is what you have then I would appreciate if you can take some pics!

I'm less worried about the non-turbo 040 Cubes.... I know those will need a new PCB design. It's not a huge priority as a revised PCB design is "easy" by comparison to the logic and it can be functionally tested without needing a new board. The NextStation Color (non-turbo) may also need a tweak due to the location of the DSP RAM slot, or I may just preclude the use of that slot. Not like there's much to do with it.

I haven't looked at an 030 cube. It would probably be relatively simple to make an 030 doubler design similar to my socketed Boosters or possibly even a cached 030 design like the DiimoCache, but as I don't have a cube I don't expect to work on that. I *am* looking for the power board to a Sony N4006 color monitor, though, but that doesn't have anything to do with my as of yet unnamed accelerator.
 
I'm less worried about the non-turbo 040 Cubes.... I know those will need a new PCB design. It's not a huge priority as a revised PCB design is "easy" by comparison to the logic and it can be functionally tested without needing a new board. The NextStation Color (non-turbo) may also need a tweak due to the location of the DSP RAM slot, or I may just preclude the use of that slot. Not like there's much to do with it.
Sorry, I have a non-turbo (25 MHz) Cube. Wikipedia has a nice-resolution photo you can zoom into for the 25MHz Cube. I can take a finer if you prefer. I took a number of measurements for you. I have a lower res photo attached with dimensions below to give you a sense for fitment.

The 68040 socket (A) raises the CPU 7mm off the board. The custom chip (B) is a total of 5mm high, so anything on the 68040 socket will clear chip (B). Chip (C) is also socketed (socket at 7mm), so it is flush with the CPU and presents an impediment. It is a total of 10mm high. Likewise the grounded bar that divides the motherboard (RF?, Heat Sink?), is also 10mm high.

Horizontally, the 68040 has a lot of room towards the RAM. A PCB will have no impediment from the other chips, as it is 3mm above everything. The distance from the 68040 to the RAM (line 1) is 125mm. The distance to Chip B (line 2) is 22mm. But, a PCB installed in the CPU socket is 2mm above the other chips (except for Chip C and the gounded bar).

Near the socket, the edge of the 68040 is 13mm (distance lines 3) away from both chip (C) and from the grounded bar. Finally, from the 68040 to the edge of the board, it can't extend beyond 3mm, or the board cannot fit into its slot. Therefore, a PCB would want to sit right into the 68040 socket and extend towards the RAM.




cube.png
 
Sorry, I have a non-turbo (25 MHz) Cube. Wikipedia has a nice-resolution photo you can zoom into for the 25MHz Cube. I can take a finer if you prefer. I took a number of measurements for you. I have a lower res photo attached with dimensions below to give you a sense for fitment.

The 68040 socket (A) raises the CPU 7mm off the board. The custom chip (B) is a total of 5mm high, so anything on the 68040 socket will clear chip (B). Chip (C) is also socketed (socket at 7mm), so it is flush with the CPU and presents an impediment. It is a total of 10mm high. Likewise the grounded bar that divides the motherboard (RF?, Heat Sink?), is also 10mm high.

Horizontally, the 68040 has a lot of room towards the RAM. A PCB will have no impediment from the other chips, as it is 3mm above everything. The distance from the 68040 to the RAM (line 1) is 125mm. The distance to Chip B (line 2) is 22mm. But, a PCB installed in the CPU socket is 2mm above the other chips (except for Chip C and the gounded bar).

Near the socket, the edge of the 68040 is 13mm (distance lines 3) away from both chip (C) and from the grounded bar. Finally, from the 68040 to the edge of the board, it can't extend beyond 3mm, or the board cannot fit into its slot. Therefore, a PCB would want to sit right into the 68040 socket and extend towards the RAM.




View attachment 86308
Thanks, I appreciate it. I'll need to find the turbo cube CPU rotation as I'd prefer to not have more than two board variants.
 
Seems like a lot of the cubes have soldered CPUs. I assume it's something with clearance to the PSU. The vintage pyro accelerator certainly required you to change the slot the board went in, I don't see any way around that.

I did get to test it on a non-turbo NeXTStation today.... and it worked! I will need to do some testing to make sure it's stable and all functions work however. So far they've been working great on my Turbo Color station; I've run a 3 day povray render, days of constant verified disk activity, and just generally giving it hell without a hiccup.

Extremely early testing numbers:

3LOyVHA.png


Interesting insights: the Non-turbo machine actually seems to have *better* memory bandwidth than the Turbo machines. I am assuming the main benefit to the interleaving DRAM (and faster DRAM) in the Turbo architecture results in quicker random access rather than necessarily more bandwidth. Also interesting is that the expected memory performance loss due to the doubler is seen here where it is not present on the Turbo. I'll have to take a look at actual timings, though - it might be a delay due to bus arbitration. I really don't like how NeXT handled DMA.

These numbers shouldn't be compared against my Turbo numbers as I haven't yet controlled for the SD card/disk image differences. I don't *think* that's a problem, but I want to eliminate it to be sure. Disk access does seem slower than I'd expect given the SCSI controller is identical between turbo and non-turbo machines.

Also amusing is this will generate a Timer test failure with code C3 at boot - the timer test sets a millisecond timer and has the CPU wait to verify that it receives an interrupt in that expected time. However it does not control for execution speed in the busy loop, so that means an accelerated CPU will always cause this error since it is faster. Apparently it affected the Pyro too.
 
I've been fiddling with the possibility of a cached doubler, which could apply for both the Quadra 700/900 case as well as potential NeXT applications. Of course, it will require a much more complicated design (adds 4 data sram, 2 tag sram, 9 buffers/registers, and larger CPLD)

My rather bodged up development platform is my poor Q605 modified for external clock input, a doubler development card, and a cache development card. Electrically, this isn't ideal so it can only hit about 48mhz before requiring wait states on the SRAM. The final design would use buffers to isolate the high speed bus from the system bus.

1748837527890.jpeg

Performance appears worthwhile. Here's the usual set of system info benchmarks.

50ws is the clock doubler with cache (estimated 50mhz performance)
50 doubler is my optimized clock doubler without cache
50 formac is the formac doubler design bolle's been making
40 OC is the bus overclocked to 40mhz, and cache is the same but with 128K cache (similar to bolle's design)

1748836395525.png

and cache performance as measured by @David Cook's tool. This is showing the difference in 6 cycle cache bursts vs 9 cycle. Lots of bandwidth.
1748836570958.png

The elephant in the room is cache coherency. Most Mac cards aren't bus mastering, nor is anything on the logic board, so not a big deal. However, the more desirable SCSI cards and a few other outliers are bus mastering, and if appropriate care must be taken to maintain identical data in cache vs main RAM.

A few approaches are possible.
  1. if the external master never writes into a region of memory that was cachable, then no problem, because data will never enter external or internal caches (no work required at all)
  2. Ideally, the cache needs to snoop data written by an external master to maintain coherency. Complicated logic and sequencing. This approach can impact performance on accelerator designs.
  3. At bare minimum, the cache should invalidate the cache line so mismatched data isn't read. May allow higher performance under certain conditions.
The DiimoCache design Bolle used takes the snooping approach, all accesses by an external master will be captured and updated in cache (if it is in cache in the first place). This would be possible to implement, if somewhat annoying, but I don't have a bus mastering card at the moment to do this with. Nor a huge amount of interest, truth be told, but we'll see.

NeXT uses DMA extensively, however, so the question is if all DMA accesses go to appropriately marked regions of RAM or if new DMA buffers can be allocated on the fly. If snooping is required on NeXT, this is going to hurt performance and also be a huge pain in my rear.
 
Well, I was wrong. It's the worst case scenario: NeXTStep doesn't mark regions of RAM as non-cacheable nor does it inhibit the cache on those accesses. Instead, NeXTStep seems to take the approach of invalidating (via cpush) DMA buffers. This causes any dirty pages in internal caches to be written to memory, before invalidating all selected lines of cache. This is.... fine.... and could lead to better performance, but is a big problem when you have multiple tiers of cache.

On writes, the external cache will maintain consistency with the flushed pages so not a problem there. There's no expectation that data in RAM changes.

On reads, we have a problem. The 040 has the cache cleared for that particular region of memory, but when the 040 goes to read the new data placed there by the DMA it gets happily supplied with the stale data from the external cache. Instead of whatever new data the IO device placed in RAM.

Best I can tell there's no mechanism to capture those invalidations in external hardware (except for, you know, properly marking pages as non-cachable in the first place). So this it not easily solvable.

One method of keeping cache coherency with alternate bus masters (DMA) is snooping their bus cycles. We know that an external transfer is in progress, so we capture the address on the multiplexed bus at /TS & ↑BCLK. If this address has a cache entry, we can either write the new data into the cache or at least invalidate the entry so it won't be supplied to the 040 without being updated.

NeXT Turbo and Mac are well behaved and can support both snooping approaches. NeXT non-turbo is not well behaved; it does not pass transfer start or transfer acknowledge strobes when external master is driving the bus. I suspect there's private strobes to the memory controller being used instead. Without a way to know an alternate master's bus cycle is in progress, I'm up a creek: I can't grab the address or know even that I need to invalidate/snoop.

This leaves the least desirable approaches:

1) Always invalidate the full cache on any DMA access.
This has dire implications on the cache hit rate and would basically be the same as no cache if sound or ethernet activity is happening.

2) Patch software so that cache invalidation is caused in the external caches.

There is one out available: what I've found so far from the software side is based on the NeXTStep 2.x sources. The Nitro accelerator solved this issue somehow. If I can find some proof in later versions of how this was supported for the Nitro's cache then perhaps I can leverage that or do something similar. I feel like this issue would have occurred in the x86 boxes, too...

I don't have a huge amount of interest in starting to disassemble the *Step kernels, though, so I'm probably likely to pause this project for a bit and focus on the non-cached accelerator which still has some ironing out needed. If anybody has any pointers or interest in digging into the kernel, I'm all ears....

Here's the new board, anyways.

4iSAUIi.jpeg


This adds a 128K direct mapped cache behind buffers so it's theoretically capable of supporting full-speed operation from cache while the bus is unavailable due to DMA access. I actually have found it can operate up to 57mhz with cache in my Mac testbed - ridiculously fast! That requires a bus overclock, however. On my Q605 test bed, a bus-mastering LC PDS ethernet card works great, as I think all DMA buffers are allocated with appropriate flags set on them based on my reading of the supermario sources.

Cache supports up to 160MB/s of bandwidth on cache hits using 2:1:1:1 timing. 5 cycles to read 16 bytes at 50mhz. This is 3x (or more) faster than main memory's maximum bandwidth, or something like 4.5x the non-turbo memory.

In testing on NeXT Turbo, unoptimized, I found around a 30% gain in POVRay and 15% on gzip compression. Roughly agrees with what I found on the Mac side. However, the cache being enabled as-is on NeXT causes immediate dysfunction of ethernet and disk corruption so my ability to test was limited.

In the Mac world the primary application would be as a Q700/Q900 accelerator and would essentially be the most performance you could get out of these machines. Probably would win out over the 840av as fastest 68K mac, too!

Here's a fun benchmark showing the difference between 9 cycles, 6, cycles, and 5 cycle cache reads again as measured by @David Cook's tool.

1750175791213.png
 
I'm extremely excited about this Joe! Thank you for your hard work. Can't wait to throw money at you for a couple. Mostly I'm just compiling stuff, so a little boring but it'd really benefit from it. Please add me to the list when you do a prod run. Wonderful work! 🥰
 
but given I've been skiving off on my 060 in Q650 project,
Now that's something I can get behind.

If you need a starting point, one of the Interware Booster cards was designed with 060 in mind but it only ever had an 040 in it.

As I'm sure you know (but others may not), 68k chips aren't always 100% backward compatible in hardware. Notably, with the '040, the built-in FPU drops IEEE transcendental instructions, which wasn't a huge deal to most people but had a huge performance penalty on those pretty fractal-generating programs people liked to use in the era vs. running the same thing on a high-end 030 with a "real" 68882.
In the same way, the 060 drops some hardware instructions to save transistors, with calls for these instructions being trapped and starting up a software emulator to complete them. Unfortunately, Macs call for these instructions in ROM/pre-boot, and there's no way to load the emulator that early. You'd have to recompile the ROM to get it to work (or do the NuBus Power Mac CPU upgrade approach where the onboard CPU boots and then gracefully hands off control to the upgrade once the extension loads, which means an 060 upgrade would either have to go in the 040 PDS or you'd have a big socketed card that retains the 040 alongside the 060).

Regardless, amazing work here. Makes me wish I had some NeXT gear.
 
Now that's something I can get behind.

If you need a starting point, one of the Interware Booster cards was designed with 060 in mind but it only ever had an 040 in it.

As I'm sure you know (but others may not), 68k chips aren't always 100% backward compatible in hardware. Notably, with the '040, the built-in FPU drops IEEE transcendental instructions, which wasn't a huge deal to most people but had a huge performance penalty on those pretty fractal-generating programs people liked to use in the era vs. running the same thing on a high-end 030 with a "real" 68882.
In the same way, the 060 drops some hardware instructions to save transistors, with calls for these instructions being trapped and starting up a software emulator to complete them. Unfortunately, Macs call for these instructions in ROM/pre-boot, and there's no way to load the emulator that early. You'd have to recompile the ROM to get it to work (or do the NuBus Power Mac CPU upgrade approach where the onboard CPU boots and then gracefully hands off control to the upgrade once the extension loads, which means an 060 upgrade would either have to go in the 040 PDS or you'd have a big socketed card that retains the 040 alongside the 060).

Regardless, amazing work here. Makes me wish I had some NeXT gear.
Yes, the electrical adaptation is simple. Stefan Reinauer has an amiga adapter that works for that piece. Replacing the CPU entirely is the sensible way to handle it.

ROM must be modified to load the 060 IPSP and FPSP (but the FPU can be disabled for early bring up). Along with some set up for some of the new 060 features. I've proceeded a bit along in this vein but I've put off trying to capture a ROM execution trace so I can understand where its getting hung up. It gets far enough to sad chime, showing basic viability.


Regarding the hyperdrive accelerator, I have implemented a method to invalidate cache lines from software and patched the kernel to invalidate cache lines prior to DMA. This has worked to prove basic viability of this method of keeping the cache coherent on NeXT, but I have a lot of work to do to firm this up as I've been rapidly iterating to have it available to show for VCF.

Around 30% performance uplift is seen in povray, compiling, boot time, and interestingly http performance is also notably improved. Not sure if this is a direct result of the cache or indirect (better bus availability for DMA)
 
Last edited:
Back
Top