• Updated 2023-07-12: Hello, Guest! Welcome back, and be sure to check out this follow-up post about our outage a week or so ago.

601 processor replacement experiments

Franklinstein

Well-known member
.  .  .  and what the heck is that black box with red sides?
I'm 97% sure it's an inductor of some variety. Pretty much all of these things have at least one somewhere near the processor in the power supply section. 

The only concern regarding the FX I noted earlier: the package mounted atop an adapter may prove to be a little tall for a tight installation like this. Maybe measure the total height of the existing processor and thermal interface and then compare to the calculated height of the completed FX-on-adapter before pulling the trigger on anything crazy. You could still make it work but you'd need to engineer a new thermal solution in addition to the processor card. 

 

Trash80toHP_Mini

NIGHT STALKER
That interchange about package thickness and available cubic in the 603e PowerBooks made the 1400 processor card the only logical lab varmint.

P1010002.JPG

The 1400s heat spreader is a removable 2mm aluminum assembly, not an integral magnesium frame as in the 2300c target machine for the theoretical interstitial adapter. Its thickness provides wiggle room of 1mm by way of milling away the surface.

P1010003.JPG

That surface is a stamped indent on the assembly of about 4mm. The cubic it occupies under the top surface provides a last resort opportunity to mill a large opening for a massive copper adapter.

1400-117-ProcCard-002.JPG

Was hoping for a better shot, but this shows the spatial relationships clearly enough.

As I said, that space makes the easily pulled CPU daughtercard of the 1400 the most logical choice for a 603e PowerBook upgrade that challenges anything available in the day. The 1400 was the only possibility for accelerator manufacturers outside of the 2400c with I've never even seen and  far too valuable to be used as a lab rat.

The stack of 1400s, my attachment to the machine that led to it and the  several extra logic boards and processor cards from the stack reduction project makes this no-brainer for something this insane. I've been working my way from PBX and the logic board connectors toward the PCMCIA cage/daughtercard interconnect on and off for 15 years now. The aim has always been to translate that half of the 1400'x split I/O bus into the Docking Connector of the Duo 2300c. Feasibility of physical installation of the 1400 PCMCIA assembly into the HDD bay of the 2300c was a done deal 15 years back! G3 insanity dovetails very nicely with that project on both sides of the shared PBX controllers in that Dynamic Duo.

dr. bob gave me a crash course on the basics of high frequency signalling after he finally gave in and admitted that the 1400 and 2300c are the same outside of the 1MB VRAM/ECSC upgrade of the 512K(?) VRAM/CSC Video Controller***** of the baseline 2300c. All other components of the 1400 were offloaded to the Docking Connector of the 2300c. Transplanting a 1400/G3 into the 2300c was the impossible dream of that full on manic episode. Guess what? 15 years and the dawn of the RaspBerry Pi/10x10cm SEEED PCB prototyping age and a partial return to sanity makes that proposition impractical, but maybe possible given interstitial adapter development?

1400-117-ProcCard-008.JPG

Note the "1600" printed on the lower left corner of the  (333MHz? )Sonnet card, what's that all about I wonder?

Another denizen of the 1400 stack reduction project box makes things a lot easier. Designing a breakout board PCB for the interboard connect is probably the first step on the journey. The PCB will have matching pads top and bottom for male/female header direct connection of harvested connectors and the pair will act as an interstertial breadboard interface. First for  when the time comes to play with the 750FX on a header matching protoboard. Hoping an underclock of the logic board enough that breadboarding will be possible while retaining function of the external video card.

1400-117-ProcCard-006.JPG

This Minimalist/1400 configuration mockup is a direct descendant of the original DuoDock based Minimalist/Duo 230 project box. The PCMCIA Card Cage/Daughtercard assembly will be removed on one left side of the CPU and the BookEndz Dock project box approach allows the breakout board to overhang KBD, speaker and LCD connectors at top and bottom. figuring out an adequate line driver/buffer setup for such a massive breadboard  prototyping contraption is something someone else will have to do for me.  Cubic aggregation building blocks and visual PCB trace schematic development I can do, but not the complicated electron pusher kinda stuff.

Adapter stack PCB development is my first order of business if this methodology proposal passes muster. Lil' help with that determination please. :huh:

*****maddog successfully gave the 2300c an ECSC transplant, but may have never gotten to or past feasibility studies for a 2300c VRAM upgrade. Full screen16bit on that gorgeous 2300c LCD would be very, very nice indeed! 

 
Last edited by a moderator:

rafthe030

Active member
To quote Red Hill Technology, "more cache good; faster cache gooder." External SRAMs were kinda slow, especially as processors ticked up into the GHz range, so while you may be able to stick 2MB of backside cache on something, if it's only running at half processor speed (or less), it doesn't provide any more of a performance benefit than 1MB of on-die cache
I'm not wanting to be a disturbance here; In fact I'm just keeping watch out of curiosity.

I believe more external backside cache (as opposite to less, faster on-chip) would be a bigger help, given a slow frontside bus with even slower memory will disproportionately bottleneck a (300mhz+?) chip at each FSB transaction. To me the trick should be to avoid as many FSB transactions as they're the worst case scenario (even more here). More backside cache is likelier to do so, and the L1 will still do its job..

 
Last edited by a moderator:

Trash80toHP_Mini

NIGHT STALKER
LOL! The more disruption in here the better, Hearing the "chioinnk" from across the room distrubed a half-nap I didn't really need.

External L2 on the backside is out of the question for anything in the soldered 603e PowerBook target range due to restricted cubic. Maybe if anything comes of this silliness we might see something along your line of thinking for the L2 Cache interface, but that's out in someone else's field of dreams.

That said, I'm confused by what you're saying. My understanding was that L1 is meant to keep the CPU's fingers off the cookies stored in the L2 jar and that the L2 cookie jar is meant to stall a headlong rush by the CPU out to the slow rate of production at cookie factory of main memory? Keeping as much of that kind of activity on the FSB would seem to be the thing to do to me. That said, there appear to be later versions with L2 on die and place additional clock divided L2 out on CPU card, or Logic Board between the die and 1x system bus memory.

Best case for the building blocks I've been able to line up thus far would be the 512K on die L2 of a 750FX running at 666MHz. Other options welcome!

I've got an 800MHz G4 accelerator card with what I assume must be L2 on die and external L3, but that's way out in BFE! :blink:

 
Last edited by a moderator:

rafthe030

Active member
I know the L2 interface is impossible :sadmac: It was just to give my version of the "So what should give better performance, 1M clock divided backside L2 or 512K of on die (frontside?) L2? " in one of your previous posts.

By mentionning the L1 I just meant the branch predictor and data caching mechanism will do their job and try to utilize the L2 and main memory, no matter how slow, the right way. And by "less frontside bus transactions" I meant less stalls from cache misses (not caching itself) since the bigger and heavier the dictionary, the more entries you can look up before having to boot up your sluggish PC to google what you're looking for.

A while ago I switched daily drivers from a Quicksilver G4 with 2x 1.6ghz 7447 w/ 512kb on-chip L2 to an MDD with 2 x 1.5ghz 7455 w/ 256kb on-chip L2 and 2mb L3 and I didn't know L3 made such a damn difference :I at times it feels like night and day

 

Franklinstein

Well-known member
The thing with external L2 caches is that they're not always half speed; sometimes it's much slower, 25% or less of processor clock. That's still faster than the memory bus (and doesn't have to go through the memory controller), granted, but significantly slower than the core frequency. If you could keep the backside cache at 2:1 and increase the size to whatever level you want, that would be one thing, but SRAMs don't generally keep up with processor speed and are expensive if they do, and also the 750 is max'd out at 1MB of external cache anyway. I did mention that generally it's a 2x difference in cache size when comparing internal vs. external, not 4x or more. Unless you're running the slowest SRAMs a 1MB external L2 will have more benefit than a 256k on-die cache, but the gap narrows when you're comparing 512k internal to 1MB external; I doubt you'd notice the difference at all between the latter unless you're running specifically-coded programs, like benchmarks, that can put the bulk of their routines in L2. 

In addition, more devices = more board space, more power consumption, and more heat generation, all of which are pretty big factors in portable applications.

 

Trash80toHP_Mini

NIGHT STALKER
If anyone runs across a 750FX specific datasheet, tech sheet or best of all the user manual, that would be a big help. Everything I've found so far seems to be an indecipherable 740/750 morass. I have to try at work again, the filtering gets me hits I can't find at home. I absolutely HATE datasheetarchive.com and datasheetlib.com for munging up searches for everything about anything with irrelevance. The manuals versions of same are helpful/hurtful in random distribution.

The Shrier archive has info on PLL voodoo for the 1400, but no crystal swap overclocking info that might be used to slow the entire system down. Setting the multiplier to 1:1 will help, but I've got feasibility concerns about breadboarding a 33MHz processor interface. Does anyone have experience with breadboarding at 33MHz? I figure 16MHz or 8MHz would be more doable so long as I can retain clocked down Video expansion card function to run a multisync display. If my 33MHz worries are unfounded, please let me know. I look at this craziness as a logic analyzer crash course, so that makes brick wall collision/fail a positive goal! :approve:

Minimalist 1400 is installed in one of my project tool boxes so I can look at breakout board possibilities in 3D during downtime.

Do I recall a statement somewhere about a drop in G4 replacement for the 750FX?

 

Trash80toHP_Mini

NIGHT STALKER
Thanks for clearing that up, trag. That massive pole of CPUs would work great for the 1400's 666MHz limitation. On a 50MHz bus (very much if ever) a 15x multiplier would run them at a 2.4% overclocked 750MHz.

 

Trash80toHP_Mini

NIGHT STALKER
[:)] Thanks so much! Took a look and laser printed some pages at work tonight. So far it's complicated or really bad news for anything but the 1400 and maybe the 2400? 750FX expects to find a formidable grid of decaps to be in place directly underneath on the solder side of the board.

750FX-DecouplingCapacitors.JPG

I've never seen a 2400, but the 1400 card has zero problems providing for that grid. An interstitial adapter from 603e CQFP or BGA is another story entirely. Dunno how close the decaps need to be to where they're spec'd to be, but if they can't be moved to the periphery of the adapter, that's a brick wall. Thankfully the 750FX CPU is only a a tiny 21mm square. 603e CQFP pads make for an SMT interstitial PCB of about 37mm square leaving approx. 8mm on each side for relocating the scads of teensy capacitors above from the underside of the 750FX.

Signals are are on the periphery for the most part, which makes threading the signals into the grid from the interboard connectors on either side less than a totally daunting proposition.

750FX-V-G-Signals.JPG

750FX-Ball-Placement-Pinout.JPG

Next up is finding out if signal differences between the 603e and 750FX will hose the project right out of the gate. Given signal compatibility the 1400 project looks socked in to maybe fair. CQFP interstitial adaptation for 2300c, 6400 and the rest looks socked in to zero visibility while flying through a mountain range.

BGA to BGA interstitial adaptation looks totally dependent upon how closely the bottom of a given 603e logic board resembles the decap map above. Numbers probably count more than locations, that's a whole lotta caplets. Basically it'd be rowing a dingy into a Nor'easter.

The fat lady is practicing scales while waiting for the signal compatibility verdict.  :mellow:

edit: massive pole of CPUs! :lol:

 
Last edited by a moderator:

Trash80toHP_Mini

NIGHT STALKER
Signal to Signal comparison came off pretty well, a few inconsistencies to research.

Looks like the 750FX may be last of the simple PLL clock multiplier setups. It has a variable setup for power saving modes, but MPC7447A doesn't seem to allow the multiplier to be set in hardware. Lovin' having that Migration from IBM 750FX to MPC7447A, Rev. 1 document available. Designing the 750FX board with provisions to do a 7447A "drop in" upgrade would be great, but methinks it better to relegate such dreams to a later revision. It's not like a 750FX board I do is gonna work given my lack of skills, but someone competent might follow up on the research at some point. :mellow:

 

Trash80toHP_Mini

NIGHT STALKER
LOL! I transferred all my files for this and a related project into a plastic storage bin about three hours ago! What's the native Amiga CPU and ROM? 1400 ROM/Toolbox is for PPC, not 680x0.

Couldn't find package specs on on the Cyclone-III FPGA they used on Apollo? From various pics it looks like it might just fit, you've got me wondering about a PowerBook 100 processor card/accelerator again! :blink:

edit: [SIZE=15.8px]iCrap, sounds a bit big in EQFP to fit with the SRAM on the card and support components for using it in a 5v environment.[/SIZE]

[SIZE=15.8px]--Plastic Enhanced Quad Flat Pack (EQFP), 144 pins, 22 mm x 22 mm[/SIZE]

[SIZE=15.8px]--BGA variant might work? 256-FBGA (17x17) is a bit better[/SIZE]

 
Last edited by a moderator:

uyjulian

Well-known member
FPGA would be interesting. However, the Apollo core is closed source (because they are intending to manufacture it on ASIC), so an alternative method would be needed.

It would be interesting to accelerate 68k on ARM processor. This exists: https://github.com/PandTomB/uae4arm

 

Trash80toHP_Mini

NIGHT STALKER
OK, tell me if I've got zny of this at least partially right? Vampire accelerators for Amiga are FPGA based using a proprietary soft core called 68080 developed by a company named Apollo? Is the ASIC they're intending to produce Amiga specific with the SAGA core on die or are they building a general purpose CPU? Only the latter makes sense to me from a business perspective.

We've but two target laptop CPU boards for 680x0 code: PowerBook 100 and Blackbird, then there are 68000 and 68030 PDS machines, including the Portable/SE in the 68000 camp and IIsi/SE/30 in the 68030 camp and then there's the 68040 PDS. Lest we forget, any socketed or socket upgrade ready CPU machine might work?

I was excited to find memory built into the FPGA, hoping it might be used as as virtual memory, but "MMU implementation is not currently planned for 68080 Core CPU." Maybe it could be configured as Cache?

Dunno, just trying to understand this a bit. Very interesting little tangent we've got going here. :approve:

 

Trash80toHP_Mini

NIGHT STALKER
It would be interesting to accelerate 68k on ARM processor. This exists: https://github.com/PandTomB/uae4arm
Can't get to and won't understant the GitHub info, but now I'm wondering if that may be what Apollo is doing on the Altera Cyclone FPGAs.

cycv-lowest-system-cost-16x9.jpg.755d48fd78d62af322fa984171baeb1d.jpg


Is that ARM implementation powerful enough to do what they're claiming for the 68080. If they're running the 68040 instruction set on ARM, maybe the 601 instruction set would work as well. Lotsa, lotsa work, but if some of it's open source already, there's a start?

 

Gorgonops

Moderator
Staff member
Can't get to and won't understant the GitHub info, but now I'm wondering if that may be what Apollo is doing on the Altera Cyclone FPGAs.
No, the Apollo core does not run on the ARM core built into Cyclone V FPGAs. This is evident by the fact that the core can be implemented on Cyclone III family devices that do not have the ARM core. (And to be clear, not all Cyclone Vs have it either, only the "SoC" variants. Which Apollo accelerators do not use.)

Re: the link to UAE4Arm, that's just a repository for an ARM-tweaked version of the UAE version of the Amiga Emulator, it's not a magic gateway to an already implemented "I stick an ARM CPU into an alien CPU socket and run 68k code MOAR FASTER while looking just like the original CPU" widget. That said: I suppose if someone *were* to use something like an Cyclone 5 SoC to essentially adapt an ARM core to interface to a 680x0 socket you certainly could use the UAE CPU core as the basis for your 68k emulator. But honestly that's by far the most trivial aspect of this build; the hard/interesting part is going to be making the bus logic to make the Macintosh "body" accessible to the ARM brain. In principle at least one should be able to pull it off; you'll need to build the appropriate state machines in the FPGA so the accelerator "looks" like a 680x0 from outside and behaves correctly in response to the interrupts/bus sizing signals/wait states/whatever. Then from the ARM's standpoint I suppose the most straightforward thing to do would be to chop and shuffle the Mac's memory map a little so you can address the whole thing as a memory mapped peripheral, route the interrupts appropriately, and then set the whole thing up so when powered on it executes a compact and highly optimized 680x0 emulator out of the onboard RAM/Flash in the FPGA.

This is going to be harder than a simple software emulator, of course, because you *will* need to be able to respond to hardware interrupts/etc, in real time, unlike in a complete emulation where you can handwave/delay things to your heart's content, but it's probably... totally possible, assuming your ARM is fast enough. Essentially what you'd be building here is the equivalent of a 680x0 ICE pod. The real question, of course, is exactly how *much* faster you'll be able to run than the original. The Vampire accelerators for the Amiga replace a lot more than the CPU; they include their own RAM, and *also* replicate most of the custom chipset features internally. Broadly speaking they basically wear the host Amiga like a mask and themselves run internally essentially the same way as a full FPGA Amiga like the MiST/Minimig does. Access to any hardware outside the Vampire is *muuuuch* slower than operations inside of it. So based on that I suspect that if you *just* replaced the CPU in an old Mac with an FPGA-interfaced ARM CPU (or even a full "real" CPU core like the Apollo "68080") and didn't include RAM, etc, on the accelerator then the total gain you could expect would be... very likely disappointing.

 
Top