Classic II possible ROM bug, weird 68030 instruction

Andy

Well-known member
I don't have much advice, but i can offer you words of encouragement. Last year I recapped two Classic II boards (one of each revision) and, even with my low experience at SMD soldering, I got it done without too much difficult. I used an old Hakko iron with a chisel tip, some kapton tape on nearby plastic components to give myself a bit of forgiveness, and took it slow. You got this!
 

dougg3

Well-known member
Thanks for the encouragement, Andy! That's a really good idea to cover nearby plastic with Kapton tape.
 

dougg3

Well-known member
Took care of the tricky ones tonight, in particular C3 and C13 because of their close proximity to other parts. Kapton tape helped with shielding, and the Hakko T18-BR02 tip worked well. I still think this is one of the trickier, if not the trickiest, fits I've ever had to deal with while recapping a logic board. C3 is reeeaaaaalllly close to the ADB port, and LP2 and LP3 stick up just enough to annoyingly be in the way. I'm not sure the picture does it justice. It's not my prettiest soldering job of all time, but it'll do. I wonder if I can buy those 10uF tantalums one size smaller. There's barely any room on the factory pads to fit them.

I'll finish the recap job later this week. I'm waiting for my RGBtoHDMI to arrive, and then assuming the logic board works, I should be able to run some tests to prove one way or another if there's really a bug. Interestingly, the ROM chips on my logic board are all really Intel D27C010 EPROMs. Theoretically, if I had a UV eraser, I could reprogram the factory chips with my hacked ROMs! But I'd rather leave them alone.

newcaps1.jpg
 

Andy

Well-known member
If it works it works, even if it is a little messy. I don't have any photos of my recapped board and I don't want to pull it out of the computer, but I'm pretty sure that C3 cap looks better then mine ;)
 

dougg3

Well-known member
Drats...recapping didn't fix this Classic II logic board on its own. I'm sooo close to being able to verify this bug on hardware.

I cobbled together a video solution that doesn't depend on the analog board using mac-se-video-converter with a Pi Pico. I was originally going to wait for the RGBtoHDMI to arrive, but I'm impatient. I'm even playing dangerously and not using level shifters for the video signal from the Classic to the RP2040 because the RP2040 is unofficially "sort of" 5V tolerant -- who cares if I fry one of my many Pi Picos?

I get a video signal, but it's just a black screen with a wide vertical white bar down the middle. (Maybe default RAM contents or something?)

I know my power and video setup is good because the original Classic logic board I also recapped at the same time, which didn't leak nearly as bad, works fine.

I was worried this would happen after reading lots of stories on this forum about the Classic II. Sounds like I might need to do a deeper cleaning, and possibly remove and resolder the Egret. And of course check to make sure all the RAM and ROM pins are making their way to the CPU.
 

nathall

Well-known member
Some years ago when the caps on my Classic II started to go, that is the symptom I got— the single wide white bar down the middle of the screen on startup. That, and the sound would get really quiet. Usually if I left it turned on long enough on that white bar screen, it would actually eventually bong and start up and I could use it without issue other than sound that got progressively quieter. And by long enough, I mean I would turn it on, go have lunch or watch TV and return when I heard it bong from the other room. I removed all caps years ago and started to recap it, but I lifted a pad on one of the last ones in a bad spot and shelved it, where it remains to this day. Some day I’ll go back and run a bodge wire, but I never got to see if simply recapping actually fixed the white bar issue.
 

dougg3

Well-known member
I figured it out! The Classic II logic board lives, and successfully booted from ZuluSCSI. Didn't have to remove any chips. I thought the Egret looked pretty clean and I had already absolutely doused it with 99% IPA, dripped it through underneath, and scrubbed well with a toothbrush earlier, so I'm very glad I didn't just start pulling chips off.

I started by confirming that every single ROM address and data pin went to the CPU, which they all did.

Then, I decided to check all of the pins on the Egret, and sure enough, /RESET (pin 15) was shorted to ground. This is bad because it would mean everything is stuck in reset. It was pretty easy to see by looking at my pics from before the recap that the likely culprit was the negative terminal of C7, so I removed it:

c7removed.jpg

You can definitely see a few spots where the solder mask isn't quite there anymore on the trace coming from pin 15 (all the way on the left of the row closest to C7). And yep, the short was gone. So I covered everything around the pad with some UV curable solder mask. Dang, I'm really lucky this didn't happen near any of the other caps (as far as I know anyway!). Maybe another point in favor of using caps that are in a form closer to the originals.

Just to be safe based on everyone's recommendations in this thread, I used a brand new cap when I replaced it. Thanks again to everyone who made it clear to me that tantalums are sensitive to heat.

uvmask.jpg

That was all it took. Now it comes right up with no problem. The black screen with white bar seems to possibly be a default RAM state before the CPU fills it with anything, kind of like the checkerboard pattern on the original Classic. I would test this myself by holding the reset button while powering it on, but it turns out my reset button also doesn't work. The button looks like it got corroded really badly somehow, and it doesn't connect its two pins together when pressed anymore. Hopefully I can find a suitable replacement.

The video quality I'm seeing with the mac-se-video-converter is pretty noisy, but that is likely due to my hodgepodge wiring, or maybe the fact that I'm not using the exact VGA port circuit that they recommended. Either way, I have a way to run tests on hardware now!

It's getting too late tonight, and I would like to have nice video quality for my screenshots proving one way or the other what's going on, so hopefully I can figure out how to clean up the video signal this weekend and run my tests! I also thought of a third ROM hack to try: replace the invalid CAS instruction with NOPs and see if the hardware Classic II fails to boot in 32-bit mode just like MAME.

Some years ago when the caps on my Classic II started to go, that is the symptom I got— the single wide white bar down the middle of the screen on startup. That, and the sound would get really quiet. Usually if I left it turned on long enough on that white bar screen, it would actually eventually bong and start up and I could use it without issue other than sound that got progressively quieter.

Thanks for the extra data point, much appreciated! I'm guessing your recap will have fixed it. Maybe one of the bad caps was causing things to stay in reset (unhappy Egret?)
 
Last edited:

David Cook

Well-known member
recap that the likely culprit was the negative terminal of C7

Ooh. Bad routing on whomever laid out that board. Always stay out of a part's land pattern if possible. There was plenty of open space to route the traces better.

Better routing.jpg
 

dougg3

Well-known member
Oh, that's a nasty one to find. Well done...

Thanks! So that might be a good one for future Classic II troubleshooters, make sure that pin isn't shorted to ground.

Ooh. Bad routing on whomever laid out that board.

No joke, it's crazy how unnecessarily close to the pad that trace runs. Although in fairness, this wouldn't have happened if I had put caps that matched what the factory put in. At least it wasn't 12V like what happened to Adrian's Digital Basement on that SE/30!

I got the Pi Pico VGA converter working well enough to run some tests with custom ROMs on hardware. The video has a column or two of weird pixels on the left, but oh well...). Running my tests...

Test 1: Replace the code at 0x40A43B9C (move.b #$90, ($1c00,A1)) with a jump to my special code that draws A1 to the screen. In other words, replace the instruction that causes a Sad Mac in MAME so that we see what A1 is in the hardware at that point. Result:

1737234053792.png

Interesting, how did 0x40A4BBB2 get in there? That's a ROM address, so it's not really a valid base address to write to, but it doesn't cause a Sad Mac on hardware.

Test 2: Replace the code at 0x40A43B94 (the invalid code that MacsBugs says is CAS.W D1,D2,$0004(A4)) with the same "jump to draw A1". In other words, replace the invalid CAS instruction so we can see what A1 is when we first reach it. Result:

1737234610326.png

Yep, A1 has that same 0xFFFF8FBA value that we see on MAME. And the fact that we actually jumped to my drawing code there means that hardware is definitely making that same invalid out-of-bounds table jump we see on MAME.

Test 3: NOP out the invalid CAS instruction at 0x40A43B94 so we can see what happens if the 68030 doesn't accidentally repair the value stored in A1. Result:

1737234834164.png

Exactly the same problem that happens in MAME. BTW, all of these tests behave exactly the same regardless of whether I have 24-bit or 32-bit addressing selected. So MAME is being more forgiving than actual hardware with the invalid access in 24-bit addressing mode.

Conclusion: I've confirmed that hardware follows the same code path as MAME, and that the invalid CAS instruction is changing A1 to point to somewhere in ROM and thus preventing a Sad Mac. This is a bug in the Classic II's ROM that the 68030 magically/accidentally worked around. Pretty cool!
 

David Cook

Well-known member
Great work! This has been really interesting to follow.

I suspect that we all have our share of bugs that have somehow magically worked out, but it is really cool to see one get caught.
 

zigzagjoe

Well-known member
I wonder if it was intentional somehow, as the ROM wouldn't assemble like that without deliberately coding a literal value into the source (or triggering bad behavior from the assembler, anyways). Perhaps it could be a bodge to salvage an issue in the production of the mask ROMs?

Just spitballing, anyways.
 

dougg3

Well-known member
Thank you all! Very fun stuff.

I wonder if it was intentional somehow, as the ROM wouldn't assemble like that without deliberately coding a literal value into the source (or triggering bad behavior from the assembler, anyways). Perhaps it could be a bodge to salvage an issue in the production of the mask ROMs?

Just spitballing, anyways.

Interestingly, the early Classic II that I bought has EPROMs with stickers for the Apple part numbers instead of mask ROMs. Not that it would disprove your mask ROM theory, but I found it interesting nonetheless.

I still think it was a mistake. It wasn't assembled that way -- the ROM is accidentally jumping into the middle of a valid instruction. Here is what the intended code looks like, directly after a jump table filled with branch instructions:

Code:
movea.l $cec.w, A1           2278 0CEC
bclr #$4, ($1800,A1)         08A9 0004 1800
move.b #$90, ($1c00,A1)      137C 0090 1C00

What this is doing is loading the address of VIA2 into A1, then clearing a bit in one of its registers and writing a value to another register, for enabling sound interrupts.

What really happens though is, because we end up jumping past the bounds of the table of branch instructions, the program counter ends up pointing at the $0CEC, which is smack dab in the middle of the movea.l instruction. The same bytes end up being interpreted like this instead (MacsBug interprets this invalid CAS instruction slightly differently, this disassembly is from MAME):

Code:
cas.w D1, D0, ($4,A4)        0CEC 08A9 0004
move.b D0, D4                1800
move.b #$90, ($1c00,A1)      137C 0090 1C00

So A1 doesn't get initialized to the correct value, but we end up using it as an offset anyway with that final move.b #$90 instruction. The intended code for the Classic II, as seen in the IIvx ROM where they finally increased the size of the jump table, does nothing at all, so we're not really missing out on any vital initialization when it does this.

Apple would have easily found this bug during development of the Classic II ROM if not for the pesky 68030 hiding it from them by setting A1 to an address that doesn't crash on that final move.b instruction. :cool:
 

zigzagjoe

Well-known member
Thank you all! Very fun stuff.



Interestingly, the early Classic II that I bought has EPROMs with stickers for the Apple part numbers instead of mask ROMs. Not that it would disprove your mask ROM theory, but I found it interesting nonetheless.

I still think it was a mistake. It wasn't assembled that way -- the ROM is accidentally jumping into the middle of a valid instruction. Here is what the intended code looks like, directly after a jump table filled with branch instructions:

Code:
movea.l $cec.w, A1           2278 0CEC
bclr #$4, ($1800,A1)         08A9 0004 1800
move.b #$90, ($1c00,A1)      137C 0090 1C00

What this is doing is loading the address of VIA2 into A1, then clearing a bit in one of its registers and writing a value to another register, for enabling sound interrupts.

What really happens though is, because we end up jumping past the bounds of the table of branch instructions, the program counter ends up pointing at the $0CEC, which is smack dab in the middle of the movea.l instruction. The same bytes end up being interpreted like this instead (MacsBug interprets this invalid CAS instruction slightly differently, this disassembly is from MAME):

Code:
cas.w D1, D0, ($4,A4)        0CEC 08A9 0004
move.b D0, D4                1800
move.b #$90, ($1c00,A1)      137C 0090 1C00

So A1 doesn't get initialized to the correct value, but we end up using it as an offset anyway with that final move.b #$90 instruction. The intended code for the Classic II, as seen in the IIvx ROM where they finally increased the size of the jump table, does nothing at all, so we're not really missing out on any vital initialization when it does this.

Apple would have easily found this bug during development of the Classic II ROM if not for the pesky 68030 hiding it from them by setting A1 to an address that doesn't crash on that final move.b instruction. :cool:
Oh, yeah, i didn't quite catch that. Absolutely unintended behavior then if it's ending up in the middle of an instruction.
 

nickpunt

Well-known member
Whoah this is a nice discovery @dougg3! Looking at your Classic II board, it's got the D62C stepping of the 68030, which is rev B at 1.2 micron. Just for completeness sake, I wonder if the rev C of 68030 (F91C stepping @ 0.8 micron) which came out years later (PGA/QFP in 95, probably earlier CPGA) had this as well.

You could test it in a IIci w an accelerator and a CPGA 68030 F91C drop-in replacement. The 68030 data sheet (see pg 15) shows no errata, but who knows given this thing is undocumented in the first place.

None of this would change the funny part about the accidental bootability of the Classic II tho :)
 

Andy

Well-known member
Great write up! There are two revisions of the Classic II, one with 4 ROM chips and one with 2. Do the two revisions have the same ROM? I would assume so if the folks at Apple didn't realize there was a bug.

Also very interested if later revisions of the 030 had this undocumented behavior.
 

dougg3

Well-known member
I guess I should point out that I posted a blog about this discovery today!

Whoah this is a nice discovery @dougg3! Looking at your Classic II board, it's got the D62C stepping of the 68030, which is rev B at 1.2 micron.

Thanks! Is there some kind of reference of all of the steppings, revisions, etc.?

You could test it in a IIci w an accelerator and a CPGA 68030 F91C drop-in replacement.

I have a IIci...definitely could be worth testing!

Great write up! There are two revisions of the Classic II, one with 4 ROM chips and one with 2. Do the two revisions have the same ROM? I would assume so if the folks at Apple didn't realize there was a bug.

Thank you! I could have sworn I read somewhere that rev B still has the same ROM, but I don't have evidence to prove it. I definitely checked to make sure that the stock four ROM chips in mine have the 3193670E checksum. It would be interesting for someone to dump a rev B logic board just to double-check it.
 
Top