• Updated 2023-07-12: Hello, Guest! Welcome back, and be sure to check out this follow-up post about our outage a week or so ago.

SAS 1068e bootable on G5 Quad but PCIe weirdness: Ethernet stops

Melkhior

Well-known member
Hello,

Might be of interest to some of you, so I'll repost some old messages to the 'rescue@sunhelp.org' mailing list ; brief version:
* The SAS controller is flashed with a Sun ROM to get OpenBoot/OpenFirmware support
* I did get my G5 Quad to boot from a SAS drive on a SAS controller, but as long as the PCIe board is plugged in the built-in Ethernets stop working

Maybe someone here will have an idea of why this Ethernet issue happens.

LSI1068 PCI-X bootable on Sun, howto & SAS 1068e bootable on G5, PCIe weirdness.

First message:
So, for those interested:

a) get a 1068 (or 1064 presumably)-based PCI-X device, put it in a PC
(... with a suitable slot, but 64 bits is not required, the
32bits/33mhz slot in my X8DTL did just fine for my HP board).

b) flash it to the latest and greatest IT firmware P21 (didn't try IR,
might work) using a PC (you might want to go for FreeDOS so that you
can safely erase the Flash to move from IR to IT, sometimes 'sasflash'
is annoying), this gives you FW Ver 01.33.00.00 and x86-BIOS
06.36.00.00. Plenty of resources on flashing LSI boards on the
internet if you have doubts.

c) get yourself Solaris patch 122165-02, extract it, there's a
'370-7696-04_FCode_1_00_40.ROM' in it. That's what's needed for the
1064/1068, there could be newer versions but I didn't look further.

d) flash it with "sasflash -c <mycontroller> -b
370-7696-04_FCode_1_00_40.ROM", exactly like a BIOS file

e) shutdown the PC and move the card to your PCI-X Sun, power-up to
PROM, check with 'probe-scsi-all' ; I get a new
'/pci at 1c,60000/LSILogic,sas at 2' bus :) It wasn't there with the x86
BIOS, so the upgrade did something

f) shutdown and plug a SAS drive (SATA might have a shot in the
future, but I have a stack of 2 TB SAS drive so one step at a time),
probe-scsi doesn't see it but probe-scsi-all does

g) boot on the Debian Install cdrom, start installing to the SAS
drive, no issue (didn't try Solaris)

h) fix the boot-device to point to the SAS drive, type boot, wait a
bit more than for the SCA drive (seems the OF driver isn't fast) to
load grub and then the kernel, and eventually the kernel loads a
driver and everything is fine :)
Second message:
So I figured, if it works on Suns, it should also work on other
OpenFirmware-based systems, right?
So I gave it a go with a PCIe board with a 1068e in it. Flashed the
same ROM file from 122165-02.zip, put it in a G5 Quad, installed OSX
10.5 on it - no issue whatsoever with the drive, the G5 reboots fine
on the SAS drive.

However - no network. Neither on-board GbE sees a carrier, whether the
cable is plugged before hand or after boot. OSX sees them fine, just
no carrier... So reboot on the on-board SATA - idem, even with no
drive plugged. Remove the 1068e, the link is brought up during POST
(the light on the switch comes up just before the start-up 'booong'
noise). Put back the 1068e, no carrier. Try Debian, no carrier either.
Remove the Fcode-1068e, put one with the original x86 firmware (OSX
can use those but not boot from them) - same: no carrier! Try the
Fcode board in a different PCIe slot - still no carrier. Remove the
board, everything's fine again.

Anyone has a suggestion on what's going on? I don't see why adding a
PCIe board would affect the on-board GbE. It doesn't really matter as
it was purely an experiment (G5s accept SATA drives anyway so plenty
of options for them, G4s have only 5V PCI slots and I've never seen a
5V-compatible SAS board, only 3.3V-only PCI[-X], so it's not a
solution to replace the PATA drive in them, sadly), but I'm curious. I
don't remember G5 having PCIe issues back in the day.
 

Franklinstein

Well-known member
Interesting. I had read that there were a few attempts to get SAS into a G5 but they were non-bootable, sadly.

Maybe the problem is the use of hard-coded configs that conflict? Like the Ethernet controller wants to be ID 5 but the SAS card is also ID 5 and it overrides because it's earlier in the chain. I don't know if you can look at the device tree in OpenFirmware and compare when you have the SAS card installed vs. when it's not. You may be able to edit the SAS card's ROM and reflash if that's the case.
 

Melkhior

Well-known member
PCI[e] is quite good at avoiding configuration issue (after the mess that were PC bus/ISA/EISA/MCA/... they finally got it reasonably right, at least on the hardware side; firmware is still a CPU-dependent mess to this day). And the Ethernet devices are still visible to the operating systems - they just don't see or don't report a carrier signal for unknown reason. It might some weird electrical issue.

I don't have a PCI-X G5 to try out if a Sun-flashed PCI-X SAS board would be bootable and whether it would cause some weird issue(s) as well. 1068-based SAS PCI boards are still available and cheap on eBay, so it might be an option. But it's not super useful as all G5 supports SATA anyway.

It would be a lot more useful in a G4, but annoyingly enough the G4's 64-bits PCI slots are 5V only (at least on my MDDs) while all SAS boards I've seen are 3.3V only like PCI-X itself. Apparently pre-X 64-bits PCI slots were rare enough, in particular 5V-only ones, that nobody bothered to make a dual-voltage SAS board.
 

demik

Well-known member
I've a feeling that the G5 PCI(e) implementation is not correct / incomplete.

I get the internal NIC to hang on single G5s if I increase the RAM above 2 GB. Confirmed this with 3 logic boards…
On a dual 2.0, using a FibreChannel PCI-X card can hang the SATA controller under heavy load.

The same card is working fine on a G4.
 

Melkhior

Well-known member
I get the internal NIC to hang on single G5s if I increase the RAM above 2 GB
That looks like a 32/64 bits addressing issue, a 2/4 GiB ceiling is rather common for 32-bits DMA.
Is that OSX or Linux or both ? Normally those issue can be solved by having a bounce buffer in the lower 2 GiB for the DMA, so it's mostly an OS issue. The RPi4 had such an issue IIRC (checking... it does).

The SATA hang looks more like a real design problem. "hang under heavy load" is never a good sign :-( Mine is closer to that I think, with the Ethernet devices not seeing the carrier.

My Quad has a full complement of 16 GiB of ECC memory in it, so maybe I should try again anyway with a lower amount and see what happens.
 
Top