Let's see your best disk speeds( PPC )(68k)

herd

Well-known member
I've tested that too. Even in 60x mode the G3 throughput is much lower than the G4. Adding MPX mode makes the difference bigger, but even earlier bridge chips with no support for MPX have much higher speeds with a G4.

68kmla.org/bb/index.php?threads/interest-in-adapting-744x-to-7400.41677/post-536211

I don't think the difference is the OS or compiler either, because I've done testing with linux and different compilers and compiler options too. Back to back tests with G3 code and G3 kernel builds will run faster on a G4 chip. Mysterious...
 

herd

Well-known member
Ok, I think I figured it out. There is a hardware explanation for what I've been seeing in various testing. A G4 is 3x faster than a G3. I missed that memo.


68kmla.org/bb/index.php?threads/ppc750gx-vs-ppc750gl.39043/post-554049

Sorry if I derailed the disk thread a bit...
 

joevt

Well-known member
I would definitely be interested in some sort of fast and bootable storage for a G4 machine. SATA, SAS, nvme, etc... any of these should be able to max out 64-bit PCI.
We could port an NVMe or XHCI driver to PowerPC Mac OS X.
GenericUSBXHCI has source code for later versions of macOS.
The NVMe website has source code.

The problem with these PCIe solutions is the lack of PCI-X to PCIe adapters. Only 32-bit 33 MHz adapters are being sold currently which is fine for Old World Power Macs but not New World Macs which have 64-bit and/or 66 or 133 MHz slots. There are some appropriate PCI-X to PCIe bridge chips:
https://www.mouser.ca/datasheet/2/115/PI7C9X130-1140658.pdf
https://www.mouser.com/datasheet/2/38/PEX_8114-PBv5_0-11-06-07-471439.pdf

The fastest PCI-X slot in a Mac is 64-bit 133 MHz = 1067 MB/s.
No Mac implemented PCI-X 2.0.

The fastest AGP slot in a Mac is 32-bit 66 MHz 4X = 1067 MB/s.
No Mac implemented AGP 8X.

PCIe 1.0 x4 is 1000 MB/s.
(using decimal MB instead of binary MiB)

PCIe is bidirectional (full duplex?). PCI is not (half duplex). I don't know if that's an issue for these interfaces.

That's its theoretical speed, or can you measure it somehow? Every measure of G3 memory throughput that I've seen was unusually low. I don't know why. For example, here is xbench on a Pismo before/after swapping the CPU chip. Note how different the memory speed is, with no other change:
I would like to make a benchmark to measure this.

Code:
memory access benchmark options:

OS          :   classic, OS X
bench       :   register move ( 32bit, 64bit, string, altivec ); 
where       :   RAM, ROM, VRAM 
direction   :   read, write
timer       :   seconds transition, timebase/rtc
interrupts  :   on, off (classic OS and timebase/rtc timer only)

other benchmarks : md5_or_sha1 ; dingusbench ; fft
The "seconds transition" timer is for DingusPPC emulator (and maybe some other emulators) which has no accurate clocks except the time of day. Basically, a benchmark would start when the time changes (the seconds). Then it would end after a few seconds.
 

chriscarruthers

New member
Power Mac G5 (Late 2005):
  • 2.0GHz Dual Core
  • 8GB RAM (2GB x4)
  • GeForce 7800 GT 256MB
  • 250GB OWC Mercury Electra 3G SSD on internal SATA (in NewerTech AdaptaDrive)
Xbench 1.3 on Mac OS X Tiger 10.4.11:
Disk Test310.97
Sequential210.67
Uncached Write (4K)244.37150.04 MB/sec
Uncached Write (256K)222.16125.70 MB/sec
Uncached Read (4K)151.6144.37 MB/sec
Uncached Read (256K)263.33132.35 MB/sec
Random593.56
Uncached Write (4K)453.8448.04 MB/sec
Uncached Write (256K)391.63125.37 MB/sec
Uncached Read (4K)2475.7417.54 MB/sec
Uncached Read (256K)633.62117.57 MB/sec
 

DarthNvader

Well-known member
The fastest AGP slot in a Mac is 32-bit 66 MHz 4X = 1067 MB/s.
No Mac implemented AGP 8X.
All the G5's have 8X AGP???
Some of the 64-bit PCI cards can approach the theoretical maximum of 267MB/s in a G4, but I have not found a way to boot from them. With linux I think there is a way to boot on something slow and then change over.

68kmla.org/bb/index.php?threads/g4-raid.40817/

In a G3 (or earlier PPC?), I would be surprised to see anything over 100MB/s because not even the RAM is that fast.
I've seen benchmarks of the B&W hitting near the 266MB/s on 64 bit PCI RAID controllers.

If the RAM can't move data that fast, what is it, magic?
Would have rather used a larger size ... not that it would probably make a ton difference. But who knows ?
Generally speaking the lower numbers, 32k-1024k are going to be some of your slowest benchmarks.

Try and use a real benchmarking tool, not XBench.


A 64-bit bus at 100MHz has a theoretical peak speed of 800MB/s (8 bytes at 100MHz). So I'm guessing the 266MB/s that you mentioned is the theoretical peak speed of a 33MHz bus?

I forgot the systems bus was a 64bit data path.
Let me give you another example. I have a 1.1GHz 750G G3 CPU in an AGP with a 100MHz bus (theoretical speed of 800MB/s). Using the AJA System Test with file system cache enabled basically results in a RAM test until the file size exceeds the available RAM. Even with the super G3, the throughput is around 100MB/s. Swapping in a 300MHz 7400 G4 CPU (no other change), results in 350MB/s.

AltiVec has always speeded up memory operations.


This is why I think that even with a fast 64-bit PCI card for disk IO (267MB/s peak) a G3 machine will not be able to move data any faster than what it can do with RAM.
We don't really have a great benchmark for systems RAM for PPC on the Mac OS, if @joevt can write us one we may have some idea how fast the memory is real moving.

Disk IO and RAM are always related when we have DMA involved, so you're not side tracking things.

With everything up to the G5 we only had SDRAM( except the OW ) so it just runs at the system bus speed in a 64bit path, theoretically. Real world test show we rarely get 266MB/s out of the lower end NWM's.

The Beige G3 was kind of a dog, the B&W/Yikes were better, then they started cranking up the bus for real.

With drive controllers on a PCI Bus there is a lot of overhead and really only Atto really did great work there on PPC
 

GorfTheChosen

Well-known member
Couple more:

G4 800 Dual, OS 9.2.2, 1.5GB, 372GB Hardware RAID 0 array on an ACARD 6880:

G4 800 Dual RAID 0.jpg

This one kind of surprised me:

G4 450 Dual, OS 10.4.11, 2GB, 137 GB Software RAID 0 array on an ACARD 6260 using OS X's striping function:

G4 450 Dual RAID 0.jpg

Probably not all that surprising though ... I think the drives in the 800 are 7200 rpm (vs the drives in the 450 which are 5400) ... plus it's a faster card + faster machine.

File sharing turned off in both cases.
 

nathall

Well-known member
I would like to see the Memory Tests in that screenshot.

@joevt

Here you go. Sorry for the relatively poor image quality; taking photos of a period-correct Trinitron is a game of trial and error.

EDIT: The RAM is 1GB of Micron Technology chips with a speed of 50ns.


9D8B6D28-ED17-407C-A0FF-A93D14DD4D30.png
 
Last edited:

joevt

Well-known member
Made a tiny benchmark app.
Only tests write 32-bit and write 64-bit so far.

Should work on any Power Mac with 16 MB of RAM. I tested it in Mac OS 9.0.4 but not on a real Mac yet.
Built with Code Warrior Pro 4. The optimizer is good enough to make each write a single PPC instruction. I manually unrolled the loop more than the optimizer would.
 

Attachments

  • joevtbenchmark.zip
    105.3 KB · Views: 3

GorfTheChosen

Well-known member
Got the iMac G5's malfunctioning hdd replaced so here's the numbers for that one.

iMac G5 @ 1.9 GHz, 4.5GB, 250 GB SPCC SSD on the built-in SATA bus:

iMac G5.jpg
 

nathall

Well-known member
Made a tiny benchmark app.
Only tests write 32-bit and write 64-bit so far.

Should work on any Power Mac with 16 MB of RAM. I tested it in Mac OS 9.0.4 but not on a real Mac yet.
Built with Code Warrior Pro 4. The optimizer is good enough to make each write a single PPC instruction. I manually unrolled the loop more than the optimizer would.

Attached are results from the same PM 8500/G3 I used for Xbench, a PM 6300/120, and a PowerBook 1400/133.

EDIT: Under OS 9.2.2, 9.1 and 8.6, respectively.
 

Attachments

  • C5CC64A0-DAAD-4794-820B-8BA99D51A91A.jpeg
    C5CC64A0-DAAD-4794-820B-8BA99D51A91A.jpeg
    3.1 MB · Views: 15
  • A428E7CB-CB89-473F-A6D6-7452E4FC9A51.jpeg
    A428E7CB-CB89-473F-A6D6-7452E4FC9A51.jpeg
    3.1 MB · Views: 15
  • 2FAC9163-F40F-481D-B076-18BC2C848F26.jpeg
    2FAC9163-F40F-481D-B076-18BC2C848F26.jpeg
    3.5 MB · Views: 15

joevt

Well-known member
Attached are results from the same PM 8500/G3 I used for Xbench, a PM 6300/120, and a PowerBook 1400/133.

EDIT: Under OS 9.2.2, 9.1 and 8.6, respectively.
Hmm, doesn't that seem terrible compared to the Xbench numbers?
Is the cache and such fully enabled in the PM 8500/G3 in Mac OS 9.2.2?
I'll make changes and do more testing.
 

joevt

Well-known member
I probably need to add some dcbz instructions. When you write to an address that is not in cache, it has to read a bunch of bytes from RAM around that address into cache. Then the address in cache is modified. The cache is written to RAM later. The dcbz instruction will mark the area in RAM as all zeros in the cache thus avoiding a read. That's my theory. I'll test it later today.
 

DarthNvader

Well-known member
Made a tiny benchmark app.
Only tests write 32-bit and write 64-bit so far.

Should work on any Power Mac with 16 MB of RAM. I tested it in Mac OS 9.0.4 but not on a real Mac yet.
Built with Code Warrior Pro 4. The optimizer is good enough to make each write a single PPC instruction. I manually unrolled the loop more than the optimizer would.
Code:
write32 size:10481664 iterations:1   9.915 ms = 1057. MB/s
write32 size:10481664 iterations:2   1.398 ms = 7498. MB/s
write32 size:10481664 iterations:4   1.394 ms = 7519. MB/s
write32 size:10481664 iterations:8   1.371 ms = 7647. MB/s

write64 size:10481664 iterations:1   1.042 ms = 1.006e+04 MB/s
write64 size:10481664 iterations:2   0.9165 ms = 1.144e+04 MB/s
write64 size:10481664 iterations:4   0.8535 ms = 1.228e+04 MB/s
write64 size:10481664 iterations:8   0.8525 ms = 1.230e+04 MB/s
 

joevt

Well-known member
Code:
write32 size:10481664 iterations:8   1.371 ms = 7647. MB/s

write64 size:10481664 iterations:8   0.8525 ms = 1.230e+04 MB/s
What computer is that from?

I can get 78 MB/s from my Power Mac 8600 (1GiB RAM, G4 1GHz).
Using dcbz, I can get 192 MB/s.

With a G4 1GHz in my B&W G3 (768 MiB RAM), I can get ≈162 MB/s for all three tests (bus speed is 66MHz instead of the normal 100MHz). Not sure why dcbz didn't improve that. I need to check CPU settings.

Both are running Mac OS 9.2.2.

The ppc xnu bcopy.s and bzero.s files have fast copy and fill algorithms for 32-bit and 64-bit PowerPC CPUs.
 

DarthNvader

Well-known member
What computer is that from?

I can get 78 MB/s from my Power Mac 8600 (1GiB RAM, G4 1GHz).
Using dcbz, I can get 192 MB/s.

With a G4 1GHz in my B&W G3 (768 MiB RAM), I can get ≈162 MB/s for all three tests (bus speed is 66MHz instead of the normal 100MHz). Not sure why dcbz didn't improve that. I need to check CPU settings.

Both are running Mac OS 9.2.2.

The ppc xnu bcopy.s and bzero.s files have fast copy and fill algorithms for 32-bit and 64-bit PowerPC CPUs.
It was Qemu running 9.2.1 on my M2 MackBook. -cpu 7410
 

nathall

Well-known member
I tried running it on the 8500/400Mhz G3 under the Classic environment on 10.2 to see if it made any difference and the answer is no, not really:

85EDA987-F383-43D3-AF16-1BF6A9FE6748.png

Want to put up a copy of the new one I can try?
 

joevt

Well-known member
I tried running it on the 8500/400Mhz G3 under the Classic environment on 10.2 to see if it made any difference and the answer is no, not really:

Want to put up a copy of the new one I can try?
This one adds the dcbz test.
Renamed the previous tests to the names of the instructions they use.
There's a ### ms = ### s line that compares the Microseconds timer to the time of day (seconds) timer. DingusPPC currently doesn't have a Microseconds timer that is related to real time yet. SheepShaver seems to have a closer match but I should test that with a different app that takes more time to get a more accurate test.
 

Attachments

  • joevtbenchmark 2.zip
    107 KB · Views: 1
Top