Testing a 6200 and comparison with 6100

The 040 bus clock is definitely going to be a multiple of the PPC clock... anything else makes the bus adaptation logic much more complicated than it already must be. As Phipli said apple almost certainly specified the ASIC for operation at 40mhz speeds but what configuration (ie. Programmable wait states) is required to support that remains to be seen. Again recommend referencing the memc documentation for some examples of how different DRAM grades work into 040 bus cycle timing. Something like 4-2-2-2 is more likely; 2-1-1-1 is the minimum cycle and difficult to achieve.

The 60mhz clock would be for the valkyrie private framebuffer DRAM only; that being uncoupled from the bus clock would be normal and expected. Similar to what the Epson chip on my 30video cards does, this should run as fast as practically supported by the DRAM in order to maximize bandwidth.
 
Hi folks,

OK, I've written the test application. It's not very big. Source code and application are included. The results take a bit of interpreting.

My PB1400c is a 603e Mac (as we all know). So, it has a 4-way x 16kB, Write-back L1 Cache. It also has a 128kB L2, Write-Through cache. So, I need to test up to "8 sets" in my test to force a flush to L2 (which also forces a flush to what's called the I/O bus on the PB5300/PB1400, which is kinda equivalent to the '040 bus on the 6200). Hence my version of the application is different to the one for the P5200/6200.

Mine's also written in CW11 Gold, I don't know how easily that converts to the later versions people tend to use. Here are my results and my interpretation of them:

TestTicks/LoopTicksRemCountBandwidth (MB/s)
1Set1298489357871134217728134217728*4/1048576/((93+(129848-57871)/129848)/60)=328MB/s
2Sets129738969139134217728134217728*4/1048576/((96+(129738-9139)/129738)/60)=317MB/s
4Sets1294219895702134217728134217728*4/1048576/((98+(129421-95702)/129421)/60)=313MB/s
8Sets129780627776120971522097152*4/1048576/((62+(129780-77761)/129780)/60)=7.69MB/s

So, what we see here is a demonstration of how great L1 cache is and the dramatic difference between L1 cache memory and the main I/O bus on a PB1400C. But are the L1 cache values realistic? Well, the PB1400c runs at 166MHz. 328MB represents 82M x 32-bit writes/s, equivalent to about 2 cycles per store instruction, which is probably correct. It also looks like there's a slight penalty for accessing different sets, which is interesting.

And my guess, is that the poor main RAM performance is because the PB1400c's bus is just 32-bits and uses pseudo-static RAM. 7.69MB/s is 1.9M 32-bit bus cycles per second for an average of 520ns per bus cycle. I mean, that's bad huh?

The version supplied will go through up to 8 sets and wait for you to press the mouse button. When you do, it'll check the Valkyrie VRAM test, writing directly, as I believe, to the VRAM addresses beginning at $F9000000. And this could be totally wrong, because for all I know those are its I/O regs so it'd be a major disaster ( @noglin ... are the I/O regs there or the screen memory itself). Perhaps I should just take the address of ScreenBits, then it would also work for my Mac. So, it's probably best to Restart the Mac instead of pressing the mouse button the first time, i.e. don't even press the mouse button once, just perform a physical Restart. The next version will have a real event loop, not a wait for Button()!.

I intend to build and submit a version that should work for the 630 at some point too. Still, this version should be useful for comparing your P6200 with my PB1400.
 

Attachments

Back
Top