David Cook
Well-known member
BlockMove is a Macintosh operating system function that copies memory from one location to another. It performs the same functionality as the C library memcpy function. It is commonly used to copy one structure to another. But, it can also shift one chunk of memory up or down (such as when deleting or inserting an element in an array).
In an earlier thread, I examined the performance of NewPtrClear. I was surprised how simplistic it was, and how it could be optimized for clearing larger ranges. The conclusion was that applications must rarely clear large chunks of memory -- so a more complicated NewPtrClear would not be beneficial in practice. So, Apple didn't implement one.
1. In the ROM source code BlockMove.a, check out what Apple developers concluded about the usage of BlockMove:

The reference to a 512K Mac suggests the analysis was performed in the early years.
2. When you look at the code, it is crazy optimized in hand-written assembly. It really takes advantage of unusual instructions and differences in 68K processors. So, Apple must have seen usage on larger chunks. They use registers for 12 bytes or less, and they use crazy branches and unrolled loops for each pass of 32 bytes or less.
I believe the code is so good that nothing I write in C is going to beat it.
3. However, BlockMove has a side-effect performance impact. It usually clears the CPU instruction cache after completion. Presumably, this is because BlockMove moves memory-containing code under circumstances that most programmers don't expect, and that results in crashes.
But, BlockMove doesn't clear the instruction cache for chunks 12 bytes or smaller. Why? Something special about number? Nope:

That's right --- they just guess it doesn't move code.
In late 1992 code, Apple introduced a special variation of BlockMove called BlockMoveData that does not clear the instruction cache. If you are certain you aren't moving code, this is the faster call.

This option is not mentioned in any of the original Inside Macintosh books, nor the follow-up Inside Macintosh: Memory. The books were published before Apple shipped a ROM or system with this code. Because it just has an option bit set, presumably you can call BlockMoveData before the code existed and it will just execute the normal BlockMove trap. Apple updated the Universal Header Memory.h at some point:

4. In 1990/1991, the BlockMove code contained NOP hacks to compensate for defective 68040s.

The 1993 code still contains NOPs throughout the ROM. I guess this is a different defect that made it into shipping Macintosh CPUs.

- David
In an earlier thread, I examined the performance of NewPtrClear. I was surprised how simplistic it was, and how it could be optimized for clearing larger ranges. The conclusion was that applications must rarely clear large chunks of memory -- so a more complicated NewPtrClear would not be beneficial in practice. So, Apple didn't implement one.
1. In the ROM source code BlockMove.a, check out what Apple developers concluded about the usage of BlockMove:

The reference to a 512K Mac suggests the analysis was performed in the early years.
2. When you look at the code, it is crazy optimized in hand-written assembly. It really takes advantage of unusual instructions and differences in 68K processors. So, Apple must have seen usage on larger chunks. They use registers for 12 bytes or less, and they use crazy branches and unrolled loops for each pass of 32 bytes or less.
I believe the code is so good that nothing I write in C is going to beat it.
3. However, BlockMove has a side-effect performance impact. It usually clears the CPU instruction cache after completion. Presumably, this is because BlockMove moves memory-containing code under circumstances that most programmers don't expect, and that results in crashes.
But, BlockMove doesn't clear the instruction cache for chunks 12 bytes or smaller. Why? Something special about number? Nope:

That's right --- they just guess it doesn't move code.
In late 1992 code, Apple introduced a special variation of BlockMove called BlockMoveData that does not clear the instruction cache. If you are certain you aren't moving code, this is the faster call.

This option is not mentioned in any of the original Inside Macintosh books, nor the follow-up Inside Macintosh: Memory. The books were published before Apple shipped a ROM or system with this code. Because it just has an option bit set, presumably you can call BlockMoveData before the code existed and it will just execute the normal BlockMove trap. Apple updated the Universal Header Memory.h at some point:

4. In 1990/1991, the BlockMove code contained NOP hacks to compensate for defective 68040s.

The 1993 code still contains NOPs throughout the ROM. I guess this is a different defect that made it into shipping Macintosh CPUs.

- David