Interesting finds in BlockMove

David Cook · Jan 14, 2024

BlockMove is a Macintosh operating system function that copies memory from one location to another. It performs the same functionality as the C library memcpy function. It is commonly used to copy one structure to another. But, it can also shift one chunk of memory up or down (such as when deleting or inserting an element in an array).

In an earlier thread, I examined the performance of NewPtrClear. I was surprised how simplistic it was, and how it could be optimized for clearing larger ranges. The conclusion was that applications must rarely clear large chunks of memory -- so a more complicated NewPtrClear would not be beneficial in practice. So, Apple didn't implement one.

1. In the ROM source code BlockMove.a, check out what Apple developers concluded about the usage of BlockMove:

The reference to a 512K Mac suggests the analysis was performed in the early years.

2. When you look at the code, it is crazy optimized in hand-written assembly. It really takes advantage of unusual instructions and differences in 68K processors. So, Apple must have seen usage on larger chunks. They use registers for 12 bytes or less, and they use crazy branches and unrolled loops for each pass of 32 bytes or less.

I believe the code is so good that nothing I write in C is going to beat it.

3. However, BlockMove has a side-effect performance impact. It usually clears the CPU instruction cache after completion. Presumably, this is because BlockMove moves memory-containing code under circumstances that most programmers don't expect, and that results in crashes.

But, BlockMove doesn't clear the instruction cache for chunks 12 bytes or smaller. Why? Something special about number? Nope:

That's right --- they just guess it doesn't move code.

In late 1992 code, Apple introduced a special variation of BlockMove called BlockMoveData that does not clear the instruction cache. If you are certain you aren't moving code, this is the faster call.

This option is not mentioned in any of the original Inside Macintosh books, nor the follow-up Inside Macintosh: Memory. The books were published before Apple shipped a ROM or system with this code. Because it just has an option bit set, presumably you can call BlockMoveData before the code existed and it will just execute the normal BlockMove trap. Apple updated the Universal Header Memory.h at some point:

4. In 1990/1991, the BlockMove code contained NOP hacks to compensate for defective 68040s.

The 1993 code still contains NOPs throughout the ROM. I guess this is a different defect that made it into shipping Macintosh CPUs.

- David

halkyardo · Jan 17, 2024

I've been doing a bit of investigation of BlockMove as well, while trying to squeeze more performance out of my ethernet driver - in particular, I've been looking at BlockMove versus the newlib memcpy() implementation in Retro68. I wrote a quick and dirty benchmark, moving memory around in various sizes and alignments between two buffers, and while I wouldn't call my results particuarly scientific, they were interesting!

On an SE, things were pretty straightforward: BlockMove was substantially faster than memcpy() for anything greater than about 32 bytes - pretty good, given the extra overhead of BlockMove's cleverness.

Interestingly, on my SE/30 and IIfx, for aligned data, memcpy()'s simpler implementation was always a bit faster than BlockMove, but for unaligned data, BlockMove's alignment correction made it faster for anything over 64 bytes - indeed, there was little difference between BlockMove's performance for aligned or unaligned data, while memcpy()'s performance fell off a cliff when given an unaligned source or destination.

If it's of interest to anyone else, I might revisit this and modify my benchmarks to output data to something easier to chart - I started copying numbers down off the screen but eventually got bored and gave up!

eharmon · Jan 22, 2024

David Cook said:
In late 1992 code, Apple introduced a special variation of BlockMove called BlockMoveData that does not clear the instruction cache. If you are certain you aren't moving code, this is the faster call.

View attachment 68084

This option is not mentioned in any of the original Inside Macintosh books, nor the follow-up Inside Macintosh: Memory. The books were published before Apple shipped a ROM or system with this code. Because it just has an option bit set, presumably you can call BlockMoveData before the code existed and it will just execute the normal BlockMove trap. Apple updated the Universal Header Memory.h at some point:

View attachment 68085

Interestingly, I happened to discover this was actually documented in the System Update 3.0 technical notes:

David Cook · Jan 22, 2024

eharmon said:
Interestingly, I happened to discover this was actually documented in the System Update 3.0 technical notes:

Thank you for that information. It is helpful to confirm that using the new call won't fail in previous systems. System Update 3.0 was released in May 1994.

David Cook · Feb 2, 2024

I was looking into how to detect whether System Update 3.0 is installed. There are three gestalt selectors with 32 bit flags each that indicate which fixes have been patched. There are some interesting fixes. : )

gestaltBugFixAttrs equ 'bugz'

gestaltFixPrinting equ 0
gestaltResponderCrashFix equ 1
gestaltResponderVersionFix equ 2
gestaltPurgeFonts equ 3
gestaltAliasMgrFix equ 4
gestaltSCSIFix equ 5
gestaltKeyboardFix equ 6
gestaltTrueTypeFix equ 7
gestaltFixedMicroseconds equ 8
gestaltSaveLastSPExtra equ 9
gestaltVMCursorTaskFix equ 10
gestaltDietPatches equ 11
gestaltBackgroundPrintingPatch equ 12
gestaltNoPreferredAlertPatch equ 13
gestaltAllocPtrPatches equ 14
gestaltEPPCConnectionTableFix equ 15
gestaltDAHandlerPatch equ 16
gestaltLaunchFix equ 17
gestaltDeathNoticePatches equ 18
gestaltBacklightFix equ 19
gestaltPrintDriverFix equ 20
gestaltPMSegmentTweaks equ 21
gestaltWDEFZeroFix equ 22
gestaltPACKSixFix equ 23
gestaltResolveFileIDRefFix equ 24
gestaltDisappearingFolderFix equ 25
gestaltPowerBookSerialFix equ 26 ; <40> Next 5 are PowerBook 100/140/170 bug fixes
gestaltPowerBookSleepQFix equ 27 ; <40>
gestaltPowerBookFloppyEjectFix equ 28 ; <40>
gestaltPowerBookSleepFPUFix equ 29 ; <40>
gestaltPowerBookRestFPUFix equ 30 ; <40>
gestaltMtCheckFix equ 31

gestaltBugFixAttrsTwo equ 'bugy'

gestaltEgretSCCFix equ 0
gestaltEgretRdTimeFix equ 1
gestaltEgretIRQPatch equ 2
gestaltEgretTickHandlerFix equ 3
gestaltSCSIFastAckFix equ 4
gestaltAFEHomeResFileFix equ 5
gestaltPowerOffDelayFix equ 6
gestaltSndIntRestoreFix equ 7
gestaltPMgrMIDIFix equ 8 ; <52> PMgrOp fix for MIDI on PowerBooks
gestaltMoveHHiExtraStackSpace equ 9
gestaltMMUOverwriteByQuadraRAMDiskFix equ 10 ; <58>
gestaltTerrorADBReInitFix equ 11 ; <58>
gestaltCentrisOnBoardGreenVGASyncFix equ 12 ; <58>
gestaltGetIndResourceSysMapHandleFix equ 13 ; <58>
gestaltCentrisBluishWhiteFix equ 14 ; <58>
gestaltCentrisFlashWhileScrollingFix equ 15 ; <58>
gestaltEightToSixteenMegBlockMoveFix equ 16 ; <60>
gestaltReleaseTheFontFlagFix equ 17 ; <61>
gestaltMSFlightSimDrawCrsrFix equ 18 ; <62>
gestaltRISCV0ResMgrPatches equ 19 ; <63> ProcessManager patches removed for RISC
gestaltSCSIBusyBugFix equ 20 ; <64> HFS Bug fixes for AppleShare
gestaltHFSDeferredTaskStackSwitch equ 21 ; <64>
gestaltTETrashExpandMemRecVersionField equ 22 ; <67> found this one in GestaltPrivateEqu.h but not here...
gestaltDartPMgrOpTimeoutBadBranchFix equ 23 ; <67> System Update 3.0 bug fixes
gestaltPwrBookLowPwrNotificationFix equ 24 ; <67> "
gestaltBlockMoveDataPatch equ 25 ; <67> "
gestaltFSpExchangeFilesCompatibilityFix equ 26 ; <67> "
gestaltSyncReadCacheFlushFix equ 27 ; <68> "
gestaltUpdateResFileFlushIfSystemFix equ 28 ; <68> "
gestaltMacPlusSizeResourceFix equ 29 ; <68> "
gestaltProcessMgrIdleTimeToRemovedDriverFix equ 30 ; <68> "
gestaltAboutThisMacSystemSizeBarFix equ 31 ; <69> "

gestaltBugFixAttrsThree equ 'bugx'
gestaltPartialResourceRangeCheckFix equ 0 ; <71> System Update 3.0 bug fix

Running 7.1.2 System Update 3.0 on Basilisk shows that the gestaltBlockMoveDataPatch is installed, but 8.1 indicates it is not.

Perhaps this is because 8.1 already includes the code and thus doesn't need the 'fix'. But, that's not how a Gestalt flag is supposed to work. It is supposed to indicate when a feature is present -- regardless of whether by native support, init, or patch.

Anyway, I thought that was interesting.

- David

Interesting finds in BlockMove

David Cook

Well-known member

halkyardo

Well-known member

eharmon

Well-known member

David Cook

Well-known member

David Cook

Well-known member

Similar threads