• Updated 2023-07-12: Hello, Guest! Welcome back, and be sure to check out this follow-up post about our outage a week or so ago.

MacSnow - project in THINK C

Mu0n

Well-known member
I wanted to do this for last winter, but I got sidetracked. Right now, I haven't even completed the proof-of-concept, but my goals are:

phase 1:
-Have a nice little demo program with a winter snowfall theme in the same vein as the famous C64 christmas demo, possibly with music out of the 4-tone synth.
-Specifically targetting a Mac Plus so that I force myself to optimize the graphics for all macs that I care about

phase 2:

-Provide a scenery (houses, cottages, trees) so that the snow accumulates over these uneven surfaces instead of an empty flat ground
-Let users provide their own curated image (force a white or black background so that the program detects the accumulation polygonal line and let snow accrue there)

phase 3:
-Let users edit snow accumulation polygonal lines

Here's what I have so far, running in mini-vMac at 8x speed. I'm not happy with the slowness on a real Mac Plus. It works as a slow snowfall, but I can't believe I can't put it faster than this:


I'm pre-generating the snowflakes as a char (8-bit) array of 64 characters width (which manages exactly 512 pixels, 1 bit per pixel, same as the compact mac resolution) and 400 lines height (so that the visual repetition of the snowflakes isn't so obvious and memory usage is kept on the downlow).

Instead of using a purely random function, the snowflakes are pulled from a hard coded array of character possibilities, such as 0x01, 0x02, 0x20, 0x80 and counterweighted with with lots of padding 0x00's so that I can control the density of snowflakes until it becomes visually appealing instead of a trash grayish average random.

Algorithms for displaying and scrolling the snow:

---

Algorithm #1: Copy into an offscreen buffer using Longs with an offset, then copy back the whole buffer into screen memory
-Lock the handle to snow data buffer
-Generate a new line of snow data in an offscreen buffer
-Get a long pointer to the start of the screen buffer
-Get a long pointer to a few lines after the start of the offscreen buffer (this is adjustable according to the number of skipped lines you want to increase the sensation of fall speed)
-Copy the screen content to the offscreen buffer, stop when you reach the last detected snow line on the ground using 2 nested for loops (every line between the top of the screen to the snow line, all 16 longs (32-bit each))
-The ground line gets a special treatment, its content is updated with a bitwise OR to *ACCUMULATE* snow on top of what it already had
-Count the number of pixels in the ground line, if it's above a threshold, make it empty and bump up the line 1 pixel above
-Copy back the offscreen to the screen using nested for loops
-Unlock the handle to snow data buffer

How many times can it be done between vertical blankings (VBL)? TO BE DETERMINED (will post later)
If not tied to VBL, how many ticks between them? TO BE DETERMINED (will post later)

---

Algorithm #2: Use ScrollRect directly on screen and draw a new top line
-Lock the handle to snow data buffer
-Use ScrollRect on the main window pointer to the portRect and scroll it a number of lines (adjustable in code)
-Generate the new top line directly on the screen memory
-Currently has no management of the ground snow line
-Unlock the handle to snow data buffer

-Will crash/exit out once the scrolling reaches a threshold since it scrolls an important Rect related to the screen

How many times can it be done between vertical blankings? TO BE DETERMINED (will post later)
If not tied to VBL, how many ticks between them? TO BE DETERMINED (will post later)

---

Algorithm #3: Use OffsetRect on screen memory, use CopyBits to an offscreen buffer, then copybits back to screen memory
-Lock the handle to snow data buffer
-Use OffsetRect on a copy of the Rect assigned to the main window ptr
-Use CopyBits from screen to offscreen and using the offset rect to create a displacement
-Generate the new top line on the offscreen buffer
-Use CopyBits from offscreen back to screen memory
-Currently has no management of the ground snow line
-Unlock the handle to snow data buffer

How many times can it be done between vertical blankings? TO BE DETERMINED (will post later)
If not tied to VBL, how many ticks between them? TO BE DETERMINED (will post later)

---

Algorithm #4: Roughly similar as #3, but using BlockMove instead of CopyBits.
-Lock the handle to snow data buffer
-Currently not working as it should
-Unlock the handle to snow data buffer

How many times can it be done between vertical blankings? TO BE DETERMINED (will post later)
If not tied to VBL, how many ticks between them? TO BE DETERMINED (will post later)
 

Crutch

Well-known member
Cool project. (Get it …. Cool ……. )

No matter what you do, you’ll have to visit every byte of screen memory with a snowflake in it, and every byte representing the pixel just below that one, once per refresh. But you don’t have to visit all the bytes (i.e. you don’t need to worry about the ones that don’t have snowflakes either in them or about to fall into them). Based on your example, it looks like there will be decently many horizontal gaps of bytes with no snow in them.

So, I would probably do this by storing an array of offsets per snowflake (# bytes to skip ahead to get to the next snowflake). When you update the screen, use the offsets to skip ahead. For each snowflake, copy its byte from the background image, then draw the snowflake 64 bytes later (=512 pixels). Multiple snowflakes in a byte, or one snowflake in the byte directly below another, are just edge cases to deal with. If possible code the inner loop in assembly or use register vars then look at a disassembly to ensure the compiler is really using them.

The slow part of CopyBits, BlockMove, and ScrollRect is all the same thing: visiting every single byte in your bitmaps. I don’t think you’ll see an order of magnitude speed difference between them, but BlockMove would be fastest since it has zero overhead. The skip-ahead technique will be your best bet (and I think the fastest possible implementation). Hope this maybe helps!
 

Mu0n

Well-known member
First testing is done in mini-vMac (I suspect a real Mac Plus is slightly even slower from memory with other tests years ago).

This is a baseline without optimizing anything. I perform the screen->offscreen->screen an integer multiple of 50 times and check the difference in TickCount.

mini-vMac @ 1X speed
Run #1: Iterations: 100, TickCount: 1526, avg tickcount per iteration: 15.26 ticks (254.333 ms)
Run #2: Iterations: 850, TickCount: 12970, avg tickcount per iteration: 15.2588 ticks (254.313 ms)

as expected, some updates are done during the refresh of the screen and flicker some of the pixels (which could be...a welcome effect??)
I'm expecting HUGE improvements by only manipulating the screen itself and updating only the chars that need updating.
 
Top