56byte c2p for
size limited demos
It's no secret that I'm a musician, and absolutely no hardcore demo-coder (and I
have no desire to be either :) )... On the other hand size limited demos have
tickled my interest somewhat, for me at least size is a much more interesting
optimisation than speed. The smallest size catagory is almost universally 128
bytes of code, which on Atari grows to 160 bytes once you add 32 bytes of TOS
header... and no cheating by using the header bytes for code :)
Anyone who looked at any of these 128bytros will know the cool fx you can
squeeze out of those few bytes... Perhaps the ultimate 128bytro is the famous
rotozoomer of DefJam/Checkpoint, although the one that makes me smile the most
is Pipeline by MrPink/RG. There seem to be more 128bytros for Atari Falcon than
normal ST - the true colour screen modes and FPU instructions make things alot
easier, and the 030 enhancements don't go amiss either :)
Perhaps you think it would be crazy to try nu-skool fx in 128bytes on normal ST?
Perhaps it is, but it's definately possible, for example, the previously
mentioned rotozoomer. Forgetting all the amazing points of the rotozoomer for a
moment, its main sucking point is the usage of a single bit plane only... Which
is why I was so inspired by DefJam's other 128bytro 'Something Like a Plasma
Tunnel', which DefJam boasts has a 16 colour c2p technique... And if at this
point you don't know what c2p is, then I suggest you read Ray/TSSC excellent
article on the subject.
So now we come to the point of this article - as a direct result of SLAPT I
created my own bytro sized c2p routine. I didn't disassemble DefJam's code, so I
have no idea if this is the technique used in his demo - however the end result
is the same - a 16 colour c2p technique with 40x25 8x8 pixels. Mine is 56bytes
big, giving you a full 72bytes to kick ass with a cool effect (no table
precaluations etc), and maybe you can specialise the code to reduce things by a
few bytes more. The main point of this article is more inspiration than a coding
tutorial, because I'm no MC68k expert. If you can handle it then imagine whats
possible for YOU... Coding a bytro is perfect for a rainy Sunday :)
Lets not mess around anymore, lets get to those asm instructions:
First lets assume you already put the processor into supervisor mode, then we
can access anywhere with (hopefully) no problems.
This instruction loads the current address of the screen into a2.
lea chkscr(pc),a3 ; base address of 40*25 chunky screen
Here we point a3 to the address of our chunky screen. Simple.
moveq #24,d3 ; 25 chunky rows
row25: moveq #7,d4 ; each chunky block 8 pixels high
high8: moveq #19,d5 ; 40/2 chunky blocks wide
col40: moveq #0,d2 ; do c2p on 2 chunky blocks
Perhaps you can guess that the whole c2p runs in a phatt nested loop. Bad for
performance, good for code size :) Here we set up some looping variables. So we
loop over every single row of chunky data. Each row is mapped 8 pixels high on
the planar screen. Each step we map 2 chunky blocks - so we loop 20 times
(remember 40 * 25 resolution). d2 keeps a track of the bit plane we are
bploop: btst d2,(a3)
Each byte of chunky screen holds the colour for one pixel. We test the least
significant 4 bits of the chunky block (16 colours) and set bitplanes of the
screen appropriately. This gives enhanced 8x8 screen blocks with very little
overhead... Here we could optimise hugely - we could work with 16x16 blocks, or
only a single bitplane.
Here we go on to the next bit-plane.
Adding 2 to a3 causes a3 to point to the next set of chunky blocks to be c2p'd
lea -40(a3),a3 ; back to start of chunky row
Getting to the end now :) After we converted one chunky row to planar format we
have to go back and do it again - why? Because each block is 8 screen pixels
lea 40(a3),a3 ; next chunky row
After converting that chunky row 8 times we can at last go onto the next one.
And of course we do this for each of the 25 rows.
Thats it for the c2p code.... But heres a look at the bss segement. It rules -
because as we get the bss segement for free, we can be really extravangant with
ds.b 40 ; upper guard
chkscr: ds.b 1000 ; 40*25
ds.b 40 ; lower guard
Here we have 1000 bytes of chunky screen, with upper and lower guard rows which
reduce weird edge effects in your effect.
Finally, lets look an example bytro - gfire128.prg. Run it and see the top part
of your screen burning furiously with an evil electric blue fire!! The latest
micro_virus? Or just an example code gwEm came up with the illustrate his c2p
technique? You decide. Either way, switching off the machine will stop those
flames before they set fire to your curtains.
Heres an overview of those 128bytes:
* First, we go into supervisor mode.
* Then we generate that blue->black palette. This is expensive as it goes and
uses a few bytes I can tell you. I can see why DefJam used the ugly default
palette for his fake plasma tunnel.
* Next we generate random heat spots over the top part of the screen. Because of
code size limitations, we can only burn the top part of the screen... But hey,
the DHS falcon demo that started it all, with its firey Atari logo, also had a
similar limitation. This is very cheap in bytes, but if you look closely the
random number generation is pretty un-random :)
* Of course next we have to spread the flames <fx: evil laughter>. Thanks to
byte savings in other areas I've been able to implement this pretty nicely. I
had to trade the good smoothing algorithm for other things - a simple palette,
and just the top part of the screen burning. Looking at the alternatives I
think this was a good choice.
* Finally we have the c2p rout which was discussed earlier.
* Then we seed more heat spots and do it all over again.
Thats all, I hope that if you didn't learn something then at least you got some
ideas... And if you make anything using this method, or if you make any bytros
at all, then I would be interested to see them - drop me an email!
gwEm (gwem(at)preromanbritain.com ) for Alive June 2004