Alive 9 - gwemc2p

01 - 02 - SE - 03 - 04 - 05 - 06 - 07 - 08 - 09 - 10 - 11 - 12 - 13 - 14
Alive 9
56byte c2p for use
in size limited demos
It's no secret that I'm a musician, and absolutely no hardcore demo-coder (and I
have no desire  to be either :) )... On  the other hand  size limited demos have
tickled my  interest somewhat, for me  at least size  is a much more interesting
optimisation  than speed. The  smallest size  catagory is almost universally 128
bytes of code, which  on Atari grows to  160 bytes once  you add 32 bytes of TOS
header... and no cheating by using the header bytes for code :)

Anyone  who looked  at any  of these  128bytros will  know the cool  fx you  can
squeeze out  of those few  bytes... Perhaps the  ultimate 128bytro is the famous
rotozoomer of DefJam/Checkpoint, although  the one that  makes me smile the most
is Pipeline by  MrPink/RG. There seem to be more 128bytros for Atari Falcon than
normal ST - the true  colour screen modes  and FPU instructions make things alot
easier, and the 030 enhancements don't go amiss either :)

Perhaps you think it would be crazy to try nu-skool fx in 128bytes on normal ST?
Perhaps  it  is, but  it's  definately   possible, for  example, the  previously
mentioned rotozoomer. Forgetting  all the amazing points of the rotozoomer for a
moment, its main  sucking point is the usage of a single bit plane only... Which
is why I was so  inspired by  DefJam's  other  128bytro 'Something Like a Plasma
Tunnel', which  DefJam  boasts has  a 16 colour  c2p technique... And if at this
point you  don't know what  c2p is, then I  suggest you  read Ray/TSSC excellent
article on the subject.

So now we come to the point of  this  article - as a  direct  result of  SLAPT I
created my own bytro sized c2p routine. I didn't disassemble DefJam's code, so I
have no idea  if this is the technique used in his demo - however the end result
is the same - a 16 colour  c2p technique with 40x25 8x8 pixels. Mine  is 56bytes
big, giving  you a  full  72bytes to  kick  ass  with a  cool  effect (no  table
precaluations etc), and maybe  you can specialise the code to reduce things by a
few bytes more. The main point of this article is more inspiration than a coding
tutorial, because  I'm no  MC68k expert. If you can handle it then imagine whats
possible for YOU... Coding a bytro is perfect for a rainy Sunday :)

Lets not mess around anymore, lets get to those asm instructions:


First  lets assume  you already put  the processor into supervisor mode, then we
can access anywhere with (hopefully) no problems.

move.l   $44E.w,a2

This instruction loads the current address of the screen into a2.

lea      chkscr(pc),a3  ; base address of 40*25 chunky screen

Here we point a3 to the address of our chunky screen. Simple.

moveq    #24,d3         ; 25 chunky rows
row25:   moveq    #7,d4          ; each chunky block 8 pixels high
high8:   moveq    #19,d5         ; 40/2 chunky blocks wide
col40:   moveq    #0,d2          ; do c2p on 2 chunky blocks

Perhaps you  can guess that  the whole c2p runs  in a phatt nested loop. Bad for
performance, good for  code size :) Here we set up some looping variables. So we
loop over  every single row  of chunky data. Each row is mapped 8 pixels high on
the  planar  screen. Each  step we  map  2 chunky  blocks - so  we loop 20 times
(remember  40 * 25  resolution). d2  keeps a  track of  the  bit  plane  we  are
converting.

bploop:  btst     d2,(a3)
sne      (a2)+
btst     d2,1(a3)
sne      (a2)+

Each byte  of chunky screen  holds the colour for  one pixel. We  test the least
significant  4 bits of  the chunky  block (16 colours) and  set bitplanes of the
screen appropriately. This  gives enhanced 8x8  screen  blocks with  very little
overhead... Here  we could optimise hugely - we could work with 16x16 blocks, or
only a single bitplane.

addq.b   #1,d2
cmpi.b   #3,d2
bls.s    bploop

Here we go on to the next bit-plane.

addq     #2,a3

Adding 2 to a3 causes a3 to point to the next set of chunky blocks to be c2p'd

dbra     d5,col40
lea      -40(a3),a3  ; back to start of chunky row

Getting to  the end now :) After we converted one chunky row to planar format we
have to  go back and  do it again - why? Because  each block is  8 screen pixels
high.

dbra     d4,high8
lea      40(a3),a3   ; next chunky row

After converting that chunky row 8 times we can at last go onto the next one.

dbra     d3,row25

And of course we do this for each of the 25 rows.

Thats it for  the c2p code.... But heres  a look at the bss segement. It rules -
because as we get  the bss segement for free, we can be really extravangant with
it :)

section  BSS
ds.b     40          ; upper guard
chkscr:  ds.b     1000        ; 40*25
ds.b     40          ; lower guard

Here we have 1000 bytes of  chunky screen, with upper and lower guard rows which
reduce weird edge effects in your effect.

Finally, lets  look an example bytro - gfire128.prg. Run it and see the top part
of  your screen burning  furiously with an  evil electric blue fire!! The latest
micro_virus? Or just  an example code  gwEm came up with  the illustrate his c2p
technique? You  decide. Either  way, switching  off the machine will  stop those
flames before they set fire to your curtains.


Heres an overview of those 128bytes:

* First, we go into supervisor mode.

* Then we generate  that blue->black  palette. This is  expensive as it goes and
  uses a few  bytes I can  tell you. I can see why DefJam  used the ugly default
  palette for his fake plasma tunnel.

* Next we generate random heat spots over the top part of the screen. Because of
  code size limitations, we can only burn the top part of the screen... But hey,
  the DHS falcon demo that started it all, with its firey Atari logo, also had a
  similar  limitation. This is very  cheap in bytes, but if you look closely the
  random number generation is pretty un-random :)

* Of  course next  we have to  spread the  flames <fx: evil laughter>. Thanks to
  byte savings  in other areas I've been able to implement this pretty nicely. I
  had to trade the good smoothing algorithm for other things - a simple palette,
  and just  the top part  of the screen  burning. Looking  at the alternatives I
  think this was a good choice.

* Finally we have the c2p rout which was discussed earlier.

* Then we seed more heat spots and do it all over again.


Thats all, I hope  that if you didn't learn something then at least you got some
ideas... And if you make  anything using this  method, or if you make any bytros
at all, then I would be interested to see them - drop me an email!

Peace out!

gwEm (gwem(at)preromanbritain.com ) for Alive June 2004
Alive 9