News Team Current issue History Online Support Download Forum @Pouet

01 - 02 - SE - 03 - 04 - 05 - 06 - 07 - 08 - 09 - 10 - 11 - 12 - 13 - 14

Alive 10


       Hi there everybody, let's talk about  getting small. And before everyone
starts thinking perverse, let me clarify that  I'm talking about small code. How
small? Well...

       The subject  here  is  bootsector  coding.  I  wrote  this  one  for the
bootsector compo of the Outline 05  party  (sorry  I couldn't make it folks!). I
had the idea just 1 day  before  the  party  started,  so  I  was under a lot of
pressure and had to code at work (hehe  :)  as  well as home. The result is this
hideous thing I'll try to explain here.

       The effect of the bootsector consists of 2 routines:
a) Draw the (well known) busy bee
b) Do some funky effect with it

       Well, many people didn't understand what the effect was, so permit me to
say that it's an outline (perhaps you understand now where I got the concept for
the effect :)

       There are a couple of things I'd  like  to discuss and then I'll explain
the whole code, line by line.

       How to make an outline  of  an  object  in  the first place? Well, there
isn't just one way to make an outline,  and  it's a matter of preference of each
individual. The following techniques  apply  only  to  monochrome images, as for
color different techniques are used, which are beyond the scope of this document
(plus, I don't know them ;)

       So, let's assume we have the following bitmap we want to outline:

  KK  KK  UU   UU      AA
  KK KK               AAAA
  KKKK    UU   UU    AA  AA
  KK  KK   UUUUU   AA      AA

(ok, so I'm not a good artist, shoot me :) As I said, there are two main ways to
do an outline of this, and it's  because  the  screen is composed of a matrix of
pixels. So, two ways to outline the above image (stop laughing!) are these:

 kk  kk  uu   uu      aa             kkkkkkkkuuuu uuuu    aaaa
kKKkkKKkuUUu uUUu    aAAa            kKKkkKKkuUUu uUUu   aaAAaa
kKKkKKk  uu   uu    aAAAAa           kKKkKKkkuuuu uuuu  aaAAAAaa
kKKKKk  uUUu uUUu  aAAaaAAa          kKKKKkk uUUu uUUu aaAAaaAAaa
kKKkKKk uUUuuuUUu aAAAAAAAAa         kKKkKKkkuUUuuuUUuaaAAAAAAAAaa
kKKkkKKk uUUUUUu aAAaaaaaaAAa        kKKkkKKkuuUUUUUuuaAAaaaaaaAAa
 kk  kk   uuuuu   aa      aa         kkkkkkkk uuuuuuu aaaa    aaaa

(and you can thank God - or Drizzt &  Earx who coded the shell! - that the shell
has colors, or you wouldn't be able to understand this too easily :)

       But I hear you say "That's  not  an  outline!". Ture, an outline doesn't
need the original image, and that gives us the following results:

 kk  kk  uu   uu      aa             kkkkkkkkuuuu uuuu    aaaa
k  kk  ku  u u  u    a  a            k  kk  ku  u u  u   aa  aa
k  k  k  uu   uu    a    a           k  k  kkuuuu uuuu  aa    aa
k    k  u  u u  u  a  aa  a          k    kk u  u u  u aa  aa  aa
k  k  k u  uuu  u a        a         k  k  kku  uuu  uaa        aa
k  kk  k u     u a  aaaaaa  a        k  kk  kuu     uua  aaaaaa  a
 kk  kk   uuuuu   aa      aa         kkkkkkkk uuuuuuu aaaa    aaaa

And now, without further ado, how to do this effect (the left one):

a)make a copy of the original image
b)blit the copy with OR one pixel above the original image
c)blit the copy with OR one pixel left of the original image
d)blit the copy with OR one pixel right of the original image
e)blit the copy with OR one pixel down the original image
f)XOR the copy with the original image

       That's it! Now we are left with an  outline in the place of the original
image! (for the right image to the right  we  need to blit copies with OR to the
upper left, upper right, lower left and  lower right of the original image) Now,
what if we outline the outline? Well, we would end up with something like this:

 ]]  ]]  ]]   ]]      ]]            ]]]]]]]]]]]]]]]]]]]  ]]]]]]
]  ]]  ]]  ] ]  ]    ]  ]           ]            ]    ] ]]    ]]
] ]]  ]]  ]] ] ]]    ] ]] ]          ] ]]  ]]  ]] ] ]] ]]]  ]]  ]]
] ]] ]] ]]  ] ]  ]  ] ]]]] ]         ] ]] ]]      ]    ]]  ]]]]  ]]
] ] ]] ]] ]] ] ]]  ] ]]  ]] ]        ] ]]]]  ] ]] ] ]] ]  ]]  ]]  ]]
] ]] ]] ] ]]   ]] ] ]]]]]]]] ]       ] ]] ]]   ]]   ]]   ]]]]]]]]  ]
] ]]  ]] ] ]]]]] ] ]]      ]] ]      ] ]]  ]]   ]]]]]   ]]      ]] ]
]  ]]  ] ]     ] ]  ]]]]]]  ]       ]        ]       ]    ]]]]    ]
 ]]  ]]   ]]]]]   ]]      ]]        ]]]]]]]]]]]]]]]]]]]]]]]  ]]]]]]

       As you can see, progressive outilnes of  the same image can produce some
trippy effects! And so, without further ado,  the source code with more comments
& explainations. Code will  be  colored  like  this,  while  my comments will be
colored like this.

; Started  24/3/05 10:16 (approx)
; Finished 23/3/05 15:08 (hopefully!)
; Rev.2 started  23/3/05 22:00 (more or less) - better in everything :)
; Rev.2 finished 23/5/05 23:52 (hopefully for a second time :)))
; Rev.3 started  30/3/05 00:23
; Rev.3 finished 30/3/05 00:34

Damn! I even forgot to write some credits!  Written by GGN in Turbo Assembler on
Steem Engine.

                IFEQ 0
               OPT X+
               pea     start(PC)
               move.w  #$26,-(SP)
               trap    #14
               addq.l  #6,SP
start:         clr.b   $FFFF8260.w

This is just some code to  test  the  whole  intro under a debugger (Bugaboo for
me). As you probably  know,  when  the  boot  sector  is  being executed, we are
already at supervisor mode in  low  resolution,  but  when a program is normally
executed, we have to manually turn  on  supervisor  mode. A switch to low-res is
needed as Turbo Assembler uses medium-res.

; plane/color 0 1 2 3 4 5 6 7 8 9 A B C D E F
;          0  o x o x o x o x o x o x o x o x
;          1  o o x x o o x x o o x x o o x x
;          2  o o o o x x x x o o o o x x x x
;          3  o o o o o o o o x x x x x x x x

               lea     $FFFF8240.w,A0
               moveq   #7,D0
pal:           move.l  #$0FFF0000,(A0)+
               dbra    D0,pal
;                move.l  #$0FFF,$FFFF8242.w *size optimising!
;                move.l  #$0FFF,$FFFF8246.w
;                move.l  #$0FFF,$FFFF824A.w
;                move.l  #$0FFF,$FFFF824E.w
;                move.l  #$0FFF,$FFFF8252.w
;                move.l  #$0FFF,$FFFF8256.w
;                move.l  #$0FFF,$FFFF825A.w
;                move.w  #$00,$FFFF825E.w

;                clr.w   $FFFF8242.w     ;black the 1st plane col
;                move.w  #$0FFF,$FFFF8244.w ;white the 2nd plane col
;explaination: 1st plane=draw plane
;              2nd plane=copy plane (invisible!)
;(was that an explaination???)

What was that all about?? Well, I need to explain some things first.

Do you remember where I wrote above that we  need to make a copy of the image to
outline? Well, another feature of the bootsector is that it is loaded & run at a
specific address (depends on your TOS version), which is quite low. 512 bytes of
memory are allocated specifically for that  purpose. After this area system data

One of the goals of the compo was that  the bootsector must have a clean exit to
the desktop, so that means that  it  has  to  be system legal (otherwise I could
have just used the memory immediately  after  the bootsector for buffer). Now, I
didn't want to mess  around  with  malloc  or  crap  like  that  (!), and then I
realised that I used only 1 bitplane of the screen!

So, I just 2 of the bitplanes for  buffer  & screen swapping! "Wait a minute, if
you go and use the other bitplanes,  the  result  will be visible, as the screen
will be filled with random colors!". True,  that's why I fill the whole palette!
Let's take a closer look at  the  table  above  (an  x denotes that the plane is
active for the given color and an o  that  the plane is inactive - i.e. we don't
draw on that plane to get that color):

; plane/color 0 1 2 3 4 5 6 7 8 9 A B C D E F
;          0  o x o x o x o x o x o x o x o x
;          1  o o x x o o x x o o x x o o x x
;          2  o o o o x x x x o o o o x x x x
;          3  o o o o o o o o x x x x x x x x

Now, my idea was to use the first  bitplane only and make the other 2 invisible.
If I didn't write  to  the  other  planes,  just  blanking  the  1st color would
suffice. Alas, since the  other  2  planes  are  involved,  when  I use them for
buffering mixed colors would appear (for example, if  a bit in plane 0 & 1 would
be on -i.e. 1- color 3  would  be  displayed).  So  the  idea is to make all the
colors in which  plane  0  is  involved  black  as  well,  and  the  rest as the
background color (i.e. white)!

And that's what the above code does!  The  commented  code was left for me, just
for reference (you can  notice  how  it  was  space optimised!). Confused? Maybe
you'll understand as you read further.

               pea     cls(PC)         ;clear the screen (for tos 2.06)
               move.w  #9,-(SP)
               trap    #1
;                addq.l  #6,SP           ;put this in the add below :)

               move.w  #2,-(SP)        ;get phys address
               trap    #14
               addq.l  #2+6,SP

Here we do some standard  calls  to  GEMDOS  and  XBIOS  to clear the screen (by
printing  ESC+E)  &  print  the   message   and  getting  the  screen's  address
respectively. One small optimisation, as you see,  is  that I merged the 2 stack
corrections into 1.

               movea.l D0,A0
               lea     48+35*160(A0),A0
               movea.l A0,A6           ;a6=phys adress (always!)

This just adds an offset to center  the  effect,  otherwise it would be drawn at
the top-left of the screen. Also, a copy  of this address is made into a6, which
will be used for the rest of the code.

               lea     drawlogo4(PC),A4 ;for eori below
               lea     gfx(PC),A1
               moveq   #14,D7          ;d7=lines to process

Here I initialise some variables  for  the  loop  below.  The  first lea will be
explained, don't worry :) A1 contains the gfx, which is the busy bee, a word for
each line (since the graphic is 16x16).  Since the last line is completely blank
(see below), I don't draw it at all (and save 2 bytes!), that's why d7 is 14 and
not 15.

drawlogo:      move.w  (A1)+,D0        ;get h-line pixels
               moveq   #0,D1           ;d1=current line draw offset
               moveq   #15,D4          ;d4=pixels left to draw for current line
drawlogo2:     lsl.w   #1,D0           ;draw pixel?
               bcc.s   drawlogo4       ;nope skip drawing

               moveq   #7,D2           ;d2=bytes to draw
               move.w  D1,D3
drawlogo3:     st      0(A0,D3.w)
               st      4(A0,D3.w)      ;put the same into plane 3
               add.w   #160,D3
               dbra    D2,drawlogo3

drawlogo4:     addq.w  #1,D1           ;point to next pixel on screen
               eori.b  #%1100,(A4)     ;wicked smc shit!
               dbra    D4,drawlogo2

drawlogo99:    lea     8*160(A0),A0    ;point to next line
               dbra    D7,drawlogo

What the hell is going on here???? Well,  firstly  a gfx line is read (16 bits).
Each pixel will be "magnified" to a  8x8  block,  which is handy, since 8 bits=1
byte! To determine if we need  to  draw  a  block, we do the lsl/bcc combination
(trace it with your debugger to see what  it'a all about if you don't understand
- it's not that tough!).

The drawing is  done  using  the  st  instruction  (which  means  store  true or
something like that), which fills a byte  at  a  given address. So, we need 8 st
instructions, for 8 horizontal strips (8 pixels  wide), one on top of the other,
to create a 8x8 block.

d1 contains the offset to be drawn. And  that's the real tricky bit here: how do
we calculate the next offest with  the  minimum required code. Let's construct a
series with all the screen  offsets  for  the  first  bitplane for a given line.
Since the planes are interleaved, the series goes like this:

8x8 block: 00 01 02 03 04 05 06 07 08 09 .....
Offset   : 00 01 08 09 16 17 24 25 32 33 .....

So, to get from block #0 to block #1 we have  to add 1, from #1 to #2 we have to
add 7, from #2 to #3 we have to add  1,  from  #3 to #4 we have to add 7 etc. Do
you see a pattern here? 1,7,1,7,1,7... Let's break down the numbers in binary:

1: 00000001
7: 00000111

So, the difference of 1 and 7 is bits 1  & 2. Say we have a data register loaded
with 1. How can we get to 7? A  simple OR with %110 would suffice, but how would
we get to 1? An AND with %001 would be  the answer, but how do we make it on one
instuction for both cases?

Simple, we use XOR (or EOR for those who prefer it this way) with %110. Now, the
legal way to do it would be to use a  data register, XOR it with %110 and add it
to d1, but as I was a bit pressed  for  time I didn't think that, and used self-
modifying code, changing the instruction at drawlogo4!!! As I say above: this is
not a good way to do it, and  in  this case yilded no size optimisation from the
method I described. But as I was pressed for time, I couldn't sit down and think
it though :)

Oh yeah, as you can see I use  2  st  instructions, I just blit the same data at
plane 0 and 2. You'll see why below.

BLiTTER         EQU $FFFF8A20
Src_Xinc        EQU 32-32
Src_Yinc        EQU 34-32
Src_Addr        EQU 36-32
Endmask1        EQU 40-32
Endmask2        EQU 42-32
Endmask3        EQU 44-32
Dst_Xinc        EQU 46-32
Dst_Yinc        EQU 48-32
Dst_Addr        EQU 50-32
X_Count         EQU 54-32
Y_Count         EQU 56-32
HOP             EQU 58-32
OP              EQU 59-32
Line_Num        EQU 60-32
Skew            EQU 61-32

ylines          EQU 160

Standard equates for blitter, as well as  the number of vertical lines allocated
for the effect.

               lea     BLiTTER+Src_Xinc.w,A2
               movem.l blitterdata(PC),D0-D7 ;load blitter data
               movem.l D0-D7,(A2)      ;put data in blitter

A small trick me and my  brother  thought  about  a decace ago (roughly!): Since
most of the blitter data doesn't change, plus loading the blitter registers with
move.l/w instructions would take a lot of  space,  we can load all the registers
with one go with  a  movem.l!  The  blitter  data  isn't  anything fancy, blit a
rectangle of 160x160, full endmasks, no halftoning, variable skew, etc.

               lea     -8-16*160(A6),A6
               lea     2(A6),A5        ;for step 1
               lea     4(A6),A1        ;point to plane 3!!!
               lea     160(A1),A4      ;for step 2
               lea     -160(A1),A3     ;for step 3

Here I adjust the screen pointers for the blitting, and create some new pointing
to the other bitplanes, and some pointing one line above and below a1.

               moveq   #99,D7          ;wait 2 sec (on 50hz)
               bra.s   vs

A small delay to marvel the bee before I destroy it!

               move.b  #18,$FFFFFC02.w ;turn mouse off

               moveq   #2,D7           ;this changed to d7 to work with TOS2.06
vs:            move.w  #37,-(SP)       ;vsync
               trap    #14
               addq.l  #2,SP
               dbra    D7,vs

Wait for as maby VBLs as d7 says (either 100 or 3).

;step 1: make a copy of the graphic plane 3 in plane 1
s1:            move.b  #3,OP(A2)       ;copy source
               clr.b   Skew(A2)        ;no skew

               lea     4(A6),A1        ;vsync sucks :(
               move.l  A1,Src_Addr(A2) ;set source address=plane 3
               move.l  A6,Dst_Addr(A2) ;set dest address=plane 1
               move.w  #ylines,Y_Count(A2) ;y lines
               move.b  #192,Line_Num(A2) ;blit!

;step 2: make a copy of plane 1 in plane 2
               move.l  A6,Src_Addr(A2) ;set source address=plane 1
               move.l  A5,Dst_Addr(A2) ;set dest address=plane 2
               move.w  #ylines,Y_Count(A2) ;y lines
               move.b  #192,Line_Num(A2) ;blit!

;step 2: OR plane 2 into plane 3 (1 pixel up)
s2:             move.b  #7,OP(A2)       ;source OR destination
;               move.l  A5,Src_Addr(A2) ;source address=plane 2
;               move.l  A3,Dst_Addr(A2) ;dest address=plane 1
;               move.w  #ylines,Y_Count(A2) ;y lines
;               move.b  #192,Line_Num(A2) ;blit!

;step 3: OR plane 2 into plane 3 (1 pixel down)
;s3:             move.l  A5,Src_Addr(A2) ;source address=plane 2
;                move.l  A4,Dst_Addr(A2) ;dest address=plane 1
;                move.w  #ylines,Y_Count(A2) ;y lines
;                move.b  #192,Line_Num(A2) ;blit!

;step 4: OR plane 2 into plane 3 (1 pixel to the right)
;s4:             move.b  #1,Skew(A2)     ;1 pixel right skew (+NFSR???)
;                move.l  A5,Src_Addr(A2) ;source address=plane 2
;                move.l  A6,Dst_Addr(A2) ;dest address=plane 1
;                move.w  #ylines,Y_Count(A2) ;y lines
;                move.b  #192,Line_Num(A2) ;blit!

;step 5: OR plane 2 into plane 3 (1 pixel to the right & down)
               move.b  #1,Skew(A2)     ;1 pixel right skew (+NFSR???)
               move.l  A5,Src_Addr(A2) ;source address=plane 2
               move.l  A4,Dst_Addr(A2)
               move.w  #ylines,Y_Count(A2) ;y lines
               move.b  #192,Line_Num(A2) ;blit!

;step 6: OR plane 2 into plane 3 (1 pixel to the right & up)
s6:            move.l  A5,Src_Addr(A2) ;source address=plane 2
               move.l  A3,Dst_Addr(A2)
               move.w  #ylines,Y_Count(A2) ;y lines
               move.b  #192,Line_Num(A2) ;blit!

;step 7: OR plane 2 into plane 3 (1 pixel to the left)
;s7:             move.b  #-1,Skew(A2)    ;1 pixel left skew (+NFSR???)
;                move.l  A5,Src_Addr(A2) ;source address=plane 2
;                move.l  A6,Dst_Addr(A2)
;                move.w  #ylines,Y_Count(A2) ;y lines
;                move.b  #192,Line_Num(A2) ;blit!

;step 8: OR plane 2 into plane 3 (1 pixel to the left & up)
s8:            move.b  #-1,Skew(A2)    ;1 pixel left skew (+NFSR???)
               move.l  A5,Src_Addr(A2) ;source address=plane 2
               move.l  A3,Dst_Addr(A2)
               move.w  #ylines,Y_Count(A2) ;y lines
               move.b  #192,Line_Num(A2) ;blit!

;step 9: OR plane 2 into plane 3 (1 pixel to the left & down)
s9:            move.l  A5,Src_Addr(A2) ;source address=plane 2
               move.l  A4,Dst_Addr(A2)
               move.w  #ylines,Y_Count(A2) ;y lines
               move.b  #192,Line_Num(A2) ;blit!

;step 10: XOR plane 2 into plane 3 (original position!)
s10:           move.b  #6,OP(A2)       ;source xor destination
               clr.b   Skew(A2)        ;no skew
               move.l  A5,Src_Addr(A2) ;set source address=plane 1
               move.l  A1,Dst_Addr(A2) ;set dest address=plane 2
               move.w  #ylines,Y_Count(A2) ;y lines
               move.b  #192,Line_Num(A2) ;blit!

Well, not much to comment here,  other  that  to  write what is being done here:
Firtly, some of the comments are false, and  I'm  not in the mood of fixing them
(now and probably ever). Now, Plane  3  is  copied  to  plane 1. Then plane 1 is
copied to plane 2. Then plane  3  is  outlined  with  the  help of plane 2 and I
repeat that until...

               cmpi.b  #57,$FFFFFC02.w ;space pressed?
               bne     mainloop is pressed

               move.b  #8,$FFFFFC02.w


Restore mouse and return to the OS (or debugger)

cls:           ;dc.b 27,'E',0 ;clears the screen using vt52
               DC.B 27,'E'
               DC.B 'KÜA software productions: Atari or buST!'

Our small message.

;                     1234567890123456789012345678901234567890
blitterdata:   DC.W 8          ;sxinc
               DC.W 8+80       ;syinc
               DC.L 0          ;saddr
               DC.L $FFFFFFFF  ;endmask1,2
               DC.W $FFFF      ;endmask3
               DC.W 8          ;dxinc
               DC.W 8+80       ;dyinc
               DC.L 0          ;daddr
               DC.W 10         ;xcount
               DC.W 160        ;ycount
               DC.B 2          ;hop
               DC.B 6          ;op (xor!)
               DC.B 0          ;line number
               DC.B 1          ;skew
               DS.W 1          ;pad for 32 bytes

The data loaded to the blitter.  One  tiny  thing  I  wanted to add above: I was
afraid that if I wanted to shift left  instead of right (which is no problem, as
I have to simply load the skew with #1) I would have to use some special bits of
the skew register, and I would have to spend a lot of time looking them up. But,
as I found out, setting the skew register  to  -1 (i.e. 254) set all the correct
bits automagically!!! (those geeks who designed  the hardware were pretty clever

gfx:           DC.W %0000100000000000
               DC.W %0000100000111100
               DC.W %0000000001100010
               DC.W %0000011011000010
               DC.W %1100011010000100
               DC.W %0001100110001010
               DC.W %0001101100010100
               DC.W %0000011011100000
               DC.W %0001110101011000
               DC.W %0011001111111100
               DC.W %0110000101100000
               DC.W %0100001011011110
               DC.W %0100010011011000
               DC.W %0100101001010110
               DC.W %0011010000010100
;               dc.w %0000000000000000  ;last line not needed ;)

Our glorious bee :)


...and that's it. Note that the main  loop  of  the outline is very messy, and I
could have used 2 bitplanes instead of 3, which would result to faster & smaller
code, but I must say again that I didn't have enough time (nor now :)

       Another point I would like  to  discuss  in  the  main loop: if you look
closely, I don't use any of the outline methods described above! I blit up-left,
up-right, down-left-down-right! Why? Firstly,  I  save  some  bytes (typical :).
Secondly, because of the blocky image  I  wanted  to outline, blitting like that
yields the same result as blitting in  every direction! (think about it a little
and it will become clear - if not, mail me :) That's true in the first frame, as
for the others I'm certain that it's not the case.

       Well, ok, that's a wrap I guess. I  hope I didn't bore you too much. The
current length of this text is  about  47  times  bigger than the assembled code
itself,  which  goes  to  show  how  much  more  thought  must  go  into  coding

Until next time,

GGN/KÜA software productions in 2005
(colored using AKT)

Alive 10