PSYCHO BEE
(THE MAKING OF)
Hi there everybody, let's talk about getting small. And before everyone
starts thinking perverse, let me clarify that I'm talking about small code. How
small? Well...
The subject here is bootsector coding. I wrote this one for the
bootsector compo of the Outline 05 party (sorry I couldn't make it folks!). I
had the idea just 1 day before the party started, so I was under a lot of
pressure and had to code at work (hehe :) as well as home. The result is this
hideous thing I'll try to explain here.
The effect of the bootsector consists of 2 routines:
a) Draw the (well known) busy bee
b) Do some funky effect with it
Well, many people didn't understand what the effect was, so permit me to
say that it's an outline (perhaps you understand now where I got the concept for
the effect :)
There are a couple of things I'd like to discuss and then I'll explain
the whole code, line by line.
How to make an outline of an object in the first place? Well, there
isn't just one way to make an outline, and it's a matter of preference of each
individual. The following techniques apply only to monochrome images, as for
color different techniques are used, which are beyond the scope of this document
(plus, I don't know them ;)
So, let's assume we have the following bitmap we want to outline:
KK KK UU UU AA
KK KK AAAA
KKKK UU UU AA AA
KK KK UU UU AAAAAAAA
KK KK UUUUU AA AA
(ok, so I'm not a good artist, shoot me :) As I said, there are two main ways to
do an outline of this, and it's because the screen is composed of a matrix of
pixels. So, two ways to outline the above image (stop laughing!) are these:
kk kk uu uu aa kkkkkkkkuuuu uuuu aaaa
kKKkkKKkuUUu uUUu aAAa kKKkkKKkuUUu uUUu aaAAaa
kKKkKKk uu uu aAAAAa kKKkKKkkuuuu uuuu aaAAAAaa
kKKKKk uUUu uUUu aAAaaAAa kKKKKkk uUUu uUUu aaAAaaAAaa
kKKkKKk uUUuuuUUu aAAAAAAAAa kKKkKKkkuUUuuuUUuaaAAAAAAAAaa
kKKkkKKk uUUUUUu aAAaaaaaaAAa kKKkkKKkuuUUUUUuuaAAaaaaaaAAa
kk kk uuuuu aa aa kkkkkkkk uuuuuuu aaaa aaaa
(and you can thank God - or Drizzt & Earx who coded the shell! - that the shell
has colors, or you wouldn't be able to understand this too easily :)
But I hear you say "That's not an outline!". Ture, an outline doesn't
need the original image, and that gives us the following results:
kk kk uu uu aa kkkkkkkkuuuu uuuu aaaa
k kk ku u u u a a k kk ku u u u aa aa
k k k uu uu a a k k kkuuuu uuuu aa aa
k k u u u u a aa a k kk u u u u aa aa aa
k k k u uuu u a a k k kku uuu uaa aa
k kk k u u a aaaaaa a k kk kuu uua aaaaaa a
kk kk uuuuu aa aa kkkkkkkk uuuuuuu aaaa aaaa
And now, without further ado, how to do this effect (the left one):
a)make a copy of the original image
b)blit the copy with OR one pixel above the original image
c)blit the copy with OR one pixel left of the original image
d)blit the copy with OR one pixel right of the original image
e)blit the copy with OR one pixel down the original image
f)XOR the copy with the original image
That's it! Now we are left with an outline in the place of the original
image! (for the right image to the right we need to blit copies with OR to the
upper left, upper right, lower left and lower right of the original image) Now,
what if we outline the outline? Well, we would end up with something like this:
]] ]] ]] ]] ]] ]]]]]]]]]]]]]]]]]]] ]]]]]]
] ]] ]] ] ] ] ] ] ] ] ] ]] ]]
] ]] ]] ]] ] ]] ] ]] ] ] ]] ]] ]] ] ]] ]]] ]] ]]
] ]] ]] ]] ] ] ] ] ]]]] ] ] ]] ]] ] ]] ]]]] ]]
] ] ]] ]] ]] ] ]] ] ]] ]] ] ] ]]]] ] ]] ] ]] ] ]] ]] ]]
] ]] ]] ] ]] ]] ] ]]]]]]]] ] ] ]] ]] ]] ]] ]]]]]]]] ]
] ]] ]] ] ]]]]] ] ]] ]] ] ] ]] ]] ]]]]] ]] ]] ]
] ]] ] ] ] ] ]]]]]] ] ] ] ] ]]]] ]
]] ]] ]]]]] ]] ]] ]]]]]]]]]]]]]]]]]]]]]]] ]]]]]]
As you can see, progressive outilnes of the same image can produce some
trippy effects! And so, without further ado, the source code with more comments
& explainations. Code will be colored like this, while my comments will be
colored like this.
; Started 24/3/05 10:16 (approx)
; Finished 23/3/05 15:08 (hopefully!)
; Rev.2 started 23/3/05 22:00 (more or less) - better in everything :)
; Rev.2 finished 23/5/05 23:52 (hopefully for a second time :)))
; Rev.3 started 30/3/05 00:23
; Rev.3 finished 30/3/05 00:34
Damn! I even forgot to write some credits! Written by GGN in Turbo Assembler on
Steem Engine.
IFEQ 0
OPT X+
pea start(PC)
move.w #$26,-(SP)
trap #14
addq.l #6,SP
rts
start: clr.b $FFFF8260.w
ENDC
This is just some code to test the whole intro under a debugger (Bugaboo for
me). As you probably know, when the boot sector is being executed, we are
already at supervisor mode in low resolution, but when a program is normally
executed, we have to manually turn on supervisor mode. A switch to low-res is
needed as Turbo Assembler uses medium-res.
; plane/color 0 1 2 3 4 5 6 7 8 9 A B C D E F
; 0 o x o x o x o x o x o x o x o x
; 1 o o x x o o x x o o x x o o x x
; 2 o o o o x x x x o o o o x x x x
; 3 o o o o o o o o x x x x x x x x
lea $FFFF8240.w,A0
moveq #7,D0
pal: move.l #$0FFF0000,(A0)+
dbra D0,pal
; move.l #$0FFF,$FFFF8242.w *size optimising!
; move.l #$0FFF,$FFFF8246.w
; move.l #$0FFF,$FFFF824A.w
; move.l #$0FFF,$FFFF824E.w
; move.l #$0FFF,$FFFF8252.w
; move.l #$0FFF,$FFFF8256.w
; move.l #$0FFF,$FFFF825A.w
; move.w #$00,$FFFF825E.w
; clr.w $FFFF8242.w ;black the 1st plane col
; move.w #$0FFF,$FFFF8244.w ;white the 2nd plane col
;explaination: 1st plane=draw plane
; 2nd plane=copy plane (invisible!)
;(was that an explaination???)
What was that all about?? Well, I need to explain some things first.
Do you remember where I wrote above that we need to make a copy of the image to
outline? Well, another feature of the bootsector is that it is loaded & run at a
specific address (depends on your TOS version), which is quite low. 512 bytes of
memory are allocated specifically for that purpose. After this area system data
exist.
One of the goals of the compo was that the bootsector must have a clean exit to
the desktop, so that means that it has to be system legal (otherwise I could
have just used the memory immediately after the bootsector for buffer). Now, I
didn't want to mess around with malloc or crap like that (!), and then I
realised that I used only 1 bitplane of the screen!
So, I just 2 of the bitplanes for buffer & screen swapping! "Wait a minute, if
you go and use the other bitplanes, the result will be visible, as the screen
will be filled with random colors!". True, that's why I fill the whole palette!
Let's take a closer look at the table above (an x denotes that the plane is
active for the given color and an o that the plane is inactive - i.e. we don't
draw on that plane to get that color):
; plane/color 0 1 2 3 4 5 6 7 8 9 A B C D E F
; 0 o x o x o x o x o x o x o x o x
; 1 o o x x o o x x o o x x o o x x
; 2 o o o o x x x x o o o o x x x x
; 3 o o o o o o o o x x x x x x x x
Now, my idea was to use the first bitplane only and make the other 2 invisible.
If I didn't write to the other planes, just blanking the 1st color would
suffice. Alas, since the other 2 planes are involved, when I use them for
buffering mixed colors would appear (for example, if a bit in plane 0 & 1 would
be on -i.e. 1- color 3 would be displayed). So the idea is to make all the
colors in which plane 0 is involved black as well, and the rest as the
background color (i.e. white)!
And that's what the above code does! The commented code was left for me, just
for reference (you can notice how it was space optimised!). Confused? Maybe
you'll understand as you read further.
pea cls(PC) ;clear the screen (for tos 2.06)
move.w #9,-(SP)
trap #1
; addq.l #6,SP ;put this in the add below :)
move.w #2,-(SP) ;get phys address
trap #14
addq.l #2+6,SP
Here we do some standard calls to GEMDOS and XBIOS to clear the screen (by
printing ESC+E) & print the message and getting the screen's address
respectively. One small optimisation, as you see, is that I merged the 2 stack
corrections into 1.
movea.l D0,A0
lea 48+35*160(A0),A0
movea.l A0,A6 ;a6=phys adress (always!)
This just adds an offset to center the effect, otherwise it would be drawn at
the top-left of the screen. Also, a copy of this address is made into a6, which
will be used for the rest of the code.
lea drawlogo4(PC),A4 ;for eori below
lea gfx(PC),A1
moveq #14,D7 ;d7=lines to process
Here I initialise some variables for the loop below. The first lea will be
explained, don't worry :) A1 contains the gfx, which is the busy bee, a word for
each line (since the graphic is 16x16). Since the last line is completely blank
(see below), I don't draw it at all (and save 2 bytes!), that's why d7 is 14 and
not 15.
drawlogo: move.w (A1)+,D0 ;get h-line pixels
moveq #0,D1 ;d1=current line draw offset
moveq #15,D4 ;d4=pixels left to draw for current line
drawlogo2: lsl.w #1,D0 ;draw pixel?
bcc.s drawlogo4 ;nope skip drawing
moveq #7,D2 ;d2=bytes to draw
move.w D1,D3
drawlogo3: st 0(A0,D3.w)
st 4(A0,D3.w) ;put the same into plane 3
add.w #160,D3
dbra D2,drawlogo3
drawlogo4: addq.w #1,D1 ;point to next pixel on screen
eori.b #%1100,(A4) ;wicked smc shit!
dbra D4,drawlogo2
drawlogo99: lea 8*160(A0),A0 ;point to next line
dbra D7,drawlogo
What the hell is going on here???? Well, firstly a gfx line is read (16 bits).
Each pixel will be "magnified" to a 8x8 block, which is handy, since 8 bits=1
byte! To determine if we need to draw a block, we do the lsl/bcc combination
(trace it with your debugger to see what it'a all about if you don't understand
- it's not that tough!).
The drawing is done using the st instruction (which means store true or
something like that), which fills a byte at a given address. So, we need 8 st
instructions, for 8 horizontal strips (8 pixels wide), one on top of the other,
to create a 8x8 block.
d1 contains the offset to be drawn. And that's the real tricky bit here: how do
we calculate the next offest with the minimum required code. Let's construct a
series with all the screen offsets for the first bitplane for a given line.
Since the planes are interleaved, the series goes like this:
8x8 block: 00 01 02 03 04 05 06 07 08 09 .....
Offset : 00 01 08 09 16 17 24 25 32 33 .....
So, to get from block #0 to block #1 we have to add 1, from #1 to #2 we have to
add 7, from #2 to #3 we have to add 1, from #3 to #4 we have to add 7 etc. Do
you see a pattern here? 1,7,1,7,1,7... Let's break down the numbers in binary:
Bit76543210
1: 00000001
7: 00000111
So, the difference of 1 and 7 is bits 1 & 2. Say we have a data register loaded
with 1. How can we get to 7? A simple OR with %110 would suffice, but how would
we get to 1? An AND with %001 would be the answer, but how do we make it on one
instuction for both cases?
Simple, we use XOR (or EOR for those who prefer it this way) with %110. Now, the
legal way to do it would be to use a data register, XOR it with %110 and add it
to d1, but as I was a bit pressed for time I didn't think that, and used self-
modifying code, changing the instruction at drawlogo4!!! As I say above: this is
not a good way to do it, and in this case yilded no size optimisation from the
method I described. But as I was pressed for time, I couldn't sit down and think
it though :)
Oh yeah, as you can see I use 2 st instructions, I just blit the same data at
plane 0 and 2. You'll see why below.
BLiTTER EQU $FFFF8A20
Src_Xinc EQU 32-32
Src_Yinc EQU 34-32
Src_Addr EQU 36-32
Endmask1 EQU 40-32
Endmask2 EQU 42-32
Endmask3 EQU 44-32
Dst_Xinc EQU 46-32
Dst_Yinc EQU 48-32
Dst_Addr EQU 50-32
X_Count EQU 54-32
Y_Count EQU 56-32
HOP EQU 58-32
OP EQU 59-32
Line_Num EQU 60-32
Skew EQU 61-32
ylines EQU 160
Standard equates for blitter, as well as the number of vertical lines allocated
for the effect.
lea BLiTTER+Src_Xinc.w,A2
movem.l blitterdata(PC),D0-D7 ;load blitter data
movem.l D0-D7,(A2) ;put data in blitter
A small trick me and my brother thought about a decace ago (roughly!): Since
most of the blitter data doesn't change, plus loading the blitter registers with
move.l/w instructions would take a lot of space, we can load all the registers
with one go with a movem.l! The blitter data isn't anything fancy, blit a
rectangle of 160x160, full endmasks, no halftoning, variable skew, etc.
lea -8-16*160(A6),A6
lea 2(A6),A5 ;for step 1
lea 4(A6),A1 ;point to plane 3!!!
lea 160(A1),A4 ;for step 2
lea -160(A1),A3 ;for step 3
Here I adjust the screen pointers for the blitting, and create some new pointing
to the other bitplanes, and some pointing one line above and below a1.
moveq #99,D7 ;wait 2 sec (on 50hz)
bra.s vs
A small delay to marvel the bee before I destroy it!
mainloop:
move.b #18,$FFFFFC02.w ;turn mouse off
moveq #2,D7 ;this changed to d7 to work with TOS2.06
vs: move.w #37,-(SP) ;vsync
trap #14
addq.l #2,SP
dbra D7,vs
Wait for as maby VBLs as d7 says (either 100 or 3).
;step 1: make a copy of the graphic plane 3 in plane 1
s1: move.b #3,OP(A2) ;copy source
clr.b Skew(A2) ;no skew
lea 4(A6),A1 ;vsync sucks :(
move.l A1,Src_Addr(A2) ;set source address=plane 3
move.l A6,Dst_Addr(A2) ;set dest address=plane 1
move.w #ylines,Y_Count(A2) ;y lines
move.b #192,Line_Num(A2) ;blit!
;step 2: make a copy of plane 1 in plane 2
move.l A6,Src_Addr(A2) ;set source address=plane 1
move.l A5,Dst_Addr(A2) ;set dest address=plane 2
move.w #ylines,Y_Count(A2) ;y lines
move.b #192,Line_Num(A2) ;blit!
;step 2: OR plane 2 into plane 3 (1 pixel up)
s2: move.b #7,OP(A2) ;source OR destination
; move.l A5,Src_Addr(A2) ;source address=plane 2
; move.l A3,Dst_Addr(A2) ;dest address=plane 1
; move.w #ylines,Y_Count(A2) ;y lines
; move.b #192,Line_Num(A2) ;blit!
;step 3: OR plane 2 into plane 3 (1 pixel down)
;s3: move.l A5,Src_Addr(A2) ;source address=plane 2
; move.l A4,Dst_Addr(A2) ;dest address=plane 1
; move.w #ylines,Y_Count(A2) ;y lines
; move.b #192,Line_Num(A2) ;blit!
;step 4: OR plane 2 into plane 3 (1 pixel to the right)
;s4: move.b #1,Skew(A2) ;1 pixel right skew (+NFSR???)
; move.l A5,Src_Addr(A2) ;source address=plane 2
; move.l A6,Dst_Addr(A2) ;dest address=plane 1
; move.w #ylines,Y_Count(A2) ;y lines
; move.b #192,Line_Num(A2) ;blit!
;step 5: OR plane 2 into plane 3 (1 pixel to the right & down)
s5:
move.b #1,Skew(A2) ;1 pixel right skew (+NFSR???)
move.l A5,Src_Addr(A2) ;source address=plane 2
move.l A4,Dst_Addr(A2)
move.w #ylines,Y_Count(A2) ;y lines
move.b #192,Line_Num(A2) ;blit!
;step 6: OR plane 2 into plane 3 (1 pixel to the right & up)
s6: move.l A5,Src_Addr(A2) ;source address=plane 2
move.l A3,Dst_Addr(A2)
move.w #ylines,Y_Count(A2) ;y lines
move.b #192,Line_Num(A2) ;blit!
;step 7: OR plane 2 into plane 3 (1 pixel to the left)
;s7: move.b #-1,Skew(A2) ;1 pixel left skew (+NFSR???)
; move.l A5,Src_Addr(A2) ;source address=plane 2
; move.l A6,Dst_Addr(A2)
; move.w #ylines,Y_Count(A2) ;y lines
; move.b #192,Line_Num(A2) ;blit!
;step 8: OR plane 2 into plane 3 (1 pixel to the left & up)
s8: move.b #-1,Skew(A2) ;1 pixel left skew (+NFSR???)
move.l A5,Src_Addr(A2) ;source address=plane 2
move.l A3,Dst_Addr(A2)
move.w #ylines,Y_Count(A2) ;y lines
move.b #192,Line_Num(A2) ;blit!
;step 9: OR plane 2 into plane 3 (1 pixel to the left & down)
s9: move.l A5,Src_Addr(A2) ;source address=plane 2
move.l A4,Dst_Addr(A2)
move.w #ylines,Y_Count(A2) ;y lines
move.b #192,Line_Num(A2) ;blit!
;step 10: XOR plane 2 into plane 3 (original position!)
s10: move.b #6,OP(A2) ;source xor destination
clr.b Skew(A2) ;no skew
move.l A5,Src_Addr(A2) ;set source address=plane 1
move.l A1,Dst_Addr(A2) ;set dest address=plane 2
move.w #ylines,Y_Count(A2) ;y lines
move.b #192,Line_Num(A2) ;blit!
Well, not much to comment here, other that to write what is being done here:
Firtly, some of the comments are false, and I'm not in the mood of fixing them
(now and probably ever). Now, Plane 3 is copied to plane 1. Then plane 1 is
copied to plane 2. Then plane 3 is outlined with the help of plane 2 and I
repeat that until...
cmpi.b #57,$FFFFFC02.w ;space pressed?
bne mainloop
...space is pressed
move.b #8,$FFFFFC02.w
rts
Restore mouse and return to the OS (or debugger)
cls: ;dc.b 27,'E',0 ;clears the screen using vt52
DC.B 27,'E'
DC.B 'KÜA software productions: Atari or buST!'
Our small message.
; 1234567890123456789012345678901234567890
blitterdata: DC.W 8 ;sxinc
DC.W 8+80 ;syinc
DC.L 0 ;saddr
DC.L $FFFFFFFF ;endmask1,2
DC.W $FFFF ;endmask3
DC.W 8 ;dxinc
DC.W 8+80 ;dyinc
DC.L 0 ;daddr
DC.W 10 ;xcount
DC.W 160 ;ycount
DC.B 2 ;hop
DC.B 6 ;op (xor!)
DC.B 0 ;line number
DC.B 1 ;skew
DS.W 1 ;pad for 32 bytes
The data loaded to the blitter. One tiny thing I wanted to add above: I was
afraid that if I wanted to shift left instead of right (which is no problem, as
I have to simply load the skew with #1) I would have to use some special bits of
the skew register, and I would have to spend a lot of time looking them up. But,
as I found out, setting the skew register to -1 (i.e. 254) set all the correct
bits automagically!!! (those geeks who designed the hardware were pretty clever
;)
gfx: DC.W %0000100000000000
DC.W %0000100000111100
DC.W %0000000001100010
DC.W %0000011011000010
DC.W %1100011010000100
DC.W %0001100110001010
DC.W %0001101100010100
DC.W %0000011011100000
DC.W %0001110101011000
DC.W %0011001111111100
DC.W %0110000101100000
DC.W %0100001011011110
DC.W %0100010011011000
DC.W %0100101001010110
DC.W %0011010000010100
; dc.w %0000000000000000 ;last line not needed ;)
Our glorious bee :)
END
...and that's it. Note that the main loop of the outline is very messy, and I
could have used 2 bitplanes instead of 3, which would result to faster & smaller
code, but I must say again that I didn't have enough time (nor now :)
Another point I would like to discuss in the main loop: if you look
closely, I don't use any of the outline methods described above! I blit up-left,
up-right, down-left-down-right! Why? Firstly, I save some bytes (typical :).
Secondly, because of the blocky image I wanted to outline, blitting like that
yields the same result as blitting in every direction! (think about it a little
and it will become clear - if not, mail me :) That's true in the first frame, as
for the others I'm certain that it's not the case.
Well, ok, that's a wrap I guess. I hope I didn't bore you too much. The
current length of this text is about 47 times bigger than the assembled code
itself, which goes to show how much more thought must go into coding
bootsectors!
Until next time,
GGN/KÜA software productions in 2005
(ggn[at]atari[dot]org)
(colored using AKT)
|