Alive 10 - dim2st

01 - 02 - SE - 03 - 04 - 05 - 06 - 07 - 08 - 09 - 10 - 11 - 12 - 13 - 14
Alive 10
Converting DIM files
What the hell is a DIM file? Well let's see if we can shade some light 
on this subject within the following article. I asked ggn to write a bit 
about it just to get some attention to the subject and perhaps to help 
him to solve some of the issues he will describe in the following. Since 
he had only one day for this last minute contribution 



Introduction
------------

Isn't the three letter extension on the filenames a wonderful thing? Not
only you get instant information about the file without opening it, but
you also know what program to run to open it! Right? Bah!

Let's face the facts, three letters for an extension is too little.
Maybe in the old days it sufficed, but even 10 years ago it was not
enough! Some extensions, as txt, doc, gif, jpg, img, prg, acc, app are
well defined and easy to understand, but most of the rest are a grey
area.

Take for example the DIM extension. All these years I've only
encountered this extension for one Atari program only, Fastcopy Pro, and
it was used for storing Disk IMages. That's a fine idea and an easy to
use mnemonic. But what if somebody thought of this earlier or later?

Now I know of three programs that use this extension: Fastcopy Pro,
Ecopy and Gcopy! Sheesh!

The purpose of this article is to describe (as good as I can) the format
used for parsing DIM files and converting them to other formats (.ST is
the easiest one).



Description
-----------

Now then, let's begin with Fastcopy Pro, as I'm more confident with it
(relatively). These files have a 32 byte header and following is the
data for the disk. Removing the header MIGHT make the image identical to
a .ST image, but in most cases it won't. Why? Well, let me write down
some features of FCopy:

- Image disk using user typed info: Here the program ignores all legal
  ways to obtain the disk geometry (tracks/sectors/sides) and trusts the
  user's info. Also, the user can select the ending as well as the start
  track (so you can image tracks 12-45 for example).

- Image disk using info from BPB (BIOS Parameter Block - stuff stored on
  the bootsector of the disk containing the disk geometry amongst others):
  The whole disk is stored here.

- Image disk using info from BPB + FAT: This is the most complex of the
  three, and takes the info from BPB, plus reads the FAT and stores only
  clusters that are reportedly used. This makes for smaller images, and
  more irregular sizes.

As you can see from above only the 2nd method of imaging ensures
standard image sizes (for example: 80 tracks x 2 sides x 9 sectors =
737280 bytes + 32 bytes header = 737312 bytes). The other two can have
variable size. So that leaves us with the header.

As you can probably guess, the header doesn't just contain a message by
the author. It also provides us with useful information to understand
how the image is stored. Let's see a breakdown of the header:

Offset Contains
$00     $4242 ("BB") identifier
$02     low byte=1=get used sectors, hi byte=1=read disk conf
$04     seems to be always(?) zero

$06     hi byte=sides
$08     hi byte=sectors
$0a     hi byte=start track
$0c     hi byte=end track

$0e     RECSIZ sector size (bytes) 
$10     CLSIZ  sectors per cluster
$12     CLSIZB cluster size (bytes) 
$14     RDLEN  root dir size (sectors) 
$16     FSIZ   FAT size (sectors) 
$18     FATREC first sector if 2nd FAT (the one that is used by TOS)
$1a     DATREC Number of 1st data sector 
$1c     NUMCL  Total number of clusters minus DATREC
$1e     BFLAGS (Bit 0 is 0 for 12 bit FAT, 1 for 16 bit FAT)

Let's go into detail a bit.

The first word contains an identifier. If it isn't equal to $4242, then
it wasn't created with FCopy Pro and falls out of the scope of this
article. Along with this identifier there are some flags in the
following four bytes. Bytes 2 and 3 contain flags for the imaging
options described above. Byte 3 contains the "read BPB" flag (1=on,
0=off). Byte 4 contains the "read FAT" flag. Obviously if byte 3 is 0,
byte 4 is 0 too. That leaves us with the three imaging options. I
haven't seen bytes 4 and 5 be anything else than 0, so I cannot comment
further.

Byte 6     contains the number of sides. Byte 7 is always 0 (guess).

Byte 8     contains the number of sectors. Byte 9 is always 0 (guess).

Byte 10    contains the start track. Byte 11 is always 0 (guess).

Byte 12    contains the end track. Byte 13 is always 0 (I guess).

Taking other disk structure informations into account I rather believe
that these values are stored in Intel manner.


The following information block can easily be identified as a Bios
Parameter Block (BPB). The contained data is encoded in Motorola manner.

Byte 14+15 RECSIZ contains the sector's size in bytes. It's usually 512.

Byte 16+17 CLSIZ contains the number of sectors per cluster.

Byte 18+19 CLSIZB contains the cluster's size in bytes.

Byte 20+21 RDLEN contains the root directory's size in sectors

Byte 22+23 FSIZ contains FAT's size in sectors

Byte 24+25 FATREC contains the startsector of the second FAT

Byte 26+27 DATREC contains the number of the first data sector, which 
           starts after the sum of the system's sectors (FATREC + FSIZ +
           RDLEN). Basically it's the sum of the previous three values.

Byte 28+29 NUMCL contains the number of clusters on the disk (SEC -
           DATREC) / CLSIZ where SEC is the total number of Sectors given
           in the bootsector.

Byte 30+31 BFLAGS Officially there should be some flags present however 
           it seems only bit 0 is in use it indicates the used FAT type.
           A 0 indicates FAT12 which 1 indicates FAT16.


So, the proper way to convert a FCopy image to .ST is the following:

a)  Read in the header. If the identifier is missing, abort. Else, read
    in the rest of the header's values.

b)  Determine with what switches the image was stored. Then.

c1) If the image was stored with BPB info simply dump the clusters in
    another file.

c2) If the image was stored with user typed info then create a blank
    image with the size of 512*sides*tracks*sectors and dump only the
    tracks stored in it. If standard sizes and start/end track numbers
    are involved, this is identical to (c1).

c3) If the image was stored using BPB + FAT info, then we need to parse
    the FAT. If a cluster is reported used, then read one from the file
    and dump it on our image, else dump a blank cluster on our file.

Theoretically, that's about it. Now, let me add a few notes about the
implementation.



The bootsector
--------------

If you don't know, it's the first sector on the first side of the first
track of the disk. This sector contains some quite useful information
that describe our disk, and can contain a small 480 byte program to be
executed when this sector is read at boot time. I won't bother you with
all the details (Cyclone has another article that explains the
bootsector better), but I will mention some useful offsets of it as we
can use them for the conversion of disk images.

Note that 2-byte offsets are stored in little endian format. To read
them you have to apply the following formula: word=byte1+byte2*256 (or
you just swap the byte order)

Offset  Name  Meaning
26-27   SIDE  Number of sides (1 (eg. SF354) or 2 (eg. SF314))
24-25   SPT   Sectors per track
19-20   SEC   Total number of disk sectors
17-18   NDIRS Number of entries in the root directory (normally 112)
22-23   SPF   Sectors per FAT
11-12   BPS   Bytes per sector
13      SPC   Sectors per cluster

Some items that we use have to be derived from these figures. For
example, the number of tracks is (number of disk sectors)/(sectors per
track*sides).



FAT
---

Again, this is only a bare bone description instead of a full blown one.
(it would take too much time, and there isn't any, and it would require
some reading on my part, and I lack the resources and the time).

The FAT can be described as a cluster map. That is, it is a map that
contains a descriptor for each of the disk's cluster. Thus, a cluster
can have various states assigned to it. There are two copies of the FAT,
and TOS uses the 2nd FAT stored on disk as primary and keeps a copy of
it on FAT 1 (MS DOS does it the other way around).

Now, disks have a version of FAT called FAT12. That's because it uses 12
bits per cluster. Now, as we all know, a byte consists of 8 bits, so it
won't fit on one. It won't exactly on 2 bytes (a word), so whoever
designed the FAT-12 decided to use 3 bytes for every 2 clusters, with 1
byte being shared between the two. In fact a FAT entry uses 1.5 bytes :)

The first FAT begins directly after the bootsector, but it's the copy,
so we should skip that in case it's corrupt, or not synced. The first 3
bytes are not used for mapping the first 2 clusters, instead it is
reserved. Its first byte contains the byte pattern that the empty
clusters are filled, and the remaining 2 bytes are $ff.

So for each 2 clusters this is the way their FAT entries are stored.

             Byte 1    Byte 2   Byte 3
        Bits 76543210 76543210 87654321
Cluster bits 76543210 3210BA98 BA987654
Cluster no   11111111 22221111 22222222

So, to retrieve an "even" FAT entry (that is, its offset aligned to 3
bytes) we could use the following formula (assuming that 'buffer' has
the correct 2 sectors):

entry = PEEK(buffer) + (PEEK(buffer+1) AND 15)*256

While, to retrieve an "odd" entry (offset not aligned to 3) we could use:

entry = PEEK(buffer AND 240)/16 + PEEK(buffer+1)*16

Now that we have extracted our entry, we can check it and see if the
cluster was stored on the file or not. In fact a FAT entry can have the
following values:

$000       Cluster is unused
$001       FAT doesn't use this value (impossible)
$002-$7FF  Points to next cluster
$800-$FEF  FAT doesn't use these values (impossible)
$FF0-$FF7  Cluster is defect
$FF8-$FFF  End of file

So in practice whenever we encounter $000 and $FF0 through $FF7, then
the cluster is definitely not stored and we can dump zeros (or an array
filled with the FAT's first byte - described above). If we find values
that the FAT doesn't use we can easily assume that the FAT is corrupt
and that we need to rely on just the bootsector info (or FCopy's stored
values) and fall back to dumping the whole image.



The end?
--------

Most certainly not! Included with this article should be a small program
that does the job converting a "used sectors" DIM image to ST. Some of
the variables names might be misleading. Please bear with that, as I'm
still learning about the subject (I bet that Cyclone will over-edit this
article to be concise :)

Also, this program is still under development, and it will eventually
support ECopy too. (ECopy uses a model like the one described above, but
with some variations that I haven't to this date been able to
understand, so I omitted all ECopy support for this source and from the
article itself).

While reading this article I read that the STEem authors plan to add
FCopy DIM support to their excellent emulator, plus the MSA converter
plans on doing the same thing.

To be continued.....

GGN/KuA for Alive, 2005-05-13
Alive 10