The 3" bible - part II

Continued from page 1.

Disk formats

A disc can, in some ways, be compared to a tape: the main differences are the size (disc) and the fact that data can be accessed randomly instead of sequentially. That is: the read head does not have to search the disc, as its tape equivalent has to do the tape, it can jump to the requested part of the disc right away. Of course this access is in fact not random: it has been systematically defined in a logical format layout: not to be confused with the physical format. A standard PCW 8512 has e.g. only one physical format (the 3" disc) but uses at least two logical formats: CF2 for the single sided A drive and CF2DD for the double-sided drive B.

Before a disc can be used in a PCW it has to be provided with the details of such a logical format (usually performed by DISCKIT, LocoScript as from version 2 or another utility program): the formatting process. In this process the disc is magnetically divided into a pattern of tracks, subdivided into sectors. A bit like albums record except that a disc has several circular electronic tracks not one physical groove. The reason why tracks on disc have been divided into sectors is due to the speed of the computer when compared to the driving mechanism. Tracks cannot be read into memory at once and have therefore been divided into sectors.

These numbered sectors have not been laid out sequentially: numbers have an interlace, a 'skewing'. A logical sequence of 1,2,3,4,5,6,7,8,9 would pose a problem to the drive if it would receive the instruction to read sectors 1 to 9 after one another. By the time that the data on sector 1 has been read by the drive and interpreted by the PCW the read head has already passed sector 2 because of the mechanical speed and the disc would have to revolve a full circle before it can reach sector 2 once again. Although disc drives have step motors which are very accurately this is a speed issue that can only be compensated for by the 'sector skew'. A simple trick: just skip a number and divide the track as 1,6,2,7,3,8,4,9,5 (a sector skew of 1, needless to say…). In this way the read head will have reached sector 2 by the time the system has interpreted the data from sector 1, just by skipping sector 6 which contents were useless to the system at that time anyway… A simple trick that can easily mislead you when looking at data at such a low level.

The first tracks of a logical format usually have special functions. The very first track on discs for the PCW are often used by the system for booting and identification purposes. Subsequent tracks function as directory listing and this is the great difference between a disc and a tape. A tape reader has to go sequentially through the entire tape, usually from the beginning to the end until it has found the data it is looking for. A disc drive access the directory tracks (also sequentially), retrieves the location of the data it is looking for and then jumps to that location (the so-called random access feature).

Some of the data that makes up the logical format of a disc:

This block size is also a difference with a tape, which can be used to the maximum limit: a disc cannot. Because a disc needs to be indexed in a directory, all segments of that disc need to have an address by means of which they can be located on the disk. As a directory, by default, should be an abbreviated listing of details this address is usually a reference that takes less space than the data block itself. The larger the size of this block, the more efficient the storage in the directory is. However, as you can see in the explanations with the CF2 and CF2DD formats, the waste in data storage can be considerable. First of all, the gross data capacity of a disc is truncated to the maximum integer number of data block that can fit in the data tracks. And, even worse, a file always consists of at least one block. If the block size is 10 kilobytes and the contents of the file is actually one kilobyte, there is a waste of 9 kilobytes: the system does not allow a block to be used by more than one file (that would corrupt the directory). Block size is a compromise between speed and directory efficiency on the one hand and storage efficiency on the other hand. Library and compression systems (filing all individual files into one file and compressing them on the basis of algorithms) can fix this problem.

Other aspects of a logical disk format are compromises as well (in fact all aspects of computers are compromises between some sort of speed and efficiency). The ratio between the size of the directory and the size of the data capacity is also a good example. If a user stores only large files, he or she has little use for a directory that can contain a lot of small directory entries - that would have suited a user with many small files… As you can see in the section on the CF2 format, the standard A drive format can store a maximum of 64 entries in the directory and has 173kb storage. If a user stores 64 file of 1kb, the directory is full, although there is still 109kb free data space, yet, the disc is 'full'…

Bearing this information in mind you can see that the standard logical formats of a PCW, CF2 and CF2DD are not economical sizes: whether they fit your needs is a question you can now answer for yourselves…

CF2 and CF2DD formats

CF2 and CF2DD are the standard formats on the PCW 8256/8512: the later models use the CF2DD format only. Both formats are double density: CF2 is single sided and has 40 tracks with each 9 sectors of 512 bytes, CF2DD is double sided and has 80 tracks. The full data is:

Do note that the reserved track is a waste if the disc is not an Early Morning Start disk (that is if some kind of valid .EMS file for CP/M or LocoScript is not present on the disk). The space reserved for EMS purposes is useless with CF2DD as a standard PCW 8000 will never boot from drive B (though there are several companies, like Pinboard, that offer B drives that allow the PCW to boot from). Perhaps Amstrad already anticipated the arrival of the PCW 9000 series, on which only CF2DD is used… The data also applies to the PCW 9512 and, I guess, also to the PcW 9256, PcW 9512 and PCW 10 (although I have not seen these). The PCW 16, Anne (instead of Joyce) uses a different format - even a different operating system (from Creative Technology - the maker of MicroDesign, the excellent DTP package).

Less common formats

A disc drive, however, is not limited to a single logical format: it is common knowledge that a 3.5" 1.44mb high density drive can also read the double density 720kb disks and that the PCW drive B can read the disks from drive A (though not write). If you are interested you may want to read the details on how to customise logical formats: if not you just could suffice with this paragraph that gives you an overview of other logical formats that are often used on the PCW series. Not just for the 3" but also in 5.25" and 3.5" disc size.

 

This table has been taken from DISCTOOL, the excellent formatting tool from Matthijs Vermeulen (see download section). Besides these, the PCW is also capable of reading the various formats of its predecessor, the Amstrad CPC series. Do note that if you start using one of the non-standard formats you can no longer use the system utility DISCKIT and are "restricted" to DISCTOOL. DISCKIT can only deal with CF2/173kb and CF2DD/706kb. You may run into problems when using MS Dos tools like MS Oddball, the JOYCE emulator and 22DISK (the later has been provided with extended definitions to deal with SF2DD and XF2DD - see the download section).

Directory

The function of the directory is known: listing the contents and where-abouts of the files on disc. As it is on one of the first tracks of the disc and most frequently used, errors occur in this area earlier than other parts of the disc. Error messages referring to the directory often do not pose a risk to the data, which usually is fully intact: only the listing has been damaged. Data salvage is often possible and not that hard (in theory, that is…).

The directory only contains data on the data (files):

Disc is Double Density, Double Sided, 96 tpi (80-track)

Physical Sector Sequence for Head 1, Track 0 (9 Sectors)

(Decimal) : 1 6 2 7 3 8 4 9 5

(Hex) : 1 6 2 7 3 8 4 9 5

Probable gap sizes : Gap3 (Format) = 90 Gap3 (R/W) = 46

Sector 1 is 512 (&0200) bytes. C=&00, H=&01, R=&01, N=&01

Head = 1, Track = 0, Sector = 1

Addr : 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 0123456789ABCDEF

0000 : 00 42 41 53 49 53 20 20 20 4d 44 54 00 00 00 6D .BASIS MDT...m

0010 : 04 00 05 00 06 00 07 00 08 00 09 00 0A 00 00 00 ................

0020 : 00 42 41 53 49 53 31 20 20 4D 44 54 00 00 00 80 .BASIS1 MDT....

0030 : 0B 00 0C 00 0D 00 0E 00 0F 00 10 00 11 00 12 00 ................

0040 : 00 42 41 53 49 53 31 20 20 4D 44 54 01 00 00 48 .BASIS1 MDT...H

0050 : 13 00 14 00 15 00 16 00 17 00 00 00 00 00 00 00 ................

0060 : 00 44 49 52 44 41 54 20 20 53 43 52 00 00 00 80 .DIRDAT SCR....

0070 : 18 00 19 00 1A 00 1B 00 1C 00 1D 00 1E 00 1F 00 ................

A)nalyse Disc Format H)ead no. T)rack no. S)ector no.

R)ead sector E)dit sector W)rite sector EXIT to end

¬ Prev Sector ¯ Prev Track ® Next Sector ­ Next Track

- Display previous part of sector + Display next part of sector

Select : n

 

Almost all disk editors look alike: this is Moonstone's Multiformat. A header with information on the logical format (shown is a CF2DD): a main field displaying the data of the disk and the lower part of the screen showing the keys to use. The main section shows the first half of a 512 bytes sector with line numbers on the left, hexadecimal display in the middle and ASCII display on the right.

In the picture 4 directory entries are present, each spanning two lines of 32 bytes. Relevant information can be obtained from both the hexadecimal as well as the ASCII display (not that where a . is displayed in ASCII, most likely there is no equivalent ASCII character to display). The first part of an entry looks like this:

 

  1. User number of the file. In LocoScript called groups (0..7 + Limbo groups). Usually this will be 0 but it can range from 0 to 15 decimal (0F hexadecimal): a special code is E5. This code is used to indicate that a file has been marked deleted: it will no longer be shown in a directory listing and will be overwritten when the system needs its space. Un-erasing files in CP/M is therefore very easy to do: just change the code back from E5 into 00 and it will be present again (if no directory extents are in use: see further on). There are several utilities that will do this for you but DIY is very easy… This feature is different from MS Dos where the first character of the name itself is replaced: unerasing in DOS requires you to fill out the first character of the name again;
  2. The name of the file: 8 characters in hexadecimal code. A fixed string that will be filled with 20 when not all the characters are used;
  3. The extension name of the file: 3 characters in hexadecimal code. Note that the usual . is missing from the actual storage as CP/M knows where it should be: it is redundant. The extension is often reserved for the system and it is not a good idea to use .COM, .$$$ and other important extensions for the naming of your own files;
  4. Extent number. An entry in the directory takes up 32 bytes, 16 of these available for the storage of allocation blocks. Each of these blocks has a block size, with CF2DD this is 2kb. This results in a 8x2kb=16kb capacity per directory entry for CF2DD. When a file exceeds that 16kb a second entry is created in the directory: the extent;
  5. Reserved for file attributes;
  6. Record count showing the actual number of records in the file. For compatibility reasons this counter does not count in the physical sector size (512 bytes) but in standard 128 byte ones. The maximum would be 80 hexadecimal;
  7. ASCII display of the hexadecimal part.

The second part of the directory entry reads:

 

  1. The number of the first allocate block: note that this is a CF2DD format and counts in pairs of 2, because of the block size being 2kb, the second pair is zero;
  2. The first not allocated block. The file size is therefore 7x2kb=14kb. The record counter from the first line reads 6D (decimal 109) x 128bytes=13,952 bytes in actual use. The allocated blocks are 7x1,024=14,336bytes, so 384 bytes of the last block have been wasted. As explained earlier, unused space from one allocation block cannot be allocated to another file. By placing files in (compressed) libraries this waste can be prevented or reduced.

Data salvage

Knowledge of the filing system is a must when solving disk errors. When you experience problems with disk data you have to combine several issues of the information on disk formats as well as your own estimates on the file.

Here are some tips and hints:

Easily said, but harder done. The easy way out, when facing disc errors, is to consult an expert on the field.

Back to the home page