Building a retro DOS PC. Part-2 Theoretical overview of DOS



This post is a continuation of my previous post about building a DOS PC. Today I want to give some theoretical overview of DOS, highlight some of its limitations and strong sides. Without the theoretical context many realities of the DOS era which were relevant to the technologies of the past might seem weird today, so I decided to write a dedicated post for the theory before sharing any practical advice.

16 Bit Real Mode

DOS itself and majority of the DOS applications operated in 16 bit Real Mode of x86 processors.

It means that the applications could access the same memory as the OS and application faults often couldn’t be fixed by any other means other than restart of the whole system.

While the CPU operated with 16 bit registers, the address bus used in the first x86 computers had 20-bits. That means that you had 1MB as your memory limit, and to address a byte in memory it was required to combine 2 different 16-bit values segment and offset. The implicit formula for addressing one byte was ((segment << 4) + offset) % 2^20. This addressing schema divides the memory space into overlapping regions called segments. A new segment begins at every 16 bytes, but the size of each segment is 64kb, therefore a single byte in memory can be addressed by many different combinations of segment and offset values.

| .. segment #1 .....|           
    | .. segment #2 .....|       
       | .. segment #3 .....|    
           | .. segment #4 .....|

The CPU has standard registers for storing segment values of the currently executed program: for it’s code (CS), data (DS) and the stack (SS). Also there are several other segment registers which could be used by applications.

In practice such segmentation scheme means that using more than 64Kb for the program code, static data, stack or dynamic memory involves additional computations to properly maintain the segment registers, which might be non-trivial. Small 16 bit real mode DOS applications which do not use more than 64k for their code, static data, stack or heap need to change only offsets values in the course of their execution and therefore tend to run faster than real mode applications which require more memory.

The limitations described above might sound discouraging in 21st century, but imagine the performance of the environment with small applications already written and tested where the whole system memory could easily fit into a CPU cache of a modern processor.

DOS memory layout

In DOS the original 1Mb of memory addressable with 20-bit bus is split into two parts - Conventional Memory (first 640Kb), and Upper Memory Area (UMA) (remaining 384Kb). The first is the memory available to applications and the second was reserved for such things as BIOS, Video memory and optional memory-mapped devices. For compatibility reasons the sizes of Conventional and Upper Memory Area remained unchanged. In order to get more memory for applications - special programs called Memory Managers were used. Memory Managers employed various techniques which I’d like to briefly explain bellow.

Since 80286, x86 computers started using more than 20 bits in address bus, so they technically were able to address more than 1 MB of memory. With that the effective formula for obtaining an address in 16 bit real mode would became ((segment << 4) + offset) % bus_size - which is different in only that the 21st bit is not dropped any longer, so for certain values of segment and offset the result of this operation could slightly overflow 2^20-1 providing access to a small region beyond the original 1Mb limit. This region of size 64Kb minus 16 bytes is called High Memory Area (HMA). Since this behaviour was not back compatible with older software the electrical line transmitting 21st bit was disabled by default in real mode, but Memory Managers could enable it by controlling a special gate called A20.

High memory area is a part of Extended Memmory, which is everything above the original 1Mb limit.

Extended Memory Specification (XMS) - implemented by Memory Managers allowed applications to copy data between Conventional memory and Extended memory (This involved switching the processor to protected mode, in which it could address memory above 1 MB). Additionally XMS covered the usage of free regions in Upper memory, by mapping RAM to them. This technique is called Upper memory blocks (UMBs).

An additional technique called Expanded Memory was used to access more memory. It was based on bank-switching of several memory pages originally provided by an external hardware. Only single page could be accessed at any given time and it was mapped to a region of Upper Memory Area called page frame. The most common standard for Expanded Memory was Expanded Memory Specification (EMS). Since 80386s the Expanded Memory could be simulated in software by using Extended Memory as a source.

|      Memory accessible in real mode                | not accessible in rm |
| Conventional Memory  |  Upper Memory  Area   | HMA |                      |
|                      |    |page frame| |UMB| |  Extended Memory  ...      |
                                 /                                 /|\
                                 |                                  |
    | Expanded Memory: page 1| page 2|... |    <--------------------/      

Finally, I would like to mention DOS extenders which enabled running applications in protected mode (by switching CPU when necessary to/from real mode to provide DOS API to the application). Since 80386 the programs could be compiled as 32 bit protected mode applications and DOS extenders enabled the existence of these applications in DOS. In 32bit mode applications could address up to 4GB of memory. I found some mentions of 64bit extenders for DOS but can’t confirm any details.

Start-up and execution of real mode applications

DOS session starts by processing directives in the main configuration file (often called CONFIG.SYS), among other things the configuration file specifies the command shell program. The command shell supports execution of commands in batches and also provides an interactive shell to the user. During start-up, before switching to the interactive mode the command line shell can execute a batch file (often called AUTOEXEC.BAT) which can be edited by the user to change the start-up behaviour. Command shell supports various commands which utilize DOS API and is capable of launching other applications. All of these is accomplished by invoking a special software interrupt handled by the OS. The same way as the command shell any other DOS application can launch other DOS programs via DOS API (software interrupt).

During the preparation to run a program the caller application is required to make an additional system call to prepare a block of 256 bytes with the program metadata. This block is called Program Segment Prefix (PSP) and it’s placed at the beginning of some segment. Among many things PSP contains command line arguments and the segment value of the block of memory with environment variables. When all preparations are ready the caller application calls exec API, and what happens next depends on the type of the executable. Applications in DOS come in one of the two types of executable files - COM and EXE (the executable files usually have the corresponding filename extensions).

COM files are super simple, they came from CP/M - an operating system developed earlier for the computers of older generation. COM files don’t have any headers and contain only program code and static data sections. They are simply loaded after PSP into the same segment, so that the code entry point is located right after the PSP block. All segment registers are set to PSP segment. The stack pointer is set to the offset of the last word in the segment, from there the stack grows down. The maximum binary size is limited to 64kb-256 bytes which is the size of one segment minus the size of PSP, but any uninitialized static data and stack must also fit into the same one segment:

| segment begins                                        segment ends|
---------------------------------------------------------------------
| PSP        |.code|.data|uninitialized data(.bss) |free space|stack|
---------------------------------------------------------------------
{ 256 bytes }{ File size }                                          |

This format is sufficient for small applications and should be preferred for them as it involves less steps for starting the program and does not increase the binary size.

EXE files contain a header with relocation table for supporting position independent code. It allows having multiple segments loaded to arbitrary addresses and unlike COM files, EXE binaries can be larger than 64Kb. The benefits come with a price - addresses from the object code need to be adjusted in memory during the load of the application to point to the correct locations (which are unknown at the compile time).

Background programs and device drivers

Before multitasking was considered necessary for personal computers, background programs already existed without the need of any multiprocessing support from the OS. In DOS such programs are called Terminate-and-stay-resident programs (TSR). These programs use a special interrupt to terminate their execution but remain in memory. Before exiting TSRs can install interrupt handlers to react on hardware or timer events or to provide functions that can be called from other applications via software interrupts. Interrupt handlers can be chained by TSR if various programs should act upon the same event. Interrupt handlers need to save and restore only those registers which they use and thus the cost of the context switching for background programs in DOS could be lower than the cost of the standard context switching between the processes in a multitasking OS. Loading TSRs (as well as DOS system libraries)to Upper Memory or High Memory Area was a common practice to preserve Conventional Memory for the user applications.

DOS computers used less device drivers compared to the modern operating systems. As an application developer you could directly access video and sound, or you could send AT commands directly to a modem through a COM port. Mice however required a dedicated driver in the form of TSR. Some devices such as sound cards had “drivers” which were simple configuration programs that ran only once during the start-up and did not stay in the memory after.

File System

DOS originally works with FAT, so there are no such advanced features as symlinks. File and directory names are not case sensitive. Moreover DOS does not support long file and directory names natively (they are still visible with mangled names and can be worded with). Finally, internally DOS does not provide any caching for the file system blocks, but luckily it can be achieved by running a specialized TSR.

Conclusion

Theoretical properties of DOS described above are aligned with the niche chosen in the previous post - which is small well-tested interactive applications which performance can benefit from the low OS overhead and direct access to hardware.