Search This Blog

Sunday, January 6, 2013

Structure of .COM file



A .COM file consists entirely of executable code and data. When the file Hello.COM is executed, for example (by typing either Hello or Hello.COM at the DOS prompt), the contents of the file are simply loaded into memory. When the file has been loaded, execution starts with the first byte. All of the segment registers are set to point to a single 64K segment starting 256 bytes before the address where the program was loaded, so in fact execution starts at CS:0100. The first 256 bytes of the segment comprise the Program Segment Prefix (PSP), which contains a variety of pieces of information about the executing program.
The most useful field in the PSP is the tail of the command line; for example, if Hello.COM had been executed by typing Hello Ram, then the string Ram would be stored in the PSP. The program can access this argument string starting at offset 80h; the first byte gives the length of the tail (3 in the example), and that many bytes starting at 81h contain the string itself. The string is terminated with a carriage return character (ASCII code 0Dh), which is not included in the count.
Since the entire segment registers point to the same segment, the structure of a typical .COM program in memory is as follows: 



The program text and initialized data are the bytes that are read in from the .COM file, corresponding to the code and data sections of the .asm source. The PSP is generated by the operating system, and the stack is automatically arranged to grow down from the top of the segment. The uninitialized data, corresponding to the bytes reserved are carved out of the free space between the loaded bytes and the growing stack; since they were not explicitly initialized before execution, they will start out containing whatever garbage was left in those locations of physical memory by the previous programs.


1 comment: