|Published (Last):||23 November 2013|
|PDF File Size:||7.44 Mb|
|ePub File Size:||6.28 Mb|
|Price:||Free* [*Free Regsitration Required]|
The remainder of the header word will contain various information depending upon the code. Specifically, Code Meaning 0 Start of segment. A segment name of up to 10 characters follows in the next two words. Bytes of the header contain a relative address. The following word is an absolute value which should be loaded into memory at the relative address.
The following word should be loaded into that location and the address field of the loaded value should be modified by the addition of the base address of the current segment. The next two words contain the character symbolic name of an entry point. Its relative address is in bytes of the header word. The next two words contain the character symbolic name of an external. The instruction which must already have been loaded which is at the relative address given in bytes of the header uses this external.
Bytes contain the relative address in the current segment where execution should begin. Bytes contain the length of the segment. Control can now be transferred to the starting address. As before, with our absolute loader input, we should consider the efficiency of this representation. Each instruction to be loaded here takes two words, at the least, plus additional words for segment definitions, entry points, and externals. This needs to be reduced if possible.
There will probably be few entry points, externals, starting addresses or segments, so we can ignore those codes for now and concentrate on the representation of codes 1 and 2; these will be the bulk of our loader input. With the input to the absolute loader, it was reasonable to assume that there would be a large number of contiguous locations to be loaded, and that one header word could serve a block of contiguous words as well as one word.
For relocatable input, that is not strictly true, however. While it is reasonable to assume that blocks of contiguous words will need to be loaded, it is not reasonable to assume that blocks of contiguous absolute words or blocks of contiguous relocatable words will be loaded.
Rather, it is more likely that relocatable and absolute words will be intermixed. With the exception of adding the base address to the address field of a relocatable instruction, however, relocatable and absolute instructions can be treated the same.
Thus, rather than an entire header word, only one extra bit per instruction is needed to distinguish between relocatable and absolute instructions. If that bit were supplied separately, then the loader could accept blocks of input instructions, loading them as if they were absolute, and then later returning to convert the address fields of some instructions from relative to absolute by the addition of the base address.
This is the approach we take here. The number of words is given in bytes of the header word. Thus, we redefine the code 1 input as Code Meaning 1 Absolute. Bytes of the header contain a relative address; bytes contain a number of words N. Load the N following words into memory at the relative address given in Some of these words may be later modified as external or relocatable references.
Code 2 input must be redefined in light of this approach. What is needed is simply the relative addresses of all instructions which need relocation.
One approach would be to simply list the addresses of the relocatable instructions, one after another, with bytes specifying the number of such addresses. The addresses could also be packed two to a word, to reduce the amount of space needed. A slightly different approach is more commonly used on binary computers. From examinations of typical programs, you notice that most instructions use an address and most of these are relocatable. Thus, many instructions are relocatable and contiguous to other relocatable instructions.
Hence a list of addresses of relocatable instructions would be a list of addresses which could typically differ only by one or two locations. Thus rather than an address for each, we need only the address of the first which can be stored in bytes of the header and an indicator, for each of the following words, whether it is or is not relocatable. This requires only one bit per word, so that the need for relocation of 30 words can be packed into one 5-byte word on a binary MIX machine.
Consider the binary word b29 b28 b27 … b1b in bytes Whichever representation is used for the code 2 information on relocation addresses, it should be clear how to write the code for relocating addresses in a program to be loaded. Both entry points and externals are defined by their symbolic names.
The problem is to match the definitions of these symbols, as entry points, to their uses, as externals. This matching is performed by the use of a data structure called a symbol table, or in this case, an entry point table. This table is a list of the known entry points, both their symbolic name and their address in memory.
This table is defined by the collection of all code 3 entry point definitions in the input. As each entry point definition is found, the name of the new entry point and its absolute address are entered into the entry point table.
This solution to the linking problem works very well, except for one problem: forward references. If an external reference to an entry point is encountered before the definition of that entry point is encountered, then the search of the entry point table will not find an address for the external. Since an entry point can be declared after it is used as an external, it cannot be known for certain if an external reference which finds no corresponding entry point is a reference to an undefined entry point an error or simply a reference to an entry point which is yet to be defined a forward reference.
This can be known only when the end of loader input code 7 is encountered. At that point, any undefined externals are errors. As always, there are many techniques for solving the problem of forward references. The simplest is to use a two-pass loader. After inputting all of the loader input once, all entry points are defined. On the first pass through the loader input, we can ignore all loader input except the entry point definitions and the length of the segments so we can convert the relative addresses of the entry point definitions to absolute addresses.
When the end of loader input is encountered, the loader input is read a second time pass two. This time the segments are loaded into memory, and all external references are resolved.
Forward references will not be a problem, since all entry points were defined and entered into the symbol table during pass one. Any forward reference during pass two is a reference to an undefined external, and is an error. If the loader input is on a mass storage device magnetic tape, disk, or drum , then the loader can reread the input by rewinding or repositioning the input to the beginning for the second pass. If input is on cards or paper tape, it is often copied to a mass storage device during pass one, and then read back from this device during pass 2.
If no mass storage is available, the loader input must simply be read into the computer twice. A two-pass solution to forward references is conceptually simple, and hence easy to write. This is quite unfortunate, since the loader is so frequently used.
One variant is to notice that all of the loading functions storing in memory, relocation, entry point table definition can be accomplished in one pass, except external reference resolution. Even this can be done except for forward references.
Thus, instead of inputting the entire loader input twice, on pass one we do all possible loading. If forward external references are encountered, then the absolute address of the reference and the name of the referenced external are written to mass storage. Pass two consists of nothing more than rewinding this device and rereading it. A more radical approach is to eliminate the second pass entirely, rather than simply trying to reduce it in size.
In this case, rather than storing the external references on a mass storage device, they are stored in memory. Two tables are constructed during pass one, an entry point table, and an external reference table often call a use table. The external reference table stores, for each forward external reference, the name of the external and the absolute address where it is used.
During pass one, if a forward external reference is found, then it is entered into the external reference table. When an entry point is defined, the external reference table is searched, and any references to this newly defined entry point are resolved i. The entry in the external reference table is then deleted. When pass one is completed, the external reference table should be empty. Any remaining entries are references to externals which were never defined as entry points.
The entry point table records where entry points are defined to be; the external reference table records where they are used. The problem is now reduced to the construction and maintenance of the two tables, the entry point table and the external reference table.
The entry point table is an easier table to construct and maintain since deletions are never made from it. Thus only search and enter subroutines are needed. The external reference table is much more difficult. The table must be able to add entries, search for entries, and delete entries. This requires care that deleted entries are not used after they are deleted. Also, the memory space they occupied should be reused if possible to reduce the total amount of memory needed.
This requires very careful design of an appropriate data structure. Some loaders have merged the entry point table and the external reference table into one table.
Each entry in this table consists of several fields. One field is a name, another is a bit. If the bit is 0, then the name is the name of an external reference which has not yet been defined as an entry point a forward external reference. If the bit is 1, the name is the name of an entry point.
For an entry point, we need an absolute address for that entry point. For an external reference, we need a list of all the addresses which reference that external. The problem is that the length of this list is variable and unbounded. How much space should be allocated for the list? Since the list is of variable length, the standard approach is to use a linked list.
LOADERS AND LINKERS
It generates the executable module of a source program. It loads the executable module to the main memory. Input It takes as input, the object code generated by an assembler. It takes executable module generated by a linker. Function It combines all the object modules of a source code to generate an executable module. It allocates the addresses to an executable module in main memory for execution.
Difference Between Linker and Loader
Typically, an object file can contain three kinds of symbols: defined "external" symbols, sometimes called "public" or "entry" symbols, which allow it to be called by other modules, undefined "external" symbols, which reference other modules where these symbols are defined, and local symbols, used internally within the object file to facilitate relocation. For most compilers, each object file is the result of compiling one input source code file. When a program comprises multiple object files, the linker combines these files into a unified executable program, resolving the symbols as it goes along. Linkers can take objects from a collection called a library or runtime library.