[BACK]Return to ldint.texinfo CVS log [TXT][DIR] Up to [cvs.NetBSD.org] / src / external / gpl3 / binutils.old / dist / ld

Annotation of src/external/gpl3/binutils.old/dist/ld/ldint.texinfo, Revision 1.1.1.1.2.1

1.1       christos    1: \input texinfo
                      2: @setfilename ldint.info
1.1.1.1.2.1! pgoyette    3: @c Copyright (C) 1992-2015 Free Software Foundation, Inc.
1.1       christos    4:
                      5: @ifnottex
                      6: @dircategory Software development
                      7: @direntry
                      8: * Ld-Internals: (ldint).       The GNU linker internals.
                      9: @end direntry
                     10: @end ifnottex
                     11:
                     12: @copying
                     13: This file documents the internals of the GNU linker ld.
                     14:
1.1.1.1.2.1! pgoyette   15: Copyright @copyright{} 1992-2015 Free Software Foundation, Inc.
1.1       christos   16: Contributed by Cygnus Support.
                     17:
                     18: Permission is granted to copy, distribute and/or modify this document
                     19: under the terms of the GNU Free Documentation License, Version 1.3 or
                     20: any later version published by the Free Software Foundation; with the
                     21: Invariant Sections being ``GNU General Public License'' and ``Funding
                     22: Free Software'', the Front-Cover texts being (a) (see below), and with
                     23: the Back-Cover Texts being (b) (see below).  A copy of the license is
                     24: included in the section entitled ``GNU Free Documentation License''.
                     25:
                     26: (a) The FSF's Front-Cover Text is:
                     27:
                     28:      A GNU Manual
                     29:
                     30: (b) The FSF's Back-Cover Text is:
                     31:
                     32:      You have freedom to copy and modify this GNU Manual, like GNU
                     33:      software.  Copies published by the Free Software Foundation raise
                     34:      funds for GNU development.
                     35: @end copying
                     36:
                     37: @iftex
                     38: @finalout
                     39: @setchapternewpage off
                     40: @settitle GNU Linker Internals
                     41: @titlepage
                     42: @title{A guide to the internals of the GNU linker}
                     43: @author Per Bothner, Steve Chamberlain, Ian Lance Taylor, DJ Delorie
                     44: @author Cygnus Support
                     45: @page
                     46:
                     47: @tex
                     48: \def\$#1${{#1}}  % Kluge: collect RCS revision info without $...$
                     49: \xdef\manvers{2.10.91}  % For use in headers, footers too
                     50: {\parskip=0pt
                     51: \hfill Cygnus Support\par
                     52: \hfill \manvers\par
                     53: \hfill \TeX{}info \texinfoversion\par
                     54: }
                     55: @end tex
                     56:
                     57: @vskip 0pt plus 1filll
1.1.1.1.2.1! pgoyette   58: Copyright @copyright{} 1992-2015 Free Software Foundation, Inc.
1.1       christos   59:
                     60:       Permission is granted to copy, distribute and/or modify this document
                     61:       under the terms of the GNU Free Documentation License, Version 1.3
                     62:       or any later version published by the Free Software Foundation;
                     63:       with no Invariant Sections, with no Front-Cover Texts, and with no
                     64:       Back-Cover Texts.  A copy of the license is included in the
                     65:       section entitled "GNU Free Documentation License".
                     66:
                     67: @end titlepage
                     68: @end iftex
                     69:
                     70: @node Top
                     71: @top
                     72:
                     73: This file documents the internals of the GNU linker @code{ld}.  It is a
                     74: collection of miscellaneous information with little form at this point.
                     75: Mostly, it is a repository into which you can put information about
                     76: GNU @code{ld} as you discover it (or as you design changes to @code{ld}).
                     77:
                     78: This document is distributed under the terms of the GNU Free
                     79: Documentation License.  A copy of the license is included in the
                     80: section entitled "GNU Free Documentation License".
                     81:
                     82: @menu
                     83: * README::                     The README File
                     84: * Emulations::                 How linker emulations are generated
                     85: * Emulation Walkthrough::      A Walkthrough of a Typical Emulation
                     86: * Architecture Specific::      Some Architecture Specific Notes
                     87: * GNU Free Documentation License::  GNU Free Documentation License
                     88: @end menu
                     89:
                     90: @node README
                     91: @chapter The @file{README} File
                     92:
                     93: Check the @file{README} file; it often has useful information that does not
                     94: appear anywhere else in the directory.
                     95:
                     96: @node Emulations
                     97: @chapter How linker emulations are generated
                     98:
                     99: Each linker target has an @dfn{emulation}.  The emulation includes the
                    100: default linker script, and certain emulations also modify certain types
                    101: of linker behaviour.
                    102:
                    103: Emulations are created during the build process by the shell script
                    104: @file{genscripts.sh}.
                    105:
                    106: The @file{genscripts.sh} script starts by reading a file in the
                    107: @file{emulparams} directory.  This is a shell script which sets various
                    108: shell variables used by @file{genscripts.sh} and the other shell scripts
                    109: it invokes.
                    110:
                    111: The @file{genscripts.sh} script will invoke a shell script in the
                    112: @file{scripttempl} directory in order to create default linker scripts
                    113: written in the linker command language.  The @file{scripttempl} script
                    114: will be invoked 5 (or, in some cases, 6) times, with different
                    115: assignments to shell variables, to create different default scripts.
                    116: The choice of script is made based on the command line options.
                    117:
                    118: After creating the scripts, @file{genscripts.sh} will invoke yet another
                    119: shell script, this time in the @file{emultempl} directory.  That shell
                    120: script will create the emulation source file, which contains C code.
                    121: This C code permits the linker emulation to override various linker
                    122: behaviours.  Most targets use the generic emulation code, which is in
                    123: @file{emultempl/generic.em}.
                    124:
                    125: To summarize, @file{genscripts.sh} reads three shell scripts: an
                    126: emulation parameters script in the @file{emulparams} directory, a linker
                    127: script generation script in the @file{scripttempl} directory, and an
                    128: emulation source file generation script in the @file{emultempl}
                    129: directory.
                    130:
                    131: For example, the Sun 4 linker sets up variables in
                    132: @file{emulparams/sun4.sh}, creates linker scripts using
                    133: @file{scripttempl/aout.sc}, and creates the emulation code using
                    134: @file{emultempl/sunos.em}.
                    135:
                    136: Note that the linker can support several emulations simultaneously,
                    137: depending upon how it is configured.  An emulation can be selected with
                    138: the @code{-m} option.  The @code{-V} option will list all supported
                    139: emulations.
                    140:
                    141: @menu
                    142: * emulation parameters::        @file{emulparams} scripts
                    143: * linker scripts::              @file{scripttempl} scripts
                    144: * linker emulations::           @file{emultempl} scripts
                    145: @end menu
                    146:
                    147: @node emulation parameters
                    148: @section @file{emulparams} scripts
                    149:
                    150: Each target selects a particular file in the @file{emulparams} directory
                    151: by setting the shell variable @code{targ_emul} in @file{configure.tgt}.
                    152: This shell variable is used by the @file{configure} script to control
                    153: building an emulation source file.
                    154:
                    155: Certain conventions are enforced.  Suppose the @code{targ_emul} variable
                    156: is set to @var{emul} in @file{configure.tgt}.  The name of the emulation
                    157: shell script will be @file{emulparams/@var{emul}.sh}.  The
                    158: @file{Makefile} must have a target named @file{e@var{emul}.c}; this
                    159: target must depend upon @file{emulparams/@var{emul}.sh}, as well as the
                    160: appropriate scripts in the @file{scripttempl} and @file{emultempl}
                    161: directories.  The @file{Makefile} target must invoke @code{GENSCRIPTS}
                    162: with two arguments: @var{emul}, and the value of the make variable
                    163: @code{tdir_@var{emul}}.  The value of the latter variable will be set by
                    164: the @file{configure} script, and is used to set the default target
                    165: directory to search.
                    166:
                    167: By convention, the @file{emulparams/@var{emul}.sh} shell script should
                    168: only set shell variables.  It may set shell variables which are to be
                    169: interpreted by the @file{scripttempl} and the @file{emultempl} scripts.
                    170: Certain shell variables are interpreted directly by the
                    171: @file{genscripts.sh} script.
                    172:
                    173: Here is a list of shell variables interpreted by @file{genscripts.sh},
                    174: as well as some conventional shell variables interpreted by the
                    175: @file{scripttempl} and @file{emultempl} scripts.
                    176:
                    177: @table @code
                    178: @item SCRIPT_NAME
                    179: This is the name of the @file{scripttempl} script to use.  If
                    180: @code{SCRIPT_NAME} is set to @var{script}, @file{genscripts.sh} will use
                    181: the script @file{scripttempl/@var{script}.sc}.
                    182:
                    183: @item TEMPLATE_NAME
                    184: This is the name of the @file{emultempl} script to use.  If
                    185: @code{TEMPLATE_NAME} is set to @var{template}, @file{genscripts.sh} will
                    186: use the script @file{emultempl/@var{template}.em}.  If this variable is
                    187: not set, the default value is @samp{generic}.
                    188:
                    189: @item GENERATE_SHLIB_SCRIPT
                    190: If this is set to a nonempty string, @file{genscripts.sh} will invoke
                    191: the @file{scripttempl} script an extra time to create a shared library
                    192: script.  @ref{linker scripts}.
                    193:
                    194: @item OUTPUT_FORMAT
                    195: This is normally set to indicate the BFD output format use (e.g.,
                    196: @samp{"a.out-sunos-big"}.  The @file{scripttempl} script will normally
                    197: use it in an @code{OUTPUT_FORMAT} expression in the linker script.
                    198:
                    199: @item ARCH
                    200: This is normally set to indicate the architecture to use (e.g.,
                    201: @samp{sparc}).  The @file{scripttempl} script will normally use it in an
                    202: @code{OUTPUT_ARCH} expression in the linker script.
                    203:
                    204: @item ENTRY
                    205: Some @file{scripttempl} scripts use this to set the entry address, in an
                    206: @code{ENTRY} expression in the linker script.
                    207:
                    208: @item TEXT_START_ADDR
                    209: Some @file{scripttempl} scripts use this to set the start address of the
                    210: @samp{.text} section.
                    211:
                    212: @item SEGMENT_SIZE
                    213: The @file{genscripts.sh} script uses this to set the default value of
                    214: @code{DATA_ALIGNMENT} when running the @file{scripttempl} script.
                    215:
                    216: @item TARGET_PAGE_SIZE
                    217: If @code{SEGMENT_SIZE} is not defined, the @file{genscripts.sh} script
                    218: uses this to define it.
                    219:
                    220: @item ALIGNMENT
                    221: Some @file{scripttempl} scripts set this to a number to pass to
                    222: @code{ALIGN} to set the required alignment for the @code{end} symbol.
                    223: @end table
                    224:
                    225: @node linker scripts
                    226: @section @file{scripttempl} scripts
                    227:
                    228: Each linker target uses a @file{scripttempl} script to generate the
                    229: default linker scripts.  The name of the @file{scripttempl} script is
                    230: set by the @code{SCRIPT_NAME} variable in the @file{emulparams} script.
                    231: If @code{SCRIPT_NAME} is set to @var{script}, @code{genscripts.sh} will
                    232: invoke @file{scripttempl/@var{script}.sc}.
                    233:
                    234: The @file{genscripts.sh} script will invoke the @file{scripttempl}
                    235: script 5 to 9 times.  Each time it will set the shell variable
                    236: @code{LD_FLAG} to a different value.  When the linker is run, the
                    237: options used will direct it to select a particular script.  (Script
                    238: selection is controlled by the @code{get_script} emulation entry point;
                    239: this describes the conventional behaviour).
                    240:
                    241: The @file{scripttempl} script should just write a linker script, written
                    242: in the linker command language, to standard output.  If the emulation
                    243: name--the name of the @file{emulparams} file without the @file{.sc}
                    244: extension--is @var{emul}, then the output will be directed to
                    245: @file{ldscripts/@var{emul}.@var{extension}} in the build directory,
                    246: where @var{extension} changes each time the @file{scripttempl} script is
                    247: invoked.
                    248:
                    249: Here is the list of values assigned to @code{LD_FLAG}.
                    250:
                    251: @table @code
                    252: @item (empty)
                    253: The script generated is used by default (when none of the following
                    254: cases apply).  The output has an extension of @file{.x}.
                    255: @item n
                    256: The script generated is used when the linker is invoked with the
                    257: @code{-n} option.  The output has an extension of @file{.xn}.
                    258: @item N
                    259: The script generated is used when the linker is invoked with the
                    260: @code{-N} option.  The output has an extension of @file{.xbn}.
                    261: @item r
                    262: The script generated is used when the linker is invoked with the
                    263: @code{-r} option.  The output has an extension of @file{.xr}.
                    264: @item u
                    265: The script generated is used when the linker is invoked with the
                    266: @code{-Ur} option.  The output has an extension of @file{.xu}.
                    267: @item shared
                    268: The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
                    269: this value if @code{GENERATE_SHLIB_SCRIPT} is defined in the
                    270: @file{emulparams} file.  The @file{emultempl} script must arrange to use
                    271: this script at the appropriate time, normally when the linker is invoked
                    272: with the @code{-shared} option.  The output has an extension of
                    273: @file{.xs}.
                    274: @item c
                    275: The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
                    276: this value if @code{GENERATE_COMBRELOC_SCRIPT} is defined in the
                    277: @file{emulparams} file or if @code{SCRIPT_NAME} is @code{elf}. The
                    278: @file{emultempl} script must arrange to use this script at the appropriate
                    279: time, normally when the linker is invoked with the @code{-z combreloc}
                    280: option.  The output has an extension of
                    281: @file{.xc}.
                    282: @item cshared
                    283: The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
                    284: this value if @code{GENERATE_COMBRELOC_SCRIPT} is defined in the
                    285: @file{emulparams} file or if @code{SCRIPT_NAME} is @code{elf} and
                    286: @code{GENERATE_SHLIB_SCRIPT} is defined in the @file{emulparams} file.
                    287: The @file{emultempl} script must arrange to use this script at the
                    288: appropriate time, normally when the linker is invoked with the @code{-shared
                    289: -z combreloc} option.  The output has an extension of @file{.xsc}.
                    290: @item auto_import
                    291: The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
                    292: this value if @code{GENERATE_AUTO_IMPORT_SCRIPT} is defined in the
                    293: @file{emulparams} file.  The @file{emultempl} script must arrange to
                    294: use this script at the appropriate time, normally when the linker is
                    295: invoked with the @code{--enable-auto-import} option.  The output has
                    296: an extension of @file{.xa}.
                    297: @end table
                    298:
                    299: Besides the shell variables set by the @file{emulparams} script, and the
                    300: @code{LD_FLAG} variable, the @file{genscripts.sh} script will set
                    301: certain variables for each run of the @file{scripttempl} script.
                    302:
                    303: @table @code
                    304: @item RELOCATING
                    305: This will be set to a non-empty string when the linker is doing a final
                    306: relocation (e.g., all scripts other than @code{-r} and @code{-Ur}).
                    307:
                    308: @item CONSTRUCTING
                    309: This will be set to a non-empty string when the linker is building
                    310: global constructor and destructor tables (e.g., all scripts other than
                    311: @code{-r}).
                    312:
                    313: @item DATA_ALIGNMENT
                    314: This will be set to an @code{ALIGN} expression when the output should be
                    315: page aligned, or to @samp{.} when generating the @code{-N} script.
                    316:
                    317: @item CREATE_SHLIB
                    318: This will be set to a non-empty string when generating a @code{-shared}
                    319: script.
                    320:
                    321: @item COMBRELOC
                    322: This will be set to a non-empty string when generating @code{-z combreloc}
                    323: scripts to a temporary file name which can be used during script generation.
                    324: @end table
                    325:
                    326: The conventional way to write a @file{scripttempl} script is to first
                    327: set a few shell variables, and then write out a linker script using
                    328: @code{cat} with a here document.  The linker script will use variable
                    329: substitutions, based on the above variables and those set in the
                    330: @file{emulparams} script, to control its behaviour.
                    331:
                    332: When there are parts of the @file{scripttempl} script which should only
                    333: be run when doing a final relocation, they should be enclosed within a
                    334: variable substitution based on @code{RELOCATING}.  For example, on many
                    335: targets special symbols such as @code{_end} should be defined when doing
                    336: a final link.  Naturally, those symbols should not be defined when doing
                    337: a relocatable link using @code{-r}.  The @file{scripttempl} script
                    338: could use a construct like this to define those symbols:
                    339: @smallexample
                    340:   $@{RELOCATING+ _end = .;@}
                    341: @end smallexample
                    342: This will do the symbol assignment only if the @code{RELOCATING}
                    343: variable is defined.
                    344:
                    345: The basic job of the linker script is to put the sections in the correct
                    346: order, and at the correct memory addresses.  For some targets, the
                    347: linker script may have to do some other operations.
                    348:
                    349: For example, on most MIPS platforms, the linker is responsible for
                    350: defining the special symbol @code{_gp}, used to initialize the
                    351: @code{$gp} register.  It must be set to the start of the small data
                    352: section plus @code{0x8000}.  Naturally, it should only be defined when
                    353: doing a final relocation.  This will typically be done like this:
                    354: @smallexample
                    355:   $@{RELOCATING+ _gp = ALIGN(16) + 0x8000;@}
                    356: @end smallexample
                    357: This line would appear just before the sections which compose the small
                    358: data section (@samp{.sdata}, @samp{.sbss}).  All those sections would be
                    359: contiguous in memory.
                    360:
                    361: Many COFF systems build constructor tables in the linker script.  The
                    362: compiler will arrange to output the address of each global constructor
                    363: in a @samp{.ctor} section, and the address of each global destructor in
                    364: a @samp{.dtor} section (this is done by defining
                    365: @code{ASM_OUTPUT_CONSTRUCTOR} and @code{ASM_OUTPUT_DESTRUCTOR} in the
                    366: @code{gcc} configuration files).  The @code{gcc} runtime support
                    367: routines expect the constructor table to be named @code{__CTOR_LIST__}.
                    368: They expect it to be a list of words, with the first word being the
                    369: count of the number of entries.  There should be a trailing zero word.
                    370: (Actually, the count may be -1 if the trailing word is present, and the
                    371: trailing word may be omitted if the count is correct, but, as the
                    372: @code{gcc} behaviour has changed slightly over the years, it is safest
                    373: to provide both).  Here is a typical way that might be handled in a
                    374: @file{scripttempl} file.
                    375: @smallexample
                    376:     $@{CONSTRUCTING+ __CTOR_LIST__ = .;@}
                    377:     $@{CONSTRUCTING+ LONG((__CTOR_END__ - __CTOR_LIST__) / 4 - 2)@}
                    378:     $@{CONSTRUCTING+ *(.ctors)@}
                    379:     $@{CONSTRUCTING+ LONG(0)@}
                    380:     $@{CONSTRUCTING+ __CTOR_END__ = .;@}
                    381:     $@{CONSTRUCTING+ __DTOR_LIST__ = .;@}
                    382:     $@{CONSTRUCTING+ LONG((__DTOR_END__ - __DTOR_LIST__) / 4 - 2)@}
                    383:     $@{CONSTRUCTING+ *(.dtors)@}
                    384:     $@{CONSTRUCTING+ LONG(0)@}
                    385:     $@{CONSTRUCTING+ __DTOR_END__ = .;@}
                    386: @end smallexample
                    387: The use of @code{CONSTRUCTING} ensures that these linker script commands
                    388: will only appear when the linker is supposed to be building the
                    389: constructor and destructor tables.  This example is written for a target
                    390: which uses 4 byte pointers.
                    391:
                    392: Embedded systems often need to set a stack address.  This is normally
                    393: best done by using the @code{PROVIDE} construct with a default stack
                    394: address.  This permits the user to easily override the stack address
                    395: using the @code{--defsym} option.  Here is an example:
                    396: @smallexample
                    397:   $@{RELOCATING+ PROVIDE (__stack = 0x80000000);@}
                    398: @end smallexample
                    399: The value of the symbol @code{__stack} would then be used in the startup
                    400: code to initialize the stack pointer.
                    401:
                    402: @node linker emulations
                    403: @section @file{emultempl} scripts
                    404:
                    405: Each linker target uses an @file{emultempl} script to generate the
                    406: emulation code.  The name of the @file{emultempl} script is set by the
                    407: @code{TEMPLATE_NAME} variable in the @file{emulparams} script.  If the
                    408: @code{TEMPLATE_NAME} variable is not set, the default is
                    409: @samp{generic}.  If the value of @code{TEMPLATE_NAME} is @var{template},
                    410: @file{genscripts.sh} will use @file{emultempl/@var{template}.em}.
                    411:
                    412: Most targets use the generic @file{emultempl} script,
                    413: @file{emultempl/generic.em}.  A different @file{emultempl} script is
                    414: only needed if the linker must support unusual actions, such as linking
                    415: against shared libraries.
                    416:
                    417: The @file{emultempl} script is normally written as a simple invocation
                    418: of @code{cat} with a here document.  The document will use a few
                    419: variable substitutions.  Typically each function names uses a
                    420: substitution involving @code{EMULATION_NAME}, for ease of debugging when
                    421: the linker supports multiple emulations.
                    422:
                    423: Every function and variable in the emitted file should be static.  The
                    424: only globally visible object must be named
                    425: @code{ld_@var{EMULATION_NAME}_emulation}, where @var{EMULATION_NAME} is
                    426: the name of the emulation set in @file{configure.tgt} (this is also the
                    427: name of the @file{emulparams} file without the @file{.sh} extension).
                    428: The @file{genscripts.sh} script will set the shell variable
                    429: @code{EMULATION_NAME} before invoking the @file{emultempl} script.
                    430:
                    431: The @code{ld_@var{EMULATION_NAME}_emulation} variable must be a
                    432: @code{struct ld_emulation_xfer_struct}, as defined in @file{ldemul.h}.
                    433: It defines a set of function pointers which are invoked by the linker,
                    434: as well as strings for the emulation name (normally set from the shell
                    435: variable @code{EMULATION_NAME} and the default BFD target name (normally
                    436: set from the shell variable @code{OUTPUT_FORMAT} which is normally set
                    437: by the @file{emulparams} file).
                    438:
                    439: The @file{genscripts.sh} script will set the shell variable
                    440: @code{COMPILE_IN} when it invokes the @file{emultempl} script for the
                    441: default emulation.  In this case, the @file{emultempl} script should
                    442: include the linker scripts directly, and return them from the
                    443: @code{get_scripts} entry point.  When the emulation is not the default,
                    444: the @code{get_scripts} entry point should just return a file name.  See
                    445: @file{emultempl/generic.em} for an example of how this is done.
                    446:
                    447: At some point, the linker emulation entry points should be documented.
                    448:
                    449: @node Emulation Walkthrough
                    450: @chapter A Walkthrough of a Typical Emulation
                    451:
                    452: This chapter is to help people who are new to the way emulations
                    453: interact with the linker, or who are suddenly thrust into the position
                    454: of having to work with existing emulations.  It will discuss the files
                    455: you need to be aware of.  It will tell you when the given "hooks" in
                    456: the emulation will be called.  It will, hopefully, give you enough
                    457: information about when and how things happen that you'll be able to
                    458: get by.  As always, the source is the definitive reference to this.
                    459:
                    460: The starting point for the linker is in @file{ldmain.c} where
                    461: @code{main} is defined.  The bulk of the code that's emulation
                    462: specific will initially be in @code{emultempl/@var{emulation}.em} but
                    463: will end up in @code{e@var{emulation}.c} when the build is done.
                    464: Most of the work to select and interface with emulations is in
                    465: @code{ldemul.h} and @code{ldemul.c}.  Specifically, @code{ldemul.h}
                    466: defines the @code{ld_emulation_xfer_struct} structure your emulation
                    467: exports.
                    468:
                    469: Your emulation file exports a symbol
                    470: @code{ld_@var{EMULATION_NAME}_emulation}.  If your emulation is
                    471: selected (it usually is, since usually there's only one),
                    472: @code{ldemul.c} sets the variable @var{ld_emulation} to point to it.
                    473: @code{ldemul.c} also defines a number of API functions that interface
                    474: to your emulation, like @code{ldemul_after_parse} which simply calls
                    475: your @code{ld_@var{EMULATION}_emulation.after_parse} function.  For
                    476: the rest of this section, the functions will be mentioned, but you
                    477: should assume the indirect reference to your emulation also.
                    478:
                    479: We will also skip or gloss over parts of the link process that don't
                    480: relate to emulations, like setting up internationalization.
                    481:
                    482: After initialization, @code{main} selects an emulation by pre-scanning
                    483: the command line arguments.  It calls @code{ldemul_choose_target} to
                    484: choose a target.  If you set @code{choose_target} to
                    485: @code{ldemul_default_target}, it picks your @code{target_name} by
                    486: default.
                    487:
                    488: @code{main} calls @code{ldemul_before_parse}, then @code{parse_args}.
                    489: @code{parse_args} calls @code{ldemul_parse_args} for each arg, which
                    490: must update the @code{getopt} globals if it recognizes the argument.
                    491: If the emulation doesn't recognize it, then parse_args checks to see
                    492: if it recognizes it.
                    493:
                    494: Now that the emulation has had access to all its command-line options,
                    495: @code{main} calls @code{ldemul_set_symbols}.  This can be used for any
                    496: initialization that may be affected by options.  It is also supposed
                    497: to set up any variables needed by the emulation script.
                    498:
                    499: @code{main} now calls @code{ldemul_get_script} to get the emulation
                    500: script to use (based on arguments, no doubt, @pxref{Emulations}) and
                    501: runs it.  While parsing, @code{ldgram.y} may call @code{ldemul_hll} or
                    502: @code{ldemul_syslib} to handle the @code{HLL} or @code{SYSLIB}
                    503: commands.  It may call @code{ldemul_unrecognized_file} if you asked
                    504: the linker to link a file it doesn't recognize.  It will call
                    505: @code{ldemul_recognized_file} for each file it does recognize, in case
                    506: the emulation wants to handle some files specially.  All the while,
                    507: it's loading the files (possibly calling
                    508: @code{ldemul_open_dynamic_archive}) and symbols and stuff.  After it's
                    509: done reading the script, @code{main} calls @code{ldemul_after_parse}.
                    510: Use the after-parse hook to set up anything that depends on stuff the
                    511: script might have set up, like the entry point.
                    512:
                    513: @code{main} next calls @code{lang_process} in @code{ldlang.c}.  This
                    514: appears to be the main core of the linking itself, as far as emulation
                    515: hooks are concerned(*).  It first opens the output file's BFD, calling
                    516: @code{ldemul_set_output_arch}, and calls
                    517: @code{ldemul_create_output_section_statements} in case you need to use
                    518: other means to find or create object files (i.e. shared libraries
                    519: found on a path, or fake stub objects).  Despite the name, nobody
                    520: creates output sections here.
                    521:
                    522: (*) In most cases, the BFD library does the bulk of the actual
                    523: linking, handling symbol tables, symbol resolution, relocations, and
                    524: building the final output file.  See the BFD reference for all the
                    525: details.  Your emulation is usually concerned more with managing
                    526: things at the file and section level, like "put this here, add this
                    527: section", etc.
                    528:
                    529: Next, the objects to be linked are opened and BFDs created for them,
                    530: and @code{ldemul_after_open} is called.  At this point, you have all
                    531: the objects and symbols loaded, but none of the data has been placed
                    532: yet.
                    533:
                    534: Next comes the Big Linking Thingy (except for the parts BFD does).
                    535: All input sections are mapped to output sections according to the
                    536: script.  If a section doesn't get mapped by default,
                    537: @code{ldemul_place_orphan} will get called to figure out where it goes.
                    538: Next it figures out the offsets for each section, calling
                    539: @code{ldemul_before_allocation} before and
                    540: @code{ldemul_after_allocation} after deciding where each input section
                    541: ends up in the output sections.
                    542:
                    543: The last part of @code{lang_process} is to figure out all the symbols'
                    544: values.  After assigning final values to the symbols,
                    545: @code{ldemul_finish} is called, and after that, any undefined symbols
                    546: are turned into fatal errors.
                    547:
                    548: OK, back to @code{main}, which calls @code{ldwrite} in
                    549: @file{ldwrite.c}.  @code{ldwrite} calls BFD's final_link, which does
                    550: all the relocation fixups and writes the output bfd to disk, and we're
                    551: done.
                    552:
                    553: In summary,
                    554:
                    555: @itemize @bullet
                    556:
                    557: @item @code{main()} in @file{ldmain.c}
                    558: @item @file{emultempl/@var{EMULATION}.em} has your code
                    559: @item @code{ldemul_choose_target} (defaults to your @code{target_name})
                    560: @item @code{ldemul_before_parse}
                    561: @item Parse argv, calls @code{ldemul_parse_args} for each
                    562: @item @code{ldemul_set_symbols}
                    563: @item @code{ldemul_get_script}
                    564: @item parse script
                    565:
                    566: @itemize @bullet
                    567: @item may call @code{ldemul_hll} or @code{ldemul_syslib}
                    568: @item may call @code{ldemul_open_dynamic_archive}
                    569: @end itemize
                    570:
                    571: @item @code{ldemul_after_parse}
                    572: @item @code{lang_process()} in @file{ldlang.c}
                    573:
                    574: @itemize @bullet
                    575: @item create @code{output_bfd}
                    576: @item @code{ldemul_set_output_arch}
                    577: @item @code{ldemul_create_output_section_statements}
                    578: @item read objects, create input bfds - all symbols exist, but have no values
                    579: @item may call @code{ldemul_unrecognized_file}
                    580: @item will call @code{ldemul_recognized_file}
                    581: @item @code{ldemul_after_open}
                    582: @item map input sections to output sections
                    583: @item may call @code{ldemul_place_orphan} for remaining sections
                    584: @item @code{ldemul_before_allocation}
                    585: @item gives input sections offsets into output sections, places output sections
                    586: @item @code{ldemul_after_allocation} - section addresses valid
                    587: @item assigns values to symbols
                    588: @item @code{ldemul_finish} - symbol values valid
                    589: @end itemize
                    590:
                    591: @item output bfd is written to disk
                    592:
                    593: @end itemize
                    594:
                    595: @node Architecture Specific
                    596: @chapter Some Architecture Specific Notes
                    597:
                    598: This is the place for notes on the behavior of @code{ld} on
1.1.1.1.2.1! pgoyette  599: specific platforms.  Currently, only Intel x86 is documented (and
1.1       christos  600: of that, only the auto-import behavior for DLLs).
                    601:
                    602: @menu
                    603: * ix86::                        Intel x86
                    604: @end menu
                    605:
                    606: @node ix86
                    607: @section Intel x86
                    608:
                    609: @table @emph
                    610: @code{ld} can create DLLs that operate with various runtimes available
1.1.1.1.2.1! pgoyette  611: on a common x86 operating system.  These runtimes include native (using
1.1       christos  612: the mingw "platform"), cygwin, and pw.
                    613:
1.1.1.1.2.1! pgoyette  614: @item auto-import from DLLs
1.1       christos  615: @enumerate
                    616: @item
1.1.1.1.2.1! pgoyette  617: With this feature on, DLL clients can import variables from DLL
1.1       christos  618: without any concern from their side (for example, without any source
1.1.1.1.2.1! pgoyette  619: code modifications).  Auto-import can be enabled using the
        !           620: @code{--enable-auto-import} flag, or disabled via the
1.1       christos  621: @code{--disable-auto-import} flag.  Auto-import is disabled by default.
                    622:
                    623: @item
                    624: This is done completely in bounds of the PE specification (to be fair,
1.1.1.1.2.1! pgoyette  625: there's a minor violation of the spec at one point, but in practice
1.1       christos  626: auto-import works on all known variants of that common x86 operating
1.1.1.1.2.1! pgoyette  627: system)  So, the resulting DLL can be used with any other PE
1.1       christos  628: compiler/linker.
                    629:
                    630: @item
                    631: Auto-import is fully compatible with standard import method, in which
                    632: variables are decorated using attribute modifiers. Libraries of either
                    633: type may be mixed together.
                    634:
                    635: @item
                    636: Overhead (space): 8 bytes per imported symbol, plus 20 for each
1.1.1.1.2.1! pgoyette  637: reference to it; Overhead (load time): negligible; Overhead
        !           638: (virtual/physical memory): should be less than effect of DLL
1.1       christos  639: relocation.
                    640: @end enumerate
                    641:
                    642: Motivation
                    643:
1.1.1.1.2.1! pgoyette  644: The obvious and only way to get rid of dllimport insanity is
        !           645: to make client access variable directly in the DLL, bypassing
1.1       christos  646: the extra dereference imposed by ordinary DLL runtime linking.
                    647: I.e., whenever client contains something like
                    648:
                    649: @code{mov dll_var,%eax,}
                    650:
1.1.1.1.2.1! pgoyette  651: address of dll_var in the command should be relocated to point
        !           652: into loaded DLL. The aim is to make OS loader do so, and than
        !           653: make ld help with that.  Import section of PE made following
        !           654: way: there's a vector of structures each describing imports
        !           655: from particular DLL. Each such structure points to two other
        !           656: parallel vectors: one holding imported names, and one which
        !           657: will hold address of corresponding imported name. So, the
        !           658: solution is de-vectorize these structures, making import
1.1       christos  659: locations be sparse and pointing directly into code.
                    660:
                    661: Implementation
                    662:
1.1.1.1.2.1! pgoyette  663: For each reference of data symbol to be imported from DLL (to
        !           664: set of which belong symbols with name <sym>, if __imp_<sym> is
        !           665: found in implib), the import fixup entry is generated. That
        !           666: entry is of type IMAGE_IMPORT_DESCRIPTOR and stored in .idata$3
        !           667: subsection. Each fixup entry contains pointer to symbol's address
        !           668: within .text section (marked with __fuN_<sym> symbol, where N is
        !           669: integer), pointer to DLL name (so, DLL name is referenced by
        !           670: multiple entries), and pointer to symbol name thunk. Symbol name
        !           671: thunk is singleton vector (__nm_th_<symbol>) pointing to
        !           672: IMAGE_IMPORT_BY_NAME structure (__nm_<symbol>) directly containing
        !           673: imported name. Here comes that "om the edge" problem mentioned above:
        !           674: PE specification rambles that name vector (OriginalFirstThunk) should
        !           675: run in parallel with addresses vector (FirstThunk), i.e. that they
1.1       christos  676: should have same number of elements and terminated with zero. We violate
1.1.1.1.2.1! pgoyette  677: this, since FirstThunk points directly into machine code. But in
        !           678: practice, OS loader implemented the sane way: it goes thru
        !           679: OriginalFirstThunk and puts addresses to FirstThunk, not something
        !           680: else. It once again should be noted that dll and symbol name
        !           681: structures are reused across fixup entries and should be there
        !           682: anyway to support standard import stuff, so sustained overhead is
        !           683: 20 bytes per reference. Other question is whether having several
        !           684: IMAGE_IMPORT_DESCRIPTORS for the same DLL is possible. Answer is yes,
        !           685: it is done even by native compiler/linker (libth32's functions are in
        !           686: fact resident in windows9x kernel32.dll, so if you use it, you have
        !           687: two IMAGE_IMPORT_DESCRIPTORS for kernel32.dll). Yet other question is
        !           688: whether referencing the same PE structures several times is valid.
        !           689: The answer is why not, prohibiting that (detecting violation) would
1.1       christos  690: require more work on behalf of loader than not doing it.
                    691:
                    692: @end table
                    693:
                    694: @node GNU Free Documentation License
                    695: @chapter GNU Free Documentation License
                    696:
                    697: @include fdl.texi
                    698:
                    699: @contents
                    700: @bye

CVSweb <webmaster@jp.NetBSD.org>