Go to the first, previous, next, last section, table of contents.

The Dld Library

This chapter describes how to use the dld library. To use any of the dld functions, you must include the header file `dld.h' for the declaration of the functions and error code constants.

Initializing Dld

The function dld_init must be called before any other dld functions.

Function: int dld_init (const char *progname)

where path is the name of the currently running program, as given by argv[0].

This function initializes internal data structures of dld and loads into memory symbol definitions of the executing process. By doing so, other dynamically loaded functions can reference symbols already defined or share functions already exist in the executing process.

dld_init returns 0 when successful; otherwise, it returns an error code that is non-zero (see section Definition of Error Codes).

Locating the Executable File

The path name of the executing process as required by dld_init might not be easily obtained all the time. Not all systems pass the entire path name of the executable file as the first argument (argv[0]) to main. In order to obtain the full path of the executable file, dld_init uses the dld_find_program function.

Function: char *dld_find_progname (const char *progname)

dld_find_progname returns the absolute path name of the file that would be executed if command were given as a command. It looks up the environment variable PATH, searches in each of the directory listed for progname, and returns the absolute path name for the first occurrence.

Note: If the current process is executed using the execve call without passing the correct path name as argument 0, dld_find_program (argv[0]) will also fail to locate the executable file.

dld_find_executable returns zero if command is not found in any of the directories listed in PATH.

Dynamically Linking in New Modules

The function dld_link dynamically links in the named relocatable object or library file into memory.

Function: int dld_link (const char *filename)

where filename is the path name of the file to be linked. Specifically, if the named file is a relocatable object file, it is completely loaded into memory. If it is a library file, only those modules defining an unresolved external reference are loaded. Since a module in the library may itself reference other routines in the library, loading it may generate more unresolved external references. Therefore, a library file is searched repeatedly until a scan through all library members is made without having to load any new modules.

Storage for the text and data of the dynamically linked modules is allocated using malloc. In other words, they are kept in the heap of the executing process.

After all modules are loaded, dld_link resolves as many external references as possible. Note that some symbols might still be undefined at this stage, because the modules defining them have not yet been loaded.

If the specified module is linked successfully, dld_link returns 0; otherwise, it returns a non-zero error code (see section Definition of Error Codes).

Unlinking a Module

The major difference between dld and other dynamic linkers is that dld allows object modules to be removed from the process anytime during execution. Unlinking a module is simply the reverse of the link operation (see section Unlinking a Module). The specified module is removed and the memory allocated to it is reclaimed. Additionally, resolution of external references must be undone.

There are two unlink functions:

Function: int dld_unlink_by_file (const char *path, int hard)

Function: int dld_unlink_by_symbol (const char *id, int hard)

The two unlink functions are basically the same except that dld_unlink_by_file takes as argument the path name (path) of a file corresponding to a module previously linked in by dld_link, but dld_unlink_by_symbol unlinks the module that defines the specified symbol (id).

Both functions take a second argument hard. When hard is nonzero (hard unlink), the specified module is removed from memory unconditionally. On the other hand, if hard is zero (soft unlink), this module is removed from memory only if it is not referenced by any other modules. Furthermore, if unlinking a module results in leaving some other modules being unreferenced, these unreferenced modules are also removed.

Hard unlink is usually used when you want to explicitly remove a module and probably replace it by a different module with the same name. For example, you may want to replace the system's printf by your own version. When you link in your version of printf, dld will automatically redirect all references to printf to the new version.

Soft unlink should be used when you are not sure if the specified module is still needed. If you just want to clean up unnecessary functions, it is always safe to use soft unlink.

Both unlink functions returns 0 if the specified object file or symbol is previously loaded. Otherwise, they return a non-zero error code (see section Definition of Error Codes).

Important Points in Using Unlink

When a module is being unlinked, dld tries to clean up as much as it can to restore the executing process to a state as if this module has never been linked. This clean up includes removing and reclaiming the memory for storing the text and data segment of the module, and un-defining any global symbols defined by this module.

However, side effects--such as modification of global variables, input/output operations, and allocations of new memory blocks--caused by the execution of any function in this module are not reversed. Thus, it is the responsibility of the programmer to explicitly carry out all necessary clean up operations before unlinking a module.

Invoking Dynamically Linked Functions

Dynamically linked functions may still be invoked from modules (e.g., main) that do not contain references to such functions.

Function: unsigned long dld_get_symbol (const char *id)

Returns the entry point of the function named id if found, 0 if not found. Non-zero returned values can be used as pointers to the functions.

Function: unsigned long dld_get_func (const char *func)

Returns the address of the global variable named func if found, 0 if not found.

A typical use of dld_get_func would be:

{
    void (*func) ();
    int error_code;

    ...

    /* First, link in the object file "my_object_file.o".  Proceed
       only if the link operation is successful, i.e. it returns 0.
       "my_new_func" is a function defined in "my_object_file.o".
       Set func to point at the entry point of this function and
       then invoke it indirectly through func. */

    if ((error_code = dld_link ("my_object_file.o")) == 0) {
        if ((func = (void (*) ()) get_func ("my_new_func")) != 0)
            (*func) ();
        ...
    } else {

    ...
    }
}

Determining if a Function is Executable

Since dld allows modules to be added to or removed from an executing process dynamically, some global symbols may not be defined. As a result, an invocation of a function might reference an undefined symbol. We say that a function is executable if and only if all its external references have been fully resolved and all functions that it might call are executable.

Function: int dld_function_executable_p (const char *func)

The predicate function dld_function_executable_p helps solve this problem by tracing the cross references between modules and returns non-zero only if the named function is executable.

Note that the implementation of dld_function_executable_p is not complete according to the (recursive) definition of executability. External references through pointers are not traced. That is, dld_function_executable_p will still return non-zero if the named function uses a pointer to indirectly call another function which has already been unlinked. Furthermore, if one external reference of a object module is unresolved, all functions defined in this module are considered unexecutable. Therefore, dld_function_executable_p is usually too conservative.

However, it is advisable to use dld_function_executable_p to check if a function is executable before its invocation. In such a dynamic environment where object modules are being added and removed, a function that is executable at one point in time might not be executable at another. Under most circumstances, dld_function_executable_p is accurate. Also, the implementation of this function has been optimized and it is relatively cheap to use.

Listing the Undefined Symbols

Function: char **dld_list_undefined_sym ()

The function dld_list_undefined_sym returns an array of undefined global symbol names.

The list returned contains all the symbols that have been referenced by some modules but have not been defined. This function is designed for debugging, especially in the case when a function is found to be not executable but you do not know what the missing symbols are.

The length of the array is given by the global variable dld_undefined_sym_count, which always holds the current total number of undefined global symbols. Note that all C symbols are listed in their internal representation--i.e., they are prefixed by the underscore character `_'.

Storage for the array returned is allocated by malloc. It is the programmer's responsibility to release this storage by free when it is not needed anymore.

Explicitly Referencing a Symbol

Normally, a library module is loaded only when it defines one of more symbols that has been referenced. To force a library routine to be loaded, one need to explicitly create a reference to a symbol defined by that library routine. The function dld_create_reference is designed for this purpose:

Function: int dld_create_reference (const char *name)

Usually name is the name of the library routine that should be loaded, but it can be any symbol defined by that routine. After such a reference has been created, linking the appropriate library by dld_link would cause the required library routine to be loaded.

If the call is successful, dld_create_reference returns 0; otherwise, it returns a non-zero error code (see section Definition of Error Codes).

The library routine loaded by this method can be unlinked by dld_unlink_by_symbol (name). Once it has been unlinked, the corresponding reference created by dld_create_reference is also removed so that this routine will not be loaded in again by subsequent linking of the library.

Explicitly Defining a Symbol

Dld allows a programmer to explicitly define global symbols. That is, a programmer can force a symbol to have storage assigned for it. This is especially useful in incremental program testing where the function being tested needs to access some global variables which are defined by another function not yet linked in (or even not yet written). There are two functions related to explicit definition:

Function: int dld_define_sym (const char *name, unsigned int size)

dld_define_sym forces dld to allocate size bytes for symbol name. It can be called before or after a reference to name is made. If references to name already exist when it is defined, all such references are directed to point to the correct address allocated for name.

dld_define_sym returns 0 if successful. Otherwise, it returns a non-zero error code (see section Definition of Error Codes). The typical error is a multiple definition of name.

Function: void dld_remove_defined_symbol (const char *name): When the definition of name is no longer needed, it can be removed by dld_remove_define_symbol.

C++ Construtor Support

The current version of dld does not support C++ global constructors and destructors. There was support in versions 3.2.7 and 3.2.8, but it was removed because the implementation was not portable.

Adding support for global constructors and destructors to dld should not be difficult... all that dld needs to know is the name of the symbols that should be executed after linking and before unlinking an object file.

Linking Other Languages

The easiest way to link in functions from other (i.e. not C) languages is to write a C function that makes the necessary calls to the other language. This is simpler than trying to link modules written in the other language directly, since you do not need to know what the actual symbol names are in the other language.

Every C++ compiler, for example, uses a different name mangling system, and it would be almost impossible for dld to know about them all (though a future version of dld should support the mangling systems of popular C++ compilers).

Here is how a C interface function can be used to let dld link in methods from a C++ class, no matter what mangling system is used:

// Declaration of class Foo.
class Foo
{
  Foo ();               // constructor
  Foo (Foo other);      // copy constructor
  ~Foo ();              // destructor
  int member_func ();   // member function

 private:
  int my_member;        // member variable
};

// Declare call_foo to be a C (unmangled) symbol.
extern "C" int call_foo (Foo a);

int
call_foo (Foo a)
{
  Foo b (a);            // use copy constructor
  Foo c;                // use constructor

  // Call member function.
  return c.member_func ();

  // Destructor is implicitly called for b and c.
}

Simply use dld to load and link the C function, use_foo, and dld will automatically resolve the references to the C++ member functions that use_foo contains.

If you do know the symbol names of functions in the other language, then you may use them directly as arguments to dld functions. You can investigate what symbol names are by running nm objfile.o where `objfile.o' is the name of the module in which the functions are defined.

As an example, here is how to find out what naming conventions your Fortran-77 compiler uses (this example was run on SunOS 4.1.3):

bash$ cat coef.f
**********************************************************
* COEF generates the coefficients and store them in the
* Y array for Newton's interpolation polynomial
**********************************************************

      subroutine coef(n,x,y)
      dimension x(n), y(n)
      do 2 j=1,n-1
      do 2 i=1,n-j
 2       y(n-i+1)=(y(n-i+1)-y(n-i))/(x(n-i+1)-x(n-i-j+1))
      return
      end
bash$ f77 -c coef.f
coef.f:
	coef:
bash$ nm coef.o
000001a8 b VAR_SEG1
00000000 T _coef_
bash$

For the SunOS 4.1.3 `nm', the `T' symbol type means that a global symbol is defined in that file. So, SunOS `f77' uses a leading and trailing underscore for global symbols.

Use the symbol coef_ as the argument to dld_get_func. Note: omit the leading underscore, because C functions, by convention, use a leading underscore and dld is written to automatically add it when dld_get_func is called.

Printing Error Messages

Function: void dld_perror (const char *user_mesg): where user_mesg is a user-supplied string prepended to the error message. The function dld_perror prints out a short message explaining the error returns by the last dld functions.

Function: char * dld_strerror (int code): The function dld_strerror returns the error message string corresponding to the given error code (from dld_errno).

Definition of Error Codes

The dld functions return a non-zero error code when they fail. The global variable, dld_errno also contains the most recent error code. The definitions of these error codes are:

@tableindent = 1.5in

DLD_ENOFILE
cannot open file.
DLD_EBADMAGIC
bad magic number.
DLD_EBADHEADER
failure reading header.
DLD_ENOTEXT
premature eof in text section.
DLD_ENOSYMBOLS
premature eof in symbols.
DLD_ENOSTRINGS
bad string table.
DLD_ENOTXTRELOC
premature eof in text relocation.
DLD_ENODATA
premature EOF in data section.
DLD_ENODATRELOC
premature EOF in data relocation.
DLD_EMULTDEFS
multiple definitions of symbol.
DLD_EBADLIBRARY
malformed library archive.
DLD_EBADCOMMON
common block not supported.
DLD_EBADOBJECT
malformed input file (not object file or archive).
DLD_EBADRELOC
bad relocation info.
DLD_ENOMEMORY
virtual memory exhausted.
DLD_EUNDEFSYM
undefined symbol.

Go to the first, previous, next, last section, table of contents.