Why Linux/gnu linker chose address 0x400000?

Issue

I’m experimenting with ELF executables and the gnu toolchain on Linux x86_64:

I’ve linked and stripped (by hand) a “Hello World” test.s:

        .global _start
        .text
_start:
        mov     $1, %rax
        ...

into a 267 byte ELF64 executable…

0000000: 7f45 4c46 0201 0100 0000 0000 0000 0000  .ELF............
0000010: 0200 3e00 0100 0000 d400 4000 0000 0000  ..>[email protected]
0000020: 4000 0000 0000 0000 0000 0000 0000 0000  @...............
0000030: 0000 0000 4000 3800 0100 4000 0000 0000  [email protected]@.....
0000040: 0100 0000 0500 0000 0000 0000 0000 0000  ................
0000050: 0000 4000 0000 0000 0000 4000 0000 0000  [email protected]@.....
0000060: 0b01 0000 0000 0000 0b01 0000 0000 0000  ................
0000070: 0000 2000 0000 0000 0000 0000 0000 0000  .. .............
0000080: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000090: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000b0: 0400 0000 1400 0000 0300 0000 474e 5500  ............GNU.
00000c0: c3b0 cbbd 0abf a73c 26ef e960 fc64 4026  .......<&..`[email protected]&
00000d0: e242 8bc7 48c7 c001 0000 0048 c7c7 0100  .B..H......H....
00000e0: 0000 48c7 c6fe 0040 0048 c7c2 0d00 0000  [email protected]
00000f0: 0f05 48c7 c03c 0000 0048 31ff 0f05 4865  ..H..<...H1...He
0000100: 6c6c 6f2c 2057 6f72 6c64 0a              llo, World.

It has one program header (LOAD) and no sections:

There are 1 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x000000000000010b 0x000000000000010b  R E    200000

This seems to load the entire file (file offset 0 thru 0x10b – elf header and all) at address 0x400000.

The entry point is:

 Entry point address:               0x4000d4

Which corresponds to 0xd4 offset in the file, and as we can see that address is the start of the machine code (mov $1, %rax1)

My question is why (how) did the gnu linker choose address 0x400000 to map the file to?

Solution

The start address is usually set by a linker script.

For example, on GNU/Linux, looking at /usr/lib/ldscripts/elf_x86_64.x we see:

...
PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000)); \
    . = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;

The value 0x400000 is the default value for the SEGMENT_START() function on this platform.

You can find out more about linker scripts by browsing the linker manual:

% info ld Scripts

Answered By – jkoshy

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published