How to embed binary data in an ELF file
An object file can be created based on a pure binary data file and linked to an application. This application can get the start address, end address and size of binary data dynamically using three symbols that are placed by linker (ld) in the object file created from the binary data.
This technique can be used to prepare installer programs in which the original program resides as data. The setup program can either have in-built unzip capability or depend on external commands like gunzip etc.
- An object file info.o can be created from a binary file info.dat as follows:
# ld -s -r -o info.o -b binary info.dat
- Link this object file with an application
# gcc test.c info.o -o test
- In the application code, the following three symbols cab be used to retrieve the contents of info.dat file.
extern int _binary_info_dat_end; extern int _binary_info_dat_size; extern int _binary_info_dat_start; ... printf("start 0x%x end 0x%x size %d bytes\n", &_binary_info_dat_start, &_binary_info_dat_end, &_binary_info_dat_size ); ...
- When the application is run, it prints the following:
# ./test start 0x804900c end 0x80e1e18 size 626188 bytes
- The ’size’ variable will be set to the size of the binary file which was used to create object file.
bash# ls info.dat bash# ld -s -r -o info.o -b binary info.dat bash# ll total 1232 -rwxr-xr-x 1 root root 626188 Dec 22 18:16 info.dat -rw-r--r-- 1 root root 626671 Dec 22 18:19 info.o
- The objdump -h output of the object file is as follows. There is only on section .data.
bash# objdump -h info.o info.o: file format elf32-i386 Sections: Idx Name Size VMA LMA File off Algn 0 .data 00098e0c 00000000 00000000 00000034 2**0 CONTENTS, ALLOC, LOAD, DATA
- The objdump -x output of the object file is as follows
bash# objdump -x info.o info.o: file format elf32-i386 info.o architecture: i386, flags 0x00000010: HAS_SYMS start address 0x00000000 Sections: Idx Name Size VMA LMA File off Algn 0 .data 00098e0c 00000000 00000000 00000034 2**0 CONTENTS, ALLOC, LOAD, DATA SYMBOL TABLE: 00000000 l d .data 00000000 00000000 l d *ABS* 00000000 00000000 l d *ABS* 00000000 00000000 l d *ABS* 00000000 00098e0c g .data 00000000 _binary_info_dat_end 00098e0c g *ABS* 00000000 _binary_info_dat_size 00000000 g .data 00000000 _binary_info_dat_start
This works pretty well and is exactly what I was looking for. Thank you for this nice and accurate article.