The document discusses various tools that are part of the GNU toolchain used for embedded software development. It provides descriptions of common tools used with the GNU Compiler Collection (GCC) such as ar, autoconf, as, gdb, ld, libtool, make, nm, objcopy, objdump, ranlib, readelf, size, strings and strip. It also covers topics like the preprocessor, compiler options, anonymous unions, arrays of zero length, and using attributes with GCC.
4. GNU Toolchain
• The GNU toolchain is a blanket term given to the
programming tools produced by the GNU project
• Projects included in the GNU toolchain are
– GNU make, for build and compilation automation
– GNU Compiler Collection (GCC), with compilers for
several languages
– GNU Binutils, linker, assembler and other tools
– GNU Debugger (GDB)
– GNU build system (autotools)
Embedded Linux Course
7-4
5. Software Tools Used with GCC
Tool Description
ar •A program to maintain library archive files by
adding, removing, and extracting files from the
archive.
•The most common use for this utility is to create
and manage object library archives used by the
linker.
autoconf Produces shell scripts that automatically configure
a source code package to compile on a specific
version of UNIX.
as The GNU assembler.
Embedded Linux Course
7-5
6. Tool Description
gdb The GNU debugger, which can be used to examine
the values and actions inside a program while it is
running.
ld The GNU linker. This program combines a
collection of object
files into an executable program.
libtool A generic library support script used in makefiles
to simplify
the use of shared libraries.
Embedded Linux Course
7-6
7. Tool Description
make •A utility that reads a Makefile to determine which
parts of a program need compiling and linking and
then issues the commands necessary to do so.
•Makefile defines file relationships and
dependencies.
nm •Lists the symbols defined in an object file.
objcopy •Copies and translates an object file from one
binary format to another.
Embedded Linux Course
7-7
8. Tool Description
objdump •Displays several different kinds of information
stored inside one or more object file.
ranlib •Creates and adds an index to an ar archive file.
•The index is the one used by ld to locate modules
in the library.
readelf •Displays information from an ELF formatted
object file.
Embedded Linux Course
7-8
9. Tool Description
size •Lists the names and sizes of each of the sections in
an object file.
strings •Reads through a file of any type and extracts the
character strings for display.
strip •Removes the symbol table, along with any other
information required for debugging, from an object
file or an archive library.
Embedded Linux Course
7-9
12. Object File (cont.)
static variables which is uninitialized or
initialized to “zero” .bss
.data
.rodata
.text
Embedded Linux Course
initialized static variables
read-only data (i,e., constant data)
all of the code blocks
The program loader initializes the memory allocated
for the bss section when it loads the program.
7-12
13. Embedded Linux Course
Example
Version 1: largearrayinit.c
int myarray[50000] = {1, 2, 3, 4};
int main(void) {
myarray[0] = 3;
return 0;
}
Version 2: largearray.c
int myarray[50000]={0};
int main(void) {
myarray[0] = 3;
return 0;
}
gcc -c largearrayinit.c
size largearrayinit.o
ls –l largearrayinit.o
gcc -c largearray.c
size largearray.o
ls –l largearray.o
7-13
15. Embedded Linux Course
objdump
• The objdump utility can be used to extract
information from object files, static libraries, and
shared libraries and then list this information in a
human-readable form.
$ objdump -f -h –EL linkedlist.o
7-15
18. Embedded Linux Course
Symbol Table
• There is also usually a symbol table somewhere in the
object file that contains the names and locations of all
the static variables and functions referenced within
the source file
– Parts of this table may be incomplete
– Symbols are a central concept: the programmer uses
symbols to name things, the linker uses symbols to link,
and the debugger uses symbols to debug.
• Linker will resolve unresolved references
– Symbols that refer to variables and functions defined in
other source files
7-18
19. Embedded Linux Course
Symbols Names
• The nm utility can be used to list all the symbols
defined in (or referenced from) an object file, a static
archive library, or a shared library.
00000004 C avail
00000115 T delafter
U exit
0000009c T freenode
U funcA
00000056 T getnode
00000004 C info
00000000 T init_list
000000b9 T insafter
00000050 C list
00000043 T list_empty
00000185 T main
U printf
U puts
$nm linkedlist.o
7-19
20. Embedded Linux Course
strip
• The strip utility removes the unused information from
the object file
• Stripping can dramatically reduce the size of the file.
$ strip linkedlist.o
$ file linkedlist.o
linkedlist.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), stripped
7-20
23. Embedded Linux Course
Preprocessor
• Predefined symbols
• #define, #undef
• Conditional directives
– ifdef, if , defined , else ,elif
• # and ##
• Diagnostics
7-23
24. Embedded Linux Course
Example -1
#define SKIP_SPACES(p, limit)
{ char *lim = (limit);
while (p < lim) {
if (*p++ != ' ') {
p--; break; }}}
/*semicolon cause trouble before esle*/
if (p != 0)
SKIP_SPACES (p, lim);
else ...
#define SKIP_SPACES(p, limit)
do { char *lim = (limit);
while (p < lim) {
if (*p++ != ' ') {
p--; break; }}}
while (0)
Now SKIP_SPACES (p, lim);
expands into do {...} while (0);
7-24
25. Embedded Linux Course
Example -2
• Calculate the address of the base of the structure
given its type, and an address of a field within the
structure.
#define get_struct_addr(p, type, m)
(type*)( (char*)p - (char*)&(((type*)0)->m) )
7-25
26. Predefined Macros
The standard C preprocessor macros provide minimal
information useful for display output at run time
• __DATE__ - string containing current date (like "Jan 1 2000")
• __FILE__ - string containing source file name (like "file.c")
• __LINE__ - current line number in source file, (like 7)
• __FUNCTION__ - current function name
printf ("The function %s in file %s, was compiled on: %s.n",
__FUNCTION__, __FILE__, __DATE__ );
Embedded Linux Course
7-26
27. Example1:ifdef
#ifdef MAVIS
# include "horse.h" /* gets done if MAVIS is #defined*/
# define STABLES 5
#else
# include "cow.h" /* gets done if MAVIS isn't #defined */
# define STABLES 15
#endif
gcc test.c -DMAVIS
Embedded Linux Course
7-27
28. Once-Only Headers
• If a header file happens to be included twice, the compiler will
process its contents twice. This is very likely to cause an error,
e.g. when the compiler sees the same structure definition
twice.
Embedded Linux Course
7-28
29. #if 0
/*code which should not be compiled*/
#endif
Embedded Linux Course
defined()
• defined(name) - evaluates to 1 if name has been
defined
• #if defined(DEBUG) is the same as #ifdef DEBUG
• #if expression - if expression evaluates to a non-zero
value, process the following code
#if 0
/*code which should not be compiled*/
#endif
7-29
31. Embedded Linux Course
#elif
#if X == 1
...
#else /* X != 1 */
#if X == 2
...
#else /* X != 2 */
...
#endif /* X != 2 */
#endif /* X != 1 */
#if X == 1
...
#elif X == 2
...
#else /* X != 2 and X != 1*/
...
#endif /* X != 2 and X != 1*/
7-31
32. The # Operator
• # :Creating Strings from Macro Arguments
• #define PSQR(X) printf
("The square of X is %d.n", ((X)*(X)));
• PSQR(8) the output is :
The square of X is 64.
• #define PSQR(x) printf
("The square of " #x " is %d.n",((x)*(x)))
Note: #x was replaced by "8"
Embedded Linux Course
7-32
33. The ## Operator: Concatenation
• ##: Concatenation means joining two strings into one
• In the context of macro expansion, concatenation
refers to joining two lexical units into one longer one
Embedded Linux Course
7-33
35. #ifdef DEBUG
#define debug(fmt,args...) printf (fmt ,##args)
#define debugX(level,fmt,args...) if (DEBUG>=level) printf(fmt,##args);
#else
#define debug(fmt,args...)
#define debugX(level,fmt,args...)
#endif /* DEBUG */
Embedded Linux Course
Example
if the variable arguments are omitted or empty, the ‘##’ operator causes
the preprocessor to remove the comma before it.
#define debug(format, ...) fprintf (stderr, format, ## __VA_ARGS__)
7-35
36. #ifdef __vax__
#error "Won't work on VAXen. See comments at get_last_object."
#endif
Embedded Linux Course
Diagnostics
• The directive `#error' causes the preprocessor to
report a fatal error.
• The directive `#warning' causes the preprocessor to
issue a warning and continue preprocessing
7-36
39. Remote-debugging under Linux (cont.)
• The purpose of GDB is to allow you to see what is going on
"inside" another program while it executes or what another
program was doing at the moment it crashed.
– Start your program, specifying anything that might affect its behavior.
– Make your program stop on specified conditions.
– Examine what has happened, when your program has stopped.
– Change things in your program, so you can experiment with correcting
the effects of one bug and go on to learn about another.
• Must specify the ‘-g’ option when you run the cross-compiler.
Embedded Linux Course
7-39
45. Data Display Debugger
• The Data Display
Debugger (DDD) is a
graphical front end to
GDB and other
command line
debuggers.
Embedded Linux Course
7-45
46. Suggestions for Additional Reading
• GDB: The GNU Project Debugger
Online Documentation
http://sourceware.org/gdb/onlinedocs/
Embedded Linux Course
7-46
48. Compiler options
void foo(void)
{
unsigned int a = 6;
char b = -20;
(a+b > 6) ? puts(“X") : puts(“Y");
}
Embedded Linux Course
–funsigned-char
7-48
49. Anonymous Unions
• Within a struct, a union can be declared without a
name, making it possible to address the union
members directly, just as if they were members of the
struct.
struct {
char code;
Embedded Linux Course
union {
char chid[4];
int numid;
};
char *name;
} morx;
7-49
50. Arrays of Zero Length
• GNU C allows the declaration of arrays of zero length
to facilitate the creation of variable-length structures.
• This only makes sense if the zero-length array is the
last member of a struct.
• The size of the array can be specified by simply being
allocated the amount of space necessary.
typedef struct {
Embedded Linux Course
int size;
char string[0];
} vlen;
7-50
51. int main(int argc,char *argv[])
{
int i;
int count = 22;
char letter = 'a';
vlen *line = (vlen *)malloc(sizeof(vlen) + count);
line->size = count;
for(i=0; i<count; i++)
line->string[i] = letter++;
printf("sizeof(vlen)=%d, line->size=%d n",sizeof(vlen),line->size);
for(i=0; i<line->size; i++)
printf("%c ",line->string[i]);
printf("n");
return(0);
} 7-51
Embedded Linux Course
52. Data Alignment -Example
• struct is a way of encapsulating a group of related values in a
single object
• Structure members are placed in memory in the same order
in which they are declared
struct MixedData
{
char Data1;
short Data2;
int Data3;
char Data4;
Embedded Linux Course
};
Question: sizeof (struct MixedData) =?
7-52
53. Data Alignment Restrictions
• Most 16-bit and 32-bit processors do not allow words
and long words to be stored at any offset.
– For example, the Motorola 68000 does not allow a 16 bit
word to be stored at an odd address. Attempting to write a
16 bit number at an odd address results in an exception
Embedded Linux Course
7-53
54. Compiler Byte Padding
• Compilers have to follow the byte alignment
restrictions defined by the target microprocessors.
• This means that compilers have to add pad bytes into
user defined structures so that the structure does not
violate any restrictions imposed by the target
microprocessor
Embedded Linux Course
7-54
55. Natural alignment
• Natural alignment means storing data items at an
address that is a multiple of their size (for instance, 8-
byte items go in an address multiple of 8).
• To enforce natural alignment while preventing the
compiler from moving fields around, you should use
filler fields that avoid leaving holes in the data
structure.
Embedded Linux Course
7-55
56. General Byte Alignment Rules
• The type of each member of the structure usually has
a default alignment
• Unless otherwise requested by the programmer, be
aligned on a pre-determined boundary
• char (one byte) will be 1-byte aligned.
short (two bytes) will be 2-byte aligned.
int (four bytes) will be 4-byte aligned.
float (four bytes) will be 4-byte aligned.
double (eight bytes) will be 8-byte aligned
Embedded Linux Course
7-56
57. #include <stdio.h>
#include <linux/types.h>
/*
* Define several data structures, all of them start with a lone char
* in order to present an unaligned offset for the next field
*/
struct c {char c; char t;} c;
struct s {char c; short t;} s;
struct i {char c; int t;} i;
struct l {char c; long t;} l;
struct ll {char c; long long t;} ll;
struct p {char c; void * t;} p;
struct u1b {char c; __u8 t;} u1b;
struct u2b {char c; __u16 t;} u2b;
struct u4b {char c; __u32 t;} u4b;
struct u8b {char c; __u64 t;} u8b;
Embedded Linux Course
7-57
59. • After compilation the data structure will be
supplemented with padding bytes to ensure a proper
alignment for each of its member:
struct MixedData (after compilation)
{
char Data1;
char Padding0[1];
short Data2;
int Data3;
char Data4;
char Padding1[3];
Embedded Linux Course
};
7-59
60. • The compiled size of the structure is now 12 bytes.
• It is important to note that the last member is padded
with the number of bytes required to conform to the
largest type of the structure
• In this case 3 bytes are added to the last member to
pad the structure to the size of a long word.
Embedded Linux Course
7-60
61. Embedded Linux Course
#pragma
• The #pragma directives offer a way for each
compiler to offer machine- and operating system-specific
features while retaining overall compatibility
with the C and C++ languages.
• Pragmas are machine- or operating system-specific
by definition, and are usually different for every
compiler
7-61
62. Embedded Linux Course
#pragma pack
• The #pragma pack directive modifies the current
alignment rule for members of structures following
the directive
#pragma pack(1)
struct s_t {
char a;
int b;
short c;
int d;
}S;
struct s_t1 {
char a;
int b;
#pragma pack(1)
struct s_t2 {
char x;
int y;
} S2;
char c;
int d;
} S1;
7-62
63. Embedded Linux Course
Attributes
• The __attribute__ keyword can be used to assign an
attribute to a function or data declaration.
• The primary purpose of assigning an attribute to a
function is to make it possible for the compiler to
perform optimization.
void fatal_error() __attribute__ ((noreturn));
. . .
void fatal_error(char *message)
{
fprintf(stderr,"FATAL ERROR: %sn",message);
exit(1);
}
struct mong {
char id;
int code __attribute__ ((align(4)));
};
7-63
64. Attribute: packed
• A variable with this attribute has the smallest possible
alignment.
• A variable will be separated no more than one byte
from its predecessor field.
• In a struct, a field with this attribute will be allocated
with no space between it and the field before it.
struct zrecord {
char id;
int zar[32] __attribute__ ((packed));
};
Embedded Linux Course
7-64
65. Memory Formats
• Pentium (80x86), VAX are little-endian
• IBM 370, Moterola 680x0 (Mac), and most
RISC are big-endian
• Internet is big-endian
– Makes writing Internet programs on PC more awkward!
– WinSock provides htoi and itoh (Host to Internet &
Internet to Host) functions to convert
Embedded Linux Course
7-65
66. Embedded Linux Course
Byte Order
• Byte ordering is the order of bytes within a word.
• Processors can number the bytes in a word such that
the MSB is either the first (left-most) or last (right-most)
value in the word.
• 0x12345678 can be stored in 4x8bit locations as
follows
Address Big-endian Little-endian
184 0x12 0x78
185 0x34 0x56
186 0x56 0x34
187 0x78 0x12
i.e. read top down or bottom up?
7-66
67. • A simple code snippet to test whether a given
architecture is big- or little-endian:
int x = 1;
if (*(char *)&x == 1)
/* little endian */
else
/* big endian */
Embedded Linux Course
7-67
68. Example of C Data Structure
Embedded Linux Course
7-68
69. Byte Ordering in the Kernel
• Each supported architecture in Linux defines one of
__BIG_ENDIAN or __LITTLE_ENDIAN in
<asm/byteorder.h> in correspondence to the
machine's byte order.
u32 __cpu_to_be32(u32); /* convert cpu's byte order to big-endian */
u32 __cpu_to_le32(u32); /* convert cpu's byte order to little-endian */
u32 __be32_to_cpu(u32); /* convert big-endian to cpu's byte order */
u32 __le32_to_cpus(u32); /* convert little-endian to cpu's byte order */
Embedded Linux Course
7-69
70. volatile Qualifiers
• A volatile variable is one that can change
unexpectedly.
• Consequently, the compiler can make no assumptions
about the value of the variable. In particular, the
optimizer must be careful to reload the variable every
time it is used instead of holding a copy in a register.
Embedded Linux Course
7-70
71. Examples of volatile variables
• Hardware registers in peripherals (for example, status
registers)
• Non-automatic variables referenced within an
interrupt service routine
• Variables shared by multiple tasks in a multi-threaded
application
Embedded Linux Course
7-71
72. unsigned int check_iobuf(void)
{
volatile unsigned int
*iobuf=IOBUF;
unsigned int val;
while (*iobuf == 0) { }
val = *iobuf;
*iobuf = 0;
return(val);
}
• if iobuf had not been declared volatile, the compiler would
notice that nothing happens inside the loop and thus eliminate
the loop
Embedded Linux Course
7-72
73. Embedded Linux Course
Questions
• Can a parameter be both const and volatile ? Explain.
• Can a pointer be volatile ? Explain.
• What's wrong with the following function?:
int square(volatile int *ptr)
{
return *ptr * *ptr;
}
7-73
76. Object Files and Libraries
• When combining object codes together to create a
single executable, the linker can find
a) the object codes as separate files in a directory (.o),
b) as object codes stored in a shared library (libxxx.so), or
c) as object codes stored in a static library (libxxx.a).
• Static library is also known as an archive
• Shared library is also known as a dynamic library
Embedded Linux Course
7-76
77. Object Files in a Directory
• The following sequence of commands will compile
*.c all into object files and link them into an
executable program.
– always includes all named files (*.o)
$ gcc -c main.c -o main.o
$ gcc -c inlet.c -o inlet.o
$ gcc -c outlet.c -o outlet.o
$ gcc -c genspru.c -o genspru.o
$ gcc main.o inlet.o outlet.o genspru.o -o spinout
$ gcc main.c inlet.c outlet.c genspru.c -o spinout
Embedded Linux Course
7-77
78. Creating a Static Library
/* hellofirst.c */
#include <stdio.h>
void hellofirst()
{
printf(“The first hellon”);
}
Embedded Linux Course
/* hellosecond.c */
#include <stdio.h>
void hellosecond()
{
printf(“The second hellon”);
}
gcc -c hellofirst.c hellosecond.c
ar -crv libhello.a hellofirst.o hellosecond.o
The naming convention for static libraries is to begin the name with the
three letters lib and end the name with the suffix .a.
7-78
80. Object Files in a Static Library
• Linker will automatically search through the contents
of the library and include only the object files that are
necessary.
• Smaller executable files than the ones produced by
linking from a collection of object files in a directory
$ gcc -c inlet.c outlet.c genspru.c
$ ar -r libspin.a inlet.o outlet.o genspru.o
$ gcc main.c libspin.a -o spinner
Embedded Linux Course
7-80
81. Creating a Shared Library
• $ gcc -c -fpic hellofirst.c hellosecond.c
– -fpic option causes the output object codes to be generated using relocatable
addressing.
– The acronym pic stands for position independent code.
• $ gcc -shared hellofirst.o hellosecond.o -o libhello-1.0.1.so
• the .so suffix on the file name tells GCC that the object files are to be
linked into a shared library
– Normally the linker locates and uses the main() function as the entry point of a
program, but this output module has no such entry point.
• $ gcc -fpic -shared hellofirst.c hellosecond.c -o libhello-1.0.1.so
• $ ln -sf libhello-1.0.1.so libhello.so
• $ gcc twohellos.c libhello.so -o twohellos
• $ gcc twohellos.c –lhello –L. -o twohellos
Embedded Linux Course
7-81
83. Static Library vs. Shared Library
Embedded Linux Course
7-83
Application
A
static
library
Application
B
static
library
Application
A
Application
B
shared
library
Static library
Shared library
85. Object Files in a Dynamic Library
• A dynamic library contains object files that are
loaded into memory and linked with a program only
when the program starts to run.
• Two advantages :
– program’s executable file is much smaller, and
– two or more programs are able to share object modules
loaded from the same dynamic library
• which is the reason dynamic libraries are also called shared
libraries
Embedded Linux Course
7-85
86. Locating the Libraries
• For a program to link properly, the linker must be
able to locate the libraries required to resolve the
external references.
• A shared library must be available at the time the
program is linked and again every time the program
is run.
• The libraries are located by so-name
– Ex. libm.so.6 , libutil-2.2.4.so, etc
Embedded Linux Course
7-86
87. $ gcc -L. -L/home/fred/lib prog.o
$ gcc -lmilt prog.o
$ gcc libjj.a /home/fred/lib/libmilt.so prog.o
Embedded Linux Course
Linking Time
1. Link Time
2. Load Time
– so-name
2. Run Time (Dynamic Loading)
– Using dlopen,dlsym,…
7-87
88. Embedded Linux Course
Load Time
• Whenever a program loads and prepares to run, the
shared libraries it needs are sought in the following
places:
1. Each of the directories listed in the colon-separated list in
the environment variable LD_LIBRARY_PATH
2. The list of libraries found in the file /etc/ld.so.cache,
• maintained by the ldconfig utility
1. The directory /lib
2. The directory /usr/lib
7-88
89. Embedded Linux Course
ldd
• The ldd utility reads through the object files in the
binary executable or shared library and lists all the
shared library dependencies.
– but toolchain doesn’t contain ldd utility, so we use readelf
• arm-linux-readelf –a <exe> | grep Shared
[root@localhost fbv-1.0b]# arm-linux-readelf -a fbv | grep Shared
0x00000001 (NEEDED) Shared library: [libpng12.so.0]
0x00000001 (NEEDED) Shared library: [libungif.so.4]
0x00000001 (NEEDED) Shared library: [libc.so.6]
7-89
90. Using Run-Time Dynamic Linking
• You can use the same shared library in both load-time and
run-time dynamic linking
• The following example uses the dlopen() to get a handle to
the libsayfn.so. If dlopen() succeeds, the program uses the
returned handle in the dlsym() function to get the address of
the shared library's sayhello function. After calling the
function, the program calls the dlclose() to unload the shared
library.
• Because the program uses run-time dynamic linking, it is not
necessary to link shared library.
Embedded Linux Course
7-90
93. Embedded Linux Course
linker
• A software development tool that accepts one or more
object files as input and outputs a relocatable
program.
• The linker is thus run after all of the source files have
been compiled or assembled.
• The job of the linker is to combine object files and, in
the process, to resolve all of the unresolved symbols.
• The GNU linker is a powerful application as well, but
in many cases there is no need to invoke ld directly—
gcc invokes it automatically unless you use the -c
(compile only) option
7-93
94. • The output of the linker is a new object file that contains all
of the code and data from the input object files and is in the
same object file format.
.bss
.data
.rodata
.text
.bss
.data
.rodata
.text
Embedded Linux Course
.bss
.data
.rodata
.text
7-94
95. • If the compiler wasn't pointing to the correct libraries
or was using the host's libraries , we would have to
tell the compiler which libraries to use by setting the
link flags as follows:
LDFLAGS = -nostdlib -L${TARGET_PREFIX}/lib
link your application statically
LDFLAGS += -nostdlib -L${TARGET_PREFIX}/lib -static
Embedded Linux Course
7-95
96. • If, however, you had used the -nostdlib option in LDFLAGS,
which you should not normally need to do, you would also
need to change the section describing the files required for the
build and the rule for generating the binary:
• -nostdlib not to use standard linking
• If you do not explicitly mention them while having disabled
standard linking, the linker will complain about the missing
_start symbol and fail.
Embedded Linux Course
7-96
97. STARTUP_FILES = ${TARGET_PREFIX}/lib/crt1.o
${TARGET_PREFIX}/lib/crti.o
${PREFIX}/lib/gcc-lib/${TARGET}/2.95.3/crtbegin.o
END_FILES = ${PREFIX}/lib/gcc-lib/${TARGET}/2.95.3/crtend.o
${TARGET_PREFIX}/lib/crtn.o
LIBS = -lc
OBJS = daemon.o
LINKED_FILES = ${STARTUP_FILES} ${OBJS} ${LIBS} ${END_FILES}
...
daemon: ${OBJS}
$(CC) -o $(EXEC_NAME) ${LINKED_FILES} $(LDFLAGS)
Note: crt1.o, crti.o, crtbegin.o, crtend.o, and crtn.o.
are special startup, initialization, constructor, destructor,
and finalization files, respectively, which are usually automatically
linked to your applications.
It is through these files that your application's main( ) function is called
Embedded Linux Course
7-97
98. Embedded Linux Course
Startup Code
• A piece of assembly language code that prepares the
way for software written in a high-level language.
• Each high-level language has its own set of
expectations about the runtime environment.
– e.g., allocate stack space, zero BSS block, etc.
– initialized startup.asm, crt0.s (short for C runtime), or
something similar
• Most C/C++ cross-compilers come with startup code
that you can modify, compile, and link with your
embedded programs.
7-98
99. The hardware and software initialization process
Embedded Linux Course
7-99
100. • Startup code for C/C++ programs usually consists of the
following actions:
1. Disable all interrupts.
2. Copy any initialized data from ROM to RAM.
3. Zero the BSS.
4. Allocate space for and initialize the stack.
5. Initialize the processor's stack pointer.
6. Create and initialize the heap.
7. Execute the constructors and initializers for all global variables (C++
only).
8. Enable interrupts.
9. Call main
Embedded Linux Course
7-100
101. • After merging all of the code and data sections and
resolving all of the symbol references, the linker
produces a special "relocatable" copy of the program
• The final step is to use a locator to fix the remaining
relocatable addresses within the code. The result of
that process is an executable.
– Assign code and data section to the absolute memory
addresses
Embedded Linux Course
7-101
103. Embedded Linux Course
Locator
• A software development tool that assigns physical
addresses to the relocatable program produced by the
linker.
– Needs information about the memory on the target board as
input to the locator
• This is the last step in the preparation of software for
execution by an embedded system, and the resulting
file is called an executable.
• In some cases, the locator's function is hidden within
the linker.
7-103
104. Embedded Linux Course
linker script
• Most of ld's functionality is controlled using linker
command files, which are text files that describe
things like the final output file's memory organization
• See
http://www.redhat.com/docs/manuals/enterprise/RHEL-• The main purpose of the linker script is to describe
– how the sections in the input files (object file format )
should be mapped into the output file (an object file,
executable), and
– to control the memory layout of the output file
7-104
109. • LMA, that follows the AT keyword specifies the load
address of the section.
– designed to build a ROM image
• The address is an expression for the VMA (the
virtual memory address) of the output section. If you
do not provide address, the linker will set it based on
REGION if present, or otherwise based on the current
value of the location counter.
Embedded Linux Course
7-109
110. • If you provide neither address nor region, then the
address of the output section will be set to the current
value of the location counter aligned to the alignment
requirements of the output section.
.text . : { *(.text) }
.text : { *(.text) }
.text ALIGN(0x10) : { *(.text) }
Embedded Linux Course
##align the section on a 0x10 byte boundary
7-110
111. Output section LMA
SECTIONS
{
.text 0x1000 : { *(.text) _etext = . ; }
.data 0x2000 :
AT ( ADDR (.text) + SIZEOF (.text) )
{ _data = . ; *(.data); _edata = . ; }
.bss 0x3000 :
{ _bstart = . ; *(.bss) *(COMMON) ; _bend = . ;}
}
Note:
1. ADDR: return the absolute address (the VMA) of the named section
2. location counter holds the VMA value, not the LMA value
Embedded Linux Course
7-111
112. Run-time initialization
• copy the initialized data from the ROM image to its
runtime address.
extern char _etext, _data, _edata, _bstart, _bend;
char *src = &_etext;
char *dst = &_data;
/* ROM has data at end of text; copy it. */
while (dst < &_edata) {
*dst++ = *src++;
Embedded Linux Course
}/
* Zero bss */
for (dst = &_bstart; dst< &_bend; dst++)
*dst = 0;
7-112
114. Embedded Linux Course
Cont.
• A section may be marked as loadable , meaning that
the contents should be loaded into memory when the
output file is run.
• A section with no contents may be allocatable , which
means that an area in memory should be set aside, but
nothing in particular should be loaded there
– in some cases this memory must be zeroed out
7-114
115. Embedded Linux Course
Cont.
• A section, which is neither loadable nor allocatable,
typically contains some sort of debugging
information
7-115
116. Embedded Linux Course
LMA & VMA
• Every loadable or allocatable output section has two
addresses.
• VMA (Virtual Memory Address) : the address the
section will have when the output file is run.
• LMA (Load Memory Address) : the address at which
the section will be loaded.
• In most cases the two addresses will be the same.
7-116
117. Embedded Linux Course
Example
• When a data section is
loaded into ROM, and then
copied into RAM when the
program starts up (this
technique is often used to
initialize global variables in
a ROM based system).
• In this case the ROM
address would be the LMA,
and the RAM address
would be the VMA.
• $objdump –h *.o
7-117
118. • $ld -T u-boot.lds start.o loop.o -o sprig
– T specifies the name of the script file
Embedded Linux Course
7-118
119. GCC inline assembly
• what is inline assembly ? some assembly routines
written as inline functions
• inline : instruct the compiler to insert the code of a
function into the code of its callers, to the point where
actually the call is to be made
• To declare inline assembly functions, we use the
keyword asm.
Embedded Linux Course
7-119
121. Embedded Linux Course
Basic Inline
• In basic inline assembly, just only instructions
• /* moves the contents of ecx to eax */
• asm("movl %ecx %eax");
• /*moves the byte from bh to the memory pointed by eax */
• __asm__("movb %bh (%eax)");
• gcc sends each instruction as a string to as
__asm__ ("movl %eax, %ebxnt"
"movl $56, %esint"
"movl %ecx, $label(%edx,%ebx,$4)nt"
"movb %ah, (%ebx)");
7-121
122. • If in our code we touch (ie, change the contents) some
registers and return from asm without fixing those
changes, something bad is going to happen.
• This is because GCC have no idea about the changes
in the register contents and this leads us to trouble,
especially when compiler makes some optimizations.
Embedded Linux Course
7-122
123. Embedded Linux Course
Extended Asm.
asm ( assembler template /*assembly instructions*/
: output operands /* optional */
: input operands /* optional */
: list of clobbered registers /* optional */
);
asm ( “cldnt"
"repnt"
"stosl"
: /* no output registers */
: "c" (count), "a" (fill_value), "D" (dest)
: "%ecx", "%edi" /* tells GCC that the value of %ecx
%edi is to be modified inside "asm
); , so GCC won’t use this register to store
any other value. */
7-123
124. Example (cont.)
int main(void) {
int a=10, b=5;
asm ("movl %1, %%eax; "
" movl %%eax, %0;"
:"=r"(b) /* output */
:"r"(a) /* input */
:"%eax" /* clobbered register */
);
printf(“b=%dn", b);
Embedded Linux Course
}
"=" : Means that this operand is write-only for this instruction; the
previous value is discarded and replaced by output data.
7-124
125. • "b" is the output operand, referred to by %0 and "a" is the input operand,
referred to by %1.
• "r" is a constraint on the operands.
– "r" says to GCC to use any register for storing the operands.
– output operand constraint should have a constraint modifier "=". And this
modifier says that it is the output operand and is write-only.
• There are two %’s prefixed to the register name. This helps GCC to
distinguish between the operands and registers. operands have a single %
as prefix.
• The clobbered register %eax after the third colon tells GCC that the value
of %eax is to be modified inside "asm", so GCC won’t use this register to
store any other value.
Embedded Linux Course
7-125
126. static inline void delay (unsigned long loops)
{
__asm__ volatile ("1:n"
"subs %0, %1, #1n"
"bne 1b":"=r" (loops):"0" (loops));
Embedded Linux Course
}
/* some delay between MPLL and UPLL */
delay (8000);
7-126
127. Embedded Linux Course
Clobber List
• Some instructions clobber some hardware registers,
so we have to list those registers in the clobber-list
• Clobber List is to inform gcc that we will use and
modify them ourselves. So gcc will not assume that
the values it loads into these registers will be valid.
• We shoudn’t list the input and output registers in this
list. Because, gcc knows that "asm" uses them
7-127
128. Clobber List (cont.)
• If our instruction modifies memory in an
unpredictable fashion, add "memory" to the list of
clobbered registers
– This will cause GCC to not keep memory values cached in
registers across the assembler instruction.
asm ( “ movl %0,%%eax;”
“ movl %1,%%ecx;”
“ call _foo"
: /* no outputs */
: "g" (from), "g" (to)
: "eax", "ecx"
);
Embedded Linux Course
7-128
129. Building a Cross Compiler
• CrossTools (http://kegel.com/crosstool/)
create a cross gcc, and also build glibc for the target.
Embedded Linux Course
7-129
130. Embedded Linux Course
References
• GNU Manuals Online
– http://www.gnu.org/manual/manual.html
• “GCC: The Complete Reference “ by Arthur Griffith
– McGraw-Hill, September 12, 2002
• Using ld, the GNU Linker
7-130