There are four distinct steps involved in transforming a C source file into an executable binary: preprocessing, compiling, assembling, and linking.
In theory, each step is the responsibility of a dedicated tool: the preprocessor
cpp, the compiler
cc, the assembler
as, and the linker
ld. In practice, the compiler will happily orchestrate all four steps for us and we can build a simple C program using a single command:
$ cc -o outname program.c
We'll look briefly at each step below and outline some of the most useful options for each.
The interface we'll describe was developed originally for GCC — the GNU C Compiler — and its supporting toolchain. This interface was later mimicked by Clang which aimed to be a drop-in replacement for GCC, and so now applies to both. It's a little crufty and inconsistent but the desire for backwards compatibility means we're stuck with it for the foreseeable future.
cpp is responsible for executing
# directives and expanding macros. It takes a
.c source file as input and outputs an expanded source file, still written in C.
Preprocessed files typically aren't retained, but when they are the convention is to give them a
We can use the compiler's
-E flag to view the preprocessed source. Output is printed to standard out unless we use the
-o flag to specify an output filename.
$ cc -E program.c > program.i
The following preprocessor options are available:
||Retain source comments in output.|
||Define the named symbol before preprocessing. If the value is omitted the symbol will have a default value of 1.|
Add the specified directory to the
||Omit debugging information from the output.|
||Undefine the named symbol before preprocessing.|
cc translates a source file written in C into assembly language.
Assembly language is a human-readable representation of the binary machine code that actually runs on the computer's hardware; as such it's specific to the CPU architecture of the target system.
Assembly language files typically aren't retained but we can view them using the compiler's
-S flag which halts compilation after they've been generated.
$ cc -S program.c
This will generate a
.s assembly file for each input file provided.
as translates source files written in assembly language into executable binary code. It outputs a single
.o object file for each input file provided.
The compiler defaults to automatically deleting these object files but we can retain them using the
$ cc -c program.c
This instructs the compiler to compile and assemble the object files but stop before linking them into an executable.
Linking is the final stage of the compilation process. The linker
ld combines multiple object files into a single executable file. It also links in code from the standard library and any other external libraries referenced by the files.
The C standard library is linked in automatically. To link in a static library
libfoo.a located on the default library search path we use the
$ cc -o outname program.c -lfoo
Note that the standard
lib prefix and
.a (archive) extension are omitted. To link to a library that isn't on the default search path we have two options:
We can specify the library's full filepath as if it were a source or object file:
$ cc -o outname program.c /path/to/lib/libfoo.a
We can add the containing directory to the search path using the
$ cc -o outname program.c -L/path/to/lib -lfoo
Note that libraries should be specified after the source or object files that reference them.
The compiler will happily accept multiple input files in varying stages of compilation:
$ cc -o outname src.c asm.s obj.o
In this case
src.c will be compiled and assembled,
asm.s will be assembled, and the two resulting object files will be linked with
obj.o into an executable.
Warnings & Standards
Turn on compiler warnings with the following flags:
-Wall -Wextra --std=c99 --pedantic
-Wextra flags turn on most of the compiler's available warnings. The
--std=c99 flag instructs the compiler to use the C99 standard — available options include
c11. The final
--pedantic flag turns on a number of additional warnings specific to the particular standard chosen.
Warnings can be turned off individually, e.g.
will turn off warnings about unused parameters.
Compiling Static Libraries
A static library is simply a collection or archive of object files. Static libraries are created using the
ar (archiver) tool and by convention are given a
lib prefix and a
$ ar -rv libfoo.a *.o
Static libraries are built into the executable at compiletime — they do not have to be present on the system at runtime.
Compiling Dynamic Libraries
A dynamic or shared object library is a special collection of object files that can be loaded by a program at runtime. Dynamic libraries are created using the compiler's
--shared flag and by convention are given a
lib prefix and a
$ cc --shared -o libfoo.so *.o
Dynamic libraries can be used in two ways:
An executable can be linked against a dynamic library at compiletime. Multiple executables can then share a single library instance, which must be available on the system at runtime.
An executable can dynamically load and unload library files at runtime using the system's dynamic linking functions. Libraries used in this way can form the basis of a plugin system for an application.