The Darren Mulholland

Insert pretentious waffle here.

C Compiler Cheatsheet

A no-frills guide to the C compiler toolchain.



There are four distinct steps involved in transforming a C source file into an executable binary: preprocessing, compiling, assembling, and linking.

In theory each step is the responsibility of a dedicated tool: the preprocessor cpp, the compiler cc, the assembler as, and the linker ld. In practice the compiler will happily orchestrate all four steps for us and we can build a simple C program using a single command:

$ cc program.c

By default, the resulting executable will be given the rather unappealing name of a.out — short for assembler output — but we can fix this by specifying a custom output name:

$ cc -o myprog program.c

We'll look briefly below at each step of the compilation process and summarize some of the most useful options available.

The interface we'll describe was developed originally for GCC — the GNU C Compiler — and its supporting toolchain. This interface was later mimicked by Clang, which aimed to be a drop-in replacement for GCC, and so now applies to both. It's a little crufty and inconsistent but the desire for backwards compatibility means we're stuck with it for the foreseeable future.

Preprocessing

The C preprocessor cpp is responsible for executing # directives and expanding macros. It takes a .c source file as input and outputs an expanded source file, still written in C.

Preprocessed files typically aren't retained, but when they are the convention is to give them a .i extension. (I have no idea why.)

We can use the compiler's -E flag to view the preprocessed source. By default, output is printed to standard out unless we use the -o flag to specify an output filename.

$ cc -E program.c

The following preprocessor options are available (and can be passed directly to the compiler):

-C Retain source comments in the output.
-D<name>=<value> Define the named symbol before preprocessing. If no value is specified the symbol will have a default value of 1.
-I<directory> Add the specified directory to the search path for #include files.
-P Omit debugging information from the output.
-U<name> Undefine the named symbol before preprocessing.

Compiling

The compiler cc translates a source file written in C into assembly language.

Assembly language is a human-readable representation of the binary machine code that actually runs on the computer's hardware; as such it's specific to the CPU architecture of the target system.

Assembly language files typically aren't retained but we can view them using the compiler's -S flag which halts the compilation process after they've been generated.

$ cc -S program.c

This will generate a .s assembly file for each input file provided.

Assembling

The assembler as translates source files written in assembly language into executable binary code. It outputs a single .o object file for each input file provided.

The compiler defaults to automatically deleting these object files but we can retain them using the -c flag.

$ cc -c program.c

This instructs the compiler to compile and assemble the object files but stop before linking them into an executable.

Linking

Linking is the final stage of the compilation process. The linker ld combines multiple object files into a single executable file. It also links in code from the standard library and any other external libraries referenced by the files.

The C standard library is linked in automatically. To link in a static library libfoo.a located on the default library search path we use the -l flag:

$ cc program.c -lfoo

Note that the standard lib prefix and .a (archive) extension are omitted. To link to a library that isn't on the default search path we have two options:

  1. We can specify the library's full filepath as if it were a source or object file:

$ cc program.c /path/to/lib/libfoo.a
  1. We can add the containing directory to the search path using the -L flag:

$ cc program.c -L/path/to/lib -lfoo

Note that libraries must be specified after the source or object files that reference them.

Multiple Files

The compiler will happily accept multiple input files in varying stages of compilation:

$ cc src.c asm.s obj.o

In this case src.c will be compiled and assembled, asm.s will be assembled, and the two resulting object files will be linked with obj.o into an executable.

Warnings & Standards

Turn on compiler warnings with the following flags:

-Wall -Wextra --std=c99 --pedantic

The -Wall and -Wextra flags turn on most of the compiler's available warnings. The --std=c99 flag instructs the compiler to use the C99 standard — available options include c90 and c11. The final --pedantic flag turns on a number of additional warnings specific to the particular standard chosen.

Warnings can be turned off individually, e.g.

-Wno-unused-parameter

will tell the compiler to stop bugging us about unused parameters.

Compiling Static Libraries

A static library is simply a collection or archive of object files. Static libraries are created using the ar (archiver) tool and by convention are given a lib prefix and .a extension.

$ ar -rv libfoo.a one.o two.o three.o

Static libraries are built into the executable at compiletime — they do not have to be present on the system at runtime.

Compiling Dynamic Libraries

A dynamic or shared object library is a special collection of object files that can be loaded by a program at runtime. Dynamic libraries are created using the compiler's --shared flag and by convention are given a lib prefix and .so extension.

$ cc --shared -o libfoo.so one.o two.o three.o

Dynamic libraries can be used in two ways: