Preprocessors
Posted by shabeer Ahamed F
Before a C program is compiled in a
compiler, source code is processed by a program called preprocessor. This
process is called preprocessing. Commands used in preprocessor are called preprocessor
directives and they begin with “#” symbol. Below is the list of
preprocessor directives that C language offers.
S.no
|
Preprocessor
|
Syntax
|
Description
|
1
|
Macro
|
#define
|
This macro defines constant value
and can be any of the basic data types.
|
2
|
Header file inclusion
|
#include
|
The source code of the file
“file_name” is included in the main program at the specified place
|
3
|
Conditional compilation
|
#ifdef, #endif, #if, #else,
#ifndef
|
Set of commands are included or
excluded in source program before compilation with respect to the condition
|
4
|
Other directives
|
#undef, #pragma
|
#undef is used to undefine a
defined macro variable. #Pragma is used to call a function before and after
main function in a C program
|
The
C Preprocessor is not part of the compiler, but is a separate step in the
compilation process. In simplistic terms, a C Preprocessor is just a text
substitution tool and they instruct compiler to do required pre-processing
before actual compilation. We'll refer to the C Preprocessor as the C.
All
preprocessor commands begin with a pound symbol (#). It must be the first nonblank
character, and for readability, a preprocessor directive should begin in first
column. Following section lists down all important preprocessor directives:
Directive
|
Description
|
#define
|
Substitutes a preprocessor macro
|
#include
|
Inserts a particular header from
another file
|
#undef
|
Undefines a preprocessor macro
|
#ifdef
|
Returns true if this macro is
defined
|
#ifndef
|
Returns true if this macro is not
defined
|
#if
|
Tests if a compile time condition
is true
|
#else
|
The alternative for #if
|
#elif
|
#else an #if in one statement
|
#endif
|
Ends preprocessor conditional
|
#error
|
Prints error message on stderr
|
#pragma
|
Issues special commands to the
compiler, using a standardized method
|
The
C preprocessor modifies a source code file before handing it over to the
compiler. You're most likely used to using the preprocessor to include files
directly into other files, or #define constants, but the preprocessor can also
be used to create "inline" code using macros expanded at compile time
and to prevent code from being compiled twice.
There
are essentially three uses of the preprocessor:
1.
Directives,
2.
Constants, and
3.
Macros.
Directives are commands that tell the
preprocessor to skip part of a file, include another file, or define a constant
or macro. Directives always begin with a sharp sign (#) and for readability
should be placed flush to the left of the page. All other uses of the
preprocessor involve processing #defined constants or macros. Typically,
constants and macros are written in ALL CAPS to indicate they are special.
1. Directives
The #include directive tells the
preprocessor to grab the text of a file and place it directly into the current
file. Typically, such statements are placed at the top of a program--hence the
name "header file" for files thus included.
Analyze the following examples to
understand various directives.
#define MAX_ARRAY_LENGTH 20
This directive tells the C to replace
instances of MAX_ARRAY_LENGTH with 20. Use #define
for constants to increase readability.
#include
#include "myheader.h"
These directives tell the C to get
stdio.h from System Libraries and add the text to the current source
file. The next line tells C to get myheader.h from the local directory
and add the content to the current source file.
2. Constants
If we write
#define [identifier name] [value]
Whenever [identifier name] shows up
in the file, it will be replaced by [value].
If we are defining a constant in terms of a mathematical expression, it is wise to surround the entire value in parentheses:
If we are defining a constant in terms of a mathematical expression, it is wise to surround the entire value in parentheses:
#define PI_PLUS_ONE (3.14 + 1)
By doing so, you avoid the
possibility that an order of operations issue will destroy the meaning of your
constant:
x = PI_PLUS_ONE * 5;
Without parentheses, the above would
be converted to
x = 3.14 + 1 * 5;
This would result in 1 * 5 being
evaluated before the addition, not after. Oops!
It is also possible to write simply
It is also possible to write simply
#define [identifier name]
Which
defines [identifier name] without giving it a value. This can be useful in
conjunction with another set of directives that allow conditional
compilation.
3.
Macros
The other major use of the preprocessor
is to define macros. The advantage of a macro is that it can be type-neutral
(this can also be a disadvantage, of course), and it's inline directly into the
code, so there isn't any function call overhead.
Note
that in C++, it's possible to get around both of these issues with template
functions and the inline keyword.
A macro definition is usually of the following form:
A macro definition is usually of the following form:
#define MACRO_NAME(arg1, arg2, ...)
[code to expand to]
For instance, a simple increment macro
might look like this:
#define INCREMENT(x) x++
They look a lot like function calls,
but they're not so simple. There are actually a couple of tricky points when it
comes to working with macros. First, remember that the exact text of the macro
argument is "pasted in" to the macro. For instance, if you wrote
something like this:
#define MULT(x, y) x * y
and then wrote
int z = MULT(3 + 2, 4 + 2);
what value do you expect z to end up
with? The obvious answer, 30, is wrong! That's because what happens when the
macro MULT expands is that it looks like this:
int z = 3 + 2 * 4 +
2; // 2 * 4 will be evaluated first!
So z would end up with the value 13!
This is almost certainly not what you want to happen. The way to avoid it is to
force the arguments themselves to be evaluated before the rest of the macro
body. You can do this by surrounding them by parentheses in the macro
definition:
#define MULT(x, y) (x) * (y)
// now MULT(3 + 2, 4 + 2) will expand
to (3 + 2) * (4 + 2)
#define ADD_FIVE(a) (a) + 5
int x = ADD_FIVE(3) * 3;
// this expands to (3) + 5 * 3, so 5 *
3 is evaluated first
// Now x is 18, not 24!
To fix this, you generally want to
surround the whole macro body with parentheses to prevent the surrounding
context from affecting the macro body.
#define ADD_FIVE(a) ((a) + 5)
int x = ADD_FIVE(3) * 3;
On the other hand, if you have a
multiline macro that you are using for its side effects, rather than to compute
a value, you probably want to wrap it within curly braces so you don't have
problems when using it following an if statement.
// We use a trick involving
exclusive-or to swap two variables
#define SWAP(a, b) a ^= b; b ^=
a; a ^= b;
int x = 10;
int y = 5;
// works OK
SWAP(x, y);
// What happens now?
if(x < 0)
SWAP(x, y);
When SWAP is expanded in the second
example, only the first statement, a ^= b, is governed by the conditional; the
other two statements will always execute. What we really meant was that all of
the statements should be grouped together, which we can enforce using curly
braces:
#define SWAP(a, b) {a ^= b; b ^=
a; a ^= b;}
Now, there is still a bit more to our
story! What if you write code like so:
#define SWAP(a, b) { a ^= b; b ^=
a; a ^= b; }
int x = 10;
int y = 5;
int z = 4;
// What happens now?
if(x <0 font="">0>
SWAP(x, y);
else
SWAP(x, z);
Then it will not compile because
semicolon after the closing curly brace will break the flow between if and
else. The solution? Use a do-while loop:
#define SWAP(a, b) do { a ^= b; b
^= a; a ^= b; } while ( 0 )
int x = 10;
int y = 5;
int z = 4;
// What happens now?
if(x <0 font="">0>
SWAP(x, y);
else
SWAP(x, z);
Now the semi-colon doesn't break
anything because it is part of the expression. (By the way, note that we didn't
surround the arguments in parentheses because we don't expect anyone to pass an
expression into swap!)
By now, you've probably realized why
people don't really like using macros. They're dangerous, they're picky, and
they're just not that safe. Perhaps the most irritating problem with macros is
that you don't want to pass arguments with "side effects" to macros.
By side effects, I mean any expression that does something besides evaluate to
a value. For instance, ++x evaluates to x+1, but it also increments x. This
increment operation is a side effect.
The problem with side effects is that macros don't evaluate their arguments; they just paste them into the macro text when performing the substitution. So something like
The problem with side effects is that macros don't evaluate their arguments; they just paste them into the macro text when performing the substitution. So something like
#define MAX(a, b) ((a) < (b) ?
(b) : (a))
int x = 5, y = 10;
int z = MAX(x++, y++);
will end up looking like this:
int z = (x++ < y++ ? y++ : x++)
The problem here is that y++ ends up
being evaluated twice! The nasty consequence is that after this expression, y
will have a value of 12 rather than the expected 11. This can be a real pain to
debug!