[C/C++] Do you understand preprocessing?

target navigation

1. Predefined symbols

2.#define 

2.1#define definition identifier

2.2#define definition macro

2.3 Replacement rules for #define

2.4# and ##

2.5 Macro parameters with side effects

2.6 Comparison of macros and functions

naming convention

3.undef

4. Command line definition

5. Conditional compilation

6. The file contains

6.1 How header files are included

①Local files contain

②The library file contains

6.2 Nested file contains

7. Other preprocessing instructions

1. Predefined symbols

__FILE__    //source files for compilation

__LINE__   //the current line number of the file

__DATE__   //The date the file was compiled

__TIME__   //The time the file was compiled

__STDC__   //Its value is 1 if the compiler follows ANSI C, otherwise undefined

These predefined symbols are all built-in in C language.

Output these predefined symbols:

#include<stdio.h>

int main()
{
	printf("%s\n", __FILE__);
	printf("%d\n", __LINE__);
	printf("%s\n", __DATE__);
	printf("%s\n", __TIME__);

	return 0;
}

2.#define 

2.1#define definition identifier

#define definition identifier is often used in the process of writing code, for example:

#define MAX 1000
#define reg register // Create a short name for the keyword register
#define do_forever for(;;) //Replace an implementation with a more vivid symbol
#define CASE break;case //Write break automatically when writing a case statement.
// If the defined stuff is too long, it can be written in several lines. Except the last line, a backslash (line continuation character) is added after each line.
#define DEBUG_PRINT printf("file:%s\tline:%d\t \
             date:%s\ttime:%s\n" ,\
             __FILE__,__LINE__ ,    \
             __DATE__,__TIME__ ) 

Note: When #define, it is best not to add at the end; for example:

#define MAX 100
#define MAX 100;

Adding ; may cause problems like this:

#define MAX 100;
int main()
{
	int a = 0;
	int max = 0;
	if (a == 0)
		max = MAX;
	else
		max = 0;
	return 0;
}

The reason is that during precompilation, MAX will be replaced with 100; instead of 100, after precompilation, the code obtained by the compiler actually looks like this:

int main()
{
	int a = 0;
	int max = 0;
	if (a == 0)
		max = 100;;//Notice
	else
		max = 0;
	return 0;
}

So in order to avoid such grammatical problems, try not to add ; at the end.

2.2#define definition macro

The #define mechanism contains a provision that allows parameters to be substituted into the text, and this implementation is often called a macro (macro) or a definition macro (define macro).

Here is how the macro is declared:

#define name(parament-list) stuff

The parament-list is a comma-separated list of symbols that may appear in stuff.

Notice:

The opening parenthesis of the parameter list must be immediately adjacent to name.

If any whitespace exists between the two, the argument list is interpreted as part of the stuff.

E.g:

#define SQUARE(x) x*x

This macro can receive an argument x.

At this time, we put the following code in the program

SQURE(5);

In the program preprocessing stage, it will be replaced by the compiler with such a statement:

5*5

Code:

#define SQUARE(x) x*x
int main()
{
	printf("%d\n", SQUARE(5));
	return 0;
}

At this point, the seemingly perfect idea actually has loopholes.

Another set of use cases:

SQUARE(5+1);

The result is as follows:

At first glance, you might expect this code to print the value 36, but it turns out 11.

After careful analysis, when replacing the text, the parameter x is replaced with 5+1, so this statement actually becomes:

printf("%d\n", 5 + 1 * 5 + 1);

So if you want to be strict, it is best to add two parentheses to the macro definition:

#define SQUARE(x) (x)*(x)

Similarly, there are:

#define DOUBLE(X) ((x)+(x)) // double-layer brackets to avoid such as 5*x+x

Summary: Macro definitions used to evaluate numeric expressions should be parenthesized in this way to avoid unforeseen interactions between operators in arguments or adjacent operators when using the macro.

2.3 Replacement rules for #define

There are several steps involved when expanding #defines to define symbols and macros in a program.

1. When calling a macro, the parameters are first checked to see if they contain any symbols defined by #define. if yes, it

They are replaced first.

2. The replacement text is then inserted into the program in place of the original text. For macros, parameter names are replaced by their values.

3. Finally, the resulting file is scanned again to see if it contains any symbols defined by #define. if yes, then

Repeat the above process.

Notice:

1. Other variables defined by #define can appear in macro parameters and #define definitions. But with macros, recursion cannot occur.

2. When the preprocessor searches for symbols defined by #define, the content of the string constant is not searched.

2.4# and ##

Suppose we now want to define a macro PRINT with printing function.

First of all, we have to realize that strings have the function of automatic connection, for example:

printf("Hello ""World""!");

So can our macro PRINT be done like this:

#define PRINT(FORMAT,VALUE) printf("the value is FORMAT\n",VALUE)

int main()
{
	int a = 0;
	PRINT("%d", a);
	return 0;
}

Obviously this way of writing is wrong:

Correct way:

#define PRINT(FORMAT,VALUE) printf("the value is "FORMAT"\n",VALUE); 

Here only when the string is used as a macro parameter, you can put the string in the string.

Of course, in addition to this, there is another trick:

Use # to turn a macro parameter into a corresponding string.

for example:

#define PRINT(FORMAT,VALUE)\
			printf("the value of "#VALUE " is "FORMAT"\n", VALUE)

  same,

#define PRINT(FORMAT,VALUE)\
			printf("the value of "#VALUE " is "FORMAT"\n", VALUE)
int main()
{
	int a = 0;
	PRINT("%d", a+100);
	return 0;
}

The role of ##:

##You can combine the symbols on both sides of it into one symbol

It allows macro definitions to create identifiers from separated text fragments.

E.g:

#define ADD_TO_SUM(num, value)\
		sum##num += value

int main()
{
	int sum5 = 0;
	ADD_TO_SUM(5, 10);//The effect is: add 10 to sum5.
	return 0;
}

2.5 Macro parameters with side effects

When a macro parameter appears more than once in the definition of the macro, if the parameter has side effects, then when you use this macro

Hazards may occur with unpredictable consequences. Side effects are permanent effects that occur when an expression is evaluated.

E.g:

x+1;  //no side effects

x++;  //with side effects

The MAX macro can demonstrate problems caused by parameters with side effects:

#define MAX(a, b) ( (a) > (b) ? (a) : (b) )
int main()
{
	int x = 5;
	int y = 8;
	int z = MAX(x++, y++);
	printf("x=%d y=%d z=%d\n", x, y, z);//What is the output?
    return 0;
}

After preprocessing, the actual code looks like this:

z = ( (x++) > (y++) ? (x++) : (y++));

After running:

Is it as you expected?

2.6 Comparison of macros and functions

Macros are often used for simple operations. For example, find the largest of two numbers.

#define MAX(a, b) ( (a) > (b) ? (a) : (b) )

So why not use a function for this task?

There are two reasons:

1. The code used to call functions and return from functions may take more time than it takes to actually perform this small computational work. So macros are better than functions in terms of program size and speed.

2. More importantly, the parameters of the function must be declared as specific types. So functions can only be used on expressions of the appropriate type. On the contrary, this macro can be applied to types such as integers, long integers, and floating-point types that can be used for > to compare. Macros are type independent.

Of course, macros have disadvantages compared to functions:

1. Every time a macro is used, a copy of the code defined by the macro will be inserted into the program. Unless the macro is relatively short, it can increase the length of the program considerably.

2. There is no way to debug macros.

3. Macros are not rigorous enough because they are type-independent.

4. The macro may bring about the problem of operator priority, which makes the program prone to errors.

Macros can sometimes do things that functions cannot. For example, macro parameters can have types, but functions cannot.

E.g:

#define MALLOC(num, type)\
		(type*)malloc(num * sizeof(type))
int main()
{
    int* newNode=MALLOC(10, int);//type as parameter
    
    return 0;

}

Precompiled code:

	int* newNode = (int*)malloc(10 * sizeof(int));

A comparison of macros and functions:

Attributes#define Define macrosfunction
generation
code
long
Spend

Macro code is inserted into the program each time it is used. Except for very small macros, the length of the program can grow substantially

The function code appears in only one place; every time the function is used, the same code in that place is called.
hold
Row
speed
Spend
fasterThere is additional overhead for function calls and returns, so it is relatively slow.
Hold
do
symbol
excellent
First
class
The evaluation of macro parameters is in the context of all surrounding expressions. Unless parentheses are added, the precedence of adjacent operators may produce unpredictable results, so it is recommended to add more parentheses when defining macros.Function parameters are only evaluated when the function is called
value once, its resulting value is passed to the function
number. The result of evaluating an expression is easier to predict
Measurement.
bring
Have
vice
do
use
of
ginseng
number
Parameters may be substituted in multiple places within the macro body, so parameter evaluation with side effects may produce unpredictable results.Function parameters are only evaluated when they are passed
Second, the result is easier to control.
ginseng
number
kind
type
The parameters of the macro have nothing to do with the type, as long as the operation on the parameters is legal, it can be used for any parameter type.The parameters of the function are related to the type, such as
If the types of the parameters are different, different
functions, even though the tasks they perform are
different.
tone
try
Macros are inconvenient to debugFunctions can be debugged statement by statement
hand over
return
Macros cannot be recursivefunctions can be recursive

naming convention

In general, the usage syntax of function macros is very similar. So language itself cannot help us distinguish between the two.

Then one of our usual habits is:

capitalize macro names

Do not use all uppercase function names

3.undef

This command is used to remove a macro definition.

#undef NAME
//If an existing name needs to be redefined, its old name must first be removed.

E.g:

#define MAX 100

int main()
{
	printf("%d\n", MAX);//Normal use at this time

#undef MAX
	printf("%d", MAX);  //At this time, an error will be reported

	return 0;
}

4. Command line definition

Many C compilers provide the ability to define symbols on the command line. Used to start the compilation process.

For example: this feature is somewhat useful when we want to compile different versions of a different program based on the same source file. (Assume that an array of a certain length is declared in a certain program. If the machine memory is limited, we need a small array, but if another machine has a larger memory, we need an array that can be larger.

(implemented in Linux environment)

Specify MAX=10 when compiling:

Specify MAX=20 when compiling:

5. Conditional compilation

When compiling a program, it is very convenient if we want to compile or discard a statement (a group of statements). Because we have conditional compilation directives.

E.g:

For debugging code, it is a pity to delete it, but to keep it in the way, so we can selectively compile it.

#include <stdio.h>
#define __DEBUG__
int main()
{
	int i = 0;
	int arr[10] = { 0 };
	for (i = 0; i < 10; i++)
	{
		arr[i] = i;
#ifdef __DEBUG__
		printf("%d\n", arr[i]);//In order to observe whether the array assignment is successful.
#endif //__DEBUG__
	}
	return 0;
}

#ifdef: If something is defined, execute it.

#endif: End ifdef.

Common conditional compilation directives:

1.
#if constant expression
//...
#endif
//Constant expressions are evaluated by the preprocessor.
like:
#define __DEBUG__ 1
#if __DEBUG__
//..
#endif
2.Conditional compilation with multiple branches
#if constant expression
//...
#elif constant expression
//...
#else
//...
#endif
3.Judging whether it is defined
#if defined(symbol)
#ifdef symbol
#if !defined(symbol)
#ifndef symbol
4.Nested directives
#if defined(OS_UNIX)
#ifdef OPTION1
unix_version_option1();
#endif
#ifdef OPTION2
unix_version_option2();
#endif
#elif defined(OS_MSDOS)
#ifdef OPTION2
msdos_version_option2();
#endif
#endif

6. The file contains

We already know that the #include directive can cause another file to be compiled. Just like where it actually appears in the #include directive.

This replacement is simple:

The preprocessor first removes this directive and replaces it with the contents of the include file.

Such a source file is included 10 times, it is actually compiled 10 times.

6.1 How header files are included

①Local files contain

#include "filename"

Search strategy: First search in the directory where the source file is located. If the header file is not found, the compiler will search for the header file in the standard location just like searching for the library function header file.

If not found, a compilation error will be prompted.

The path to the standard header files for Linux:

/usr/include

The path to the standard header file for the VS environment:

C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include

Pay attention to find it according to your own installation path.

②The library file contains

#include <filename.h>

Find the header file and go directly to the standard path to find it. If it cannot find it, it will prompt a compilation error.

Does this mean that library files can also be included in the form of " "?

The answer is yes.

But it is less efficient to search in this way, and of course it is not easy to distinguish between library files and local files.

6.2 Nested file contains

If such a scenario occurs:

comm.h and comm.c are common modules.

test1.h and test1.c use common modules.

test2.h and test2.c use common modules.

test.h and test.c use the test1 module and test2 module.

In this way, two copies of comm.h will appear in the final program. This creates duplication of file content.

How to solve this problem?

Answer: conditional compilation.

At the beginning of each file write:

#ifndef __TEST_H__
#define __TEST_H__
//Contents of the header file
#endif  //__TEST_H__

or:

#pragma once

You can avoid repeated references to header files.

7. Other preprocessing instructions

#error
#pragma
#line
...

Reference book for this article: "Deep Anatomy of C Language"

Tags: C

Posted by afrim12 on Sat, 10 Dec 2022 16:51:38 +0530