Phases of translation

C++ source files are processed by the compiler to produce C++ programs.

# Notes

Source files, translation units and translated translation units need not necessarily be stored as files, nor need there be any one-to-one correspondence between these entities and any external representation. The description is conceptual only, and does not specify any particular implementation.

The conversion performed at phase 5 can be controlled by command line options in some implementations: gcc and clang use -finput-charset to specify the encoding of the source character set, -fexec-charset and -fwide-exec-charset to specify the ordinary and wide literal encodings respectively, while Visual Studio 2015 Update 2 and later uses /source-charset and /execution-charset to specify the source character set and literal encoding respectively.

Some compilers do not implement instantiation units (also known as template repositories or template registries) and simply compile each template instantiation at phase 7, storing the code in the object file where it is implicitly or explicitly requested, and then the linker collapses these compiled instantiations into one at phase 9.

# Defect reports

DRApplied toBehavior as publishedCorrect behavior
CWG 787C++98the behavior was undefined if a non-empty source file doesnot end with a newline character at the end of phase 2add a terminating newlinecharacter in this case
CWG 1104C++98the alternative token <: caused std::vector<::std::string>to be treated as std::vector[:std::string>added an additional lexingrule to address this case
CWG 1775C++11forming a universal character name inside a rawstring literal in phase 2 resulted in undefined behaviormade well-defined
CWG 2747C++98phase 2 checked the end-of-file splice after splicing, this is unnecessaryremoved the check
P2621R3C++98universal character names were not allowed tobe formed by line splicing or token concatenationallowed

# See also