And Ezra opened the book in the sight of all the people; (for he was above all the people;) and when he opened it, all the people stood up: .. Also Jeshua, and Bani, and Sherebiah, Jamin, Akkub, Shabbethai, Hodijah, Maaseiah, Kelita, Azariah, Jozabad, Hanan, Pelaiah, and the Levites, caused the people to understand the law: and the people stood in their place. So they read in the book in the law of God distinctly, [translating to give] the sense, and caused them to understand the reading.
Nehemiah 8:5-8
What is a compiler?
Input: source code
Output: binary code - or assembly code, or even C code
Best definition: Translator. Thus, the name of the course: Language Translation Systems
Grace Murray Hopper - programmer for Mark I computer, found herself coding certain tasks repeatedly (sin, cos, etc.) and devised the concept of subroutines: coined the term compiler to refer to a program which would insert called subroutines at the correct places in a program (today we might call such a program a linker)
First modern compiler: FORTRAN, mid 1950's, written by hand by a team led by John Backus. Required 18 person-years to build.
Theoretical work in the 1960's led to the development of tools that automated the process of writing a compiler, reducing the effort required to person-days.
Interpreters intermingle translation with execution
Assemblers encode assembly language into machine language ("encode" better describes the procedure than does "translate")
Linkers assemble a collection of separately compiled modules to form a single executable
The translation process involves several steps (p. 7):
Lexical analysis: Convert a stream of characters into a stream of tokens (lexical units). Performed by a scanner.
Syntactic analysis: "Sentence Diagramming" - Verify that program is written according to language's grammar rules; may produce a tree data structure. Performed by a parser.
Semantic analysis: Verify that the program is legal, collect certain information, decorate the tree
Optimization: Remove unused subroutines, perform any constant arithmetic, etc.
Code Generation: Generate code in the target language
The structure of a compiler can be divided into:
Front end: Phases 1-3
Back end: Phases 4-5
A cross-compiler is a compiler that produces target code for an architecture that is different than the architecture the compiler runs on.
Example: GCC
A single pass compiler processes an input program in a single traversal.
A multipass compiler typically builds an intermediate representation (syntax tree or atoms) and makes several traversals / passes to produce the final target code.