CpS 450 Language Translation Systems

Phase 4: Code Generation

Your Submission Repository

Requirements

Enhance your compiler to translate MiniDream programs to 32-bit or 64-bit x86 assembly code. The code that you generate must assemble and run on Ubuntu 22.04.

Your compiler must continue to perform semantic checks. After all semantic checks are complete, if no errors are detected, generate target code for the program.

Setup Environment

I recommend using the Windows Subsystem for Linux for your development and testing.

  • Install JDK 21 in Ubuntu (tip: Execute the commands given here as root)
  • Install 32-bit libraries:
    sudo dpkg --add-architecture i386 
    sudo apt-get update
    sudo apt-get install libc6:i386 libstdc++6:i386 gcc-multilib
    

Usage Specifications

Your program will be invoked from the command line as follows:

build/install/dream/bin/dream [-ds] [-dp] [-S] <Dream_source_filename> 

where <filename> is the name of a MiniDream source file.

When run with the -S option, your program should produce a file containing assembly code. The file should have the same name as the Dream_source_filename, but with a .s extension. The output file must assemble with gcc and link with stdlib.o to produce a working executable, as follows:

gcc -m32 myprog.s stdlib.o -o myprog

(Note the -m32 option which specifies 32-bit code.)

When run without the -S option, your program should generate the assembly file, then invoke gcc to produce an executable with the same name as Dream_source_filename.

Your compiler should expect to find stdlib.o located in the current working directory.

Test Files

Test files are provided in the class files in /tests/phase4.

Additional Requirements

Assembler Output

The assembler output your compiler produces should be clearly commented, so that it’s easy to tell what assembler instructions are generated from each source line. This feature will be absolutely essential when you attempt to debug your compiler’s output. You should emit a comment that prints the source line number and Dream text of the statement being generated before emitting the assembler instructions for that statement.

I/O Functions

The code for writeint is provided for you in /examples/codegen/stdlib_phase4/stdlib.c. You must implement the code for readint using Linux system calls only (no C library functions may be used). I strongly suggest that you develop and test your readint code using a C test driver program. Make sure that your readint works correctly when the program is run with I/O redirection (since that is how I will test your programs). Tip: Use the read() system call, in a loop, with a size of 1, until you encounter a newline.

Note: If you create any other functions for use by your compiler’s generated code, be sure to include them in stdlib.c (don’t put them in a separate file).

Invoking gcc from your compiler

Use the ProcessBuilder class to launch gcc from your compiler. Use the Process.waitFor() method to wait for gcc to finish and get its exit code, so you can determine whether gcc successfully assembled the program or not.

Extra Credit (+10%)

Add support for a -g command line option that enables source-level debugging. See these notes for details. Note that, when the -g option is omitted, the debugging directives should be omitted from the assembly output. That will help you disable this feature in Phase 5, if needed.

Getting Started

  1. Setup your submission repository and replace the dream folder in it with your dream folder from phase 3. Copy examples/codegen/stdlib_phase4/stdlib.c into your dream folder.

  2. Define a class named TargetInstruction as follows:

    class TargetInstruction {
       String label;
       String instruction;
       String operand1, operand2;
       String comment;
       String directive;
    }
    

    Instances of this class will represent target instructions.

  3. Create a tree traversal class named (ex.) CodeGen to generate the code. Update your main() method to use this class to perform a traversal.

    Define a list of TargetInstruction as an instance variable in your CodeGen class. This list will hold the target instructions emitted during the traversal of the tree.

  4. Decide how you want to output target instructions. I suggest defining a method emit() that builds a TargetInstruction from its parameters and adds the TargetInstruction to the list of target instructions.

    You may want to define overloads of this method for convenience.

  5. Write a method that loops over the list of target instructions and outputs each to the target assembly file.

  6. Begin your code generation with variable declarations and assignment statements. See the lecture notes for guidance.

Debugging Tips

You can step through your generated assembly code using gdb or VSCode. See Debugging Assembly for details.

Submission

  1. Create README.md in the root of the dream folder with the number of hours you spent on this phase, and list any known bugs. Also, include an academic integrity statement indicating what help you received, if any.

  2. Ensure that your code is pushed to the submission system.