A sophisticated compiler implementation for a custom programming language that performs complete lexical analysis, syntax parsing, semantic analysis, and intermediate code generation via quadruples.
- Project Overview
- Features
- System Architecture
- Language Features
- Error Handling
- Tokens
- Quadruples
- Building and Running
- Examples
QuadStack-Compiler is a complete compiler implementation that transforms a high-level programming language into an intermediate representation using quadruples. The compilation process follows the standard multi-phase design pattern:
- Lexical Analysis: Tokenizes the source code using Flex
- Syntax Parsing: Validates grammatical structure using Bison/YACC
- Semantic Analysis: Ensures meaningful code with static type checking
- Symbol Table Management: Tracks variables, functions, scopes, and types
- Intermediate Code Generation: Produces quadruples representing the program
The compiler's design emphasizes robust error detection and reporting, including syntax errors, type mismatches, scope violations, and semantic inconsistencies.
The lexical analyzer (lexer.l) is implemented using Flex. It:
- Converts source code into a stream of tokens
- Identifies keywords, identifiers, literals, operators, and punctuation
- Tracks line numbers for error reporting
- Handles comments (single and multi-line)
- Passes token values to the parser
The parser (parser.y) is implemented using Bison/YACC. It:
- Defines the formal grammar of the language
- Constructs a parse tree according to syntax rules
- Initiates semantic analysis checks
- Calls the quadruple generator during parsing
- Handles syntax error recovery
Semantic analysis is integrated into the parser. It performs:
- Type checking for expressions and assignments
- Variable declaration and usage validation
- Constant reassignment prevention
- Function argument validation
- Control flow validation (e.g., break outside loops)
- Array access validation
The symbol table module (symbol_table.c/h) provides a robust system for tracking program entities:
- Structure: Linked list with scope-based lookup
- Entries: Store name, type, scope, initialization status, usage info
- Scopes: Support nested blocks with proper scope entry/exit
- Operations:
- Variable declaration tracking
- Function and parameter declaration
- Type retrieval and validation
- Constant tracking
- Array information
- Usage and initialization tracking
The quadruples generator (quadruples.c/h) produces a stack-based intermediate representation:
- Model: Stack machine with push/pop operations
- Structure: Operation code + up to three operands
- Generation: Emits quadruples during parsing
- Labels: Manages control flow with automatically generated labels
- Output: Produces human-readable quadruple file
-
Basic Types:
int: Integer valuesfloat: Floating-point valueschar: Single character valuesstring: Text stringsbool: Boolean values (true/false)void: Used for functions without return values
-
Type Checking:
- Static type checking at compile time
- Implicit conversions where appropriate (int → float)
- Warning messages for potentially unsafe conversions
- Error messages for incompatible types
-
Conditional:
if-elsestatements with proper nestingswitch-casestatements with default case handling
-
Loops:
whileloops with condition testing at startrepeat-untilloops with condition testing at endforloops with initialization, condition, and update expressionsbreakandcontinuestatements with context validation
- Arithmetic: Addition, subtraction, multiplication, division, modulo, power
- Comparison: Equal, not equal, greater than, less than, greater/less or equal
- Logical: AND, OR, NOT
- Bitwise: AND, OR, left shift, right shift
- Assignment: Simple assignment, compound assignments
- Increment/Decrement: Pre/post-increment, pre/post-decrement
- Declaration: Function name, return type, parameter list
- Parameters: Type-checked parameters with proper scope
- Return Values: Type-checked return statements
- Calls: Argument validation against parameter types and counts
- Quadruples Generation: Function calls are translated to appropriate CALL/RET operations
- Declaration: Fixed-size arrays with type and dimensions
- Access: Index-based access with bounds checking
- Assignment: Element assignment with type checking
- Declaration: Using
constmodifier with any data type - Protection: Prevents reassignment after initialization
The compiler performs extensive error detection and reporting:
- Syntax Errors: Reports unexpected tokens with line numbers
- Type Errors: Validates type compatibility in expressions and assignments
- Declaration Errors: Prevents variable redeclaration in the same scope
- Usage Errors: Detects use of undeclared variables
- Initialization Warnings: Warns about using variables before initialization
- Constant Modification: Prevents reassignment to constants
- Function Errors: Validates return types, parameter counts, and argument types
- Control Flow Errors: Ensures break/continue are used in proper contexts
- Array Errors: Validates array indices and dimensions
| Token Category | Examples | Description |
|---|---|---|
| Keywords | if, else, while, for, int, float |
Language keywords for control flow, types, etc. |
| Identifiers | Variable and function names | User-defined names for program entities |
| Literals | 123, 45.67, "hello", true |
Number, string, character, boolean constants |
| Operators | +, -, *, /, =, ==, !=, &&, || |
Arithmetic, assignment, comparison, logical operators |
| Bitwise Operators | &, |, <<, >> |
Operators for bit manipulation |
| Punctuation | ;, ,, (, ), {, }, [, ] |
Delimiters and structural tokens |
| Special Tokens | ++, --, -> |
Increment, decrement, and other special operators |
Quadruples are four-part instructions that represent operations in a stack-based execution model:
| Quadruple | Description | Example |
|---|---|---|
PUSH x |
Push the value of x onto the stack | PUSH 5 |
POP x |
Pop a value from the stack and store it in x | POP result |
ADD |
Pop two values, add them, push the result | ADD |
SUB |
Pop two values, subtract second from first, push result | SUB |
MUL |
Pop two values, multiply them, push result | MUL |
DIV |
Pop two values, divide first by second, push result | DIV |
MOD |
Pop two values, compute modulo, push result | MOD |
NEG |
Pop a value, negate it, push result | NEG |
EQ |
Compare for equality, push boolean result | EQ |
NE |
Compare for inequality, push boolean result | NE |
LT |
Compare less than, push boolean result | LT |
GT |
Compare greater than, push boolean result | GT |
LE |
Compare less than or equal, push boolean result | LE |
GE |
Compare greater than or equal, push boolean result | GE |
LOGICAL_AND |
Logical AND of two values | LOGICAL_AND |
LOGICAL_OR |
Logical OR of two values | LOGICAL_OR |
BITWISE_AND |
Bitwise AND of two values | BITWISE_AND |
BITWISE_OR |
Bitwise OR of two values | BITWISE_OR |
LSHIFT |
Left shift operation | LSHIFT |
RSHIFT |
Right shift operation | RSHIFT |
NOT |
Logical NOT of a value | NOT |
JMP Label |
Unconditional jump to label | JMP EndLoop_1 |
JF Label |
Jump to label if top of stack is false | JF FalseLabel_2 |
Label: |
Define a label for jumps | StartWhile_1: |
CALL func |
Call a function | CALL sum |
RET |
Return from a function | RET |
END func |
End of function declaration | END sum |
To build the compiler on Windows, you can use the provided batch file:
run.batGraphical User Interface
The project includes a graphical user interface built with Python and Tkinter for a more user-friendly compilation experience.
Features
Source code editor with syntax highlighting
File open/save functionality
One-click compilation
Tabbed output display showing compiler messages and generated quadruples
Status bar for process feedback