Minishell is a small UNIX shell that reproduces the core behavior of bash. It takes user input, tokenizes and parses it, builds an Abstract Syntax Tree (AST), and executes the resulting command structure while handling pipes, redirections, command lists, logical operators, blocks, heredocs, quoting, and environments.
- Install the Readline library Minishell depends on the GNU readline library for interactive input. You must install Readline on your system before building the project
brew install readline
sudo apt-get install libreadline-dev
- Update the Makefile (
H_PATHandLIB_PATH)
H_PATH := -I ./include -I <readline-install-path>/include
LIB_PATH := -L ./lib/libft -L <readline-install-path>/lib
Replace <readline-install-path> with the actual directory on your machine.
- Build the project
make all
./minishell [command-line]
- Manual tokenizer with operator-aware lookahead
- Recursive-descent parser respecting shell precedence
- AST-based command representation
- Pipeline and redirection engine
- Logical operators (&&, ||) and command lists
- Built-in command suite
- Environment/variable management
- Signal integration
- Heredoc handling (with Command Substitution)
- Wildcards
- History handling
-
AST-based command representation
This project uses an Abstract Syntax Tree (AST) to represent the command line. More specifically, it follows a type-tagged, polymorphic AST design.
At the core is a minimal base struct (t_cmd) that contains only anint typefield. Every concrete command —pipeline, list operator, redirection, exec command, logical AND/OR, heredoc, and so on - has its own specialized struct that embeds the base struct's header field (int type) and then carries whatever additional data.Because all nodes share this commonint typeheader, the parser and executor can treat any specialized node as a generict_cmdand select the correct behavior based solely on the type.
During parsing, the input command line is broken into logical components and assembled into an AST. Binary operators such as pipelines, command lists, and logical AND/OR become nodes containing two children (leftandright). Commands like redirections or heredocs wrap a single subcommand. Leaf nodes are exec-command nodes, which hold the final command arguments and all information needed to run the program.
Execution is performed by recursively traversing the AST and handling each node according to its type, allowing the shell to evaluate complex command lines in a structured and predictable way. -
Lookahead-based tokenizer
The shell uses a hand-written, lookahead-based tokenizer to scan the input line and produce tokens for the parser. The tokenizer walks the input character by character, skips irrelevant whitespace, and classifies each token according to shell syntax rules. It handles multi-character operators (such as&&,||,<<,>>), quotes, parentheses, redirections, and ordinary words. Each call identifies a token’s type, returns the token’s start and end positions, and advances the input pointer so the parser can continue from the correct location.

-
Operator Precedence and Parsing Strategy
The shell follows a well-defined operator precedence, which determines how complex command lines are grouped. From strongest binding to weakest:
Parentheses
Group commands and force them to be parsed as a unit.
Redirections (<,>,>>,<<)
Attach tightly to the command they modify.
Simple commands and arguments
The basic building blocks of execution.
Pipelines (|)
Connect the output of one command to the input of another.
Logical operators (&&,||)
Control execution flow based on command success or failure.
Command lists (;)
The weakest binding, used to sequence independent commands.
To implement these rules, the project uses a recursive-descent parser with top-down precedence handling. Each parsing function corresponds to one precedence level. Higher-precedence constructs call lower-precedence ones first, ensuring that tightly bound operations (like redirections or pipelines) are parsed before more general ones (like lists or logical operators). Parenthesized expressions re-enter the parser at the highest level, allowing a complete command line to be nested inside a block. This structure produces a clean and predictable Abstract Syntax Tree (AST) where each node reflects the operator precedence defined by the shell grammar.
