← Back to Tutorials
Python

Build a Programming Language Interpreter

Difficulty: Advanced Est. Time: ~6 hours

Introduction

Interpreters are fundamental to computing—they execute code written in programming languages. In this tutorial, we'll build a complete interpreter for a simple but powerful programming language from scratch.

Building an interpreter gives you deep insights into how programming languages work, improves your debugging skills, and prepares you for advanced topics like compiler design and language implementation.

What You'll Build
  • A lexer/tokenizer
  • A recursive descent parser
  • An Abstract Syntax Tree (AST)
  • An interpreter/evaluator
  • Built-in functions and operators
  • A REPL (Read-Eval-Print Loop)
What You'll Learn
  • How interpreters work internally
  • Lexical analysis and tokenization
  • Parsing algorithms (recursive descent)
  • AST design and traversal
  • Expression evaluation
  • Scope and environment management

Core Concepts

Lexical Analysis (Lexing)

The first phase of interpretation is lexical analysis, also called tokenization. The lexer reads the source code as a stream of characters and groups them into meaningful tokens. Each token has a type (like NUMBER, STRING, IDENTIFIER, PLUS) and a value.

For example, the source code let x = 5 + 3; would be tokenized into:

  • LET - "let"
  • IDENTIFIER - "x"
  • ASSIGN - "="
  • INTEGER - "5"
  • PLUS - "+"
  • INTEGER - "3"
  • SEMICOLON - ";"

Syntactic Analysis (Parsing)

The parser takes the stream of tokens and checks if they form a valid program according to the language's grammar. If valid, it produces an Abstract Syntax Tree (AST)—a tree representation of the program's structure.

The parser uses recursive descent parsing, where each grammar rule becomes a function that either returns a parsed AST node or calls other parsing functions.

Abstract Syntax Tree (AST)

The AST is a hierarchical tree structure where each node represents a construct in the source code. For example, a binary expression like 5 + 3 would be represented as:

BinaryExpression {
    operator: "+",
    left: NumericLiteral { value: 5 },
    right: NumericLiteral { value: 3 }
}

Interpretation/Evaluation

The interpreter traverses the AST and evaluates each node. For expressions, this means computing their value. For statements, it means executing their side effects (like variable assignment).

The evaluator maintains an environment—a mapping from variable names to their values—to support variable scope and lookups.

Project Overview

Our interpreter, which we'll call Monkey, will support:

Language Features

Feature Syntax Example
Variable bindings let name = value; let x = 5;
Integers Numeric literals 42, 100, -5
Strings Double-quoted text "hello world"
Arithmetic +, -, *, / 5 + 3 * 2
Comparisons ==, !=, <, > 5 < 10
Boolean true, false true != false
Conditionals if-else if (x > 0) { ... }
Functions fn(params) { body } fn(x) { x + 1 }
Function calls name(args) add(1, 2)
Comments // comment // this is a comment

Prerequisites

  • Python 3.8+ - Installed on your system
  • Basic Python knowledge - Functions, classes, lists, dictionaries
  • Understanding of recursion - Essential for the parser

The Lexer

We'll start by building the lexer. Create a file called lexer.py:

class Token:
    def __init__(self, token_type, literal):
        self.type = token_type
        self.literal = literal

    def __repr__(self):
        return f"Token({self.type}, {self.literal})"


class Lexer:
    def __init__(self, source):
        self.source = source
        self.position = 0
        self.read_position = 0
        self.ch = ''
        self._read_char()

    def _read_char(self):
        if self.read_position >= len(self.source):
            self.ch = '\0'
        else:
            self.ch = self.source[self.read_position]
        self.position = self.read_position
        self.read_position += 1

    def _skip_whitespace(self):
        while self.ch in [' ', '\t', '\n', '\r']:
            self._read_char()

    def _read_identifier(self):
        position = self.position
        while self._is_letter(self.ch):
            self._read_char()
        return self.source[position:self.position]

    def _read_number(self):
        position = self.position
        while self.ch.isdigit():
            self._read_char()
        return self.source[position:self.position]

    def _read_string(self):
        self._read_char()
        position = self.position
        while self.ch != '"' and self.ch != '\0':
            self._read_char()
        literal = self.source[position:self.position]
        self._read_char()
        return literal

    def _is_letter(self, ch):
        return ch.isalpha() or ch == '_'

    def _peek_char(self):
        if self.read_position >= len(self.source):
            return '\0'
        return self.source[self.read_position]

    def next_token(self):
        self._skip_whitespace()
        
        if self.ch == '=':
            if self._peek_char() == '=':
                ch = self.ch
                self._read_char()
                token = Token('EQ', ch + self.ch)
            else:
                token = Token('ASSIGN', self.ch)
        elif self.ch == '+':
            token = Token('PLUS', self.ch)
        elif self.ch == '-':
            token = Token('MINUS', self.ch)
        elif self.ch == '*':
            token = Token('ASTERISK', self.ch)
        elif self.ch == '/':
            token = Token('SLASH', self.ch)
        elif self.ch == '<':
            if self._peek_char() == '=':
                ch = self.ch
                self._read_char()
                token = Token('LE', ch + self.ch)
            else:
                token = Token('LT', self.ch)
        elif self.ch == '>':
            if self._peek_char() == '=':
                ch = self.ch
                self._read_char()
                token = Token('GE', ch + self.ch)
            else:
                token = Token('GT', self.ch)
        elif self.ch == '!':
            if self._peek_char() == '=':
                ch = self.ch
                self._read_char()
                token = Token('NE', ch + self.ch)
            else:
                token = Token('BANG', self.ch)
        elif self.ch == ',':
            token = Token('COMMA', self.ch)
        elif self.ch == ';':
            token = Token('SEMICOLON', self.ch)
        elif self.ch == '(':
            token = Token('LPAREN', self.ch)
        elif self.ch == ')':
            token = Token('RPAREN', self.ch)
        elif self.ch == '{':
            token = Token('LBRACE', self.ch)
        elif self.ch == '}':
            token = Token('RBRACE', self.ch)
        elif self.ch == '"':
            token = Token('STRING', self._read_string())
        elif self.ch == '\0':
            token = Token('EOF', '')
        else:
            if self.ch.isalpha():
                literal = self._read_identifier()
                if literal in KEYWORDS:
                    return Token(KEYWORDS[literal], literal)
                return Token('IDENT', literal)
            elif self.ch.isdigit():
                return Token('INT', self._read_number())
            else:
                token = Token('ILLEGAL', self.ch)
        
        self._read_char()
        return token


KEYWORDS = {
    'fn': 'FUNCTION',
    'let': 'LET',
    'if': 'IF',
    'else': 'ELSE',
    'return': 'RETURN',
    'true': 'TRUE',
    'false': 'FALSE',
}

How the Lexer Works

The lexer maintains two pointers: position (current character) and read_position (next character). This lookahead allows us to check if the next character forms a two-character token like == or !=.

The next_token() method skips whitespace and returns the next token. It handles identifiers, keywords, numbers, strings, and operators.

The Parser

Now we'll build the parser. Create parser.py:

from ast import *
from lexer import Token, Lexer

class Parser:
    def __init__(self, lexer):
        self.lexer = lexer
        self.errors = []
        self._register_actions()
        self._advance()

    def _register_actions(self):
        self.prefix_parse_fns = {
            'IDENT': self._parse_identifier,
            'INT': self._parse_integer,
            'STRING': self._parse_string,
            'TRUE': self._parse_boolean,
            'FALSE': self._parse_boolean,
            'BANG': self._parse_prefix_expression,
            'MINUS': self._parse_prefix_expression,
            'LPAREN': self._parse_grouped_expression,
            'IF': self._parse_if_expression,
            'FUNCTION': self._parse_function_literal,
        }
        
        self.infix_parse_fns = {
            'PLUS': self._parse_infix_expression,
            'MINUS': self._parse_infix_expression,
            'ASTERISK': self._parse_infix_expression,
            'SLASH': self._parse_infix_expression,
            'EQ': self._parse_infix_expression,
            'NE': self._parse_infix_expression,
            'LT': self._parse_infix_expression,
            'GT': self._parse_infix_expression,
            'GE': self._parse_infix_expression,
            'LE': self._parse_infix_expression,
            'LPAREN': self._parse_call_expression,
        }

    def _advance(self):
        self.cur_token = self.peek_token
        self.peek_token = self.lexer.next_token()

    def _expect_peek(self, token_type):
        if self.peek_token.type == token_type:
            self._advance()
            return True
        self.errors.append(f"expected next token to be {token_type}, got {self.peek_token.type}")
        return False

    def parse_program(self):
        statements = []
        while self.cur_token.type != 'EOF':
            stmt = self._parse_statement()
            if stmt:
                statements.append(stmt)
            self._advance()
        return Program(statements)

    def _parse_statement(self):
        if self.cur_token.type == 'LET':
            return self._parse_let_statement()
        elif self.cur_token.type == 'RETURN':
            return self._parse_return_statement()
        else:
            return self._parse_expression_statement()

    def _parse_let_statement(self):
        token = self.cur_token
        if not self._expect_peek('IDENT'):
            return None
        name = Identifier(self.cur_token, self.cur_token.literal)
        if not self._expect_peek('ASSIGN'):
            return None
        self._advance()
        value = self._parse_expression(Precedence.LOW)
        if self.peek_token.type == 'SEMICOLON':
            self._advance()
        return LetStatement(token, name, value)

    def _parse_return_statement(self):
        token = self.cur_token
        self._advance()
        value = self._parse_expression(Precedence.LOW)
        if self.peek_token.type == 'SEMICOLON':
            self._advance()
        return ReturnStatement(token, value)

    def _parse_expression_statement(self):
        token = self.cur_token
        expr = self._parse_expression(Precedence.LOW)
        if self.peek_token.type == 'SEMICOLON':
            self._advance()
        return ExpressionStatement(token, expr)

    def _parse_expression(self, precedence):
        prefix = self.prefix_parse_fns.get(self.cur_token.type)
        if not prefix:
            self.errors.append(f"no prefix parse function for {self.cur_token.type}")
            return None
        
        left_exp = prefix()
        
        while self.peek_token.type != 'SEMICOLON' and precedence < self._peek_precedence():
            infix = self.infix_parse_fns.get(self.peek_token.type)
            if not infix:
                return left_exp
            self._advance()
            left_exp = infix(left_exp)
        
        return left_exp

    def _parse_identifier(self):
        return Identifier(self.cur_token, self.cur_token.literal)

    def _parse_integer(self):
        try:
            value = int(self.cur_token.literal)
            return IntegerLiteral(self.cur_token, value)
        except ValueError:
            self.errors.append(f"could not parse {self.cur_token.literal} as integer")
            return None

    def _parse_string(self):
        return StringLiteral(self.cur_token, self.cur_token.literal)

    def _parse_boolean(self):
        return BooleanLiteral(self.cur_token, self.cur_token.type == 'TRUE')

    def _parse_prefix_expression(self):
        token = self.cur_token
        operator = self.cur_token.literal
        self._advance()
        right = self._parse_expression(Precedence.PREFIX)
        return PrefixExpression(token, operator, right)

    def _parse_infix_expression(self, left):
        token = self.cur_token
        operator = self.cur_token.literal
        precedence = self._cur_precedence()
        self._advance()
        right = self._parse_expression(precedence)
        return InfixExpression(token, left, operator, right)

    def _parse_grouped_expression(self):
        self._advance()
        exp = self._parse_expression(Precedence.LOW)
        if not self._expect_peek('RPAREN'):
            return None
        return exp

    def _parse_if_expression(self):
        token = self.cur_token
        if not self._expect_peek('LPAREN'):
            return None
        self._advance()
        condition = self._parse_expression(Precedence.LOW)
        if not self._expect_peek('RPAREN'):
            return None
        if not self._expect_peek('LBRACE'):
            return None
        consequence = self._parse_block_statement()
        
        alternate = None
        if self.peek_token.type == 'ELSE':
            self._advance()
            if not self._expect_peek('LBRACE'):
                return None
            alternate = self._parse_block_statement()
        
        return IfExpression(token, condition, consequence, alternate)

    def _parse_block_statement(self):
        token = self.cur_token
        statements = []
        self._advance()
        
        while self.cur_token.type != 'RBRACE' and self.cur_token.type != 'EOF':
            stmt = self._parse_statement()
            if stmt:
                statements.append(stmt)
            self._advance()
        
        return BlockStatement(token, statements)

    def _parse_function_literal(self):
        token = self.cur_token
        if not self._expect_peek('LPAREN'):
            return None
        
        parameters = self._parse_function_parameters()
        
        if not self._expect_peek('LBRACE'):
            return None
        
        body = self._parse_block_statement()
        return FunctionLiteral(token, parameters, body)

    def _parse_function_parameters(self):
        args = []
        self._advance()
        
        if self.cur_token.type == 'RPAREN':
            return args
        
        arg = Identifier(self.cur_token, self.cur_token.literal)
        args.append(arg)
        
        while self.peek_token.type == 'COMMA':
            self._advance()
            self._advance()
            arg = Identifier(self.cur_token, self.cur_token.literal)
            args.append(arg)
        
        if not self._expect_peek('RPAREN'):
            return None
        
        return args

    def _parse_call_expression(self, function):
        token = self.cur_token
        arguments = self._parse_call_arguments()
        return CallExpression(token, function, arguments)

    def _parse_call_arguments(self):
        args = []
        self._advance()
        
        if self.cur_token.type == 'RPAREN':
            return args
        
        arg = self._parse_expression(Precedence.LOW)
        args.append(arg)
        
        while self.peek_token.type == 'COMMA':
            self._advance()
            self._advance()
            arg = self._parse_expression(Precedence.LOW)
            args.append(arg)
        
        if not self._expect_peek('RPAREN'):
            return None
        
        return args

    def _peek_precedence(self):
        return Precedence.PRECEDENCES.get(self.peek_token.type, Precedence.LOW)

    def _cur_precedence(self):
        return Precedence.PRECEDENCES.get(self.cur_token.type, Precedence.LOW)


class Precedence:
    LOW = 0
    EQUALS = 1
    LESSGREATER = 2
    SUM = 3
    PRODUCT = 4
    PREFIX = 5
    CALL = 6
    
    PRECEDENCES = {
        'EQ': EQUALS,
        'NE': EQUALS,
        'LT': LESSGREATER,
        'GT': LESSGREATER,
        'LE': LESSGREATER,
        'GE': LESSGREATER,
        'PLUS': SUM,
        'MINUS': SUM,
        'ASTERISK': PRODUCT,
        'SLASH': PRODUCT,
        'LPAREN': CALL,
    }

Parser Explanation

The parser uses Pratt parsing (also called top-down operator precedence). It registers prefix and infix parsing functions for each token type. Prefix functions handle expressions that start with a specific token (like -5 for negation), while infix functions handle expressions that appear after another expression (like 5 + 3).

Abstract Syntax Tree

Create ast.py to define all AST node classes:

class Node:
    def token_literal(self):
        return ""

    def __repr__(self):
        return self.__class__.__name__


class Program(Node):
    def __init__(self, statements):
        self.statements = statements

    def token_literal(self):
        if self.statements:
            return self.statements[0].token_literal()
        return ""


class Statement(Node):
    pass


class Expression(Node):
    pass


class LetStatement(Statement):
    def __init__(self, token, name, value):
        self.token = token
        self.name = name
        self.value = value

    def token_literal(self):
        return self.token.literal


class ReturnStatement(Statement):
    def __init__(self, token, return_value):
        self.token = token
        self.value = return_value

    def token_literal(self):
        return self.token.literal


class ExpressionStatement(Statement):
    def __init__(self, token, expression):
        self.token = token
        self.expression = expression

    def token_literal(self):
        return self.token.literal


class BlockStatement(Statement):
    def __init__(self, token, statements):
        self.token = token
        self.statements = statements

    def token_literal(self):
        return self.token.literal


class Identifier(Expression):
    def __init__(self, token, value):
        self.token = token
        self.value = value

    def token_literal(self):
        return self.token.literal


class IntegerLiteral(Expression):
    def __init__(self, token, value):
        self.token = token
        self.value = value

    def token_literal(self):
        return self.token.literal


class StringLiteral(Expression):
    def __init__(self, token, value):
        self.token = token
        self.value = value

    def token_literal(self):
        return self.token.literal


class BooleanLiteral(Expression):
    def __init__(self, token, value):
        self.token = token
        self.value = value

    def token_literal(self):
        return self.token.literal


class PrefixExpression(Expression):
    def __init__(self, token, operator, right):
        self.token = token
        self.operator = operator
        self.right = right

    def token_literal(self):
        return self.token.literal


class InfixExpression(Expression):
    def __init__(self, token, left, operator, right):
        self.token = token
        self.left = left
        self.operator = operator
        self.right = right

    def token_literal(self):
        return self.token.literal


class IfExpression(Expression):
    def __init__(self, token, condition, consequence, alternate):
        self.token = token
        self.condition = condition
        self.consequence = consequence
        self.alternate = alternate

    def token_literal(self):
        return self.token.literal


class FunctionLiteral(Expression):
    def __init__(self, token, parameters, body):
        self.token = token
        self.parameters = parameters
        self.body = body

    def token_literal(self):
        return self.token.literal


class CallExpression(Expression):
    def __init__(self, token, function, arguments):
        self.token = token
        self.function = function
        self.arguments = arguments

    def token_literal(self):
        return self.token.literal

The Evaluator

Create evaluator.py to interpret the AST:

from ast import *
from object import *

class Evaluator:
    def __init__(self):
        self.TRUE = Boolean(True)
        self.FALSE = Boolean(False)
        self.NULL = Null()

    def eval(self, node, env):
        if isinstance(node, Program):
            return self._eval_program(node.statements, env)
        elif isinstance(node, ExpressionStatement):
            return self.eval(node.expression, env)
        elif isinstance(node, IntegerLiteral):
            return Integer(node.value)
        elif isinstance(node, StringLiteral):
            return String(node.value)
        elif isinstance(node, BooleanLiteral):
            return self._to_boolean_object(node.value)
        elif isinstance(node, Identifier):
            return self._eval_identifier(node, env)
        elif isinstance(node, PrefixExpression):
            right = self.eval(node.right, env)
            if self._is_error(right):
                return right
            return self._eval_prefix_expression(node.operator, right)
        elif isinstance(node, InfixExpression):
            left = self.eval(node.left, env)
            if self._is_error(left):
                return left
            right = self.eval(node.right, env)
            if self._is_error(right):
                return right
            return self._eval_infix_expression(node.operator, left, right)
        elif isinstance(node, IfExpression):
            return self._eval_if_expression(node, env)
        elif isinstance(node, BlockStatement):
            return self._eval_block_statement(node.statements, env)
        elif isinstance(node, LetStatement):
            val = self.eval(node.value, env)
            if self._is_error(val):
                return val
            env.set(node.name.value, val)
            return None
        elif isinstance(node, ReturnStatement):
            val = self.eval(node.value, env)
            if self._is_error(val):
                return val
            return ReturnValue(val)
        elif isinstance(node, FunctionLiteral):
            return Function(node.parameters, node.body, env)
        elif isinstance(node, CallExpression):
            func = self.eval(node.function, env)
            if self._is_error(func):
                return func
            args = self._eval_expressions(node.arguments, env)
            if len(args) == 1 and self._is_error(args[0]):
                return args[0]
            return self._apply_function(func, args)
        
        return None

    def _eval_program(self, statements, env):
        result = None
        for stmt in statements:
            result = self.eval(stmt, env)
            if isinstance(result, ReturnValue):
                return result.value
            if isinstance(result, Error):
                return result
        return result

    def _eval_block_statement(self, statements, env):
        result = None
        for stmt in statements:
            result = self.eval(stmt, env)
            if result and (isinstance(result, ReturnValue) or isinstance(result, Error)):
                return result
        return result

    def _eval_expressions(self, exprs, env):
        result = []
        for expr in exprs:
            evaluated = self.eval(expr, env)
            if self._is_error(evaluated):
                return [evaluated]
            result.append(evaluated)
        return result

    def _eval_prefix_expression(self, operator, right):
        if operator == '!':
            return self._eval_bang_operator_expression(right)
        elif operator == '-':
            return self._eval_minus_prefix_operator_expression(right)
        return Error(f"unknown operator: {operator}{right.type}")

    def _eval_infix_expression(self, operator, left, right):
        if left.type == INTEGER and right.type == INTEGER:
            return self._eval_integer_infix_expression(operator, left, right)
        elif left.type == STRING and right.type == STRING:
            return self._eval_string_infix_expression(operator, left, right)
        elif operator == '==':
            return self._to_boolean_object(left == right)
        elif operator == '!=':
            return self._to_boolean_object(left != right)
        return Error(f"type mismatch: {left.type} {operator} {right.type}")

    def _eval_integer_infix_expression(self, operator, left, right):
        if operator == '+':
            return Integer(left.value + right.value)
        elif operator == '-':
            return Integer(left.value - right.value)
        elif operator == '*':
            return Integer(left.value * right.value)
        elif operator == '/':
            return Integer(left.value // right.value)
        elif operator == '<':
            return self._to_boolean_object(left.value < right.value)
        elif operator == '>':
            return self._to_boolean_object(left.value > right.value)
        elif operator == '==':
            return self._to_boolean_object(left.value == right.value)
        elif operator == '!=':
            return self._to_boolean_object(left.value != right.value)
        return Error(f"unknown operator: INTEGER {operator}")

    def _eval_string_infix_expression(self, operator, left, right):
        if operator != '+':
            return Error(f"unknown operator: STRING {operator}")
        return String(left.value + right.value)

    def _eval_bang_operator_expression(self, right):
        if right == self.TRUE:
            return self.FALSE
        elif right == self.FALSE:
            return self.TRUE
        elif right == self.NULL:
            return self.TRUE
        return self.FALSE

    def _eval_minus_prefix_operator_expression(self, right):
        if right.type != INTEGER:
            return Error(f"unknown operator: -{right.type}")
        return Integer(-right.value)

    def _eval_if_expression(self, ie, env):
        condition = self.eval(ie.condition, env)
        if self._is_error(condition):
            return condition
        if self._is_truthy(condition):
            return self.eval(ie.consequence, env)
        elif ie.alternate:
            return self.eval(ie.alternate, env)
        return self.NULL

    def _eval_identifier(self, ident, env):
        val = env.get(ident.value)
        if val:
            return val
        if ident.value in BUILTINS:
            return BUILTINS[ident.value]
        return Error(f"identifier not found: {ident.value}")

    def _apply_function(self, fn, args):
        if isinstance(fn, Function):
            extended_env = self._extend_function_env(fn, args)
            evaluated = self.eval(fn.body, extended_env)
            return self._unwrap_return_value(evaluated)
        return Error(f"not a function: {fn.type}")

    def _extend_function_env(self, fn, args):
        env = Environment(outer=fn.env)
        for param, arg in zip(fn.parameters, args):
            env.set(param.value, arg)
        return env

    def _unwrap_return_value(self, obj):
        if isinstance(obj, ReturnValue):
            return obj.value
        return obj

    def _to_boolean_object(self, value):
        return self.TRUE if value else self.FALSE

    def _is_truthy(self, obj):
        if obj == self.NULL:
            return False
        if obj == self.TRUE:
            return True
        if obj == self.FALSE:
            return False
        return True

    def _is_error(self, obj):
        return obj and obj.type == ERROR

Built-in Functions and Environment

Create object.py to define the object system:

class Object:
    def type(self):
        return self.__class__.__name__


class Integer(Object):
    def __init__(self, value):
        self.value = value
    def __repr__(self):
        return f"{self.value}"


class String(Object):
    def __init__(self, value):
        self.value = value
    def __repr__(self):
        return f'"{self.value}"'


class Boolean(Object):
    def __init__(self, value):
        self.value = value
    def __repr__(self):
        return "true" if self.value else "false"


class Null(Object):
    def __repr__(self):
        return "null"


class ReturnValue(Object):
    def __init__(self, value):
        self.value = value


class Error(Object):
    def __init__(self, message):
        self.message = message
    def __repr__(self):
        return f"ERROR: {self.message}"


class Function(Object):
    def __init__(self, parameters, body, env):
        self.parameters = parameters
        self.body = body
        self.env = env
    def __repr__(self):
        params = ", ".join(p.value for p in self.parameters)
        return f"fn({params}) {{ ... }}"


class Environment:
    def __init__(self, outer=None):
        self.store = {}
        self.outer = outer

    def get(self, name):
        val = self.store.get(name)
        if val is None and self.outer:
            return self.outer.get(name)
        return val

    def set(self, name, value):
        self.store[name] = value
        return value


INTEGER = "INTEGER"
STRING = "STRING"
BOOLEAN = "BOOLEAN"
NULL = "NULL"
RETURN_VALUE = "RETURN_VALUE"
ERROR = "ERROR"
FUNCTION = "FUNCTION"


BUILTINS = {
    'len': Builtin(len),
    'puts': Builtin(lambda *args: print(*args)),
    'first': Builtin(lambda args: args[0] if args else Null()),
    'last': Builtin(lambda args: args[-1] if args else Null()),
    'rest': Builtin(lambda args: args[1:] if args else []),
    'push': Builtin(lambda args: args),
}


class Builtin(Object):
    def __init__(self, fn):
        self.fn = fn
    def __repr__(self):
        return "builtin function"

Testing the Interpreter

Create repl.py to provide a REPL interface:

from lexer import Lexer
from parser import Parser
from evaluator import Evaluator
from object import Environment

def start_repl():
    env = Environment()
    evaluator = Evaluator()
    
    print("Monkey Interpreter v0.1")
    print("Type 'exit' to quit")
    print("=" * 40)
    
    while True:
        try:
            source = input(">> ")
            if source.lower() in ('exit', 'quit'):
                break
            if not source.strip():
                continue
            
            lexer = Lexer(source)
            parser = Parser(lexer)
            program = parser.parse_program()
            
            if parser.errors:
                print("Parser errors:")
                for err in parser.errors:
                    print(f"  {err}")
                continue
            
            result = evaluator.eval(program, env)
            if result:
                print(result)
                
        except KeyboardInterrupt:
            print("\nExiting...")
            break
        except Exception as e:
            print(f"Error: {e}")


if __name__ == "__main__":
    start_repl()

Testing the Interpreter

Run the interpreter and try these examples:

# Arithmetic
>> 5 + 3
8
>> 10 - 4
6
>> 3 * 5
15
>> 10 / 2
5

# Variables
>> let x = 5;
>> let y = 10;
>> x + y
15

# Functions
>> let add = fn(a, b) { a + b; };
>> add(3, 4)
7

# Conditionals
>> let max = fn(a, b) { if (a > b) { a; } else { b; } };
>> max(5, 10)
10

# Recursion
>> let fib = fn(n) { if (n <= 1) { n; } else { fib(n - 1) + fib(n - 2); } };
>> fib(10)
55

Summary

Congratulations! You've built a complete interpreter for the Monkey programming language. Here's what you learned:

  • Lexical Analysis - How to tokenize source code into meaningful tokens
  • Parsing - How to use recursive descent and Pratt parsing to build an AST
  • AST Design - How to represent program structure as a tree of nodes
  • Evaluation - How to traverse the AST and compute results
  • Environment - How to manage variable scope and bindings
  • Built-ins - How to add native functions to your language

Possible Extensions

  • Add arrays and indexing (e.g., arr[0])
  • Add hash/dictionary data type
  • Add for loops and while loops
  • Add closures
  • Add error handling with try-catch
  • Add standard library functions
  • Implement a bytecode compiler