man

Compilers Are Not Magic: Let’s Create One in Swift

Compilers are complex programs that convert source code written in a programming language into machine code that can be executed by a processor. Despite their importance in software development, many people view compilers as something mystical and unattainable. However, creating a simple compiler is a very real task that can be accomplished, even using the Swift language. In this article, we will look at the basic steps to create a simple compiler.

Lexical analysis (lexer)

The first step in creating a compiler is lexical analysis, or creating a lexer. A lexer divides source code into “tokens”-minimal units of information such as keywords, operators, identifiers, and literals. In Swift, you can use regular expressions to implement a lexer.

Here is an example of a tokenizer in Swift:

enum Token {
case number(Int)
case plus
case minus
case eof
}

func lexer(input: String) -> [Token] {
var tokens: [Token] = []
let regex = try! NSRegularExpression(pattern: “\d+|\+|+|\-”)
let nsrange = NSRange(input.startIndex…, in: input)
regex.enumerateMatches(in: input, options: [], range: nsrange) { match, _, _ in
if let match = match {
let token = (input as NSString).substring(with: match.range)
if let number = Int(token) {
tokens.append(.number(number)))
} else if token == “+” {
tokens.append(.plus)
} else if token == “-” {
tokens.append(.minus)
}
}
}
tokens.append(.eof)
return tokens

}

Syntactic analysis (parser)

The next step is parsing, which turns a stream of tokens into a structure called a syntax tree (or AST – Abstract Syntax Tree). The parser analyzes the sequence of tokens and builds a tree that represents the grammar of the language.

An example of a parser for an arithmetic expression:

enum ASTNode {
case number(Int)
case addition(left: ASTNode, right: ASTNode)
case subtraction(left: ASTNode, right: ASTNode)
}

func parser(tokens: [Token]) -> ASTNode? {
var index = 0
func expression() -> ASTNode? {
var node = term()
while index < tokens.count, case .plus = tokens[index] {
index += 1
node = .addition(left: node!, right: term()!)
}
return node
}
func term() -> ASTNode? {
if case let .number(value) = tokens[index] {
index += 1
return .number(value)
}
return nil
}

return expression()

}

Code generation

Once the syntax tree has been created, the next step is to generate the machine code. In the case of our example, we can simply interpret the AST and execute the operation.

Example interpreter:

func evaluate(ast: ASTNode) -> Int {
switch ast {
case .number(let value):
return value
case .addition(let left, let right):
return evaluate(ast: left) + evaluate(ast: right)
case .subtraction(let left, let right):
return evaluate(ast: left) - evaluate(ast: right)
}
}

Conclusion

Creating a compiler is an exciting and multifaceted process that involves lexical analysis, syntax analysis, and code generation. Although compilers can be complex and require a deep knowledge of compilation theory, creating a simple compiler in Swift is a great way to learn the basics of compilation.