Creating immutable tree data structures in Ruby

What are the advantages of immutable data structures outside functional programming? We explore the benefits of immutability when designing a search query API in Ruby.

A data structure is immutable if it cannot be changed after it is created. In a recent Ruby application we decided to use immutability to our advantage. This article explains how. We will look at an interesting property of the API of Active Record and see how we can easily implement a similar API with an immutable tree structure.

Abstract syntax trees

One of our customers has a web application that is built with elasticsearch, an open source search engine. We developed an interface to query the search index: a simple API that makes it easy to build complex search queries. For example, if we want to get a list of the first five products sorted by name, this is how we might retrieve them:

              Document.where(tags: "product").sort("name").take(5)

The search queries are ultimately represented in JSON, but before they're sent to the search index they are internally represented as an abstract syntax tree. The query above might have a syntax tree roughly similar to the one below:

Query syntax Example query syntax

Before explaining more about our implementation, let's briefly discuss the interfaces of Active Record and the library that powers it: Arel. The reason for this detour is that they are similar to the API we created for our search index. It will also illustrate an important advantage of immutable data structures: persistence.

Persistent interfaces

Let's look at the API of Arel first. It is used internally by Active Record and can be accessed by calling the arel_table method on any Active Record model. If we call to_sql on the objects returned by Arel, we can inspect the SQL query that is generated.

In the examples below, notice how calling a query method changes the original object. In fact, all methods simply return the same object:

              reports = Report.arel_table
              all_reports = reports.project(reports[])
              #=> SELECT "reports".* FROM "reports"
              #=> SELECT "reports".* FROM "reports" WHERE "reports"."published" = 't'
              #=> SELECT "reports".* FROM "reports" WHERE "reports"."published" = 't' LIMIT 5
              #=> SELECT "reports".* FROM "reports" WHERE "reports"."published" = 't' LIMIT 5

Now let's compare this behaviour with Active Record. Its query methods return objects that represent the results of a database query. We can also inspect them by calling to_sql.

As you can see, calling a query method on an object does not change the previous objects:

              all_reports = Report.scoped
              #=> SELECT "reports".* FROM "reports"
              all_reports.where(published: true).to_sql
              #=> SELECT "reports".* FROM "reports" WHERE "reports"."published" = 't'
              #=> SELECT "reports".* FROM "reports" LIMIT 5
              #=> SELECT "reports".* FROM "reports"

Data structures that behave like Active Record's interface are called persistent data structures. Each operation exposes a new version of the data structure while the original is unmodified. This is very helpful if multiple queries are derived from the same basic query.

Active Record goes through great lengths to achieve a persistent interface. A lot of code is needed to keep track of the input to each query method and deliver it to Arel to build the SQL queries.

When designing the API for our search index, we discussed the various ways to build a syntax tree while having an interface that is persistent. Some options are:

  • Store the input to query methods and return a new value object that encapsulates all parameters. Delay building the syntax tree until the very last.
  • Build the syntax tree immediately but clone the entire tree every time a query method causes a change in the tree. Return the new tree. We can't reuse parts of the old tree because they may be modified accidentally.
  • Build the syntax tree immediately but use an immutable tree data structure. Immutable structures have a persistent interface by definition. We can reuse parts of the old tree without any risk.

Active Record uses the first approach. It works well, but is certainly not the easiest. The second approach is simpler but is needlessly expensive, especially with complex queries.

The third approach is quite efficient and simple to implement. It has the additional benefit of guaranteed thread-safety. The remainder of this article further explores how to use immutable trees in Ruby to represent an abstract syntax tree.

Immutable tree structures

Returning to our search index API, how do we use immutable trees to our advantage? Suppose we construct two queries, the second one being a variation on the first one:

              query1 = Document.where(tags: "product").sort("name")
              query2 = query1.take(5)

The calls to the query methods are internally converted to an abstract syntax tree. The first query contains a reference to a where node and to a sort node. The second query consists of a new root node but shares the where and sort nodes with the first query.

Conversion to an abstract syntax tree Diagram of conversion to an abstract syntax tree

Because each node is immutable there is no risk in sharing the subtrees. We share them because it's both easier to implement and more efficient.

Immutable objects in Ruby

Our first step is to make sure we can make objects immutable after their creation. For this purpose we're going to override the new class method. We call freeze on all newly created nodes as well as their instance variables. Our changes are isolated in a module.

              module Immutable
                def new(*)
                  # Create a new object, then freeze it along with its instance variables.
                  object = super
                  object.instance_variables.each do |var|

As long as the instance variables are primitives such as strings, it is impossible to change the state of a new node. If instance variables can be arrays, hashes or other complex types, we should recursively freeze them to ensure immutability. Because an abstract syntax tree typically only has primitive values or arrays of other (frozen) nodes we don't necessarily have to deal with it.

              class ExampleNode
                extend Immutable
                attr_reader :value
                def initialize(value)
                  @value = value
              node =
              # RuntimeError: can't modify frozen String
              node.instance_variable_set(:@value, "update")
              # RuntimeError: can't modify frozen ExampleNode

Implementation of an immutable syntax tree

With basic immutability implemented we can create classes for some of the nodes we need in our syntax trees. Except for the initialiser, none of the instance methods we define can modify the object state. We must return a replacement node instead.

              class Node
                extend Immutable
              class QueryNode < Node
                attr_reader :where_node, :sort_node, :limit_node
                def initialize(where_node =, sort_node = nil, limit_node = nil)
                  @where_node, @sort_node, @limit_node = where_node, sort_node, limit_node
                def where(predicates)
        , sort_node, limit_node)
                def sort(field)
        ,, limit_node)
                def take(limit)
        , sort_node,
              class WhereNode < Node
                attr_reader :condition_nodes
                def initialize(condition_nodes = [])
                  @condition_nodes = condition_nodes
                def append(predicates)
                  appended_condition_nodes = do |field, value|
          , value)
         + appended_condition_nodes)
              class LimitNode < Node
                attr_reader :limit
                def initialize(limit)
                  @limit = limit
              # And so on... SortNode and EqualityConditionNode omitted.

We want to make the query methods available as class methods on a Document model. All we have to do is delegate each method to an empty query node.

              require "forwardable"
              class Document
                class << self
                  extend Forwardable
                  delegate [:where, :sort, :take] => :all
                  def all

This is just a very basic example, but we've already managed to implement a working API:

              query1 = Document.where(tags: "product").sort("name")
              query2 = query1.take(5)
              #=> #<LimitNode @limit=5>
              #=> nil

The original query remained unaffected! Because we used an immutable tree structure the interface of the syntax tree is guaranteed to be persistent.


Immutability has practical applications outside functional programming. We explored how to implement immutable abstract syntax trees in a non-functional language like Ruby.

The resulting implementation has clear advantages over a traditional mutable tree structure. By definition it is thread-safe and it has a persistent interface. This means we can use multiple versions of the same tree without worrying about unintended changes or data corruption.

Thread-safety and persistent interfaces usually require some careful thought when implemented in a mutable data structure. However, they emerged automatically from our decision to favour strictly enforced immutability. An immutable implementation will therefore most likely contain fewer errors.