Learning Swift: Ordered Dictionaries

Note: this post begins a new series about the Swift programming language, introduced at WWDC 2014. I’m no more experienced in Swift than anyone else outside Apple, but I learn best by coding and talking through a problem. If there’s a better way to approach some of these topics, get in touch on Twitter!

An ordered dictionary is a useful data structure that combines attributes of an array and a dictionary. Like an array, it stores elements in the order they’re added, and enumeration happens in the same fixed order every time. It also satisfies the basic idea of a dictionary: it stores key-value pairs, and can look up values by key.

Ordered dictionaries are incredibly useful tools for all kinds of development. To crib an example from the Swift book, an ordered dictionary might help an app display people and their ages: you might use names for keys, ages for values, and use the order to provide data in table cells by index path (and support user-driven reordering, to boot).

In this article, we’ll build an ordered dictionary atop the two primitive collection types already in Swift: an Array and a Dictionary. Let’s go!

The Shoulders of Giants

As described in The Swift Programming Language, Apple provides us some basic collection classes already. Most iOS developers are probably already familiar with the concepts of an array and a dictionary, but there are a few key caveats about the Swift equivalents.

First, they’re no longer prefixed. NSArray and NSDictionary served us well for many years, but have been replaced in the Swift world. The equivalents, Array and Dictionary, are fundamental types in the language, bringing along with them a few new behaviors.

The most notable of these is increased type safety. Swift, unlike Objective-C, supports type generics: a way of stating that a variable’s type or types can be determined later without throwing away all type information about that variable. (This concept also goes by the alias parameterized types.)

For example, in Objective-C we were limited to declaring an NSArray of objects, then trusting that they were all NSStrings. By contrast, in Swift we can declare that a given Array only contains String instances. The full type of such a variable would be Array<String> (in a syntax that is probably familiar to former Java programmers). Swift also provides a shorthand for this type: String[]. Either is acceptable, and both are interchangeable.

Dictionary has similar semantics: when creating a new Dictionary instance, you’ll specify the type for both the key and the value. We write such a type as Dictionary<KeyType,ValueType>; our hypothetical name-to-age mapping as a Dictionary would thus have the full type Dictionary<String,Int>.

One final change we see in Swift collections is their memory semantics. In Objective-C, NSString and NSDictionary were classes, and had appropriate semantics: they were passed (effectively) by reference, and copied only when necessary. Swift’s Array and Dictionary are, by contrast, structs; this means that in most cases, they’re copied when modified, assigned to a variable, or passed to a function. (Array has some exceptions to this behavior: for performance, it doesn’t always copy on assignment or mutation, but can be forced to do so if necessary by calling unshare().)

Beginning an Implementation

Since we desire most of the same memory behavior for our hypothetical ordered dictionary, let’s begin implementing it as a struct:

struct OrderedDictionary {}

We’d like to specify that this dictionary always maps keys of a certain type to values of a certain (potentially different) type, so let’s add some generic types to its declaration:

struct OrderedDictionary<Tk,Tv> {}

Here, Tk stands in for the type of the dictionary’s keys, and Tv for the type of the values. Next, we’ll need a way to store the keys and values they map to; let’s provide an array for the former, and a dictionary for the latter.

struct OrderedDictionary<Tk,Tv> {
    var keys: Array<Tk>
    var values: Dictionary<Tk,Tv>
}

Whoa! We’ve hit our first compile error:

Type ‘Tk’ does not conform to protocol ‘Hashable’

That’s right - Swift dictionaries need to be able to hash their keys. The language provides a Hashable protocol for that, which in turn defines a single method: hashValue(). Let’s constrain our Tk generic type so that we’re sure it conforms to this protocol:

struct OrderedDictionary<TK: Hashable, Tv> { /* ... */ }

Just like declaring class inheritance or protocol conformance, we can have a generic type include a constraint by naming the required protocol after a colon. It’s almost as if we’re declaring the “type of the type,” so to speak.

Believe it or not, this is actually coming close to a complete ordered dictionary type definition. This little snippet will compile and is usable, thanks largely to Swift’s various bits of inference and automatic code generation. Let’s gin up a little command-line tool that makes one of these things and prints it out:

let od = OrderedDictionary(keys: ["Tim"], values: ["Tim" : 24])
println(od)

If we run that code (with the appropriate imports, if needed), we’d see the following output:

V19SwiftDataStructures17OrderedDictionary (has 2 children)

Hm. That seems remarkably unhelpful in a couple different ways:

We don’t really get any useful information about the dictionary, and
We have to specify its contents in a fairly roundabout way, naming the key twice (potentially introducing an inconsistency along the way)

Better Behaved

Any developer who tries to use our OrderedDictionary is first going to need to create an instance, so let’s focus some efforts there. As shown in the snippet above, Swift has automatically created a memberwise initializer for this struct, so that callers can make an instance by providing a value for every member. This isn’t ideal – it requires that callers specify keys twice, for example.

Let’s provide our own initializer on the struct. For now, we don’t want to take any initial keys or values in (we’ll provide that later); we just want to set up an empty OrderedDictionary instance for the caller.

struct OrderedDictionary<Tk: Hashable, Tv> {
    var keys: Array<Tk> = []
    var values: Dictionary<Tk,Tv> = [:]

    init() {}
}

This initializer is about as bare as it gets: no arguments and no body. It accomplishes its purpose, however, in that it prevents Swift from generating a memberwise initializer for this struct – the language will refuse to do so if it detects that the developer has provided any custom initializers.

At the same time, though, we’ve made another slight tweak here to ensure that our struct is still fully initialized by the time the init() method returns: we’ve added default values for both of our var declarations. Since a new OrderedDictionary has no contents, we’ve set up the keys variable to have an empty Array, and given the values variable a similarly empty Dictionary.

With these changes, our stub code to make and print an instance looks a little cleaner:

let od = OrderedDictionary()
println(od)

Now, we can’t add any values to this dictionary up front, but at least we’re not repeating ourselves. Let’s bring back that mutability.

Subscripting

Swift brings over a handy language feature from Objective-C: the ability to define a method that supports subscripting syntax. In Objective-C, we implemented the method -objectAtIndexedSubscript: to support array-style integer indexing, and -objectForKeyedSubscript: for dictionary-style key-value access.

Swift combines both of these methods into a single keyword subscript, which infers its behavior (integer-based or key-based) from the argument types with which it is implemented. For example, implementing subscript and typing its argument as an Int gets you array-like behavior, while typing the argument with an arbitrary key type will look more like a dictionary.

Let’s implement the basic dictionary subscripting on OrderedDictionary now:

struct OrderedDictionary<Tk: Hashable, Tv> {
    /* ... vars and init ... */

    subscript(key: Tk) -> Tv? {
        get {
            return self.values[key]
        }
        set(newValue) {
            if newValue == nil {
                self.values.removeValueForKey(key)
                self.keys.filter {$0 != key}
                return
            }
            
            let oldValue = self.values.updateValue(newValue!, forKey: key)
            if oldValue == nil {
                self.keys.append(key)
            }
        }
    }
}

That’s a lot of code all at once! Let’s go through it in a more fine-grained way:

subscript(key: Tk) -> Tv? {

This says we’re providing a subscripting implementation that takes our key type (Tk) as the subscript itself and returns something that is an optional value type (Tv?). Remember that it’s possible for us to return nil from subscript implementations – if we’re given a key that doesn’t exist, we can’t return a valid value, so we return nil and the return type has to be optional.

get {
    return self.values[key]
}

This is fairly straightforward: when getting the value for a given subscript, just ask the values dictionary for its own value for that key and return it unchanged.

set(newValue) {

This will be our implementation for a subscript setter; newValue is of the return type above, so Tv?. We’ll need to handle possible nil new values in this setter, so:

if newValue == nil {
    self.values.removeValueForKey(key)
    self.keys.filter {$0 != key}
    return
}

We’ll assume that setting a nil new value means we want to remove the key and associated value, so we remove the value from our values dictionary and the key from our keys array. The latter has to be accomplished by way of a filter for equality against the given key; since our key type is Hashable, we know it’s also Equatable, so we can use the equality operator != on instances.

let oldValue = self.values.updateValue(newValue!, forKey: key)
if oldValue == nil {
    self.keys.append(key)
}

On the other hand, if the new value isn’t nil, let’s insert it into our values dictionary (unboxing it from its optional type along the way). We also hang on to any value it’s replacing: if there wasn’t already a value for this key, we’ll need to tack the key onto the keys array as well, so as to maintain the right ordering.

Now, our stub runner code can not only make a new OrderedDictionary, but it can also insert some new values into it. Let’s keep on tracking names and ages using the same data from above, but this time set it using subscripting:

let od = OrderedDictionary()
od["Tim"] = 24
println(od)

Descriptivism

Now that we can shove values into an OrderedDictionary, let’s get them back out again. Much like Objective-C, we can implement a method description() that formats our instance to a nicer String:

var description: String {
    var result = "{\n"
    for i in 0..self.keys.count {
        let key = self.keys[i]
        result += "[\(i)]: \(key) => \(self[key])\n"
    }
    result += "}"
    return result
}

This implementation should be fairly straightforward, with the exception of a couple tiny new bits of syntax:

The declaration (var description: String { ...) is a read-only computed property. It doesn’t include “storage” (what we would think of as an instance variable), but instead provides a value based solely on other internal state.
The loop (for i in 0..self.count {) uses a range, which isn’t inclusive of its last index – we iterate up to, but not past, the count.

Now, our runner code can print out the description of our OrderedDictionary:

let od = OrderedDictionary()
od["Tim"] = 24
println(od.description)

And we should see the final output:

{
[0]: "Tim" => 24 
}

Voila! We have a working ordered dictionary struct with basic mutability and a handy description. A more complete implementation is available on GitHub.