Closures are the Generics for Go
Part Two in a series on Go
How do you solve the Go generics problem?
Go is a language with an intentionally restricted feature set; one of the features that Go leaves out being user-defined generic types and functions. There are, broadly speaking, two things that people use generics for.
The most common request is for generic data structures. There are times when we need something more complicated than a slice or map. In Go, you have two choices — either you can write the data structure over and over for each type that you need (or use a code generator to save some typing) or you need to define your data type in terms of interface{} and cast values when they are returned from the structure.
The other usage of generics is functional. Rather than just storing different kinds of values, you sometimes need to run the same algorithm over different data types. Unlike generic data structures, there is an approach that feels like The Go Way for this usage.
Generics in checks-out
As an example, let’s look at Checks-Out, a fork of LGTM, an open-source project that helps teams enforce code review policies via Github. You can find checks-out at https://github.com/capitalone/checks-out. As part of its integration with Github, Checks-Out calls several Github APIs that return lists of different things, including teams and users.
For example, here is the original getTeamMembers function:
A Not Very DRY Solution
This code was working well until a new user of Checks-Out tried to register a project in an organization with 45 teams. It turns out that all of the List* API calls are actually paginated, with a page size of 30. One-third of the teams for the new user weren’t ever loaded, which resulted in errors when the teams on the second page were referenced.
When this happened, we had to add pagination support to checks-out.
Our first fix looked like this:
And:
While this approach fixes the bug, it does make the code a bit longer and repeats itself quite a bit. Is there a way to address this?
ListTeams and ListTeamMembers both paginate in very similar ways, but they expect different parameters and return very different data. If these were the only places in the code where we had to paginate, we might have left it as-is. But there are an additional ten locations in the code with the same issue. We didn’t want to repeat pagination code over and over again with minor variations.
A Quick Detour to Java
I’ve spent most of the last two decades programming in Java. In Java, you would solve this problem with a generic method that looks something like this:
The initial empty List is passed in as a parameter along with the data that’s used to make the remote call. Loader is an interface that looks like:
Unlike Go, we can’t return multiple values from a Java method, so LoaderInfo is created as a simple wrapper class:
With this code in place, we can pull down all of the teams with:
And we can pull down all of the users in a team with:
This solution shows that the power of generics is more than just type-safe containers; generics allow you to abstract the same algorithm over multiple data types.
Back to Go
But Go doesn’t have generics. How do we elegantly solve this pagination problem?
We could use interface{} as a way to pass untyped input and output parameters around but that misses the point. It creates ugly code that requires casts and subverts the type system that helps us write correct code.
Looking at the two GitHub API calls again, there are some data types in common. Both have *github.ListOptions as an input parameter (directly in the case of ListTeams and as an embedded struct in ListTeamMembers), and they both return *github.Response and error. Is there a way to pass along the parameters that are the same while still referring to the parameters that are different?
One of the most powerful features in Go is closures. A closure looks like a function declared inside of a function:
Calling outer with “hello” as the parameter will return 10:
You can try it out yourself at https://play.golang.org/p/U3HV-0nvTB
At first, it might seem silly to declare a function inside of another function. Why split up the logic? The power of closures come from two things. First is their ability to refer to and modify variables declared in the outer function. Second, closures can be passed to other functions and even returned from their declaring function. When combined with references to local variables, this leads to some interesting behavior:
Calling outer2 returns 20:
The code can be run at https://play.golang.org/p/dNmhg_6x9T
Look at outer2. Its local variable total was modified when myClosure was passed to helper and called from there. There’s no reference to total in helper, but using a closure allowed it to be modified. Just like structs in Go, closures have state. This state provides the solution to our problem.
Putting It All Together
Using the closure technique, we can generate a slice of the type we need and modify that slice in a closure that’s passed to a function which does the pagination looping. Our closure needs one argument, the *github.ListOptionsthat has the current page info, and returns two values, a *github.Response and the ubiquitous error. The function looks like this:
And the subsequent code looks like:
And
The final change we made is factoring out the code to get all of the Teams into its own function. Besides getTeamMembers, there is another function that needs a slice of *github.Team, so rather than repeating ourselves, we once again share code:
The final version of our getTeamMembers function is:
Sorting with Closures
The closure approach can be used many places in Go code where you would turn to generics in other languages. Another example is sorting. In order to use the sort.Sort function, you need to pass it an implementation of the sort.Interface interface. Rather than creating a new type every time that you need to sort a slice, you can use the following code:
Then sort with:
Try it out at: https://play.golang.org/p/hYMcQ81AvN
A variation on this technique was added in Go 1.8 with the new sort.Slice function.
Half A Loaf
So does this mean that we don’t need generics at all in Go? Not entirely. We can abstract over generic functions, but we still don’t get additional user-defined generic data structures.
The biggest drawback to using closures as a replacement for generics is that they work via side-effects. Modern software development discourages side-effects because they make it harder to reason about the flow of data. It’s easier to test and understand a program when its functions only rely on data that’s passed in as parameters and only modify data that’s created within the function. When a function relies on data that’s set up somewhere else in the program, and modifies non-local data, there is an invisible dependency that needs to be documented.
When you pass around a closure, you potentially turn local state into non-local data. That data needs to be managed. It can’t be called from multiple goroutines without proper locking. And there’s no way to know that a passed-in function is a closure; you need to understand how the closure flows through your program.
In the code above, it isn’t clear how to use buildCompleteList correctly, because the method’s input and output parameters don’t indicate how the data is created or stored. You need to look at existing uses of the function to see that the closures passed into buildCompleteList are modifying variables within their declaring function.
However, this doesn’t make closures bad; it just means that you need to use them when appropriate and be careful. Rather than pine for generics in Go, we can use the tools it does have to help us work around this limitation.