flooey.org

Dec 28, 2020

A December with Julia

I first did Advent of Code last year, and I decided then that I wanted to use it to learn a new programming language each year. Last year it was Rust, this year it was Julia. Unlike last year, this year I managed to complete every puzzle on the day it was released, and in the process I think I got a pretty good introduction to Julia.

For the most part, Julia lives in the same basic category as Python: an expressive, dynamically typed, garbage-collected language for writing small programs. Julia in particular has a focus on numeric analysis and scientific programming, which is also a niche where Python is heavily used with libraries like SciPy and pandas.

The quick summary of my experience is that I really like Julia. It’s thoughtfully designed in ways that make writing programs quick and efficient, while seeming to be more maintainable than Python, though I still don’t think I’d advocate writing large programs in it. Like Python, it has an expansive standard library with lots of helpful tools, though Julia’s is geared more towards numeric computing (for example, there is an entire linear algebra library). That said, the limits on how deep you can get into a language just writing Advent of Code solutions are real, so this is just a first impression.

Rather than walk through the basics of the language, which you can easily find elsewhere, I thought I’d discuss the areas that I particularly liked or disliked, from the major to the trivial.

Like: Multiple dispatch

Julia is based heavily on multiple dispatch, which means that a given function name can have multiple implementations that differ in the type of its arguments, and the correct implementation is selected based on the parameters provided. Since Julia is also dynamically typed, that selection happens at runtime, though the compiler may use some tricks to narrow the possibilities ahead of time, I don’t know.

It turns out that this works out really beautifully. Among other things, despite Julia having type hierarchies, it seems like it discourages deep hierarchies and a lot of the debates about composition-versus-inheritance. I’ve gradually soured on the usefulness of complicated OO hierarchies, and one thing I enjoy about Go is the use of interfaces for making interchangeable parts instead of inheritance, and multiple dispatch seems like an even better way to go about that.

One awkward consequence, though, is that code completion becomes a lot less useful. Imagine you’re wanting to add an object to a container, and you remember that the function you need is called push!(), but you can’t remember which order the arguments go in. So you type out the function, and VS Code’s IntelliSense helpfully provides you with this info popup:

That indicator on the left shows you that there are 40 implementations of push!() available, and the first takes a Base.Nowhere (a type you can’t use) and a Core.Any (the root of the type hierarchy). That’s… not helpful. Some way to make the most relevant or indicative implementation show up first would go a long way here.

Like: Shared concept, shared name

One reason Julia’s use of multiple dispatch is so nice is the thoughtful way the standard library has been constructed. Shared implementations of an overall concept are all bundled under the same name, which makes code both more reusable and easier to write. This may be a case where my mathematical background biases me, but there are a lot of cases where what could be a very narrow interpretation of a name is avoided and instead an expansive one is instead, ultimately making the language more useful.

My best example of this is the max() function, whose behavior is familiar from most languages. Or at least, its behavior on numbers and strings is familiar. But one of the Advent of Code programs involved a Conway’s Game of Life in 3D (and later 4D), where the Julia type needed is CartesianIndex. What should max(CartesianIndex(3, 1, 2), CartesianIndex(2, 3, 1)) return? In Julia, it returns CartesianIndex(3, 3, 2).

This violates what could be a maxim in another language: max() should always return one of its arguments. However, following that would mean that it fails on partially-ordered types like CartesianIndex (though CartesianIndex is totally ordered by < in Julia, but whatever). With Julia’s implementation, though, it really provides a least upper bound on the arguments, which is more useful. It lets you write code like for i in min(a...):max(a...) to loop over all CartesianIndexes within the N-dimensional bounding box that contains all your points: exactly what you need for Game of Life in an infinite 3D space.

Like: Type assertions, not hints

In Python, you can decorate variables, function parameters, and so forth with types, but they don’t actually do anything. Static analyzers can use them to tell you when you’ve screwed up your typing, but at runtime they have no effect, and actually getting a static analyzer set up properly is a lot of trouble for a small program. Julia has similar functionality, but it actually enforces types at runtime: assigning to a variable annotated with a different type triggers an exception, including things like storing an item of the wrong type in an Array. This is way more useful than just hints and I found myself using types frequently to catch bugs early.

This is one of the features that makes it feel more maintainable than Python. Type assertions are useful, which means you’re much more likely to add them on their own merits, unlike in Python where they always feel like a chore to me.

Neutral: Crowded namespaces

The Base module, which is the module that contains most of the commonly used functions and is automatically imported into your module, is surprisingly crowded. It includes names like one, something, devnull, retry, functionloc, and code_lowered, among hundreds of others. I’m not really sure how I feel about this, but it’s surprising. I basically never had to import anything, which is convenient. I was also constantly worried about stomping on some name in the Base namespace, but the name resolution rules make it so that largely doesn’t become a problem.

Dislike: 1-based indexing and closed ranges

These two are related, but Julia uses both 1-based indexing for strings and arrays and closed range intervals, rather than the more typical 0-based indexing and half-open intervals. While this is sometimes a little more convenient for small problems, such as needing to translate “the ID starts at the fourth character of the line” into code, it makes more complicated operations quite annoying. In one chunk of code (which didn’t end up making the final version), I was using a regular array as a circular buffer, and I had to index it as (i - 1) % length(array) + 1, which is just ridiculous.

Julia does have the ability to specify that an array should use different indexing, but that doesn’t change the fact that intervals are still closed, and it also means your programs have a mixture of different indexing. This seems way more dangerous than just using the same (bad) indexing for everything, so unless you have some very dense code that makes much more sense with 0-based indexing, it doesn’t seem advisable.

Like: Appropriate use of magic

Julia has a few places where the authors have decided that a little sprinkling of syntactic sugar would make a common operation more straightforward, and I think basically all of them have merit in a language like this. The most magical is begin and end, which are normally keywords that introduce blocks and new scopes, but can additionally be used in array indexing operations to indicate the first and last valid index, so you can write a[begin:end-1] to get a collection without the last element. This must be tricky to parse (especially since begin and end can legitimately show up in expressions already), but it’s nice to have and makes the code using it pretty clean.

Like: High-quality REPL

This is just table stakes for a new language. Even C# has a reasonable REPL these days. Julia’s is quite good: nice presentation of values; good handling of multi-line operations like function definitions, especially in history; tolerable completion.

Dislike: Non-ASCII symbols as operators

It’s cute that you can write e ∈ a to check if e is an element of a, but I think it makes programs harder to write and the language harder to pick up for little benefit. It appears that all of the operators that do this also have functions with ASCII names to do the same thing (and the REPL allows typing the symbols using LaTeX syntax, also cute), but none of the documentation even mentions that, so you can only discover those if you go look in the reference manual for the operator’s definition.

Like: Immutability by default

Many things (most notably, structs) in Julia are immutable by default, and most functions that work on mutable objects return copies rather than modifying the original. There’s a convention that functions that modify their arguments are named with a !, so you have sort(), which produces a sorted copy of something, and sort!(), which sorts it in place. (And immutable types simply have implementations for sort() but not sort!()). This is great for maintainability in larger programs as well as safety under concurrency, and the convention calls out any modifications quite clearly.

Dislike: Documentation

Julia’s documentation is only okay. Some of the introductory material is really quite good, but it’s definitely aimed at scientists rather than software developers. For instance, the manual doesn’t appear to include the concept of a dictionary, and the “Noteworthy differences from other languages” section begins with comparisons to MATLAB. I wish they had a parallel track for software developers switching to Julia.

In addition, the reference manual leaves out a lot of details. For example, the entry for CartesianIndex doesn’t mention the max() behavior I liked above, nor does it mention that they’re totally ordered under < (apparently reverse lexicographically). The entry for max() similarly gives no indication it might be smarter than first assumed. StackOverflow or random blog posts appear to be the only places you can learn these things, which is a sad state of affairs.

Conclusion

Overall, I ended up with quite a good impression of the language. Once you get over a few hurdles like figuring out how to write a dictionary, I ended up having code that was as easy to write as Python but felt both cleaner and safer. Given that the language is so much younger, I suspect that the available libraries are probably a lot less mature for complicated topics like networking or crypto, but for the kinds of thing I typically want a small scripting language for, it really fits the bill for me. For the next little while, I’m going make an effort to reach for it first whenever I have a small problem to solve and see if it continues to impress or if I end up going back to Python.