Eric Holk

You wouldn't like me when I'm not coding.

Rust Project Updates

Happy New Year!

With the recent release of Rust 0.9, I’ve decided to start a tradition of tagging releases for my Rust projects to coincide with releases of the Rust language. Although things like Rust CI have helped keep Rust code running as the language matures, there’s still some frustration in using Rust projects if you’d rather not always run the latest master build. Sometimes, it may even be impossible to make two projects work together if their maintainers do not update at the same pace. By tagging releases with official Rust releases, it will become much easier to always find a version of my code that works with the latest release of Rust.

Without further ado, here are the projects:

  • rust-opencl – OpenCL bindings for Rust. Thanks to Colin Sheratt and Ian Daniher for their contributions and help keeping this code running.
  • rust-papi – Bindings to the PAPI performance counters library for Linux.
  • SciRust – Some linear algebra routines.
  • Boot2Rust – A small UEFI program that lets you boot and run nothing but Rust (and all the UEFI firmware stuff). See more in my previous post.

Besides shamelessly plugging my own software, I hope that this post will encourage others who maintain Rust projects to do the same. As the Rust community grows, more people will want to stick with official releases, and these are much more valuable when most of the Rust projects have easy-to-find versions that work with these releases.

Continuation Passing Style Interpreters

In my post on Scheme debuggers, I introduced a simple Scheme interpreter. Unfortunately, the interpreter was structured such that it was hard to make a terribly sophisticated debugger. While we won’t improve upon our debugger much today, I do want to look at a different style of interpreter that should enable more advanced debugging features.

This new style is called continuation passing style. You can think of a continuation as what to do next in a program, and so continuation passing style means we make the continuation an explicit argument of a function. This means we can do a lot more with the control flow of the program, which is important in a debugger, for example, since we want to be able to pause and inspect the execution at arbitrary points.

We’ll continue with a quick introduction to continuations and continuation passing style, then look at how to apply this to our interpreter. Once that is done, we will see how to implement call/cc in our new interpreter.

Booting to Rust

A couple nights ago I was looking over the UEFI spec, and I realized it shouldn’t be too hard to write UEFI applications in Rust. It turns out, you can, and here I will tell you how.

The thing that surprises me most about UEFI is that it now appears possible to boot your machine without ever writing a single line of assembly language. Booting used to require this tedious process of starting out in 16-bit real mode, then transitioning into 32-bit protected mode and then doing it all over again to get into 64-bit mode. UEFI firmwares, on the other hand, will happily load an executable file that you give it and run your code in 64-bit mode from the start. Your startup function receives a pointer to some functions that give you basic console support, as well as an API to access richer features. From a productivity standpoint, this seems like a win, but I also miss the sorcery you used to have to do when you were programming at this level.

Booting to Rust is a lot like writing bindings to any C library, except that the linking process is a bit more involved.

Improving the Performance of Harlan’s FFI

My last post showed that it’s now possible to call code written in Harlan from C++ programs. Sadly, the performance numbers I posted were pretty embarrassing. On the bright side, when you have a 20-30x slowdown like we saw before, it’s usually pretty easy to get most of that back. In this post, we’ll see how. The performance still isn’t where I’d like to be, but when we’re done today, we’ll only be seeing about a 4x slowdown relative to CUBLAS.

Using Harlan in C++ Programs

So far, Harlan programs have primarily existed in a vacuum. You’d compile them, run them, and they might produce some output. Certainly none of them received any input from the outside world. Most of the test cases use small sets of data, and the larger benchmarks generated incredibly synthetic data, like a vector of 16 million ones. My focus has been on building the compiler itself, so this has been a tolerable situation up to this point. However, Harlan is at the point where it needs more realistic applications and it’s clear the foreign function interface (FFI) story just won’t cut it anymore.

I’m happy to report that it’s now possible to pass non-trivial amounts of data from C++ to Harlan. Two new features made this possible. First, there are library functions like unsafe-deref-float and unsafe-set!-float which allow reading and writing from raw pointers. Second, there’s a new primitive form called unsafe-vec-ptr which gives a raw pointer to the contents of a vector. These are very low level, but they give us the tools we need to build a reasonably usable FFI. Let’s see how to use these to implement a dot product in Harlan and use it from a C++ program.

How to Write a Simple Scheme Debugger

A while back, Rich Loveland asked how one might write a Scheme debugger. I realized that I’ve written many a Scheme interpreter, but I’ve never really thought about how to write a debugger. This post is a first step down that path. We’ll write what I’d consider a minimally functional debugger. It will allow you to break a running program into the debugger (albeit by invoking a primitive function in the program you’re running) and then inspect the variables that are available. As an added bonus, you’ll be able to evaluate arbitrary Scheme expressions from the debugger and even change the values of variables. For the moment, however, we will not support stepping program execution, which is admittedly an important feature of debuggers.

We’ll start by writing an interpreter for a small but interesting subset of Scheme. Then we’ll show see how to add debugging features to the interpreter. As usual, you’ll be able to see all the code on Github.

Visualizing the Turing Tarpit

Jason Hemann and I recently had a paper accepted at FARM called “Visualizing the Turing Tarpit.” The idea grew out of a talk that Jason did at our weekly PL Wonks seminar on the minimalist programming languages, Iota and Jot. At the end of the talk, Ken Shan asked whether this could be used to do some kind of cool fractal visualization of programs. That night, several of us pulled out our computers and started hacking on Iota and Jot interpreters.

Iota and Jot are examples of Turing Tarpits, that is, languages “in which everything is possible but nothing of interest is easy.” The term comes from Alan Perlis. Turing Tarpits have some utility. The Lambda Calculus is arguably a Turing Tarpit, and yet it is quite useful in the study of programming languages and computability. Iota is notable as a Turing-complete language which consists of only two symbols. For example, i, *i*ii, and **iii are all legal programs in Iota. Sadly, some strings, such as i*i are not legal. This makes Iota less than ideal for enumerating many programs, as we can’t choose arbitrary strings but must instead be sure we follow the grammar. Jot was designed to fix this. Jot has the property that any binary string is a legal program.

Using Jot, it’s now incredibly easy to enumerate large numbers of programs. The next step is to plot the behavior. We chose to see how many steps a program executes for before terminating (or until we give up, as not all programs will terminate), and then use this to generate a color for a certain point. Below is an animated example of what we ended up with.

Why Write Compilers in Scheme?

One of the questions Klint Finley asked me for the Wired article about Harlan was “Why Scheme?” I wasn’t really satisfied with my answer, so I thought I’d answer it more completely here. Besides the fact that we were already experienced writing compilers in Scheme, Scheme has a lot of features that are very useful to compiler writers.

Why Is Harlan Called Harlan?

One of the more unexpected things to have happened after releasing Harlan was that I was contacted by a couple of people who are named Harlan. One of the common questions about Harlan is actually where the name comes from, so I thought I’d take the time to tell the story here.

A couple years back we started working on the design of a new language to handle GPUs. One of the initial observations was that people seem generally pretty content with languages like C or even Fortran for specifying computational code, but one of the difficulties in GPU programming is managing the data transfer between host and device memories and also within the memory hierarchy on the GPU. It seemed like this was an area that was ripe for a new language to address. Thinking about it some more, we realized the computational kernels really decide when data needs to be where, so we should let our language schedule data transfers around kernels. Indeed, the language we designed has no way of explicitly doing data transfers, but instead these are driven by the computations being done in kernels.

It became clear that kernels would play an important part in the language we were building. The word “kernel” happens to sound a bit like “colonel,” and one particularly famous colonel is Colonel Sanders. My advisor mentioned that Colonel Sanders’ first name was Harland, but I misheard this as Harlan, and by the time we figured this out, the name Harlan had already stuck. This is also why all of the test cases have the file extension .kfc.

It’s fitting, in a way, that we accidentally dropped the last letter from the name, given that a similar thing happened with Scheme, the language in which Harlan is implemented. Scheme was supposed to be called Schemer, but the operating system at the time limited filenames to only six characters, so the “r” was dropped.