GSoC Report - Part 1

Originally when I had envisioned this project, the selling point was "a piecemeal rewrite of key features in Rust". So, how did that fare?

Well, it's subjective. When looking at what I learned from attempting this, it is a success, but in terms of actually doing? Not so much. The problem isn't that it can't be done, it's that it isn't the right approach for a project such as GJS.

A Very Brief Overview of How GJS Works

GJS is a complex piece of software that does some very low-level manipulation using various libraries; the GNOME libs (GLib and friends), libffi, and Mozilla's SpiderMonkey JS engine.

It uses the SpiderMonkey engine to load and run JS. The JS can then use many GJS specific functions via FFI to create and manipulate objects - Gnome objects via GLib or otherwise - for example (using snippets from the gtk.js example in the GJS source);

imports.gi.versions.Gtk = '3.0';
// Import internal native GJS modules, in this case, gi
const Gtk = imports.gi.Gtk;

function onDestroy(widget) {
    log("destroy signal occurred");
    Gtk.main_quit();
}

Gtk.init(null);

// create a new window
let win = new Gtk.Window({ type: Gtk.WindowType.TOPLEVEL });
//connect the 'destroy' event to call onDestroy()
win.connect("destroy", onDestroy);

// display the window
win.show();

// All gtk applications must have a Gtk.main(). Control ends here
// and waits for an event to occur (like a key press or mouse event).
Gtk.main();

const Gtk = imports.gi.Gtk;, this is a long-winded set up that does a rather large number of things, the root of which starts with imports.gi, but first lets go bottom-up and start from GJS initialization which sets this (amongst other things) up;

  • on running GJS, it initializes and calls a function to create a JSContext, this is internal state of the JS engine that our script will live in, a bit like a tab in a browser.
  • when this function is called, it also sets up the JSContext, and calls some more functions
  • one of these functions is importer_resolve, which is called as a property of imports.gi
  • in this chain is gjs_register_native_module, this function stores a GjsDefineModuleFunc (custom typedef) in a hashtable - in this case a callback to gjs_define_repo for gi
  • along the way when the javascript is loaded and parsed, repo_new (in gi/repo.cpp) is called, this starts a few other things which results in a JS:ClassOps object which is given to the context.
  • one of these things is calling gjs_import_native_module, which takes the context, name of import, and a JS:MutableHandleObject, it then looks up the name in the hashtable of modules
  • pulls out that callback and gives it the context and handle
  • this ends up being called, and sets up a variety of things for us to use in JS

So calling new Gtk.Window({ type: Gtk.WindowType.TOPLEVEL }); in JS gets the JSContext to call a few functions that were set up in gjs_define_repo, in this case creating a new Gtk window with the chosen properties - this is really calling some native C functions in Gnome libs at a low level. There are also some javascript in GJS/modules which sets up a few things too. I won't cover them here.

The reason I cover the above is that it gives an introduction in to the main point of GJS; to allow javascript via SpiderMonkey to call native functions to create native objects. This results in much pointer passing such as passing a pointer to the context, or a JS Object, to GJS functions, or creating native objects and passing pointers to those to JS (amongst many other things). It gets very complicated.

Please note: I've glossed over a huge number of things here.

To borrow from one Federico Mena-Quintero's slides at GUADEC; "C is a sea of aliased pointers"

Why a Piecemeal Rewrite isn't Right for GJS

The problem lies in "C is a sea of aliased pointers".

To effectively use Rust, you need to be able to use safe Rust. As seen in my last post, you can write unsafe Rust within an unsafe block, and wrap it in safe Rust so-long as you;

  1. Understand the contract the external function imposes on you
  2. Don't violate this contract in the unsafe block
  3. Enforce this contract in both the safe and unsafe Rust

A contract in this sense is the rules that unsafe (Rust or C) code imposes on you - it is up to you to follow these rules when using the code or it may result in undefined behaviour; such as de-referencing a pointer that has been freed. When you use such a thing within Rust and don't honour the contract, then you have broken your contract with Rust, that is; guaranteed safety.

We can rewrite a function in Rust, and where the Rust code calls through FFI, we make sure the contract to that function is honoured in our shiny new Rust code. But! Then in the case of GJS, this is likely going to be passed as a raw-pointer to another function within GJS - at this point the safety of Rust becomes moot.

To really get the benefits of Rust you need to be able to take ownership of the data that is created - this is a little bit of a complex scenario for us with the GJS codebase as it stands, and why I think a from scratch rewrite is better. With a piecemeal rewrite, the function that is rewritten in Rust will mainly be taking and giving raw-pointers since this is what the rest of the code-base does - and re-factoring these converted functions as the rewrite progresses. That's going to be a lot of refactoring in this case, the GJS code-base isn't small, and it is complex. So in this case while a piecemeal rewrite is doable, it isn't the right answer.

Data Ownership

I will try to illustrate the issues of data ownership across C/C++ with pointers, and Rust.

One of the areas I was gearing up to rewrite (and the reason I rewrote gi-ffi in Rust) is the gi/function.cpp source file. It is heavily used and could seriously benefit from a Rust conversion in code layout alone - there are many switch/case blocks and nested if/else blocks.

Lets take a look at one of the functions I was gearing up to rewrite. GJS uses GObjectIntrospection to dynamically construct bindings to various Gnome libraries via the use of .gir files - these are an XML representation of exported C which can then be compiled to .typelib. This reconstruction is done with GObjectIntrospection calls which in turn call libffi calls. Unfortunately for me the functions in GObjectIntrospection to do this aren't in the .gir I used to create the Rust bindings, so I ended up actually rewriting them in Rust.

In gi/function.cpp (this is what contains much of the C function reconstruction code), static bool gjs_invoke_c_function. Relatively straight-forward: it takes a JSContext passed in by pointer - this is owned by SpiderMonkey, a pointer to a Function, a JS::HandleObject is taken by value, some other objects, and a GIArgument.

One of the pointers created in this function is GIArgument *in_arg_cvalues;, this gets given a new memory allocation through g_newa() which is a typesafe alloc through GLib. This is an array of pointers to arguments, and this alloc is one of many. in_arg_cvalues then gets passed by pointer to other functions in function.cpp - it also has the addresses to its elements stored in GIArgument *ffi_arg_pointers;, this pointer along with a member of Function is passed to a libffi function, ffi_call which then does magic with it. What I'm trying to highlight here is that this is just one of many intermingled pointer scenarios, it's not too bad, certainly not the worst, but it displays two things; the need of clear ownership - pointers don't carry ownership, and passing many pointers - which means to gain full benefit from Rust I need to rewrite each of the functions called here, and that call this. No small feat. A rewrite here would also demonstrate quite well the need to honour contracts to external code and unsafe Rust. Philip Chemento mentioned when reviewing this post that g_newa() is a stack allocation and is hence very fast - Rust can match this, as it allocates all bare (standard) types to the stack by default and heap allocation is explicit with the Box<T> type. Exceptions to this rule are Vectors, arrays, ref counted variables and a few others.

A Converted Function

I did convert one function in gi/function.cpp, one that simply returns a *const c_char. Lets discuss that:

#[no_mangle]
pub extern fn format_function_name(function: &Function, is_method: bool) -> *const c_char {
    let string = CString::new(
        if is_method {
            format!("method {:?}.{:?}.{:?}",
                unsafe { g_base_info_get_namespace(function.info) },
                unsafe { g_base_info_get_name(g_base_info_get_container(function.info)) },
                unsafe { g_base_info_get_name(function.info) })
        } else {
            format!("method {:?}.{:?}",
                unsafe { g_base_info_get_namespace(function.info) },
                unsafe { g_base_info_get_name(function.info) })
        }
    ).unwrap();
    let ptr = string.as_ptr();
    mem::forget(string);
    ptr
}

This function:

  • is exported by Rust in the C style for use in C code
  • takes a &Function, which is implicitly a pointer converted to an immutable Rust borrow
  • a bool by value
  • returns a string, in C a *const char

This is very straight-forward, it illustrates that Rust can deal with pointers fine, and can use its safety to restrict how we use that pointer within the rust code. But there are some complex things going on here.

format! is a Rust macro which functions much like any string formatter. In this case it's taking two or three arguments, all which are unsafe calls to GObjectIntrospection functions which return a *gchar (corresponds to the standard C char type). format! does its thing, and then .ptr() is called on the result to return a raw pointer, which is then cast as *const i8. Nice and simple!

But who owns the string and how do we make sure it is freed? This is another thing that bogs us down in the pointer soup, since in the above example, Rust does not own the resulting string once passed out as a raw pointer, we now need to manually track ownership and make sure the string is freed.

In this case, I've used the existing utility function, GjsAutoChar, which is a wrapper around unique_ptr specifically for freeing string arrays (defined in gjs/jsapi-util.h). This works by the destructor calling g_free() on the *char when the variable goes out of scope.

However in a full Rust conversion format_function_name would own the string by default until the string is [moved][moved] - that is, the address of the r_val (return value) is assigned - we would simply create a Rust String object and return that to the callee. If we were to return the Rust equivalent of a pointer - a borrow (&) - then we would need to annotate the function signature with lifetimes that this borrow (pointer) would need so that it is dropped once it goes out of scope. You can see runnable code examples of this at rustbyexample.

TL;DR: gi/function.cpp was a wonderful candidate for a rewrite. But, in this case it was an all or nothing rewrite of 1500+ lines - a partial rewrite would have ended in much continuous re-factoring - not something I could have achieved in the remaining GSoC time-frame after research, wiring things up, rewriting some external dependencies and contributing to tools required.

Another example: a function in GJS calls creates a GObject through the FFI layer. The object is allocated, and a pointer to it given to the callee. In Rust we can convert this pointer so that we take ownership of the data, or just own the pointer, but then there is the issue of having to pass a raw pointer to the data to other functions - at which point we can no-longer guarantee safety, once the pointer is out of Rust's hands, all guarantees the Rust compiler gives us are lost.

In this example there is also the case of having to free the GObject too. If we owned the data in Rust, and we're not passing raw-pointers, then we can perhaps call the correct freeing mechanism on it at the end of the function that created it, or implement a custom drop trait for it which does this automatically once the data is out of scope. And what about if you've passed a raw pointer to it to another FFI function? You get the possibility of use-after-free.

As you can see there are a number of things to consider:

  • data creation and ownership
  • freeing the data created outside of Rust
  • whether or not a raw-pointer passed from Rust to another function is being held after drop

Other Issues

SpiderMonkey is seeing rather active development with regards to Rust bindings for it as these are used for the next gen browser engine, servo, which is written in Rust. Incidentally this actually makes a rewrite of GJS in Rust much easier - except for the fact that this development happens in the HEAD branch of mozjs. This means that I needed to port GJS to mozjs-55. I did back-port the Rust bindings to mozjs-52 (ESR), but this was going to prove to be more hassle than it is worth unless I can dedicate time to maintain it.

To Summarize:

Piecemeal rewrites are absolutely possible, and you don't need to look far to find projects doing this (librsvg comes to mind). But in my case, I was setting myself up to be continuously rewriting code as safety was introduced, and potentially getting very tangled up in pointers.

Because of everything outlined above and in previous sections, the goal of the project had to change so that I could in fact produce something useful - in a new branch I started to:

  1. introduce a wrapper to use unique_ptr for some GLib types that need g_free to release memory
  2. wrap more GLib types for use with unique_ptr in a similar fashion to GjsAutoChar - for example GjsAutoBytesUnref
  3. where possible, convert functions to take references instead of pointers
    • this will not be an easy task as in some cases ownership is not clear, and in many cases these references need to be passed to glib functions as pointers - meaning that this can be a pointer to a pointer and if the reference is a unique_ptr this can't be done.

The above is a work in progress available here.

This is only part 1 of my final GSoC report, part 2 is actively worked on and will be available soonish. In part 2 I try to discuss the use of unique_ptr, move semantics, ownership, and how how this can affect a code-base such as GJS which relies on many older C idioms or C libraries.

I have also published my GSoC Summary at this permanent link.