23 July 2007

A Question (and an Answer, sort of) About Scheme FFIs

Recently, I received the following email from Raoul Duke:

hi, I'm trying to learn up on what Scheme has the easiest FFI for talking to C (more ideally C++, but I assume nobody really does that). I've been reading up and it seems like PLT isn't bad. I came across your blog - might you have any opinions or recommendations about PLT vs. Chicken vs. Bigloo vs. Gambit vs. etc.? many thanks!
Raoul generously allowed me to post his original mail here along with my response. I hope that it will be useful to others who have similar questions. I hope that I have my facts correct (I have some experience with this), but if I've made a mistake please correct me. Also, please feel free to post your opinions on this topic in the comments! My response:

Raoul,

I do have extensive experience with the FFI from PLT Scheme, and some experience with Chicken's, Bigloo's and Gambit's. Do you mind if I post this message and your original request to my blog (unless you request otherwise, I would leave out your email address, but include your name)? I think this is something that many other people would be interested in, too.

Essentially, the FFIs split into two types:

PLT
Access everything dynamically from Scheme (i.e. at runtime).
Chicken, Bigloo, Gambit
Write some Scheme code which, when *compiled*, generates C stubs that wrap a given C library.

(Chicken has a dynamic FFI, too---see the lazy-ffi egg here---but most people don't use it.)

Personally, I favor the PLT approach. The following is all you need to do, for example, to use the ever-popular daxpy from the BLAS library (if you really want to do this, you should check out the plt-linalg PlaneT package):

(module demonstrate-ffi mzscheme
  (require (lib "foreign.ss"))

  (unsafe!)

  (define daxpy
    (get-ffi-obj 'daxpy (ffi-lib "libblas")
                 (_fun _int _double* _f64vector _int _f64vector _int -> _void))))

Run this in DrScheme under the "module" language, and you've got daxpy. Of course, since you're doing all this binding at runtime, it's really easy to do all sorts of fancy stuff because you're in the Scheme world the whole time. See this paper for a discussion of the design that went into PLT's FFI. I cannot emphasize enough how great it is to be able to stay in the Scheme world when importing C functions.

The drawback: it doesn't do C++ at all. To bind to C++, you will have to write some code with extern "C" linkage to produce stub functions which you can bind using the FFI. That is possible to do, but it could be a pain, depending on how much you want to wrap. I think the PLT guys themselves did this with wxWindows in order to get their MrEd graphical framework; you might want to ask on the mailing list about this.

The alternate approach (used by Chicken, Bigloo, and Gambit) is not without advantages. Chicken, for one, can bind directly to C++ (at least to a large subset of C++); it imports C++ classes as tinyclos objects. And Chicken, in particular, provides a really nice C/C++ parser (see the easyffi egg here) that will automatically generate a lot of your bindings for you (which is something that PLT scheme will not do). Bigloo, also, has a tool called cigloo (though it's probably now named bglcigloo) which parses C header files and outputs the proper foreign declaration. I've had more trouble with it than with the Chicken tools---it gets confused easily. For both, I have found it's better to pipe or copy and paste only some pieces of the header into these tools so they don't get confused.

Gambit doesn't have a FFI parser (or at least I haven't found one for it); you just write the input and output types in a (c-lambda ...) special form.

Since all of Chicken, Bigloo, and Gambit require you to compile the FFI declarations before you can use them, you'll have to make sure that you have all the proper include and linking flags on the command line in order to make your library work---this can be a big drawback. In PLT, the only thing you need is to find the actual object file(s) which make up your library. You can do this using the extensive file-searching/existence testing, etc, code in PLT, and you can do it at runtime, so it's easy to keep searching for the file in lots of strange places, in case your users don't always put their libraries in /usr/local or whatever.

In short: I heartily recommend the PLT approach if you're willing to write some extern "C" stubs for your C++ code. If you're not, then I recommend Chicken, as it can (probably) handle the C++ natively. The only reason I see to use Bigloo or Gambit is if you require serious speed in the Scheme part of your code; in that case, use Bigloo if you are going to write C++ style code in Scheme, and Gambit if you need things like call/cc.

Hope this helps! Please feel free to ask more questions if any occur to you.

Will

5 comments:

Jens Axel S√łgaard said...

Nice post. This is just a quick comment, to say that the two types of FFI aren't am either-or. PLT Scheme also supports the "good old" way:

PLT FFI Docs

James said...

But when doing anything complex with FFI's you almost always need to provide a translation layer that not only ensures valid type conversion, but usually provides higher-level functions that make it much easier to use the library from Scheme.

I've never used PLT's FFI, but I have used Gambit's and I love it. You almost never load in a shared object in your FFI either; you link to it statically and create your own Scheme shared object which you deploy with your application. That also lets you compile your FFI and use it dynamically from the REPL.

Will Farr said...

@James: you say "But when doing anything complex with FFI's you almost always need to provide a translation layer," and it's definitely true. One of my favorite things about PLT is that the syntax for describing function types allows you to put the translation layer (for the typical cases) right into the type definition. For example, I have a bunch of functions which take a pointer to the beginning of an array of objects. On the scheme side, such pointers are represented as a type-safe "cvector"; on the C side, you need to pass both the pointer and the length of the array. Here's the relevant function type description:

(_fun (vec : _cvector) (_ulong = (cvector-length vec)) -> _double)

Basically, the FFI lets you name arguments to _fun's as you go along, and then construct values for other arguments (or even return values) from those names. The generated Scheme wrapper accepts only one argument, while the C code gets two. No need to wrap the base C layer and then manually write your own translation on top of it!

James said...

Ah, I get what you are saying now (I never have looked at PLT's mechanism). As neat as that is, you could do that in Gambit like so:

(define %%compute
(c-lambda (int) cvector int "compute"))

(define %compute
(lambda (v) (%%compute v (vector-length v))))

It may be slightly more verbose, but I like it in this case because I think it's clearer.

offby1 said...

I'm pretty sure Guile has an FFI, and I'd bet it's easy to use.

And don't overlook Java -- SISC might not have an FFI, strictly speaking, but it does (so I hear) let you get at all the goodies in the various Java libraries, so it's perhaps as useful.