08 January 2009

Profiling Using MzScheme and Errortrace

Lately I've needed to do some profiling in PLT Scheme, and the usual process in DrScheme hasn't been working for me. If you don't know, here's the usual process:

  1. Language --> Choose Language
  2. Show Details
  3. Debugging and Profiling
  4. Run Program
  5. View --> Show Profile
Robby suggested using the errortrace library directly from mzscheme. It worked perfectly. Here's how you can duplicate my success. (More information is available in the errortrace docs.)

First, get your program (or the parts of it you want to profile) in shape to be run easily from the REPL. I already have a run-tests.ss module which runs all my tests and exercises the code in a pretty realistic way, so I just entered (require "run-tests.ss"); you should do something similar. Then:

  1. Fire up mzscheme:
    mzscheme -i -l errortrace
    
    The -i tells mzscheme to start in interpreter mode; the -l errortrace says "load errortrace first thing".
  2. Set a couple of parameters for errortrace:
    (instrumenting-enabled #t)
    (profiling-enabled #t)
    (profiling-record-enabled #t)
    
  3. Run your program. For example:
    (require "run-tests.ss")
    
  4. Output the profile information:
    (output-profile-results #t #t)
    
    Or:
    (with-output-to-file "profile.dat"
      (lambda () (output-profile-results #t #t)))
    
    The two #ts tell errortrace to output the path-to-function-source information and to sort in terms of total time taken.
And that's it! I now know where my slow program is spending its time (and I think I even understand why). Fixing it is another matter, though....

03 December 2008

Interpolation in OCaml

I just wrote some code to do polynomial interpolation in O'Caml. The full darcs repository is at http://web.mit.edu/farr/www/onumerics/; here's the guts of the code. Eventually (depending on my time and level of continued interest), this may turn into a more general O'Caml numerical library. As usual, it's released under the GPL.

(*  interp.ml: Interpolation.   
    Copyright (C) 2008 Will M. Farr 

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program.  If not, see .
*)

let interp xs ys x = 
  let n = Array.length xs in 
  assert(Array.length ys = n);
  let ytab = Array.make n ys in 
  for i = 1 to n - 1 do 
    let m = n - i in 
    let last_col = ytab.(i-1) and 
        this_col = Array.make m 0.0 in 
    for j = 0 to m - 1 do 
      let xlow = xs.(j) and 
          xhigh = xs.(j+i) and 
          ylow = last_col.(j) and 
          yhigh = last_col.(j+1) in 
      this_col.(j) <- (x-.xlow)*.yhigh/.(xhigh-.xlow) +. (x-.xhigh)*.ylow/.(xlow-.xhigh)
    done;
    ytab.(i) <- this_col
  done;
  ytab.(n-1).(0)

Here's how it works. Initially, we have two arrays of points, xs and ys, and a point at which we want to evaluate the interpolated polynomial, x. This means we have an array of 0-order interpolating polynomials (the constants ys). We can use a linear interpolation between neighboring points to construct the values of first-order interpolating polynomials at x. Then we can linearly interpolate between the neighboring first-order polynomials to construct second-order interpolating polynomials. We iterate this process until we have a single value, which is the value of the n-point interpolating polynomial at x. In the case of four points, here is a graphical representation of the procedure:

x0: y0
       y01
x1: y1     y012
       y12      y0123
x2: y2     y123
       y23
x3: y3 
The y's subscripts indicate the points involved in the interpolation. Each y is a linear interpolation of the y values above and below it to the left at the corresponding x points.

11 June 2008

Very Basic Matrix Library for PLT Scheme

I just uploaded to PLaneT a very simple matrix library. The library provides a matrix datastructure, iterators over matrices and vectors, algebraic operations on matrices, and matrix-vector and matrix-matrix products. There is also an initial module import language which provides all of scheme except the +, -, * and / operations, which instead are generalized for all reasonable combinations of scalar, vector and matrix operands.

I feel like there is a need for this sort of library in PLT Scheme, even though its set of operations is pretty impoverished. Maybe people can build on it to do actual linear algebra in Scheme. I always find myself re-implementing these sorts of trivial matrix operations (since they're simple, it doesn't take very long); maybe now I'll just grab the PLaneT library instead.

By the way, I'm particularly proud of the sequence nature of the matrix struct, and the for/vector, for/matrix, and in-matrix forms. These take advantage of the new machinery for user-defined iterations and comprehensions in PLT Scheme version 4 (have a look at the source code if you're interested in how it's done). (Version 4, by the way, represents a tremendous improvement from version 3, in every part of the system, but most particularly in the documentation tools---have a look at the new documentation for my library here to see what I mean.)

07 September 2007

Quick Note on Continuations to comp.lang.scheme

I just posted a quick note on continuations to c.l.s. I'm afraid it's pretty basic, and doesn't really do the topic justice, but maybe the original poster will find it helpful. I'm noting it here because no ever seems to explain continuations the way I did in my note (at least, I haven't seen any explanations like this), which uses "return points" to illustrate what continuations do. I don't fully understand continuations (in particular, I have a lot of trouble with the delimited continuations that PLT Scheme has recently acquired), but thinking of them the way I talk about in the c.l.s. post is the way I manage to muddle through using them. How do others think about call/cc? (i.e. what images are going through your head while you figure out how to do what you want with call/cc in code, not how do you formalize call/cc, or how you explain it to someone else, or ....)

15 August 2007

One More Reason to Love Module Systems and Hygiene

I just released an Automatic Differentiation PlaneT Package, and am now converting my numerical relativity code to use it. The package exports procedures like +, -, =, etc, which are modified from the standard ones so it can compute derivatives of numerical procedures which use them. Of course, these modified procedures are slower than their standard cousins.

My numerical relativity code uses all sorts of looping constructs from the excellent SRFI-42. These loops iterate over vectors, over ranges of numbers, lists, matrices, .... The iteration macros expand to code like (this is just an illustration---the real code is more complicated than this)

(let loop ((vector-index 0))
  (if (>= vector-index (vector-length vector))
      'done
      (let ((vector-elt (vector-ref vector vector-index)))
        <body>
        (loop (+ vector-index 1)))))
You'll note that this code uses +, >=, etc. It is acceptable for these functions to be the addition and comparison from my PlaneT package, since that package preserves the meaning of the numerical functions operating on pure numbers. But, it would be nicer if the + and >= came from the mzscheme language module. In this case, the compiler could inline the addition and comparison, and even if it didn't there wouldn't be the differentiation overhead.

No problem! PLT Scheme's module system takes care of this automatically: SRFI-42 is defined in the mzscheme language, so all references to + and >= in its macros refer to those bindings from mzscheme. So, even if the module using the SRFI-42 macros is written in a different language (like "deriv-lang.ss"), the expansion of these macros uses the correct procedures.

This is a combined effect of of hygiene and the PLT module system. Neither alone would reproduce the correct behavior. For example, in schemes without a module system which handles macros properly, it would be necessary to re-define + and >= at the top level in the deriv.plt package. Then, the hygiene of the macro expander would ensure that the + and >= in the macro expansion of SRFI-42 iterators referred to the top-level bindings (even if there were local bindings for these symbols at the macro use-site), but these would be the deriv-enabled + and >=, not the usual, optimizable mzscheme ones.

In short: I know of no other module system that would handle this so elegantly (in a current scheme system, anyway---the proposed R6RS module system would do just as well for this). (In fact, I think only the module system of Chez scheme, immortalized in the portable syntax-case expander by Dybvig and Waddell handles this at all. Unfortunately, it requires you to list the local variables which a macro can expand into manually along with the usual exports of the module, which is not nearly as convenient.) Hooray for PLT!

Automatic Differentiation PlaneT Package Released

I just released a PlaneT Package which, when used as a PLT Scheme module's initial language import, allows to compute derivatives of functions written in the module exactly. (I've written about the technique here and in the posts which follow.) I'm going to use this code to compute some derivatives of gravitational actions in code which would be extremely complicated to do by hand. Enjoy!

08 August 2007

Persistency and Lazy Memoization in a Pairing Heap

I just released a pairing heap PlaneT package. Because the datastructure is both persistent and has amortized space bounds, it requires a neat trick to implement, which I will discuss below.

A pairing heap is an adaptive datastructure for a collection of objects which supports the following operations [edit: You can also read about an alternative implementation here in the Scheme Cookbook. This implementation doesn't achieve the amortized bounds on remove-min which mine does in the presence of persistence]:

Operation Time Bound
min O(1)
insert O(1)
merge O(1)
remove-min O(n) worst case, O(log(n)) amortized.

The pairing heaps in my PlaneT package are persistent, which means that the above operations do not change the heap given to them, but rather construct a new heap which shares some structure with the old heap. For example, the (remove-min h) procedure returns a new heap which contains all the elements of h except for the smallest element.

Note that the time-bound on remove-min in O(n) worst-case, and O(log(n)) amortized. This means that, while a single remove-min operation can take time which is O(n), most will take much less. In fact, the longer remove-min operations are so rare that, in a sequence of an infinite number of remove-min operations, the average time per operation is O(log(n)). (This bound is only conjectured at the moment, but supported by a lot of experimental evidence. Because the pairing-heap adapts its structure to the use pattern it observes---a really neat feature, by the way---it is extremely difficult to rigorously prove bounds for it.)

It turns out to be difficult to ensure that amortized bounds still obtain with a persistent datastructure. The reason is illustrated in the following code snippet which creates a list of 1000 threads, each of which does something with a heap:

(let ((h (make-expensive-remove-min-heap)))
  (list-ec (:range i 1000)
    (thread
     (lambda ()
       (let ((new-h (remove-min h)))
         (do-something-interesting-with new-h))))))
A persistent datastructure can have multiple futures! But, when proving amortized bounds, one typically exploits that the cost of an expensive operation (remove-min here) can be "accounted" to a sequence of prior cheap operations (basically, make-expensive-remove-min-heap must do a lot of inserts, each of which can "carry" a bit of the cost of the upcoming remove-min operation, so that you know that expensive remove-mins are very rare, and the cost averages out to O(log(n))). That style of accounting (often called the "Banker's Method") works if you know that only one remove-min will occur in the future of a given datastructure, as would be case if we were using ephemeral datastructures---in this case, the first remove-min would destroy the datastructure, so we could only perform it once. But we're not using ephemeral datastructures: each of the 1000 threads created above will re-perform the expensive remove-min on the same h, and we'll be in a pile of trouble, time-bounds-wise.

The solution to this problem is to use lazy evaluation with memoization. When constructing a heap, queue up the remove-min operation using delay. When you need to remove the minimum element, use force. Because delay and force memoize the result, you can guarantee that remove-min operations after the first will complete in O(1) time, which maintains the amortized bounds in the presence of persistence.

This is, of course, not my idea. It (and much more) are explained in the delightful thesis of Chris Okasaki, and also in his reportedly-delightful (I've never read it, but it's very well-regarded) book, Purely Functional Data Structures.

By the way, if you need a bunch of datastructures (more than my simple pairing-heap), please check out the Galore PlaneT library by Carl Eastlund, Jens Axel Soegaard, and Stevie Strickland (either Version 2.2---which contains more datastructures, but also more bugs and isn't as clean---or Version 3.6---which is the most modern version). It's what I use when I need a serious finite map or set.

06 August 2007

I Just Voted "Yes" on R6RS

Yep, I did. I'm including my explanation below, because I think that R6RS has come under a lot of fire lately. (By the way, for those who say, "why doesn't the standard come with a small base, and then lots of libraries attached to it?", it does! <snark>Try reading it, why don't you.</snark>)

Even if you don't agree with my choice, don't forget to vote if you signed up (just one of many reminders you'll surely get until voting closes Sunday)!

My explanation:

I would gladly trade a report full of mistakes---not that I think this is such a report, but some do---for a library system which properly handles hygienic macros. Therefore, I am voting "Yes" on R6RS. (The exception specification is just icing on the cake, as far as I'm concerned.)

Some consolation for those who do think this report is full of mistakes:

  1. I expect we'll be seeing lots of "(r6rs base)"-compliant implementations which leave off libraries they don't like, and
  2. ...in most matters it is more important that the applicable rule of law be settled than that it be settled right.
    --- Burnet v. Coronado Oil & Gas Co., 285 U.S. 393, 406--408 (1932) (Justice Brandeis dissenting).