Here's an easy technique for automatically memoizing the results of method calls in Ruby. Let's say that we're interested in looking up instances of the Employee class by their employee ID numbers via a class method called find_first_by_empid. Furthermore, this lookup might have to hit the disk or the network, so we want to invoke find_first_by_empid at most once for each employee ID.

We might want to just store employees in a hash, indexed by employee ID. If the hash had an entry for a given ID, we'd know that we had already looked up the employee with that ID. If not, we'd know that we needed to look up the employee with that ID and then place the employee object in the hash. This solution is straightforward, but could prove tedious to implement in practice. Fortunately, Ruby provides a mechanism to make this essentially transparent.

When you create a Hash in Ruby, you can create it in one of several ways. You can just call the constructor Hash.new (or {}), in which case attempting to look up a key that is not in the table will return nil. You can provide a single default object as an argument to the constructor, which will be returned instead of nil as the value for keys not in the table. Finally, you can provide a block that describes how to calculate the value for a missing key. This last option is interesting for the memoization application, since it mimics a straightforward implementation of call-by-need.

The above code snippet shows an example of this technique. When we ask employees for an employee ID that it doesn't already contain, it will invoke the block with itself and the missing key as arguments. In executing the block, employees will insert a key-value pair in itself corresponding to the missing key and the result of the method call with that key as an argument.

There are some caveats here. First, you need to make sure that your arguments are hashable. It is also somewhat clunkier to deal with multiple arguments, since they must be in a list. Also, it must be safe to memoize the function: that is, as long as the hash exists, the memoized method must return the same value every time it is invoked with a given argument. This technique isn't perfect (and almost certainly isn't novel), but it is a pretty slick way to keep from repeatedly invoking expensive methods and depends on nothing more than Ruby's standard library.

Updates to the wallaby API

| No Comments

If you've built tools on the Wallaby API, you may be interested in some recent changes to the API; these are currently in source control and will appear in the upcoming 0.4.0 release of the wallaby agent and support libraries.

If you've been using a version of the Wallaby agent prior to 0.3.5, you'll want to note that the Wallaby classes are now in a different QMF package. In older versions, the QMF package name was mrg.grid.config; now, the QMF package name is com.redhat.grid.config. (The ruby module names implementing Wallaby still begin with Mrg::Grid, but this is unlikely to affect most users.)

If you've been using version 0.3.5 of the Wallaby agent, then you'll want to note that most of the "getter" methods have been replaced by QMF properties. This should simplify most console applications. We've also renamed some methods to make their functionality clearer. The main changes are as follows:

  • The Feature#getName method is now the name property
  • The Feature#getFeatures method is now the included_features property
  • The Feature#modifyFeatures method is now called modifyIncludedFeatures
  • The Feature#getParams method is now the params property
  • The Feature#getParamMeta method is now the param_meta property
  • The Feature#getConflicts method is now the conflicts property
  • The Feature#getDepends method is now the depends property
  • The Group#getMembership method is now the membership property
  • The Group#getName method is now the name property
  • The Group#getFeatures method is now the features property
  • The Group#getParams method is now the params property
  • The Node#getIdentityGroup method is now the identity_group property
  • The Node#getMemberships method is now the memberships property
  • The Parameter#getType method is now the kind property
  • The Parameter#setType method is now called setKind
  • The Parameter#getDescription method is now the description property
  • The Parameter#getDefault method is now the default property
  • The Parameter#getDefaultMustChange method is now the must_change property
  • The Parameter#setDefaultMustChange method is now called setMustChange
  • The Parameter#getVisibilityLevel method is now the visibility_level property
  • The Parameter#getRequiresRestart method is now the requires_restart property
  • The Parameter#getDepends method is now the depends property
  • The Parameter#getConflicts method is now the conflicts property
  • The Subsystem#getParams method is now the params property

For full documentation of the Wallaby API, please see the API documentation link on getwallaby.com.

Here's a cheap and cheerful little script I threw together to automatically generate Markdown-formatted documentation for my QMF methods. I used this to make the Wallaby API documentation.

I'm pleased to announce yesterday's release of SPQR 0.3.0, which is available as source, on github, or as a RubyGem. (SPQR is a library to make it painless to publish Ruby objects over the Qpid Management Framework; more details are available elsewhere on this site.)

There are several new features and enhancements in this release, including the ability to access the current Qpid user and context from within managed methods. The most exciting new functionality, though, is support for QMF events. In order to use QMF events, you'll need a fairly recent version of the Qpid and QMF libraries from their source repository (later than revision 929717 for nearly-complete support; revision 942861 fixes a minor bug that won't affect most users). Once you have those installed, though, SPQR makes it characteristically easy to make QMF events; see the example below for details:

As usual, I welcome your feedback.

Notes on configuration

| No Comments

As most readers of this site know, I've been busy lately working on the Wallaby configuration service, which aims to make it painless to manage configurations for entire Condor pools. In this post, I'm going to discuss some of the issues of application configuration from the other side of the problem: the semantics and interface to the configuration subsystem from within the application itself. These issues and concerns are taken from Condor but are, I suspect, generally applicable to substantial configurable applications in general. (I am currently collaborating with Pete Keller from Wisconsin on a redesign of Condor's configuration subsystem that should address these problems.)

Problems with application configuration generally fall into several categories: value-gap problems, default value problems, type-safety problems for the configuration parameter values themselves, and type-safety problems of the programmatic interface to the configuration. We'll discuss each of these in turn.

Value-gap problems

Truth-value gaps are a problem of classical logic: we would like some way to reason about propositions that are apparently neither true nor false (e.g. Aristotle's example that "there will be a sea battle tomorrow or there will not be a sea battle tomorrow" is true), but we cannot encode an unknown or indeterminate truth value as "true" or "false." (The three-valued logic Ł3 of Jan Łukasiewicz is one such system that addresses this problem.)

We can generalize the problem of truth-value gaps to many situations that come up in programming, for example:

  • Imagine a sparse array of Java-style references. Does a null value correspond to an explicit null that we are interested in tracking, or merely to the absence of a value that we care about?
  • Consider C preprocessor macros: many a novice C programmer has been frustrated by the distinction between #ifdef FOO and #if FOO --- but only after setting FOO to 0 fails to have the desired effect.
  • Similarly, applications that are configured through environment variables may act on the mere existence of a variable in the process's environment or on the existence of a variable and some property of its actual value.

In Condor, these sorts of problems arising from the configuration subsystem are typically handled in an ad hoc manner. In some places, the mere definition of a configuration parameter is enough to enable a feature; in others, the parameter must be defined and its string representation must include something that corresponds to "true" (one example is "the first character is either 't' or 'T'"). The current configuration API also makes it inconvenient (but not impossible) to determine if a parameter is undefined or set to false; in any case, the burden for sensibly treating value gaps falls to the programmer. (I see an interface that does not depend on programmer discipline to be a win over one that does.)

Default value problems

Configuration parameters should have sensible defaults. Unfortunately, the process of assigning default values may not be straightforward. A default value may not merely be a value but it may be the result of evaluating a function at runtime. Alternatively, defaults may depend on context: parameter FOO may have one value in subsystem X on platform Y, but another in subsystem Z or on platform W. (All but the last of these variable names have no deeper significance.)

In Condor, these defaults are handled in two ways: for some parameters whose defaults that are consistent across subsystems and platforms, the default is specified in the generic configuration file shipped with Condor. (About 275 parameters are given default values in this file.) For other parameters that require context-sensitive defaults, the default values are supplied as an extra parameter to configuration API functions at each call site; these may also be conditionally compiled, so that, for example, whether thread-based parallelism is available is defined to be true within the condor_collector on all platforms except Windows, but false everywhere else.

A configuration subsystem that supported specifying rich defaults --- including immediate values as well as expressions that would evaluate to the correct default depending on the context in which they were evaluated --- would free application programmers from the tedious and error-prone work of encoding defaults in every call to param().

Type-safety problems

Popular discussion of types and type safety is fraught with handwaving, imprecision, and nonsense (e.g. the ridiculous and contradictory appellation "dynamically typed," which is often applied to untyped languages). For the remainder of this discussion, I will be using definitions adapted from Luca Cardelli's survey chapter on type systems: namely, a type is the upper bound on a range of values, and a typed language is one in which nontrivial types can be ascribed to variables.

It should be clear that imposing a sufficiently expressive type system on configuration variables is generally desirable. A sufficiently expressive type system for an application like Condor would include not merely the classic types of low-level languages (e.g. unsigned int, boolean, character string, etc.), but also types that encode application-specific information (e.g. email addresses, hostnames, Pascal-style ranges, typed dictionaries). However, since configuration variables in Condor may be defined in terms of macro-expanded values or (in the case of some parameters) as ClassAd expressions to be evaluated later, it may be difficult to typecheck configuration variables a priori.

Another type-safety problem deals with how (or how frequently) values are computed. Some configuration parameters have fixed values for the life of a process, but others (e.g. "the current time" or "the result of taking a random element from this list") may change every time that they are evaluated. The idea of using type systems to track how an expression is evaluated (and not merely the shape of its result) is not new: Lucassen and Gifford proposed effect systems for functional languages in 1988, and computer music languages like Max/MSP and SAOL have long distinguished between expressions that are evaluated once and expressions whose values could change at various intervals (like control signals, which can change hundreds of times per second, or audio signals, which vary tens of thousands of time per second). In the context of configuring Condor, it is sufficient to track whether a parameter's definition is to be evaluated (at least) once or every time its value is requested from the configuration subsystem. The values of parameter definitions that need only be evaluated once can be memoized; currently, Condor's configuration subsystem doesn't support this sort of type information and any memoization is performed on an ad hoc basis in application code.

One of the main concerns behind thinking about improving Condor's configuration API is to reduce the potential avenues for error by developers. Adding a typed interface to typed configuration data greatly simplifies application code that uses the configuration subsystem and eliminates ad hoc, per-call-site typechecking of configuration values. (It shouldn't be acceptable to support a high-level configuration API in 2010 that might require clients to manually coerce a user-supplied string into a value of some other type!) A typed interface could also automatically handle memoization of non-varying configuration variables, and provide methods to inspect whether and where parameters are defined, rather than merely a value (that might represent a value gap).

Going forward

As I mentioned earlier, these issues are the subject of active design and development. Currently, some of our notes on our solution to these problems in Condor are in a ticket on the Condor wiki; ideally, an implementation will be part of a developer release of Condor in the next few months. Of course, I welcome feedback on these ideas and on the concerns that they are meant to address.

At Condor Week

| No Comments

the team

Like many of my colleagues on the MRG Grid team, I've been at Condor Week 2010 this week. Later today, I'll give a talk about some of the configuration management software we've been developing; for more details, you can visit getwallaby.com.

Introducing capricious

| No Comments

Last week I needed a good random number generator to make repeatable stress tests for a Ruby project. Ruby's standard library includes a good random number generator (the Mersenne Twister), but its implementation is unsuitable for use in simulations and testing because the generator state is global to the program. (For simulations, it is usually nice to have multiple independent generators, possibly seeded on different values, that coexist in the same program.)

I couldn't find one I liked, so I wrote my own. I'm pleased to announce the immediate availablity of capricious (as source or as a RubyGem). capricious is a framework to manage randomness for simulations and tests; it includes an interface for pseudorandom number generators, a family of probability distribution simulators (currently uniform, normal, exponential, Erlang, and Poisson, but it is easy to add more), and a class that can track aggregate information (e.g. min, max, mean and variance) about a stream of samples in constant space. capricious also includes two pseudorandom number generators: a simple linear-feedback shift register and one of George Marsaglia's multiply-with-carry generators. Each of the distribution simulators is parameterized on a generator class as well as on a seed and any distribution parameters, so it is easy to evaluate new generators for different applications.

I will write more about some of the interesting technical details behind capricious soon; in the meantime, I welcome feedback, contributions, or bug reports.

SPQR update

| No Comments

I released version 0.1.2 of SPQR this morning; it is available from gemcutter (as an installable gem package) or from fedorahosted.org (as source). This version contains many enhancements and fixes when compared to the versions described my previous SPQR posts. I've noticed that people are still installing older versions of SPQR and would like to encourage everyone who's interested in using SPQR to adopt the most recent version.

The 0.1.x series and the 0.0.x series are not API-compatible, but the incompatibility is for a good reason: it enables SPQR to expose normal Ruby methods (that is, those that don't use keyword arguments). To see what I mean, compare the "Hello, world" example from v0.1.2 with its counterpart in v0.0.4. (The former is also embedded below, but you might have to click through to the post to see it if you're reading this in a syndication feed.)

Here are some other enhancements to the code since the last time I mentioned SPQR:

  • Many fixes and enhancements to SPQR/Rhubarb (simple object-graph persistence) integration; most glue methods are now automatically generated at runtime.
  • Improved code generation from XML QMF schema files, including support for generating classes that are both exposed over QMF (with SPQR) and persistent (with Rhubarb).
  • A stable and mostly-repeatable test suite. (I hope to write up the process of developing unit tests for QMF agents at some point in the near future.)
  • Enhancements to the app skeleton, which can now specify the broker that manages the QMF bus.
  • Many fixes and stability enhancements, most of which have come out of my experience with a substantial SPQR/Rhubarb application I'm developing.

The full list is here. Enjoy, and please don't hesitate to write with questions.

Two brief SPQR updates

| No Comments

Here are two quick notes (and a bonus meta-note) about the quickly-evolving SPQR project:

  1. SPQR now includes a gem target (courtesy of jeweler). Pull a recent version and sudo rake install to try it out. Once the project has stabilized a little more, I will publish gems to gemcutter.
  2. I'm mirroring the fedorahosted SPQR repository on github; if you're already a github user, you might rather follow the repository there.

I'm pretty excited about SPQR, but this site won't become "all SPQR, all the time." Since interested hackers can easily follow the day-to-day project status on github or fedorahosted, I will reserve future SPQR-related blog posts for substantial announcements.

In a previous post, I introduced SPQR and presented a couple of examples of how one could use SPQR to publish Ruby objects over QMF. Sometimes, though, you aren't starting from an application -- instead, you're starting from an XML QMF schema document. SPQR includes spqr-gen, a tool designed to automatically generate a skeleton SPQR application from a QMF schema. In this post, we'll see an example of spqr-gen in action.

First, let's look at a simple QMF schema for a class that exposes one method, echo, which returns its argument (note that the code examples may not show up if you're viewing this in a feed reader):

Running spqr-gen on this example produces two files: agent-app.rb and examples/codegen/EchoAgent.rb. As you can see, these two files contain all of the boilerplate we need to start implementing these QMF methods --- and, by chance, the boilerplate methods actually have the behavior that our agent is meant to!

spqr-gen is included in the SPQR repository; the system as documented in the last two posts is tagged "introducing-spqr".

Find recent content on the main index or look in the archives to find all content.

About Chapeau

  • I work for Red Hat on the MRG project. I hold a PhD in computer sciences from the University of Wisconsin, where I mainly worked on program analysis and concurrency.
  • On this site, I write about topics related to things I'm working on now and things I've worked on in the past: distributed computing and programming languages. I don't speak for my employer, and any opinions on this site are mine alone.

Recent Comments

  • ferkeltongs: Hi Will, I came across your post while looking for read more

Categories

Pages

Powered by Movable Type 4.25