|
Home
>> Perspective >> Beowulf Cluster


In
1996, health-care analysts at the University of Virginia announced research
results that contradicted a long-held assumption about a particular emergency
room procedure. The procedure, called
right-heart catheterization,
was commonly administered to critically ill patients during their first 24 hours
of hospitalization, and was thought to lead to better outcomes. Using
conventional statistical techniques, the 1996 study demonstrated that the
procedure does the opposite, actually
increasing death
rates.
That study, like most complex
assessments, used parametric
modeling methods. It targeted a particular question or questions, built certain
assumptions into the analysis, and thus incorporated potential limitations.
Under normal circumstances, it is pragmatically necessary to focus a study this
way. To do otherwise—to engage in an open-ended consideration of a topic area
without regard for a specific query—would require analyzing mountains of data
(only a fraction of it ultimately relevant), factoring in a mind-boggling web of
possible implications.
Thanks to computers, however,
the impossible is possible. In 2001, a group of analysts that included Maxwell
economist Jeffrey Racine used computationally intensive
nonparametric
statistical methods to reassess the efficacy of right-heart catheterization,
using data obtained from the University of Virginia. The approach they developed
involved feeding large amounts of data on the procedure into a supercomputer,
with a minimum of accompanying assumptions. To everyone’s surprise, this
analysis yielded a completely different result from the 1996 study. “Once we
eliminated the rigid assumptions that were built into the original analysis, a
completely different conclusion emerged,” says Racine. “We found that the
procedure, if anything, lowered the death rate for critically ill patients.”
Nonparametric modeling, which
teases out patterns inherent in data (absent human assumptions), sometimes has
that sort of remediating effect. But, until recently, the enormous cost of
supercomputing placed nonparametric analysis beyond the reach of many
institutions.
This is about to change at the
Maxwell School. With a grant of $162,810 from the National Science Foundation,
Syracuse University has purchased a computer cluster (housed in Maxwell’s
Center
for Policy Research) that does the same sophisticated data analysis that
formerly only supercomputers could do. It’s become practical to bypass a
traditional supercomputer in favor of what’s called a “Beowulf cluster.”
A
Beowulf cluster uses open-source software and libraries (developed by a
government/private sector consortium in the 1990s) to link and coordinate
off-the-shelf processors to achieve power comparable to that of traditional
supercomputers. (The third fastest computer in the world today is, in fact, a
cluster.) And a cluster does this at a fraction of the cost of a supercomputer.
Racine says that at least nine
faculty members and 20 to 30 graduate students will benefit initially, with the
number growing as other Maxwell researchers develop projects that take advantage
of the Beowulf cluster’s capabilities.
“CPR has a number of faculty
members and students who work with numerically intensive econometric methods or
who routinely struggle with computational aspects of modeling large datasets,”
he says. “With this cluster computer, we will be able to teach students not only
the theory of these methods but how to apply them to large, real-world databases
like those with which they will work after graduation.”
For his part, Racine has
planned a number of projects for the cluster. One involves verifying, via
computer simulation, certain theoretical conjectures regarding new nonparametric
estimators; another, reassessing union and gender wage gaps using these methods.
—Jill Leonhardt
This article appeared
in the Fall 2003 print edition of Maxwell Perspective;
© 2003 Maxwell School of Syracuse University. To request a
copy, e-mail
dlcooke@maxwell.syr.edu.
|