RSS

I have used Statistics::Regression in a couple of projects. I find it useful, but flawed.

Advantages:

* provides most of the basic multivariate linear regression functionality, including getting standard errors for the fitted parameters

* does NOT hardwire a constant

* nice Perl API to add one point at a time (I haven't tried the provide-all-data-at-once API)

* provides linearcombination_variance()

* uses Gentleman's algorithm, which seems to be numerically quite robust

Disadvantages

* documentation is a bit skimpy, e.g., it never explicitly says that the usage for include() is include(y, [x0,x1,x2,...,xk]), rather I had to reverse-engineer this from the examples

* does not seem to provide any way to get the covariance matrix of the fitted parameters

* no documentation at all for linearcombination_variance() :(

* doesn't offer any help in diagnosing problems caused by a linearly-dependent set of basis fns (this would require an SVD-based fit)

* There must be at least 2 independent variables -- attempting to do a regression with only a single independent variable [i.e., attempting to estimate the parameter c given a set of (x,y) data points and the model y = c*x ] yields a run-time error message that this isn't allowed. This restriction is unfortunate (the mathematical formalism is ok with 1 independent variable, and some applications generate N-variable fits where N isn't known in advance). What's worse, this restriction is undocumented.

I have been using Statistics::Regression for about a year in an ongoing project (which totals around 2500 lines of Perl), and generally been fairly happy with it. (Not having a covariance matrix has been a bit awkward, but I could live with(out) it.)

I haven't tried the fancy "pretty-print the regression" routines.

Alas, I've just hit the "must have at least 2 independent variables" restriction, so it looks like I'll have to convert my code to use a different regression package (maybe Math::GSL::Multifit).