RSS | Module Info | Add a review of Statistics-Regression

I have used Statistics::Regression in a couple of projects. I find it useful, but flawed.

Advantages:

* provides most of the basic multivariate linear regression functionality, including getting standard errors for the fitted parameters

* does NOT hardwire a constant

* nice Perl API to add one point at a time (I haven't tried the provide-all-data-at-once API)

* provides linearcombination_variance()

* uses Gentleman's algorithm, which seems to be numerically quite robust

Disadvantages

* documentation is a bit skimpy, e.g., it never explicitly says that the usage for include() is include(y, [x0,x1,x2,...,xk]), rather I had to reverse-engineer this from the examples

* does not seem to provide any way to get the covariance matrix of the fitted parameters

* no documentation at all for linearcombination_variance() :(

* doesn't offer any help in diagnosing problems caused by a linearly-dependent set of basis fns (this would require an SVD-based fit)

* There must be at least 2 independent variables -- attempting to do a regression with only a single independent variable [i.e., attempting to estimate the parameter c given a set of (x,y) data points and the model y = c*x ] yields a run-time error message that this isn't allowed. This restriction is unfortunate (the mathematical formalism is ok with 1 independent variable, and some applications generate N-variable fits where N isn't known in advance). What's worse, this restriction is undocumented.

I have been using Statistics::Regression for about a year in an ongoing project (which totals around 2500 lines of Perl), and generally been fairly happy with it. (Not having a covariance matrix has been a bit awkward, but I could live with(out) it.)

I haven't tried the fancy "pretty-print the regression" routines.

Alas, I've just hit the "must have at least 2 independent variables" restriction, so it looks like I'll have to convert my code to use a different regression package (maybe Math::GSL::Multifit).

Title: good job, but not yet done

1. Overall Impression: This package is useful and has fewer bugs than might be expected in a 0.x version.

2. The sigmasq method does not work. Here's the error message that I got when I tried to use it:

Internal Error: yelement is undef at C:/Perl/site/lib/Statistics/Regression.pm line 265.

3. For the sake of efficiency, there should be a mechanism for adding multiple observations at a single shot.

4. The documentation should specify the formulas that are being used or at least provide references.

5. It would be nice to have a method that returns the coefficients, R^2, and sigma^2 all at a single shot (returning a reference to the coefficient vector).