Basis Splines¶
This chapter describes functions for the computation of smoothing basis splines (B-splines). A smoothing spline differs from an interpolating spline in that the resulting curve is not required to pass through each datapoint. For information about interpolating splines, see Interpolation.
The header file gsl_bspline.h contains the prototypes for the
bspline functions and related declarations.
Overview¶
B-splines are commonly used as basis functions to fit smoothing curves to large data sets. To do this, the abscissa axis is broken up into some number of intervals, where the endpoints of each interval are called breakpoints. These breakpoints are then converted to knots by imposing various continuity and smoothness conditions at each interface. Given a nondecreasing knot vector
t = \{t_0, t_1, \dots, t_{n+k-1}\}
the n basis splines of order k are defined by
B_{i,1}(x) &= \left\{ \begin{array}{cc} 1, & t_i \le x < t_{i+1} \\ 0, & else \end{array} \right. \\ B_{i,k}(x) &= {(x - t_i) \over (t_{i+k-1} - t_i)} B_{i,k-1}(x) + {(t_{i+k} - x) \over (t_{i+k} - t_{i+1})} B_{i+1,k-1}(x)
for i = 0, \ldots, n-1. The common case of cubic B-splines is given by k = 4. The above recurrence relation can be evaluated in a numerically stable way by the de Boor algorithm.
If we define appropriate knots on an interval [a,b] then the B-spline basis functions form a complete set on that interval. Therefore we can expand a smoothing function as
f(x) = \sum_{i=0}^{n-1} c_i B_{i,k}(x)
given enough (x_j, f(x_j)) data pairs. The coefficients c_i can be readily obtained from a least-squares fit.
Initializing the B-splines solver¶
-
gsl_bspline_workspace¶ The computation of B-spline functions requires a preallocated workspace.
-
gsl_bspline_workspace *
gsl_bspline_alloc(const size_t k, const size_t nbreak)¶ This function allocates a workspace for computing B-splines of order
k. The number of breakpoints is given bynbreak. This leads to n = nbreak + k - 2 basis functions. Cubic B-splines are specified by k = 4. The size of the workspace is O(2k^2 + 5k + nbreak).
-
void
gsl_bspline_free(gsl_bspline_workspace * w)¶ This function frees the memory associated with the workspace
w.
Constructing the knots vector¶
-
int
gsl_bspline_knots(const gsl_vector * breakpts, gsl_bspline_workspace * w)¶ This function computes the knots associated with the given breakpoints and stores them internally in
w->knots.
-
int
gsl_bspline_knots_uniform(const double a, const double b, gsl_bspline_workspace * w)¶ This function assumes uniformly spaced breakpoints on [a,b] and constructs the corresponding knot vector using the previously specified
nbreakparameter. The knots are stored inw->knots.
Evaluation of B-splines¶
-
int
gsl_bspline_eval(const double x, gsl_vector * B, gsl_bspline_workspace * w)¶ This function evaluates all B-spline basis functions at the position
xand stores them in the vectorB, so that the i-th element is B_i(x). The vectorBmust be of length n = nbreak + k - 2. This value may also be obtained by callinggsl_bspline_ncoeffs(). Computing all the basis functions at once is more efficient than computing them individually, due to the nature of the defining recurrence relation.
-
int
gsl_bspline_eval_nonzero(const double x, gsl_vector * Bk, size_t * istart, size_t * iend, gsl_bspline_workspace * w)¶ This function evaluates all potentially nonzero B-spline basis functions at the position
xand stores them in the vectorBk, so that the i-th element is B_{(istart+i)}(x). The last element ofBkis B_{iend}(x). The vectorBkmust be of length k. By returning only the nonzero basis functions, this function allows quantities involving linear combinations of the B_i(x) to be computed without unnecessary terms (such linear combinations occur, for example, when evaluating an interpolated function).
-
size_t
gsl_bspline_ncoeffs(gsl_bspline_workspace * w)¶ This function returns the number of B-spline coefficients given by n = nbreak + k - 2.
Evaluation of B-spline derivatives¶
-
int
gsl_bspline_deriv_eval(const double x, const size_t nderiv, gsl_matrix * dB, gsl_bspline_workspace * w)¶ This function evaluates all B-spline basis function derivatives of orders 0 through
nderiv(inclusive) at the positionxand stores them in the matrixdB. The (i,j)-th element ofdBis d^jB_i(x)/dx^j. The matrixdBmust be of size n = nbreak + k - 2 by nderiv + 1. The value n may also be obtained by callinggsl_bspline_ncoeffs(). Note that function evaluations are included as the zeroth order derivatives indB. Computing all the basis function derivatives at once is more efficient than computing them individually, due to the nature of the defining recurrence relation.
-
int
gsl_bspline_deriv_eval_nonzero(const double x, const size_t nderiv, gsl_matrix * dB, size_t * istart, size_t * iend, gsl_bspline_workspace * w)¶ This function evaluates all potentially nonzero B-spline basis function derivatives of orders 0 through
nderiv(inclusive) at the positionxand stores them in the matrixdB. The (i,j)-th element ofdBis d^jB_{(istart+i)}(x)/dx^j. The last row ofdBcontains d^jB_{iend}(x)/dx^j. The matrixdBmust be of size k by at least nderiv + 1. Note that function evaluations are included as the zeroth order derivatives indB. By returning only the nonzero basis functions, this function allows quantities involving linear combinations of the B_i(x) and their derivatives to be computed without unnecessary terms.
Working with the Greville abscissae¶
The Greville abscissae are defined to be the mean location of k-1
consecutive knots in the knot vector for each basis spline function of order
k. With the first and last knots in the gsl_bspline_workspace
knot vector excluded, there are gsl_bspline_ncoeffs() Greville abscissae
for any given B-spline basis. These values are often used in B-spline
collocation applications and may also be called Marsden-Schoenberg points.
-
double
gsl_bspline_greville_abscissa(size_t i, gsl_bspline_workspace * w)¶ Returns the location of the i-th Greville abscissa for the given B-spline basis. For the ill-defined case when k = 1, the implementation chooses to return breakpoint interval midpoints.
Examples¶
The following program computes a linear least squares fit to data using cubic B-spline basis functions with uniform breakpoints. The data is generated from the curve y(x) = \cos{(x)} \exp{(-x/10)} on the interval [0, 15] with Gaussian noise added.
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <gsl/gsl_bspline.h>
#include <gsl/gsl_multifit.h>
#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>
#include <gsl/gsl_statistics.h>
/* number of data points to fit */
#define N 200
/* number of fit coefficients */
#define NCOEFFS 12
/* nbreak = ncoeffs + 2 - k = ncoeffs - 2 since k = 4 */
#define NBREAK (NCOEFFS - 2)
int
main (void)
{
const size_t n = N;
const size_t ncoeffs = NCOEFFS;
const size_t nbreak = NBREAK;
size_t i, j;
gsl_bspline_workspace *bw;
gsl_vector *B;
double dy;
gsl_rng *r;
gsl_vector *c, *w;
gsl_vector *x, *y;
gsl_matrix *X, *cov;
gsl_multifit_linear_workspace *mw;
double chisq, Rsq, dof, tss;
gsl_rng_env_setup();
r = gsl_rng_alloc(gsl_rng_default);
/* allocate a cubic bspline workspace (k = 4) */
bw = gsl_bspline_alloc(4, nbreak);
B = gsl_vector_alloc(ncoeffs);
x = gsl_vector_alloc(n);
y = gsl_vector_alloc(n);
X = gsl_matrix_alloc(n, ncoeffs);
c = gsl_vector_alloc(ncoeffs);
w = gsl_vector_alloc(n);
cov = gsl_matrix_alloc(ncoeffs, ncoeffs);
mw = gsl_multifit_linear_alloc(n, ncoeffs);
/* this is the data to be fitted */
for (i = 0; i < n; ++i)
{
double sigma;
double xi = (15.0 / (N - 1)) * i;
double yi = cos(xi) * exp(-0.1 * xi);
sigma = 0.1 * yi;
dy = gsl_ran_gaussian(r, sigma);
yi += dy;
gsl_vector_set(x, i, xi);
gsl_vector_set(y, i, yi);
gsl_vector_set(w, i, 1.0 / (sigma * sigma));
printf("%f %f\n", xi, yi);
}
/* use uniform breakpoints on [0, 15] */
gsl_bspline_knots_uniform(0.0, 15.0, bw);
/* construct the fit matrix X */
for (i = 0; i < n; ++i)
{
double xi = gsl_vector_get(x, i);
/* compute B_j(xi) for all j */
gsl_bspline_eval(xi, B, bw);
/* fill in row i of X */
for (j = 0; j < ncoeffs; ++j)
{
double Bj = gsl_vector_get(B, j);
gsl_matrix_set(X, i, j, Bj);
}
}
/* do the fit */
gsl_multifit_wlinear(X, w, y, c, cov, &chisq, mw);
dof = n - ncoeffs;
tss = gsl_stats_wtss(w->data, 1, y->data, 1, y->size);
Rsq = 1.0 - chisq / tss;
fprintf(stderr, "chisq/dof = %e, Rsq = %f\n",
chisq / dof, Rsq);
printf("\n\n");
/* output the smoothed curve */
{
double xi, yi, yerr;
for (xi = 0.0; xi < 15.0; xi += 0.1)
{
gsl_bspline_eval(xi, B, bw);
gsl_multifit_linear_est(B, c, cov, &yi, &yerr);
printf("%f %f\n", xi, yi);
}
}
gsl_rng_free(r);
gsl_bspline_free(bw);
gsl_vector_free(B);
gsl_vector_free(x);
gsl_vector_free(y);
gsl_matrix_free(X);
gsl_vector_free(c);
gsl_vector_free(w);
gsl_matrix_free(cov);
gsl_multifit_linear_free(mw);
return 0;
} /* main() */
The output is shown below:
$ ./a.out > bspline.txt
chisq/dof = 1.118217e+00, Rsq = 0.989771
The data and fitted model are shown in Fig. 31.
References and Further Reading¶
Further information on the algorithms described in this section can be found in the following book,
- C. de Boor, A Practical Guide to Splines (1978), Springer-Verlag, ISBN 0-387-90356-9.
Further information of Greville abscissae and B-spline collocation can be found in the following paper,
- Richard W. Johnson, Higher order B-spline collocation at the Greville abscissae. Applied Numerical Mathematics. vol.: 52, 2005, 63–75.
A large collection of B-spline routines is available in the PPPACK library available at http://www.netlib.org/pppack, which is also part of SLATEC.
