Algorithm Overview
This page explains the algorithm behind mpspline.
Background
mpspline implements the mass-preserving equal-area quadratic smoothing spline algorithm described in:
Bishop, T.F.A., McBratney, A.B., & Laslett, G.M. (1999). Modelling soil attribute depth functions with equal-area quadratic smoothing splines. Geoderma, 89(3-4), 185-208.
What It Does
Given soil horizon data at varying depths, the algorithm:
Fits a smooth function through the data
Generates 1cm predictions across the profile depth
Averages predictions to standardized depth intervals
Preserves mass: Average values at standard depths equal the original data
Why It Matters
Soil data is usually collected at irregular depths (horizons):
Depth (cm) Clay (%)
0-20 24.5 ← Ap horizon
20-50 35.2 ← Bt horizon
50-80 32.0 ← BC horizon
But analysis often requires standardized depths:
Depth (cm) Clay (%)
0-5 21.4 ← Interpolated
5-15 24.1 ← Interpolated
15-30 29.8 ← Interpolated
30-60 34.9 ← Interpolated
60-100 31.2 ← Interpolated
The spline algorithm ensures these new estimates have the same average value as the original horizons.
How It Works
The algorithm has three main steps:
Step 1: Estimate Spline Parameters
Fit a continuous function using quadratic spline pieces
Each piece defined by parameters at control points
Smoothing parameter (lambda) controls how tightly the spline fits
Step 2: Generate Predictions
Use the fitted spline to predict values at 1cm intervals
Creates a continuous depth function from discrete horizons
Ensures mass is preserved across depth intervals
Step 3: Aggregate to Standard Depths
Average predictions within each target depth interval
Output harmonized values at standardized depths
Result: Values that represent the horizon averages
Implementation Details
The roughness penalty matrix \(\mathbf{R}\) is constructed with diagonal elements:
Where \(h_i\) is the thickness of layer \(i\) and \(g_i\) is the gap between layers.
Key Properties
Mass Preservation
The algorithm ensures:
In plain English: The average of the spline function across the original horizon depth equals the original value.
Smoothing Parameter (Lambda)
Controls the trade-off between smoothness and fit:
Small lambda (0.001-0.01): Very smooth predictions
Default lambda (0.1): Balanced smoothness and fit
Large lambda (1.0-10.0): Closer fit to original data
Continuity
The function is: - Continuous: No jumps - Smooth: No sharp angles - Differentiable: Can compute slopes
Parameters That Matter
lambda (lam)
Smoothing parameter (default: 0.1)
Controls the trade-off between: - Smoothness: Smaller lambda = smoother curve - Fidelity: Larger lambda = follows data more closely
Example:
# Smooth curve for noisy data
result = mpspline(profile, lam=0.01)
# Balanced (default)
result = mpspline(profile, lam=0.1)
# Follows data closely
result = mpspline(profile, lam=1.0)
vlow and vhigh
Value constraints (default: 0.0 and 1000.0)
Ensures predictions stay within realistic ranges:
# Clay: 0-100%
result = mpspline(profile, var_name=['clay'], vlow=0, vhigh=100)
# Bulk density: 1-2 g/cm³
result = mpspline(
profile,
var_name=['bdensity'],
vlow=1.0,
vhigh=2.0
)
Strengths
Mass preservation: Maintains original horizon averages
Smooth predictions: No artificial discontinuities
Flexibility: Works with any horizon configuration
Robustness: Handles variable number of horizons
Speed: Efficient computation
Limitations
Extrapolation: Cannot extend beyond profile depth
Few horizons: Requires minimum 2 horizons per profile
Assumption: Assumes smooth depth function (not appropriate for all properties)
Parameter selection: Choice of lambda affects results
When to Use Different Lambda Values
Use lambda = 0.01 (smooth): - Noisy field measurements - Estimate is uncertain - Want conservative, smooth predictions - Variability between horizons seems excessive
Use lambda = 0.1 (default): - Normal soil profile data - Good data quality - Want balanced fit - When in doubt, use this
Use lambda = 1.0 (fit close to data): - Very accurate laboratory measurements - High precision data - Want predictions close to observed values - Large number of horizons with clear trends
Algorithm Diagram
Simplified flow:
Input Horizons
↓
Estimate Spline Parameters (using lambda)
↓
Generate Predictions at 1cm intervals
↓
Apply Constraints (vlow, vhigh)
↓
Aggregate to Standard Depths
↓
Output Harmonized Values
Example Walkthrough
Given this profile:
Horizon | Depth | Clay
---------|----------|------
Ap | 0-20cm | 24.5%
Bt | 20-50cm | 35.2%
Step 1: Fit spline function
The algorithm finds parameters that: - Pass through (0, 24.5) and (20, 24.5) boundary - Pass through (20, 35.2) and (50, 35.2) boundary - Smooth transition between horizons - Controlled by lambda parameter
Step 2: Generate 1cm predictions
Spline function evaluated at every 1cm:
Depth | Prediction
-------|----------
0-1cm | 24.1%
1-2cm | 24.2%
...
20-21cm| 28.5%
...
50cm | 35.2%
Step 3: Aggregate to standard depths
Average predictions within intervals:
Interval | Average of Predictions
------------|---------------------
0-5cm | (24.1+24.2+...)/5 = 24.3%
5-15cm | (average of 5-15) = 24.5%
15-30cm | (average of 15-30) = 28.9%
30-60cm | (average of 30-60) = 35.0%
Result: Harmonized values that preserve original horizon averages!
Further Reading
Bishop et al. (1999): Original algorithm paper https://doi.org/10.1016/S0016-7061(99)00003-8
Malone et al. (2009): Application to soil carbon mapping https://doi.org/10.1016/j.geoderma.2009.10.007
mpspline2 R package: Reference implementation https://github.com/obrl-soil/mpspline2