Parameters

Complete reference for mpspline() parameters.

Function Signature

mpspline(
    obj: Union[Dict, List[Dict]],
    var_name: Union[str, List[str]] = None,
    target_depths: Optional[List[Tuple[int, int]]] = None,
    lam: float = 0.1,
    vlow: float = 0.0,
    vhigh: float = 1000.0,
    batch_size: int = 100,
    parallel: bool = False,
    n_workers: Optional[int] = None,
    strict: bool = False,
) -> Union[Dict, pd.DataFrame]

Parameters

objdict or list of dict

Input data.

Single profile (dict): Returns dict with harmonized values
Multiple profiles (list of dicts): Returns pandas DataFrame

Each profile must have a ‘horizons’ key with list of horizon dicts.

var_namestr, list of str, or None

Property/properties to harmonize.

None (default): Harmonizes all numeric fields found in horizons
‘clay’: Harmonizes only clay
[‘clay’, ‘sand’]: Harmonizes clay and sand

Numeric string values will be converted to floats. Non-numeric fields are skipped.

target_depthslist of tuples or None

Depth intervals as (top, bottom) in cm.

Default: GlobalSoilMap standard depths

[(0, 5), (5, 15), (15, 30), (30, 60), (60, 100), (100, 200)]

Can be customized for your analysis:

# Shallow sampling
target_depths = [(0, 5), (5, 10), (10, 20), (20, 40)]

# Deep profile
target_depths = [(0, 10), (10, 30), (30, 60), (60, 100), (100, 200)]

lamfloat

Smoothing parameter (default: 0.1)

Controls how closely the spline fits the original data:

0.01 (high smoothing): Smooth predictions, tolerates noisy data
0.1 (medium smoothing, default): Balanced fit between smoothness and data fit
1.0 (low smoothing): Closer to original data points

Try different values based on your data quality. Example:

# For noisy data
result = mpspline(profile, lam=0.01)

# For accurate data
result = mpspline(profile, lam=1.0)

vlowfloat

Minimum constraint for predictions (default: 0.0)

Prevents unrealistic low values. For example:

# Clay percentage cannot be negative
result = mpspline(profile, var_name=['clay'], vlow=0.0)

# Bulk density is typically > 1 g/cm³
result = mpspline(profile, var_name=['bdensity'], vlow=1.0)

vhighfloat

Maximum constraint for predictions (default: 1000.0)

Prevents unrealistic high values. For example:

# Clay percentage cannot exceed 100%
result = mpspline(profile, var_name=['clay'], vhigh=100.0)

# Bulk density typically < 2 g/cm³
result = mpspline(profile, var_name=['bdensity'], vhigh=2.0)

batch_sizeint

Number of profiles per batch (default: 100)

Only applies when processing multiple profiles. Used for memory optimization. For very large datasets, reduce batch size to lower memory usage:

# For memory-constrained systems
df = mpspline(large_profile_list, batch_size=50)

parallelbool

Enable multiprocessing (default: False)

Only applies when processing multiple profiles (list input). Speeds up processing on multi-core systems:

# Use all available CPU cores
df = mpspline(profiles, parallel=True)

n_workersint or None

Number of worker processes (default: None = CPU count)

Only applies when parallel=True. Limit CPU usage:

# Use only 2 cores
df = mpspline(profiles, parallel=True, n_workers=2)

strictbool

Error handling mode (default: False)

False (lenient): Log warnings and skip problematic profiles/properties
True: Raise exceptions on validation errors

Example:

# Lenient mode: skip bad profiles, return results for valid ones
df = mpspline(profiles, strict=False)

# Strict mode: fail fast on first error
try:
    df = mpspline(profiles, strict=True)
except ValueError as e:
    print(f"Validation error: {e}")

Common Parameter Combinations

Single profile, basic usage:

result = mpspline(profile, var_name=['clay', 'sand'])

Multiple profiles, all properties:

df = mpspline(profiles)

Multiple profiles with constraints:

df = mpspline(
    profiles,
    var_name=['clay', 'sand'],
    vlow=0.0,
    vhigh=100.0
)

Large dataset with parallel processing:

df = mpspline(
    profiles,
    var_name=['clay'],
    parallel=True,
    batch_size=500,
    lam=0.1
)

Custom depths for research project:

custom_depths = [(0, 5), (5, 25), (25, 50), (50, 100)]
result = mpspline(
    profile,
    target_depths=custom_depths,
    var_name=['clay', 'sand', 'silt']
)

Horizon Input Format

Flexible depth field names (all recognized):

# All of these work:
{'upper': 0, 'lower': 20}
{'top': 0, 'bottom': 20}
{'start': 0, 'end': 20}
{'depth_min': 0, 'depth_max': 20}
{'hzdept_r': 0, 'hzdepb_r': 20}

Property fields (any numeric field):

{
    'upper': 0,
    'lower': 20,
    'clay': 24.5,           # Will be harmonized
    'sand': 42.3,           # Will be harmonized
    'silt': 33.2,           # Will be harmonized
    'texture': 'loam',      # Skipped (non-numeric)
    'color': 'brown',       # Skipped (non-numeric)
}