Parameters
Complete reference for mpspline() parameters.
Function Signature
mpspline(
obj: Union[Dict, List[Dict]],
var_name: Union[str, List[str]] = None,
target_depths: Optional[List[Tuple[int, int]]] = None,
lam: float = 0.1,
vlow: float = 0.0,
vhigh: float = 1000.0,
batch_size: int = 100,
parallel: bool = False,
n_workers: Optional[int] = None,
strict: bool = False,
) -> Union[Dict, pd.DataFrame]
Parameters
- objdict or list of dict
Input data.
Single profile (dict): Returns dict with harmonized values
Multiple profiles (list of dicts): Returns pandas DataFrame
Each profile must have a ‘horizons’ key with list of horizon dicts.
- var_namestr, list of str, or None
Property/properties to harmonize.
None (default): Harmonizes all numeric fields found in horizons
‘clay’: Harmonizes only clay
[‘clay’, ‘sand’]: Harmonizes clay and sand
Numeric string values will be converted to floats. Non-numeric fields are skipped.
- target_depthslist of tuples or None
Depth intervals as (top, bottom) in cm.
Default: GlobalSoilMap standard depths
[(0, 5), (5, 15), (15, 30), (30, 60), (60, 100), (100, 200)]
Can be customized for your analysis:
# Shallow sampling target_depths = [(0, 5), (5, 10), (10, 20), (20, 40)] # Deep profile target_depths = [(0, 10), (10, 30), (30, 60), (60, 100), (100, 200)]
- lamfloat
Smoothing parameter (default: 0.1)
Controls how closely the spline fits the original data:
0.01 (high smoothing): Smooth predictions, tolerates noisy data
0.1 (medium smoothing, default): Balanced fit between smoothness and data fit
1.0 (low smoothing): Closer to original data points
Try different values based on your data quality. Example:
# For noisy data result = mpspline(profile, lam=0.01) # For accurate data result = mpspline(profile, lam=1.0)
- vlowfloat
Minimum constraint for predictions (default: 0.0)
Prevents unrealistic low values. For example:
# Clay percentage cannot be negative result = mpspline(profile, var_name=['clay'], vlow=0.0) # Bulk density is typically > 1 g/cm³ result = mpspline(profile, var_name=['bdensity'], vlow=1.0)
- vhighfloat
Maximum constraint for predictions (default: 1000.0)
Prevents unrealistic high values. For example:
# Clay percentage cannot exceed 100% result = mpspline(profile, var_name=['clay'], vhigh=100.0) # Bulk density typically < 2 g/cm³ result = mpspline(profile, var_name=['bdensity'], vhigh=2.0)
- batch_sizeint
Number of profiles per batch (default: 100)
Only applies when processing multiple profiles. Used for memory optimization. For very large datasets, reduce batch size to lower memory usage:
# For memory-constrained systems df = mpspline(large_profile_list, batch_size=50)
- parallelbool
Enable multiprocessing (default: False)
Only applies when processing multiple profiles (list input). Speeds up processing on multi-core systems:
# Use all available CPU cores df = mpspline(profiles, parallel=True)
- n_workersint or None
Number of worker processes (default: None = CPU count)
Only applies when parallel=True. Limit CPU usage:
# Use only 2 cores df = mpspline(profiles, parallel=True, n_workers=2)
- strictbool
Error handling mode (default: False)
False (lenient): Log warnings and skip problematic profiles/properties
True: Raise exceptions on validation errors
Example:
# Lenient mode: skip bad profiles, return results for valid ones df = mpspline(profiles, strict=False) # Strict mode: fail fast on first error try: df = mpspline(profiles, strict=True) except ValueError as e: print(f"Validation error: {e}")
Common Parameter Combinations
Single profile, basic usage:
result = mpspline(profile, var_name=['clay', 'sand'])
Multiple profiles, all properties:
df = mpspline(profiles)
Multiple profiles with constraints:
df = mpspline(
profiles,
var_name=['clay', 'sand'],
vlow=0.0,
vhigh=100.0
)
Large dataset with parallel processing:
df = mpspline(
profiles,
var_name=['clay'],
parallel=True,
batch_size=500,
lam=0.1
)
Custom depths for research project:
custom_depths = [(0, 5), (5, 25), (25, 50), (50, 100)]
result = mpspline(
profile,
target_depths=custom_depths,
var_name=['clay', 'sand', 'silt']
)
Horizon Input Format
Flexible depth field names (all recognized):
# All of these work:
{'upper': 0, 'lower': 20}
{'top': 0, 'bottom': 20}
{'start': 0, 'end': 20}
{'depth_min': 0, 'depth_max': 20}
{'hzdept_r': 0, 'hzdepb_r': 20}
Property fields (any numeric field):
{
'upper': 0,
'lower': 20,
'clay': 24.5, # Will be harmonized
'sand': 42.3, # Will be harmonized
'silt': 33.2, # Will be harmonized
'texture': 'loam', # Skipped (non-numeric)
'color': 'brown', # Skipped (non-numeric)
}