diive is currently being prepared for the v1.0 release.
diive is a Python library for time series processing, focused on ecosystem data. It was originally developed by the ETH Grassland Sciences group for Swiss FluxNet.
Cite diive using DOI 10.5281/zenodo.10884017. This concept DOI resolves to the latest release, so include the version number in your citation.
BibTeX format:
@software{diive2026,
author = {Hörtnagl, Lukas},
title = {diive: Python library for time series processing},
version = {0.91.0},
year = {2026},
doi = {10.5281/zenodo.10884017}
}Replace version and year with the values for your target release.
Requires Python 3.12+
pip install diiveOr with uv:
uv pip install diiveimport diive as dv
# Load example data
df = dv.load_exampledata_parquet()
# Plot time series
dv.plot_time_series(series=df['NEE']).plot()
# Gap-fill with Random Forest
from diive.core.ml.feature_engineer import FeatureEngineer
from diive.gapfilling.randomforest_ts import RandomForestTS
engineer = FeatureEngineer(target_col='NEE', features_lag=[-1, 1], features_rolling=[12, 24])
df_engineered = engineer.fit_transform(df)
model = RandomForestTS(input_df=df_engineered, target_col='NEE', n_estimators=100)
model.trainmodel()
model.fillgaps()diive exposes its classes through a top-level namespace, available as both PascalCase and snake_case aliases:
import diive as dv
plot = dv.plot_time_series(series=data) # snake_case alias
plot = dv.TimeSeries(series=data) # PascalCase class name| Area | Common exports |
|---|---|
| Plotting | TimeSeries, Cumulative, DielCycle, HeatmapDateTime |
| Gap-filling | RandomForestTS, XGBoostTS, FluxMDS |
| Analysis | GridAggregator, SeasonalTrendDecomposition |
| Eddy covariance | FluxProcessingChain, FluxDetectionLimit, WindDoubleRotation |
| I/O | load_parquet, save_parquet, load_exampledata_parquet |
For the full list, see diive.__all__.
104 runnable examples are organized by topic in examples/. They follow Sphinx Gallery format (# %% sections), so they run as plain scripts and convert to HTML docs automatically. Browse by use case in CATALOG.md, or check EXAMPLE_DATASET.md for documentation of the 37-variable dataset used throughout.
uv run python examples/visualization/plot_heatmap_datetime_basic.py
uv run python examples/analysis/analysis_daily_correlation.py
uv run python examples/gapfilling/gapfill_randomforest.py
uv run python examples/flux/fluxprocessingchain/fluxprocessingchain_composable.pyFeatureEngineer runs an 8-stage feature pipeline (lag features, rolling stats, differencing, EMA, polynomial terms, STL decomposition, timestamps, record numbering). You build the features once and reuse them across models.
| Method | How it works |
|---|---|
XGBoostTS |
Gradient boosting |
RandomForestTS |
Ensemble learning with SHAP importance |
FluxMDS |
Meteorological similarity, no training needed |
| Linear interpolation | Short gaps only |
Long-term variants support multi-year data with USTAR scenario options. See examples/gapfilling/.
Post-processing from quality flags through gap-filling, covering Levels 2 to 4.1 following Swiss FluxNet standards. Two entry points:
run_chain(data, config)— single call drives the full pipeline (L2 → L3.1 → L3.2 → L3.3 → L4.1) from oneFluxConfig. Intentionally simple: fixed defaults for per-detector / per-model knobs (Hampel sub-options, MDS tolerances, ML hyperparameters). Use this for the standard FLUXNET-style workflow.- Composable per-level callables (
run_level2,run_level31,make_level32_detector+run_level32,run_level33_constant_ustar/run_level33_ustar_detection,run_level41_mds/_rf/_xgb) — full control. Every detector class, model hyperparameter, MDS tolerance, and diagnostic flag is reachable here and only here.
Need a computed driver (e.g. VPD in kPa) for L4.1? Use add_driver(data, series) to put it where L4.1 actually reads from. Call data.gap_stats() at any level for a monthly/annual breakdown with long-gap listing. data.plot_gapfilled_heatmaps() puts all gap-filling methods side by side; data.plot_cumulative_comparison() overlays their cumulative sums on one axes.
Reference: Swiss FluxNet flux processing | Examples: examples/flux/fluxprocessingchain/
FlagQCF merges multiple test flags into a single quality indicator with daytime/nighttime separation and USTAR scenario support.
Nine outlier detection methods are available: Hampel filter, Z-score (global, rolling, or split by day/night), local SD, Local Outlier Factor, absolute limits, incremental detection, manual removal, trimmed mean, and stepwise chaining across multiple methods. See examples/preprocessing/outlier_detection/.
Tools cover offset correction for measurements, radiation, humidity, and wind direction; threshold and missing value handling; and timestamp sanitization (validation, regularization, frequency detection). See examples/preprocessing/corrections/ and examples/times/.
Seasonal-trend decomposition (STL, classical, or harmonic), lagged correlation and binned analysis, 2D grid aggregation, gap detection with monthly/annual breakdown, and percentiles/histograms. See examples/analysis/.
VPD from temperature and humidity, day/night flags from solar geometry, air density, aerodynamic resistance, unit conversions, lagged features, and clear-sky potential radiation. See examples/features/.
Flux detection limit from 20 Hz data, maximum covariance lag, pre-whitening bootstrap (PWB) for trace gases (CH4, N2O) with single-period and multi-file parallel variants, wind double rotation, self-heating correction for open-path IRGAs, USTAR filtering, and random error propagation. See examples/flux/.
14+ plot types including time series, cumulative, diel cycle, heatmaps (datetime and year-month), hexbin, histogram, ridgeline, scatter, and anomaly plots. Both Matplotlib and Plotly are supported. See examples/visualization/.
Load and save parquet files, read single or batch EddyPro output, detect and split irregular files, and format data for FLUXNET submission. See examples/io/.
See CLAUDE.md for development setup, coding standards, and testing.
diive is released under the GNU General Public License v3.0.
