Time series data processing

diive is currently being prepared for the v1.0 release.

Time series data processing

diive is a Python library for time series processing, focused on ecosystem data. It was originally developed by the ETH Grassland Sciences group for Swiss FluxNet.

CHANGELOG | Releases

Citation

Cite diive using DOI 10.5281/zenodo.10884017. This concept DOI resolves to the latest release, so include the version number in your citation.

BibTeX format:

@software{diive2026,
  author = {Hörtnagl, Lukas},
  title = {diive: Python library for time series processing},
  version = {0.91.0},
  year = {2026},
  doi = {10.5281/zenodo.10884017}
}

Replace version and year with the values for your target release.

Installation

Requires Python 3.12+

pip install diive

Or with uv:

uv pip install diive

Quick start

import diive as dv

# Load example data
df = dv.load_exampledata_parquet()

# Plot time series
dv.plot_time_series(series=df['NEE']).plot()

# Gap-fill with Random Forest
from diive.core.ml.feature_engineer import FeatureEngineer
from diive.gapfilling.randomforest_ts import RandomForestTS

engineer = FeatureEngineer(target_col='NEE', features_lag=[-1, 1], features_rolling=[12, 24])
df_engineered = engineer.fit_transform(df)

model = RandomForestTS(input_df=df_engineered, target_col='NEE', n_estimators=100)
model.trainmodel()
model.fillgaps()

API

diive exposes its classes through a top-level namespace, available as both PascalCase and snake_case aliases:

import diive as dv

plot = dv.plot_time_series(series=data)   # snake_case alias
plot = dv.TimeSeries(series=data)         # PascalCase class name

Area	Common exports
Plotting	`TimeSeries`, `Cumulative`, `DielCycle`, `HeatmapDateTime`
Gap-filling	`RandomForestTS`, `XGBoostTS`, `FluxMDS`
Analysis	`GridAggregator`, `SeasonalTrendDecomposition`
Eddy covariance	`FluxProcessingChain`, `FluxDetectionLimit`, `WindDoubleRotation`
I/O	`load_parquet`, `save_parquet`, `load_exampledata_parquet`

For the full list, see diive.__all__.

Examples

104 runnable examples are organized by topic in examples/. They follow Sphinx Gallery format (# %% sections), so they run as plain scripts and convert to HTML docs automatically. Browse by use case in CATALOG.md, or check EXAMPLE_DATASET.md for documentation of the 37-variable dataset used throughout.

uv run python examples/visualization/plot_heatmap_datetime_basic.py
uv run python examples/analysis/analysis_daily_correlation.py
uv run python examples/gapfilling/gapfill_randomforest.py
uv run python examples/flux/fluxprocessingchain/fluxprocessingchain_composable.py

Features

Gap-filling

FeatureEngineer runs an 8-stage feature pipeline (lag features, rolling stats, differencing, EMA, polynomial terms, STL decomposition, timestamps, record numbering). You build the features once and reuse them across models.

Method	How it works
`XGBoostTS`	Gradient boosting
`RandomForestTS`	Ensemble learning with SHAP importance
`FluxMDS`	Meteorological similarity, no training needed
Linear interpolation	Short gaps only

Long-term variants support multi-year data with USTAR scenario options. See examples/gapfilling/.

Flux processing chain

Post-processing from quality flags through gap-filling, covering Levels 2 to 4.1 following Swiss FluxNet standards. Two entry points:

run_chain(data, config) — single call drives the full pipeline (L2 → L3.1 → L3.2 → L3.3 → L4.1) from one FluxConfig. Intentionally simple: fixed defaults for per-detector / per-model knobs (Hampel sub-options, MDS tolerances, ML hyperparameters). Use this for the standard FLUXNET-style workflow.
Composable per-level callables (run_level2, run_level31, make_level32_detector + run_level32, run_level33_constant_ustar / run_level33_ustar_detection, run_level41_mds / _rf / _xgb) — full control. Every detector class, model hyperparameter, MDS tolerance, and diagnostic flag is reachable here and only here.

Need a computed driver (e.g. VPD in kPa) for L4.1? Use add_driver(data, series) to put it where L4.1 actually reads from. Call data.gap_stats() at any level for a monthly/annual breakdown with long-gap listing. data.plot_gapfilled_heatmaps() puts all gap-filling methods side by side; data.plot_cumulative_comparison() overlays their cumulative sums on one axes.

Reference: Swiss FluxNet flux processing | Examples: examples/flux/fluxprocessingchain/

Quality control and outlier detection

FlagQCF merges multiple test flags into a single quality indicator with daytime/nighttime separation and USTAR scenario support.

Nine outlier detection methods are available: Hampel filter, Z-score (global, rolling, or split by day/night), local SD, Local Outlier Factor, absolute limits, incremental detection, manual removal, trimmed mean, and stepwise chaining across multiple methods. See examples/preprocessing/outlier_detection/.

Corrections and preprocessing

Tools cover offset correction for measurements, radiation, humidity, and wind direction; threshold and missing value handling; and timestamp sanitization (validation, regularization, frequency detection). See examples/preprocessing/corrections/ and examples/times/.

Analysis

Seasonal-trend decomposition (STL, classical, or harmonic), lagged correlation and binned analysis, 2D grid aggregation, gap detection with monthly/annual breakdown, and percentiles/histograms. See examples/analysis/.

Derived variables

VPD from temperature and humidity, day/night flags from solar geometry, air density, aerodynamic resistance, unit conversions, lagged features, and clear-sky potential radiation. See examples/features/.

Eddy covariance

Flux detection limit from 20 Hz data, maximum covariance lag, pre-whitening bootstrap (PWB) for trace gases (CH4, N2O) with single-period and multi-file parallel variants, wind double rotation, self-heating correction for open-path IRGAs, USTAR filtering, and random error propagation. See examples/flux/.

Visualization

14+ plot types including time series, cumulative, diel cycle, heatmaps (datetime and year-month), hexbin, histogram, ridgeline, scatter, and anomaly plots. Both Matplotlib and Plotly are supported. See examples/visualization/.

I/O

Load and save parquet files, read single or batch EddyPro output, detect and split irregular files, and format data for FLUXNET submission. See examples/io/.

Contributing

See CLAUDE.md for development setup, coding standards, and testing.

License

diive is released under the GNU General Public License v3.0.

Name		Name	Last commit message	Last commit date
Latest commit History 1,139 Commits
diive		diive
docs		docs
examples		examples
images		images
notebooks		notebooks
scratch		scratch
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.readthedocs.yml		.readthedocs.yml
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Time series data processing

Citation

Installation

Quick start

API

Examples

Features

Gap-filling

Flux processing chain

Quality control and outlier detection

Corrections and preprocessing

Analysis

Derived variables

Eddy covariance

Visualization

I/O

Contributing

License

About

Uh oh!

Releases 51

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Time series data processing

Citation

Installation

Quick start

API

Examples

Features

Gap-filling

Flux processing chain

Quality control and outlier detection

Corrections and preprocessing

Analysis

Derived variables

Eddy covariance

Visualization

I/O

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 51

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages