A Python package for working with mass spectrometry data in the QPX format.
- Convert data from DIA-NN, MaxQuant, FragPipe, QuantMS (mzTab), mzIdentML, and SDRF to QPX Parquet format
- Transform QPX data: gene mapping, protein quantification (DirectLFQ, MaxLFQ, iBAQ, TopN, …), accession normalization, metadata updates
- Query datasets with SQL, filter rows, or preview with
head - Inspect dataset summaries, Arrow schemas, and Parquet metadata
- Validate datasets against the canonical QPX schema
- Ontology management for PSI-MS and PRIDE CV terms
QPX datasets can be exported to MuData — the multi-modal container from the scverse ecosystem. This export is available only for quantification results (precursor/protein intensities and, optionally, protein expression and differential expression results).
ds = Dataset("path/to/PXD000000/")
mdata = ds.to_mudata() # auto-detects label & available modalities
mdata.write("PXD000000.h5mu") # serialize to HDF5Requires the optional
mudatadependency:pip install "qpx[mudata]"
pip install qpx
# With optional extras
pip install "qpx[quantify]" # protein quantification (mokume + DirectLFQ)
pip install "qpx[all]" # all optional dependenciespip install git+https://github.com/bigbio/qpx.git# Clone the repository
git clone https://github.com/bigbio/qpx.git
cd qpx
# Install the package locally
pip install .uv is a fast Python package installer and resolver. The project supports PEP 621 and can be installed, built, and published with uv.
Prerequisites: Install uv (e.g. curl -LsSf https://astral.sh/uv/install.sh | sh or pip install uv).
# Install from GitHub
uv pip install "qpx @ git+https://github.com/bigbio/qpx.git"
# With optional extras (transforms, plotting)
uv pip install "qpx[transforms,plotting] @ git+https://github.com/bigbio/qpx.git"From a local clone:
git clone https://github.com/bigbio/qpx.git
cd qpx
# Create a venv, install the project and its dependencies (recommended)
uv sync
# Or install in editable mode with optional dev dependencies
uv sync --extra dev
# Run the CLI without installing globally
uv run qpxc --helpBuild distributable packages (sdist and wheel in dist/):
uv buildPublish to PyPI (after configuring credentials or trusted publishing):
uv build
uv publishThe pyproject.toml uses PEP 621 metadata with Hatchling as the build backend.
For development with all dependencies:
# Using uv (recommended for fast installs)
uv sync --extra dev
# Or using pip
pip install -e ".[dev]"QPX depends on pyOpenMS, which requires certain system libraries. If you encounter errors related to missing shared libraries (e.g., libglib-2.0.so.0), install the required system dependencies:
Ubuntu/Debian:
sudo apt-get update
sudo apt-get install -y libglib2.0-0macOS:
brew install glibUsing Conda/Mamba (Recommended for pyOpenMS):
Using mamba (faster dependency resolution):
mamba env create -f environment.yml
conda activate qpx
pip install git+https://github.com/bigbio/qpx.gitOr with conda:
conda env create -f environment.yml
conda activate qpx
pip install git+https://github.com/bigbio/qpx.gitThe package provides a command-line interface (qpxc) with the following command groups:
qpxc [OPTIONS] COMMAND [ARGS]...
Commands:
convert Convert external tool outputs to QPX format.
transform Transform QPX data into derived representations.
query Query and inspect QPX datasets.
info Show information about a QPX dataset.
validate Validate a QPX dataset or structure against the canonical schema.
ontology Manage CV ontology data (PSI-MS, PRIDE CV).qpxc convert [diann | maxquant | quantms | fragpipe | mzidentml | sdrf] [OPTIONS]qpxc transform [gene-map | quantify | normalize-accessions | update-metadata] [OPTIONS]# Run SQL against a dataset
qpxc query sql --dataset-path ./PXD014414 --sql "SELECT anchor_protein, COUNT(*) FROM feature GROUP BY 1"
# Filter rows
qpxc query filter --dataset-path ./PXD014414 --structure feature --condition "charge >= 3"
# Preview first N rows
qpxc query head --dataset-path ./PXD014414 --structure feature -n 20# Dataset summary
qpxc info --dataset-path ./PXD014414
# Validate against canonical schema
qpxc validate --dataset-path ./PXD014414Most commands support a --verbose flag that enables more detailed logging to stdout. The CLI uses standard logging configuration and does not require environment variables.
qpx/
├── cli/ # Click CLI (entry point: qpx.cli.main:main)
│ ├── main.py # Top-level CLI group
│ └── convert.py # convert subcommands (maxquant, diann, quantms, fragpipe, mzidentml, sdrf)
├── converters/ # Tool-specific converters
│ ├── quantms/ # QuantMS (mzTab) converter
│ ├── diann/ # DIA-NN converter
│ ├── maxquant/ # MaxQuant converter
│ ├── fragpipe/ # FragPipe converter
│ ├── mzidentml/ # mzIdentML converter
│ └── sdrf.py # Shared SDRF converter
├── core/ # Core logic & formats
│ ├── data/ # Schema definitions (YAML + Python)
│ │ └── schemas/ # YAML schema files for all structures
│ ├── engine.py # DuckDB engine wrapper
│ ├── scores.py # Score normalization & ontology
│ └── ontology/ # OBO ontology registry
├── writers/ # Parquet writers (one per structure)
├── views/ # Analytical views (protein, peptide, QC)
└── dataset.py # Main Dataset class entry point
- Fork the repository
- Create a feature branch
- Make your changes
- Run tests
- Submit a pull request
This project is licensed under the Apache-2.0 License - see the LICENSE file for details.
The project is run by different groups:
- Yasset Perez-Riverol (PRIDE Team, European Bioinformatics Institute - EMBL-EBI, U.K.)
- Ping Zheng (Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, China)
IMPORTANT: If you contribute with the following specification, please make sure to add your name to the list of contributors.
As part of our efforts toward delivering open and inclusive science, we follow the Contributor Covenant Code of Conduct for Open Source Projects.
Copyright 2025 BigBio
Licensed under the Apache License, Version 2.0.
See the LICENSE file for details.