PyPeakRanker

PyPeakRanker is a Python package for extracting quantitative features from a predefined set of ATAC-seq peaks and assembling them into a reproducible, analysis-ready table.
The resulting peak × feature matrix enables systematic ranking and comparison of regulatory elements across cell types, conditions, or species.

PyPeakRanker does not perform peak calling. Instead, it standardizes feature extraction so that peak prioritization can be performed reproducibly and transparently using downstream ranking or modeling approaches.

Given a fixed set of genomic peaks, PyPeakRanker:

Extracts multiple quantitative features per peak from ATAC-seq data
Aggregates features consistently across cell types or groups
Produces a unified table where rows represent peaks and columns represent features
Enables reproducible ranking of peaks across biological contexts

This design separates feature generation from ranking logic, allowing users to apply custom scoring functions, statistical tests, or machine-learning models downstream.

Statement of Need

ATAC-seq experiments generate large sets of candidate regulatory regions. However, peak prioritization across cell types or conditions is often performed using ad hoc scripts with inconsistent feature definitions and normalization strategies. This limits reproducibility and cross-study comparability.

Existing tools primarily focus on:

Peak calling
Differential accessibility testing
Genomic annotation

But they typically lack a standardized framework for reproducible peak-level feature extraction.

PyPeakRanker addresses this gap by providing a Python package that:

Systematically aggregates quantitative features for predefined ATAC-seq peaks
Produces a single, analysis-ready feature table
Enables transparent, reproducible peak ranking and comparative analyses

Features

PyPeakRanker currently supports extraction of the following peak-level features:

ATAC specificity
Sequence conservation (PhyloP score)
GC content
TSS distance
Peak skewness
Peak kurtosis
Peak bimodality
Gene marker score

The framework is modular and designed to be easily extended with additional peak-level features.

Installation

Install from source:

git clone https://github.com/AllenInstitute/PyPeakRankR
cd PyPeakRankR
pip install -e .

pip install git+https://github.com/AllenInstitute/PyPeakRankR.git

Quick Example

Initialize a feature table from a predefined peak set:

pypeakranker init \
  --peaks peaks.bed \
  --out features.tsv

Add signal summaries from BigWig files:

pypeakranker add-signal \
  --table features.tsv \
  --bigwig-files sample1.bigWig sample2.bigWig \
  --stat sum \
  --suffix summary \
  --out features.tsv

Add GC content from a reference genome:

pypeakranker add-gc \
  --table features.tsv \
  --reference-fasta genome.fa \
  --out features.tsv

The resulting features.tsv will contain:

Original peak coordinates and columns

One column per BigWig summary
A GC_content column

Author

Saroja Somasundaram

Acknowledgements

Development was assisted by AI-based coding tools.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
compute_peak_gc_content.py		compute_peak_gc_content.py
compute_peak_signal_moments_wide.py		compute_peak_signal_moments_wide.py
phylop.py		phylop.py
pyproject.toml		pyproject.toml
summarize_peaks_from_bigwigs.py		summarize_peaks_from_bigwigs.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyPeakRanker

Statement of Need

Features

Installation

Quick Example

Author

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

AllenInstitute/PyPeakRankR

Folders and files

Latest commit

History

Repository files navigation

PyPeakRanker

Statement of Need

Features

Installation

Quick Example

Author

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages