Skip to content

Clinical-Genomics/microSALT

Repository files navigation

Build status DOI

Microbial Sequence Analysis and Loci-based Typing pipeline

The microbial sequence analysis and loci-based typing pipeline (microSALT) is used to analyse microbial samples. It produces a quality control of the sample, determines a sample's organism specific sequence type, and its resistance pattern. microSALT also provides a database storage solution and report generation of these results.

microSALT uses a combination of Python, MySQL and Jinja2. Python is used for the majority of functionality, the database is handled through MySQL via SQLAlchemy and reports are rendered through Jinja2. All analysis activity by microSALT requires a SLURM cluster.

Installation

Quick install

Important

This install requires uv to be installed on the system. For installation instructions, see https://docs.astral.sh/uv/getting-started/installation/.

bash <(curl https://raw.githubusercontent.com/Clinical-Genomics/microSALT/master/install.sh)

Manual install

  1. Clone the repository and enter the directory
  2. Checkout the desired branch
  3. install package using uv pip install .

Configuration

Copy the configuration file anywhere and.

cp configExample.json $HOME/.microSALT/config.json

Important

Then edit the fields to match your environment.

Installing containers

microSALT uses Singularity containers to run the various tools used in the analysis. These containers are available on Clinical Genomics' DockerHub, and can be pulled using the following command:

singularity pull docker://clinicalgenomics/microsalt-blast:latest singularity pull docker://clinicalgenomics/microsalt-bwa:latest singularity pull docker://clinicalgenomics/microsalt-picard:latest singularity pull docker://clinicalgenomics/microsalt-quast:latest singularity pull docker://clinicalgenomics/microsalt-samtools:latest singularity pull docker://clinicalgenomics/microsalt-skesa:latest singularity pull docker://clinicalgenomics/microsalt-trimmomatic:latest

Note

Remember to enter the correct path to the singularity images in the configuration file.

Usage

  • microsalt analyse contains functions to start sbatch job(s) & produce output to folders['results']. Afterwards the parsed results are uploaded to the SQL back-end and produce reports (HTML), which are then automatically e-mailed to the user.
  • microsalt utils contains various functionality, including generating the sample description json, manually adding new reference organisms and re-generating reports.

Setup

Before running microSALT, the user must run the setup command, which will create the necessary database tables and download the necessary databases. This only needs to be run once, and can be run again if the user wants to reset the database or download new databases.

The setup is also dependent on

microsalt setup

Retrieving credentials

The credentials to access the pubMLST and Pasteur database can be retrieved by running the following command:

microsalt utils get_bigsdb_credentials

This will allow the user to specify which database they want to retrieve credentials for. Given that the user has given the correct information in the Configuration section, the credentials will be retrieved and stored on disk for later use.

Databases

MLST Definitions

microSALT will automatically download & use the MLST definitions for any organism on pubMLST or Pasteur. Other definitions may be used, as long as they retain the same format.

Resistance genes

microSALT will automatically download & use the resistance genes of ResFinder. Any definitions will work, as long as they retain the same formatting.

Requirements

Hardware

  • A SLURM enabled HPC system

Software

  • uv >= 0.4
  • Python >= 3.10
  • MySQL server

Contributing to this repo

This repository follows the Github flow approach to adding updates. For more information, see https://guides.github.com/introduction/flow/

Credits

  • Isak Sylvin - Lead developer
  • Emma Sernstad - Accreditation ready reports
  • Tanja Normark - Various issues
  • Maya Brandi - Various issues

About

Microbial Sequence Analysis and Loci-based Typing pipeline for use on NGS WGS data.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages