Skip to content

How to handle HERRO inference OOM errors on limited-RAM GPU nodes #84

@wlhCNU

Description

@wlhCNU

Hi,
I am currently trying to correct an 80G ONT UL dataset using herro. Both preprocess.sh and create_batched_alignments.sh have been successfully run on a large computing node with 1TB of memory. Since this 1TB node does not have a GPU, I need to switch to a GPU-equipped node. However, this GPU node has only 125GB of memory. Therefore, when I run the command: singularity run --nv herro-0.1.1/herro.sif inference --read-alns batches_dir/ -t 8 -d 0 -m model_R10_v0.1.pt -b 32 preprocessed.fastq.gz corrected_output.fasta , the memory quickly gets filled up and the job gets interrupted. Even after adjusting the parameters to -t 1 -b 1, the same problem persists.
I am considering whether I can split the preprocessed sequences (from preprocess.sh) into several smaller pieces and then run the inference command on each piece separately. However, the --read-alns batches_dir/ would still be the output generated from the entire 80G ONT UL dataset. Will this approach reduce the accuracy of the corrected sequences? What would be the appropriate way to handle this situation?

                Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions