Generates typeset documents from genomic reference sequences
This code is designed to be run inside a single Docker image. The image is built
on top of texlive/texlive:latest, so it contains both refparse (Python) and a
full TeX Live installation (pdflatex). The whole pipeline — parsing the
reference sequence, typesetting it to .tex, and compiling that to PDF — happens
in this one image. There is no longer a second pdflatex container in the chain.
You will need a local docker desktop to be running.
The refparse.sh companion script handles everything: it builds the image if
needed, then converts one or more input files. Run ./refparse.sh --help for
details.
Examples::
$ ./refparse.sh input/LRG_110.xml # one file -> PDF
$ ./refparse.sh input/LRG_110.xml input/GB_TEST.gb # several files -> PDF
$ ./refparse.sh --text input/GB_TEST.gb # plain-text transcription
PDFs are written to output/pdf/ (with the generated .tex kept in
output/tex/), and --text output to output/txt/.
Build the docker image using
docker build -t refparse:local .
This can then be run by mounting the input, output, and primers directories:
docker run -v ${PWD}/input:/input -v ${PWD}/output:/output -v ${PWD}/primers:/primers refparse:local -i input/LRG_TEST.xml
That produces a .tex file in output/. To compile it to a PDF, reuse the same
image, overriding the entrypoint to call pdflatex directly:
docker run --rm --user $UID:$GID -w /sources -v ${PWD}:/sources --entrypoint pdflatex refparse:local -interaction=nonstopmode -output-directory=output output/output_file.tex
Because the base image already bundles pdflatex, no separate image (previously
embix/pdflatex) is needed.