Genomic (or generic) work pipeline and data managers?

ChuckP · April 18, 2024, 4:07pm

At PSU ICDS, I am getting frequent requests for bioinformatic and genomics pipeline support. We have deployed several for users such as Galaxy (usegalaxy.org) and many folks just do scripted python. I was wondering what others are doing in their institutions to ensure reproducible results?

mjs · April 21, 2024, 6:58pm

Hi Chuck,

In terms of command-line/config file-based tools, I’ve seen quite a few people use Snakemake, Nextflow, or Cromwell for this kind of thing. They are all fairly similar to each other and support pipelines written in pretty much any language. Not sure what kind of job scheduler you use, but these are all implemented well with Slurm, as well as a few other schedulers (though I don’t have experience with those).

These are the three I’ve personally supported, but there are a few more listed in Table 1 of this article that might be worth checking out too. Hope this is helpful.

Best,
Matt