Session 5: Run assemblies in IMP3
In this part of the course, we will use the assemblers in the IMP3 pipeline and have a look at the diagnostics plots.
In this session, you can use the IMP3 installation on crunchomics and the example data.
Warning
Data
In this session, you can use the small files prepared on crunchomics:
cd /zfs/omics/projects/metatools/SANDBOX/Metagenomics101/EXAMPLE_DATA/FILTERED_READS
ls
Running assembly within IMP3
We start with the already preprocessed read data. All you need is an appropriate configuration file. In this config file, you set that you don’t want to do preprocessing, but that you want to do assembly and some summary. You set the input to already preprocessed files and choose an output directory. Here’s an example:
There is a config file which you can copy to your ~/personal
folder.
cd ~/personal
cp /zfs/omics/projects/metatools/SANDBOX/Metagenomics101/05_assembly/test_assembly.config.yaml my.assembly.config.yaml
This config will work as is, but you could change the input to your own data.
First, you should always perform a dry run to detect potential problems in your configuration file.
cd ~/personal
/zfs/omics/projects/metatools/TOOLS/IMP3/runIMP3 -d my.assembly.config.yaml
If the dry-run was successful, you’re set to submit the test run to the compute nodes. Remember, you can commit you job to the cluster like so:
sinfo -o "%n %e %m %a %c %C"
/zfs/omics/projects/metatools/TOOLS/IMP3/runIMP3 -c -r -n TESTASS -b omics-cn002 my.assembly.config.yaml
As always, you can check the status in the output folder and by checking the slurm queue for your user name.
squeue -u YourUserID
Once the run is done, you will have a directory Assembly
in your new output directory, which holds the assembly. IMP3 also maps the reads back to that assembly for your later use.
Other assembly options
Next to Megahit, IMP3 can also run MetaSPAdes. You can also decide to run two iterations of assembly, if you want to maximise read usage. Originally, IMP was developed to also run more sophisticated assemblies of metagenomics and metatranscriptomics reads. Check out the IMP3 documentation on assembly, if you want to know more.
End of today’s lesson. We can look at your assemblies next week.