.. _runAnno:

============================================
Session 6: Run analysis / annotation in IMP3
============================================

In this part of the course, we will use the analysis module of the `IMP3 <https://imp3.readthedocs.io/en/latest/>`_ pipeline and have a look at the output. 

In this session, you can use the IMP3 installation on crunchomics and the example data. 


.. warning:: 

   | If you haven't set up your conda enviroment yet, do the set up as described in the :ref:`first session <testrun>` now.


Data
----

In this session, you can use the small files prepared on crunchomics:

.. code-block:: console

   cd /zfs/omics/projects/metatools/SANDBOX/Metagenomics101/EXAMPLE_DATA
   ls FILTERED_READS
   ls ASSEMBLY
   ls ALIGNMENTS


Running annotation within IMP3
------------------------------

We start with the assembly from last week. In addition, we need the processed reads - the IMP3 analysis will already return the numbers of reads mapping to each functional feature we found. You also need an appropriate configuration file. In this config file, you set that you don't want to do preprocessing, assembly and taxonomy, but that you want to do analysis and some summary. You set the input to already preprocessed files and choose an output directory. Here's an example:
There is a config file which you can copy to your ``~/personal`` folder.

.. code-block:: console

   cd ~/personal
   cp /zfs/omics/projects/metatools/SANDBOX/Metagenomics101/06_annotation/test_annotation.config.yaml annotation.reads.config.yaml


This config will work as is, but you could change the input to your own data. 

First, you should always perform a dry run to detect potential problems in your configuration file.

.. code-block:: console

   cd ~/personal
   /zfs/omics/projects/metatools/TOOLS/IMP3/runIMP3 -d annotation.reads.config.yaml


If the dry-run was successful, you're set to submit the test run to the compute nodes. Remember, you can commit you job to the cluster like so:

.. code:: console

   sinfo -o "%n %e %m %a %c %C"
   /zfs/omics/projects/metatools/TOOLS/IMP3/runIMP3 -c -r -n RANN -b omics-cn002 annotation.reads.config.yaml


As always, you can check the status in the output folder and by checking the slurm queue for your user name.

.. code:: console

   squeue -u YourUserID


Once the run is done, you will have a directory ``Analysis`` in your new output directory, which holds the analysis results in ``annotation``. IMP3 also maps the reads back to that assembly, because we supplied the reads and not an alignment.

Check out the `outputs <https://imp3.readthedocs.io/en/latest/output/Analysis.html>`_ .


Another option
--------------

The config file in ``/zfs/omics/projects/metatools/SANDBOX/Metagenomics101/06_annotation/test_annotation2.config.yaml`` shows how to use an alignment (.bam) file instead of read input and uses different databases.

End of today's lesson. We will look more in depth at the different functional databases next time.