Qiime2 updated and improved from Qiime
Qiime2.0.6 is live. Though billed as an incremental improvement, this is quite a jump. Complete with a separate web presence, tutorials, and forum. The documentation for the most part is much better organized in my opinion. Read on for more info.
Hi all. I recently had the opportunity to get group and alone time with members of the Caporaso and Knight Labs. In addition to the improved documentation, tutorials and other bells and whistles found at Qiime2.org, I have a few notes that may be of use to the UWyo community. There is an extensive set of documents at Qiime2.org:
- Qiime2 will operate in a plug-in fashion so that improvements or customizations no longer need to wait until the developers implement proposed additions (don't like Spearman or Pearson correlations…no problem).
- 454 is no longer supported.
- Qiime2 isn’t HPC ready yet so the largest dataset you can work with at this point is ~1 MiSeq run. You can Do the heavy lifting with Qiime1 if you prefer and then import the *.biom table(s) into Qiime2; however, it will not improve the OTU-picking. You can only avail yourself of the improved data management and visualizations portions of Qiime2. If all goes as planned Qiime2 will be HPC ready (beta) next summer.
- There is a new reliance on “artifacts”, which will better allow for tracking of sample data and its provenance. This will allow tracking of ALL files from start to finish and how each bit of data was handled. This also comes with now file formats (*.qza – Qiime zip Artifact & *.qzv – Qiime zipped visualization) that will seamlessly check data types and translate file formats to match intended analyses. These new formats are in fact zipped archives. You can change them to *zip suffixes if you want to look at the sausage up close. You can also unpack the new archive formats like so:
messy-antelope$ qiime tools extract --help
Usage: qiime tools extract [OPTIONS] PATH
- Also the use of an md5 OTU assigner throughout will allow for merging data from separate runs to make cross-run comparisons. This is new to Qiime2. If you want to see it in action, run through the FMT tutorial (https://docs.qiime2.org/2.0.6/tutorials/fmt/)
- Keemei is still used to check for formatting errors and will continue to only be available as a plug-in for Google Chrome.
- Big change/improvement: Dada2 is now used for denoising and there has been a significant improvement in the reduction of false positive assignment in Qiime2. Benchmarking shows that this is real. Please see the paper for deeper discussion. This is a huge improvement. Dada2 is multithreaded, but a bug currently prevents this in Qiime2. This should be rectified in a few months.
- Rarefication is still tricky of course but when you are setting your –p-sampling-depth number. The resultant artifacts from feature-table summarize and feature-table tabulate-seqs provide easier (than before) to read tables to make decisions for rarefication.
- Tree building is more sensitive and likely better reflects biology. This is because alignment masking is now dynamic and based on the data in hand rather than a static mask developed long ago in history.
- Eventually the provenance in artifacts will automatically spit out citations for all tools used for publication
- There is a GUI interface in development (read: in development). It has a long way to go in my opinion so keep your terminals open.