BugSeq outputs the following raw data files, depending on analysis:
Under the folder
Metagenomic Classification Summary, there’s a file named
metagenomic_classification-RUN_ID.tsv. This file contains the raw data visualized in the above file. For example, opening this file in Excel will reveal:
|Taxon Rank||Taxon NCBI ID||Taxon Name||Sample 1 - Read count at this taxon and below||Sample 1 - Read count directly assigned to this taxon|
The first three columns are row labels and reflect taxonomic nodes. BugSeq follows the NCBI taxonomic scheme. Taxon rank codes reflect (U)nclassified, (D)omain, (K)ingdom, (P)hylum, (C)lass, (O)rder, (F)amily, (G)enus, or (S)pecies. Intermediate ranks, eg. F1, reflects one level below family.
Each sample is then included as two columns:
- Read count at this taxon and below: This field contains the summed read count at this taxon. In this example, 67449 reads were assigned to the superkingdom Bacteria or a rank below Bacteria.
- Read count directly assigned to this taxon: These reads could not be assigned to a lower taxonomic node given mapping ambiguity and/or the nature of the taxonomic tree (eg. if the reads are assigned to the lowest rank in the tree). In this example, 25 reads were identified as bacterial in origin, but could not be assigned to lower nodes such as Klebsiella pneumoniae.
Assembled contigs are found in the
Assembly folder under
Metagenomic Bins as
.fna files. These files contain all organisms with sufficient depth in the submitted sequencing data to be assembled. Details on each bin, such as their completeness (eg. BUSCO count), antimicrobial resistance profile and more are found in the summary and per-sample reports.