MetaEuk Downloads

Resources produced as part of the MetaEuk manuscript (directory 2019_11)

  1. Assembled Tara Oceans contigs >5Kbp in lengeth and classifed as eukaryotic (fasta format)
  2. Protein profiles computed from ~88 million clusters of MERC, MMETSP and Uniclust50 (MMseqs2 database format)
  3. MetaEuk ~12 million predicted proteins in Tara Oceans contigs (fasta format)
  4. MetaEuk ~6 million unique predicted proteins in Tara Oceans contigs (fasta format)
  5. Taxonomic databases (see 2020_TAX_DB for a newer version, not used in the manuscript)
  6. Taxonomic assignment of the MetaEuk predictions in Tara Oceans contigs (tsv format)
  7. Taxonomic assignment of Tara Oceans contigs based on their MetaEuk predictions (tsv format)

Scripts for the main manuscript analyses

  1. The main scripts and bash files to run MetaEuk on the benchmark and Tara Oceans data are available in:
  2. The input for the RNA Polymerase alignment and tree computation are available in: RNA_pol_tree.tar.gz
  3. MMETSP and Uniclust90 taxonomically annotated databses are avaialble in: TAX_DBs (MMseqs2 database format)