Python API

sam2lca.main.list_available_db(dbdir, verbose=False)[source]

List available taxonomy databases

Parameters

db_dir (str) – Path to sam2lca database directory

Returns

List of available taxonomy databases list: List of available acc2tax databases

Return type

list

sam2lca.main.sam2lca(sam, output=None, dbdir='/home/docs/.sam2lca', taxonomy='ncbi', acc2tax='nucl', process=2, identity=0.8, distance=None, length=30, conserved=False, bam_out=False, bam_split_rank=False, bam_split_read=50)[source]

Performs LCA on SAM/BAM/CRAM alignment file

Parameters
  • sam (str) – Path to SAM/BAM/CRAM alignment file

  • output (str) – Path to sam2lca output file

  • dbdir (str) – Path to database storing directory

  • taxonomy (str) – Type of Taxonomy database

  • acc2tax (str) – Type of acc2tax database

  • process (int) – Number of process for parallelization

  • identity (float) – Minimum alignment identity threshold

  • edit_distance (int) – Maximum edit distance threshold

  • length (int) – Minimum alignment length

  • bam_out (bool) – Write BAM output file with XT tag for TAXID

  • bam_split_rank (str) – Rank to split BAM output file by TAXID

  • bam_split_read (int) – Minimum number of reads to split BAM output file by TAXID

sam2lca.main.update_database(dbdir='/home/docs/.sam2lca', taxonomy=None, taxo_names=None, taxo_nodes=None, taxo_merged=None, acc2tax='nucl', acc2tax_json='https://raw.githubusercontent.com/maxibor/sam2lca/master/data/acc2tax.json')[source]

Performs LCA on SAM/BAM/CRAM alignment file

Parameters
  • dbdir (str) – Path to database storing directory

  • taxonomy (str) – Name of Taxonomy database

  • names (str) – names.dmp file for taxonomy database. None loads the NCBI taxonomy database

  • nodes (str) – nodes.dmp file for taxonomy database. None loads the NCBI taxonomy database

  • merged (str) – merged.dmp file for taxonomy database. None loads the NCBI taxonomy database

  • acc2tax (str) – Type of acc2tax database

  • acc2tax_json (str) – Path to acc2tax json file