Skip to main content

Table 4 Jobs in the GeneTree pipeline for Ensembl release 49

From: eHive: An Artificial Intelligence workflow system for genomic analysis

Analysis

Number of jobs

Failed jobs

Granularity

GenomeDumpFasta

39

-

1 per genome

GenomeLoadMembers

39

-

1 per genome

GenomeSubmitPep

39

-

1 per genome

CreateBlastRules

39

-

1 per genome

SubmitPep_*

682412

-

1 per peptide

blast_*

26614068

-

All vs all peptides

UpdatePAFids

1

-

1 per pipeline

PAFCluster

1

-

1 per pipeline

Muscle

26484

7

1 per genetree

BreakPAFCluster

95

-

As many as required

TreeBeST

26477

9

1 per genetree

OrthoTree

26468

-

1 per genetree

CreateHomology_dNdSJob

1

-

1 per pipeline

Homology_dNdS

3646340

1364

1 per orthologous gene pair

Threshold_on_dS

1

-

1 per pipeline

TOTAL

31022503

1380

 
  1. This table shows the final number of jobs run for each analysis during the execution of the GeneTree pipeline for 39 species. All the SubmitPep_xxxxx and blast_yyyyy jobs have been grouped for simplicity. The table also shows the number of jobs that failed. Muscle and TreeBeST jobs were recovered using the BreakPAFCluster analysis. This breaks the cluster and creates new Muscle jobs.