Skip to main content

Table 1 Structural rules employed for the extraction of sequence record properties from GenBank and GenPept. For each structural rule, the priority (lower numbers indicate higher priority) and XPath expression are given. The proteinName property was only extracted from GenPept.

From: Rule-based knowledge aggregation for large-scale protein sequence analysis of influenza A viruses

Property Priority Xpath expression
proteinName 1 /GBSeq/GBSeq_definition
  2 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='gene']/GBQualifier_value
subtype 1 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='strain']/GBQualifier_value
  2 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='isolate']/GBQualifier_value
  3 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='organism']/GBQualifier_value
isolate 1 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='strain']/GBQualifier_value
  2 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='isolate']/GBQualifier_value
  3 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='organism']/GBQualifier_value
host 1 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='specific_host']/GBQualifier_value
  2 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='strain']/GBQualifier_value
  3 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='isolate']/GBQualifier_value
  4 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='organism']/GBQualifier_value
origin 1 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='country']/GBQualifier_value
  2 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='isolation_source']/GBQualifier_value
  3 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='strain']/GBQualifier_value
  4 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='isolate']/GBQualifier_value
  5 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='organism']/GBQualifier_value
  6 /GBSeq/GBSeq_references/GBReference/GBReference_title
year 1 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='note']/GBQualifier_value
  2 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='isolation_source']/GBQualifier_value
  3 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='strain']/GBQualifier_value
  4 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='isolate']/GBQualifier_value
  5 /GBSeq/GBSeq_feature-table/GBFeature/GBFeature_quals/GBQualifier [GBQualifier_name='organism']/GBQualifier_value