Feature | Principle | Scoring criteria | Result |
---|---|---|---|
Pseudogenes linked to HPs | It is generally believed that the majority of HPs are the products of pseudogenes. Follow-up of BLAST: if the hits do not have starting codon ATG across six reading frames, then it may be assumed to be a pseudogene. | Predicted and synthetic sequences, sequences with end-to-end alignment are ignored. Sequences from Homo sapiens with E- value less than zero are considered. | Sequences starting without methionine and meeting all the above criteria were given 1, otherwise 0. |
Homology Modelling | As sequence-structure implies function, it is possible to assign function to HP if we could model the protein to find any interacting domains. | Based on % identity between query and PDB template | If there is more than 30% similarity, score = 1, otherwise 0. |
Non-coding RNAs associated to HPs | Most of the HPs from GenBank lack protein coding capacity and some of them may themselves be noncoding RNAs | The top three hits are considered for sequences from Homo sapiens, while the top five hits are considered when there is no considerable difference between scores. | If the above criterion is met, score 1, otherwise 0. |