A partial snapshot of the Biozon hierarchical document classification model. A major distinction is made between descriptors and objects (see text for details). The presence of a particular class in the hierarchy can arise due to physical or semantic differences in the nature of the documents therein. For example, amino acids and nucleic acids are both stored as text strings in the database and their internal representations are identical (although over different alphabets). However, they represent fundamentally different real-world objects and should be classified as such. A special subclass of objects is locus. This type serves to localize information with respect to larger objects or to represent efficiently objects that are essentially sub-entities of other existing objects (for example, a protein domain is a locus with respect to a protein sequence, with specific start and end positions).