The architecture of EHCO. (A) EHCO uses a Content Management System, PLONE, to maintain different types of information. PLONE supports workflow design, content sharing, front-end editing, and member registration. Softbots, which interact with a software environment by using and interpreting the environment feedback, are used as annotation collectors to retrieve scattered genomic information across the Internet. EHCO also implements Natural Language Processing, a subfield of artificial intelligence and linguistics, and Gene Name Service, a comprehensive cross reference service of all widely used gene ID nomenclatures, to support the annotation engine, which is supported by mySQL database and python-written scripts. The Presentation Engine uses Wiki pages to allow dynamic information display as well as user commenting. (B) We performed biological information retrieval (IR) to obtain PubMed abstracts that may contain gene-HCC relationships, followed by two information extraction tasks: (1) Named Entity Recognition (NER): to recognize biomedical named entities (NEs) and (2) Named Entity Relation Recognition (NERR), as shown in the flowchart.