The Drug Ontology (DrOn) is a modular, extensible ontology of drug products, their ingredients, and their biological activity [1,2,3,4]. It was created to enable comparative effectiveness and health services researchers to query National Drug Codes (NDCs) [5] that represent products by ingredient, by molecular disposition (e.g., beta-adrenergic receptor molecule blockade), by therapeutic disposition (e.g., antihypertensive), and by physiological effect (e.g., diuretic). It is based on the RxNorm [6] drug terminology maintained by the U.S. National Library of Medicine (NLM), and on Chemical Entities of Biological Interest (ChEBI) [7].
This paper presents improvements to DrOn in its handling of identifiers used in RxNorm and elsewhere to manage drug information. These improvements include a new model for handling National Drug Codes – which are centrally registered identifiers that denote packaged drug products - as Information Content Entities, and a handling of RxNorm unique concept identifiers (RXCUIs), also as ICEs. Our approach makes it possible to then also model and track changes to these information entities over time. This comprehensive modeling of NDCs and RXCUIs as ICEs, as well as of the entities and processes involved in managing these identifiers, improves DrOn’s representation by bringing it into close correspondence with reality.
With this amendment, DrOn becomes even more useful for prescription data management, especially in those cases where an NDC is used to denote different packaged drug products at different points in time, or where an RXCUI that is now no longer active was used to denote a prescribed product in historic data that needs to be accurately interpreted. Because NDC codes may be re-used [8], without a rich representation of NDC histories, it can be difficult to determine which drug product is denoted by a given NDC in historic records, an issue solved by this proposed amendment to DrOn. RXCUIs are never reused once they are retired but they also require explicit modeling in DrOn to track their evolution over time through their participation in processes such as retirement/deactivation, and various types of remappings [1, 3].
National Drug Codes
National Drug Codes (NDCs) are numeric codes issued by the US Food and Drug Administration (FDA) and published in a National Drug Code Directory that is updated daily [5]. Each NDC has three segments, which uniquely identify 1) the labeling entity (drug manufacturers, distributors), 2) the drug product, including strength, dose, and formulation, and 3) packaging. Once deactivated, an NDC may be re-used as soon as five years later by the same labeling entity to identify a different product [9]. Though rare, NDC re-use can create difficulties when managing prescription records and other historic data that use NDCs--especially long-term longitudinal databases of pharmacy claims records that span five years or more--because they are not guaranteed to uniquely identify a particular packaged drug product. RxNorm tracks the assignment and deactivation of NDC codes, and this information is made available for each NDC through regular releases of RxNorm. The NLM’s RxNav tools provide the ability to retrieve information about the history of a single NDC using the ndcstatus function of the RxNav [10, 11] REST API. As an example, accessing the RxNav REST API gives an XML version of the history for the NDC code 51655072052, shown in Listing 1.
From the perspective of RxNorm, this appears to be a single code that was created once, in 2007, and associated with its first concept/RxCUI at that time (RxCUI 308,119, which identifies the concept Aminophylline 200 mg Oral Tablet). The nature of this association is that the NDC code identifies a packaged drug product that contains oral tablets with 200 mg of Aminophylline. Note that there are possibly several other NDC codes that we do not consider here but that also identify packaged drug products containing tablets of the same dose of the same substance, for instance packages that have a different number of tablets than the packaging identified by 51,655,072,052, or packages produced by other manufacturers.
This NDC code 51655072052 was later deactivated, in 2012, ending its association with RxCUI 308,119. Later yet, in 2017, this same NDC code was newly associated with a different RxCUI 309,114, which identifies the concept Cephalexin 500 MG Oral Capsule. It is important to note that Aminophylline Tablets and Cephalexin Capsules have no clear conceptual association but only this accidental connection caused by the reuse of an NDC.
Currently in DrOn, each NDC is represented as an rdfs:label attached to a class for the corresponding packaged drug product. For example, as shown in Fig. 1, the product with NDC 51655072052, a packaged drug product that includes as part one or more Aminophylline 200 MG Oral Tablets, has its own class with 51,655,072,052 as the label. In this scheme, it is not explicitly represented that 51,655,072,052 is an NDC. Further, any historic information about NDCs using the symbol 51,655,072,052 is unavailable to users of DrOn.
In Fig. 1, as in other figures depicting Web Ontology Language (OWL) [12,13,14] ontology fragments throughout this manuscript we use the following conventions:
Ovals stand for OWL classes, with the rdfs:label for each class appearing as text in its oval. For example, in Fig. 1, the DrOn class ‘Packaged Drug Product’ is represented by an oval containing the class name.
Arrows stand for relations. In Fig. 1, as in others, there is a class with a dotted subClassOf arrow connecting a specific oral tablet class to the class DRON: drug tablet. These arrows are dotted rather than solid to indicate that the relation in question is not directly asserted in DrOn, but is inferred through the transitivity of the subclass relation.
Empty circles stand for individuals.
Rectangles stand for annotation values (aside from rdfs:labels).
To enhance its potential as a tool for managing prescription data, we are adding to DrOn explicit representations of NDCs as Information Content Entities. This representation allows us to account for the creation/assignment, deactivation, and re-use of NDC symbols, and includes temporal information about these processes and links to the correct packaged drug product and drug product classes.
RxNorm unique concept identifiers
RxNorm is primarily organized around its concepts, identified by RXCUIs (RxNorm unique concept identifiers). These concepts correspond to such entities as drugs, ingredients, dose forms, brand names, etc., and are arranged into a graph that expresses relations among these entities. Concepts are also used to group together synonyms, for instance the different terms that name the same product across drug information sources. For example, the following are all names for an oral tablet consisting of 250 mg of Naproxen [15].
‘Naproxen Tab 250 MG’
‘Naproxen 250mg tablet (product)’
‘NAPROXEN@250 mg@ORAL@TABLET’
‘Naproxen 250 MILLIGRAM In 1 TABLET ORAL TABLET’
‘NAPROXEN 250MG TAB,UD [VA Product]’
RxNorm groups these as synonyms under the same concept, identified by the RXCUI, 198013.
Within RxNorm itself, each of these terms is represented by its own atom, and these atoms are grouped together to form concepts. As a realist ontology of drug products, DrOn focuses on representing the products themselves, and information about their ingredients and biological activity, then terminological information such as which string may be used to identify a product in different information sources. Hence, DrOn contains information about RxNorm concepts identifiers, linking these directly to ontology classes that represent the entities that RXCUI concepts correspond to, but does not concern itself with the atoms used to organize synonyms within RxNorm.
Unlike NDCs, RXCUIs are never reused. They can and do, however, undergo other changes. RxNorm tracks concept changes over time with each release. Of course these include the creation of a concept (e.g. for a newly-available drug product), but also concept retirement. RXCUIs may be retired/deactivated for several different reasons, including the removal of a concept that is an error, merging two concepts that are discovered to be synonymous (in which case one of the two RXCUIs is retired), or splitting a concept into two or more new concepts (in which case the old RXCUI is retired. Once retired an RXCUI is kept in the inactive state through all future releases of RxNorm [16].
Figure 2 shows the remapping of RXCUI 197523 (Clomiphene 50 MG Oral Tablet), which was created in the May 2006 release of RxNorm, to a new RXCUI 1093060 (Clomiphene Citrate 50 MG Oral Tablet) in March 2011. The original RXCUI is retired as part of the remapping.
Figure 3 shows a split in which a single RXCUI is replaced by two others. RXCUI 197587 (Glucose 50 MG/ML / Sodium Chloride 0.154 MEQ/ML Injectable Solution) first appears in the May 2006 release of RxNorm. In June 2016 it is split by retiring the original and marking it as replaced by two new RXCUIs: 1795344 (500 ML Glucose 50 MG/ML / Sodium Chloride 9 MG/ML Injection) and 1,795,346 (1000 ML Glucose 50 MG/ML / Sodium Chloride 9 MG/ML Injection). Note that the distinguishing feature between these two new drug solution concepts is their total volume (500ML vs 1000ML). The original concept does not specify a volume, but only provides concentrations. Conceivably the original concept was used for both 500ML and 1000ML versions before the need to represent this distinction became evident and was included in RxNorm.
Though DrOn’s current build process does use and manage historic information about the provenance of RXCUIs [2] in order to determine which RXCUIs are currently active as of each release, and what entities they currently denote as of that release, a DrOn ontology release that is the output of the DrOn build process does not itself represent the history or provenance of RXCUIs, or make explicit representation of the RXCUIs as identifiers. Rather, it simply attaches RXCUIs to the ontology classes that they correspond to using the annotation property ‘has_rxcui,’ as shown in Fig. 1.
This simple representation of RXCUIs allows users of DrOn to easily see the current RXCUI for each class, and to easily query and retrieve current drug information based on RXCUIs, but it is of limited usefulness for working with historic data, for instance to make sense of five year old prescription records stored in an electronic health record system using RXCUIs that were current at the time the data was generated by the system. In this scenario, the current version of DrOn is very useful for answering questions about those entries using RXCUIS that have not changed since the prescriptions were originally written, questions such as: Which patients were prescribed a product that contains a dose of more than 50 mg of any opioid? However, for a prescription entry using an RXCUI that has been remapped or otherwise retired DrOn will not be able to provide any information about the drug that was prescribed.