Skip to main content

Table 4 Drug characteristics with poorly discriminative index keywords but achieved an overall good predictive performance

From: BICEPP: an example-based statistical text mining method for predicting the binary characteristics of drugs

Category

Characteristic

Position of IK(s)

n(pos)

Best AUC*

Top-20 predictive tokens/words (AUC†)

AE

Cystitis

10.7 pct

13

1.000

nsaids, cyclooxygenase, nimesulide, meloxicam (0.999); nsaid, diclofenac, naproxen, antiinflammatory, non-steroidal (0.998); ibuprofen, anti-inflammatory, nonsteroidal (0.997); ketoprofen, antipyretic (0.996); indomethacin (0.993); osteoarthritis (0.991); pge2 (0.991), prostanoid (0.988), thromboxane (0.986), prostaglandin (0.985)

AE

Dyslipidaemia

14.0 pct

13

0.900

Aldosterone (0.95); acetazolamide, mineralocorticoid (0.94); deoxycorticosterone (0.93), pge2, indomethacin, hearing (0.88); spironolactone, mineralocorticoids, hyponatremia, renin, adh, ace (0.87); furosemide (0.86); insipidus, asthmatic, prostaglandins, fev1, pra, phenylbutazone (0.85)

AE

Migraine

76.0 pct

14

0.906

angiotensin (0.89), plasminogen, dbp, insulin (0.85); infarction, low-density, losartan (0.84); hormonal, brachial, run-in, fixed-dose (0.83); lipoprotein, valsartan, endothelium-dependent, renin (0.82); pravastatin, hba1c (0.81); angiotensinogen, chd, smoking (0.80)

AE

Oesophagitis

10.0 pct

19

0.919

Metastases (0.92); marrow (0.90); weekly (0.88); metastatic, antitumor (0.87); cancer (0.86); 3-year, toxicity, regimen, nadir, breast, metastasis (0.85); myeloma, survival, cancers, prostate (0.84); regimens, remission, cytotoxic, melphalan (0.83)

AE

Paralytic ileus

59.4 pct

12

0.961

amitriptyline (0.96); tricyclic (0.95); antidepressants, anticholinergic, antidepressant (0.94); neuroleptics, chlorpromazine, depressive, overdose (0.93); tca, serotonin, diazepam (0.92); intoxication (0.91); thioridazine, clonidine (0.90); psychotropic, affective, psychological, antinociceptive (0.89); constipation (0.87)

AE

Thrombocytopenic purpura

10.6 pct

11

1

infarction, ejection (0.95); intra-arterial (0.94); echocardiography (0.93); st-segment (0.93); echocardiographic (0.90); beta-blocking (0.89); beta-blocker (0.87); cardiology, diacetolol, bopindolol, bucindolol, beta-adrenoceptor-blocking, adp, beta-ars, beta1-selective, beta-adrenoblockers, cardioselectivity, atenolol, non-fatal (0.86)

MC

5HT3 antagonists

10.0 pct

4

1

5-ht3ra, 5-ht3ra/dexamethasone, granisetron, ondansetron, 1966-september, 5-ht3, tropisetron, 5-hydroxytryptamine3, anti-emetic (1); dolasetron, emetic, ramosetron, 5-ht3-receptor, cinv, 5ht3, emetogenic, type-3, emesis, setrons, pov (0.999)

MC

Antibacterials (ear)

14.7 pct

4

0.98

enrofloxacin, chloramphenicol, gentamycin (0.99); oxytetracycline, kanamycin, polymyxin, colistin, gentamicin, bacitracin, neomycin, povidone-iodine, fusidic, streptomycin, bacterial, septicemia, swabs (0.98); anaerobic, peru, tetracycline, aminoglycoside (0.97)

  1. *) Refers to the best area under the ROC curve achieved by the four machine learning algorithms as evaluated by stratified CVs. †) Refers to the AUCs used during the feature selection process. Abbreviations: AE: Adverse events; IK: index keywords; MC: minor drug classes; pct percentile. n(pos): number of positive examples in each drug characteristic dataset.