Origins of events proposed by the stacked model. For each event in the stacked model's output, we show which model(s) originally proposed the event or whether the event was novel (generated by the stacked model). We group events by their event type and whether they were correct with respect to the gold standard. Event origin (only from the Stanford 2P predictions, only from UMass, from both Stanford 2P and UMass, or novel to the stacking output) is marked by hatching while event correctness is indicated by color. Several observations can be made: Novel events tend to be more incorrect than correct, events originating from both base models have high precision, and Gene Expression s events have high agreement between the two base models.