Differences on projective and non-projective documents. Document-wise differences in performance (number of predicted events matched with gold) between two models. Positive differences indicate that the first model outperformed the second model. Each chart includes a boxplot for the distributions of these differences for both projective documents (i.e., documents containing no projective arcs) and non-projective documents. Each boxplot shows the median value (solid red line), 25th and 75th percentiles (blue lines), and 1.5 times the interquartile range, (75th percentile to 25th percentile, shown with solid black lines) and outliers (blue plus symbols). In general, larger improvements happen in non-projective documents. The FAUST model performs comparably to UMass on projective documents (the 25th, 50th, and 75th percentiles here are 0). However, on non-projective documents, the differences are generally positive, indicating that this is one class of document that is improved by stacking. These experiments were performed over the development section of Genia. There were 205 projective documents and 54 documents with at least one non-projective arc.