Skip to main content
Fig. 4 | BMC Bioinformatics

Fig. 4

From: Investigation of improving the pre-training and fine-tuning of BERT model for biomedical relation extraction

Fig. 4

Probing classifier architecture. We freeze the parameters of BERT model during the training of probing classifier. Through the learned \(\alpha \), we can know the relevance between each layer and the task. Also, we can tell which layer learns the knowledge for a specific instance by building a series of probing classifier \(\{P_\tau ^l\}_{l=1}^L\). For the relation extraction instance “RFX5 interacts with histone deacetylase 2”, if the probing classifier \(P_\tau ^l\) predicts the interacting relationship between proteins “RFX5” and “histone deacetylase 2” correctly using the information of the first l layers, but \(P_\tau ^{l-1}\) does not predict correctly using the information of the first (l-1) layers. We can say that the knowledge about this instance is learned in the l-th layer

Back to article page