 Research article
 Open access
 Published:
3Dimensional facial expression recognition in human using multipoints warping
BMC Bioinformatics volume 20, Article number: 619 (2019)
Abstract
Background
Expression in Hsapiens plays a remarkable role when it comes to social communication. The identification of this expression by human beings is relatively easy and accurate. However, achieving the same result in 3D by machine remains a challenge in computer vision. This is due to the current challenges facing facial data acquisition in 3D; such as lack of homology and complex mathematical analysis for facial point digitization. This study proposes facial expression recognition in human with the application of Multipoints Warping for 3D facial landmark by building a template mesh as a reference object. This template mesh is thereby applied to each of the target mesh on Stirling/ESRC and Bosphorus datasets. The semilandmarks are allowed to slide along tangents to the curves and surfaces until the bending energy between a template and a target form is minimal and localization error is assessed using Procrustes ANOVA. By using Principal Component Analysis (PCA) for feature selection, classification is done using Linear Discriminant Analysis (LDA).
Result
The localization error is validated on the two datasets with superior performance over the stateoftheart methods and variation in the expression is visualized using Principal Components (PCs). The deformations show various expression regions in the faces. The results indicate that Sad expression has the lowest recognition accuracy on both datasets. The classifier achieved a recognition accuracy of 99.58 and 99.32% on Stirling/ESRC and Bosphorus, respectively.
Conclusion
The results demonstrate that the method is robust and in agreement with the stateoftheart results.
Background
Emotions in human face play a remarkable role when it comes to social communication. The identification of expressions by human beings is relatively easy and accurate. However, achieving the same result by machine remains a challenge in computer vision. Human face is the part that hosts the most crucial sensory organs. It also acts as the central interface for appearance, communication, expression and identification [1]. Therefore, acquiring its information digitally is important to researchers. This makes landmarkbased geometric morphometrics methods for facial expression a new insight into patterns of biological emotion variations [2]. Many advances have been proposed in the area of acquisition of facial landmark but with several challenges especially in threedimensional model. One of the challenges is the insufficient acquisition of 3D facial landmarks. Another challenge is the lack of homology due to manual annotation. Whereas complex mathematical analysis has made many works unreproducible in 3D facial landmark acquisition.
The use of threedimensional face images in morphometrics does not only give room to cover a wider area of human facial region but also retains all the geometric information of the object descriptors [3, 4]. In modality comparison, 3D face has higher detection rate than that of 2D due to its higher intensity modality [5]. Furthermore, during subjection to systematically increasing pitch and yaw rotation experiment performed in [6], there was a dropped in expression recognition performance in 2D while that of 3D remained constant. This is as a result of occlusion effects substantial distortion in outofplane rotations. More so, in the area of feature transformation and classification, 3D modality shows a little improvement with higher confidence over 2D. But in terms of depth features, both show the same performance; and the cost of 3D model in terms of processing is higher than that of 2D [5].
Below is the summary of the main contribution of this work:

1)
We developed an approach for 3D facial landmark using multipoints warping. This approach has extended the computational deformation processing in [7] to improve the annotation performance using a less complex pipeline. We used six iterations and hundred to 5 % exponential decay sliding step in our method to ensure convergence and optimum smoothness.

2)
Due to the easy detection, pose correction [8] and invariant to facial expression of nose tip [9], Pronasale was selected as the most robust and prominent landmark point. Since the nose tip area can be approximated as a semisphere of the human face. This determines the location where the sliding points begin to spread across the facial surface.

3)
We have tested the method on two public 3D face databases (Stirling/ESRC and Bosphorus) to validate the precision of the annotation of the landmarks with the stateoftheart methods.

4)
We have validated the usability of our approach through its application to softtissue facial expression recognition in 3D. By using PCA for feature selection, we classify six expressions on both datasets. So far, to the best of our knowledge, sliding semilandmark approach to facial landmarking has not been applied to solve problem relating to softtissue facial expression recognition in 3D.
Section one of this study focuses on the introduction, section two discusses the related studies. In section three, the implementation of the methodology is presented with supporting references where short explanation has been provided. Section four discusses the results of the implementations. In section five, a more detailed discussion is presented for the clarification of the result and comparison with stateoftheart methods. The last section concludes the study and presents the limitations and future direction. Figure 1 shows the architectural diagram of the application of multipoints warping to the analysis of facial expression recognition in 3D.
Literature review
The term “Morphometrics” was coined by Robert E. Blackith more than 50 years ago, who applied multivariate statistical methods to the basic carapace morphology of grasshoppers [10]. Morphometrics is the study of shape variation and its covariation with other variables [7, 11]. According to DC Adams, et al. [12], morphometrics was traditionally the application of multivariate statistical analyses to a sets of quantitative variables such as length, width, height and angle. But advances in morphometrics have shifted focus to the Cartesian coordinates of anatomical points that might be used to define more traditional measurements. Morphometrics examines shape variation, group differences in shape, the central tendency of shape, and associations of shape with extrinsic factors [13]. This is directly based on the digitized x,y, (z)coordinate positions of landmarks, points representing the spatial positions of putatively homologous structures in two or three dimensions; whereas conventional morphometric studies utilize distances as variables [7, 11, 14]. The landmark was described in LF Marcus, et al. [15] as a point in a bi or threedimensional space that corresponds to the position of a particular trait in an object. This set of points, one on each form, are operationally defined on an individual by local anatomical features and must be consistent with some hypothesis of biological homology. But the formal landmark definitions were provided by anthropometric studies in [16]. This work by LG Farkas [16] has been provided as the standard for head and face landmark definitions through the study of thousands of subjects from different races. These have produced a large number of anthropometric studies in the head and face regions.
A flexible and mathematically rigorous interpolation technique of D’Arcy Thompson’s transformation grids [17], called Thin PlateSpline (TPS), was brought into morphometrics. This ensures that the corresponding points of the starting and target form appear precisely in corresponding positions in relation to the transformed and untransformed grids [18]. With the application of Iterative Closest Point (ICP), landmark correspondence can iteratively be registered in the vicinity of a landmark with a reweighted error function. Morphometrically, some studies have been proposed which computed localization errors of facial landmarks on Bosphorus dataset. A novel 3D constrained Local Models (CLM) approach facial landmark detection in 3D images is proposed in [19], which capitalizes on the Independent Component Analysis (ICA) properties in order to define appropriate face Point Distribution Model (PDM) tailored to the mesh manifold modality. Each sample contains 24 manually annotated facial landmarks. While the PDM includes 33 landmarks and 14 of them are part of the ground truth set tested on Bosphorus database. An automatic method for facial landmark localization relying on geometrical properties of 3D facial surface was proposed in [20], working on complete faces displaying different emotions and in presence of occlusions. The method extracts the landmark onebyone. While the geometrical condition remains unchanged, the method doublechecks to ascertain whether pronasale, nasion and alare are correctly localized, otherwise the process starts afresh. The method is deterministic and is backboned by a thresholding technique designed by studying the behavior of each geometrical descriptor in correspondence to the locus of each landmark, experimented on Bosphorus database.
Though facial landmarks are known to be specific points with an anatomical meaning which has been described in Table 1; since a considerable amount of biological variability cannot be assessed using only anatomical landmarks [21], in order to quantify complex shapes, sliding semilandmarks have been developed which can be placed on surfaces [22] or curves [7, 22]. This approach generates landmarks that are spatially homologous after sliding [23] which may be optimized by minimizing bending energy [24, 25] or Procrustes distance [26, 27]. Since sliding semilandmarks have not been implemented in analysing facial expression for softtissue in 3D, we have decided to investigate the expression recognition using the application of multipoints warping approach.
Emotion or expression recognition using facial analysis has been the current trend in computer vision but the diversity of human facial expression has made the emotion recognition somehow difficult [28]. Moreover, asides unidentifiable lighting challenges, the fairly significant differences in age, skin colour and appearance of individual placed additional burden on machine learning. When face subjects are transformed into feature vectors, any classifier can be used for expression recognition such as neural network, support vector machines, random forest, linear discriminant analysis, etc. But the uniqueness is the application of facial image information [29]. Due to the sensitivity of the change in head posture and illumination, the use of static 2D image is unstable for expression recognition. The use of 3D does not only play safe in the area of illumination and pose change but also enables the use of more image information. This is because facial expressions are generated by facial muscle contractions. It results in temporary facial deformations in both texture and facial geometry which is detectable in 3D and 4D [30]. The same successes achieved in 3D face recognition could still be naturally adopted for expression recognition [31]. According to M Pantic and LJ Rothkrantz [32] on facial expression analyser, facial expression follows the general properties for solving computer vision problems: face detection, landmark localisation, recognition or classification. As 3D databases are becoming more and more available in the computer vision community, different methods are being proposed to tackle the challenges facing facial expression recognition. Most of these studies are based on six fundamental expression classes or less: anger, fear, disgust, sadness, happiness, and surprise [33]. Many also focus on the use of local features which retrieves the topological and geometrical properties of the face expression [29, 34].
Linear discriminant analysis and many other classifiers have been used for classification in many face expression recognitions. A learn sparse features from spatiotemporal local cuboids extracted from human face was proposed in [35]. This has application of conditional random field classifiers for training and testing the model. In H Tang and TS Huang [36], similar distance feature was explored using automatic feature selection technique. This was done by maximizing the average relative entropy of marginalized classconditional feature distributions. Using 83 landmarks, less than 30 features were selected. The features distance are subtracted from the features of the expressive scan on the neutral scan which they classified by Naive Bayes, Neural network and Linear Discriminant Analysis on BU3DFE dataset. To approximate the continuous surface at each vertex of an input mesh, YL Wang Jun, Wei Xiaozhou, Sun Yi [6] proposed a cubicorder polynomial functions. It estimated coefficient at a particular vertex, formed the weingarten matrix for the local surface path. The eigenvectors and eigenvalues of the matrix could be derived by normal direction along the gradient magnitude. The facial region was described using 64 landmarks to overcome the lack of correspondence between the meshes. Their best performance was obtained using LDA; no rigid transformation is required due to the geometrical invariance of curvaturebased features. To deal with issue of deformation of facial geometry which results from expression changes, C Li and A Barreto [37] proposed a framework that is composed of three subsystems: expressional face recognition system, neutral face recognition system and expression recognition system. This was tested on 30 subjects and was classified using LDA, but used only two expression groups.
H Li, et al. [38] proposed a novel method using finegrained matching of 3D keypoint descriptors by extending the SIFTlike matching framework to mesh data. To account average for reconstruction error of probe face descriptors, multitask sparse representation algorithm was used. The approach was evaluated on Bosphorus database for expression recognition, pose invariant and occlusion. A comprehensive comparative evaluation was performed on Gavab, UND/FRGC, and Bosphorus in [39] by using local shape descriptor. The method captured distinguishing traits on the face by extracting 3D keypoints. Similarity expression on faces was evaluated by comparing local shape descriptors across inlier pairs of matching keypoints between gallery scans and probe. Using a Keypointbased Multiple Triangle Statistics (KMTS) with a TwoPhase Weighted Collaborative Representation Classification (TPWCRC), a robust to partial data, large facial expression and pose variations was proposed in [40]. The method was experimented on six databases including Bosphorus which achieved a promising result on occlusions, pose variation and expressions. A 3D face augmentation technique was proposed in [41], which synthesizes a number of different facial expressions from a single 3D face scan. The method showed excellent performance on BU3DFE, 3DTEC, and Bosphorus datasets, without application of handcrafted features. A novel geometric framework for analysing 3D faces was proposed in [42] with the goals of averaging face shapes and comparing matching. The method presented facial surfaces by radial curves emanating from the nose tips, which was experimented on FRGCv2, GavabDB, and Bosphorus.
Furthermore, in order to address the issue of 2D counterpart and the handling of large intraclass and interclass variability for human facial expression, W Hariri, et al. [43] proposed the use of covariance matrices of descriptors rather than using the descriptors themselves. Their work focused on application of manifoldbased classification which was tested on BU3DFE and Bosphorus databases. While extended local binary patterns was proposed in [44] for facial expression recognition from 3D depth map images where the results on Bosphorus showed better performance by the combination of 3D and 3D curvature.
Experiment results
After the stepbystep methods in facial surface deformation of semilandmark, the error assessment, the analysis, visualisation and classification of the experiment were performed using MorphoJ 1.06d [45], PAST 3.0 [46] and R 5.1 [47].
Landmarks significance
The use of landmarks evolves when locating biological or anatomical features on human faces. Its validity is drawn from the morphometric analysis which depends on the biological justification for designation of the landmarks as stated in [3]. But not all the facial anatomical landmarks always indicate a meaningful significant measure. On Stirling dataset, the overall landmarks are tested using one way ANOVA to see the significant of the variation on each expression group, each group having the same degree of freedom (df = 1499). Angry: F = 133.9, p < 0.00001; Disgust: F = 120.9, p < 0.00001; Fear: F = 132.9, p < 0.00001; Sad: F = 130.2, p < 0.00001; Happy: F = 184.3, p < 0.00001; and Surprise: F = 117, p < 0.00001. Subsequently, same test was computed for Bosphorus on each expression group, each group having the same degree of freedom (df = 1499). Angry: F = 2507, p < 0.00001; Disgust: F = 1552, p < 0.00001; Fear: F = 3899, p < 0.00001; Sad: F = 2543, p < 0.00001; Happy: F = 2435, p < 0.00001; and Surprise: F = 1582, p < 0.00001. Furthermore, we conducted PERMANOVA (NonParametric MANOVA) which is a nonparametric test of the significant difference between the expression groups, based on the distance measured [48] with F = 7.76 and P = 0.0001 for Stirling dataset and F = 115.5 and P = 0.0001 for Bosphorus dataset. The large positive of F value indicates that there is a significant difference between the expression groups.
Procrustes ANOVA
For the assessment of localization errors of the landmarks; the deviations of each landmark is obtained by simply calculating the amount of displacement from the average position calculated from all digitization and the variation accounts for the smallest portion of the total variation using Procrustes ANOVA. The localization errors accounts for only 0.041 and 0.095 for Stirling and Bosphorus, respectively, from the total variation (Table 1).
PCA results
The PCA of the total sample of Stirling yielded 239PCs while Bosphorus yielded 179PCs. When each expression group was separately computed, each yielded 39PCs and 29PCs for Stirling and Bosphorus, respectively, all with nonzero variability. Using a broken stick approach of PCA selection [17], only first 2PCs in each group accounted for more than 58% of the shape variation for Stirling while only the first 2PCs accounted for more than 70% for Bosphorus. For Stirling: Angry (PC1: 39.11%, PC2: 18.99%), Disgust (PC1: 38.50%, PC2: 15.65%), Fear (PC1:41.52%, PC2: 17.51%), Sad (PC1: 41.78%, PC2: 17.51%), Surprise (PC1: 43.71%, PC2: 15.69%), and Happy (PC1: 42.13%, PC2: 16.96%). For Bosphorus: Angry (PC1: 76.41%, PC2: 8.96%), Disgust (PC1: 55.82%, PC2: 19.88%), Fear (PC1:56.73%, PC2: 13.28%), Sad (PC1: 75.71%, PC2: 8.99%), Surprise (PC1: 77.54%, PC2: 7.95%), and Happy (PC1: 66.42%, PC2: 9.09%). For the sake of visualisation, we only presented the deformations of the first PC which accounted for the largest variation after Procrustes fit in each expression group (Fig. 2) as 3D vectors away from the mean configuration [11].
Classification
We used LDA to classify the expression variations of 240 sample faces of six different classes using 135 selected PCs in Stirling dataset and 180 sample faces of six different classes using 98 selected PCs in Bosphorus dataset. Since LDA is easy to implement and no tuning parameters or adjustment required which has successfully been applied to many previous studies [49, 50], etc. By using leaveoneout cross validation, the data was learned with 70% training and 30% testing. A call to LDA returned the prior probability of each expression class, the group means for each covariate, the coefficient for each linear discriminant (for the six classes, we have five linear discriminants) and the singular values that produced the ratio of the withinclass and betweenclass standard deviation on the first two LDs variables returned the proportions of the variance by Stirling (LD1 = 36.23%, LD2 = 29.57%) and by Bosphorus (LD1 = 36.23%, LD2 = 29.57%) (Fig. 3). The confusion matrixes for both Stirling and Bosphorus are also produced in Table 2 and Table 3, respectively. These indicate that only Sad expression is slightly misclassified with 2.44% for Fear expression in Stirling dataset while only the same Sad expression is slightly misclassified with 4.55% for Surprise in Bosphorus dataset.
LDA model performance
In this scheme, the dataset was divided into 70% training and 30% testing for both Stirling and Bosphorus. The scheme performance was measured using precision, recall and specificity.
Where TP is the true positive, TN is true negative, FP is false positive, FN is false negative. The accuracy shows overall prediction performance; sensitivity is the capacity of features to accurately recognize an expression while specificity is the feature capacity to recognise a true expression. The classifier produced the percentage precision, sensitivity, specificity and accuracy of 99.70, 99.60, 99.90 and 99.58%, respectively for Stirling dataset and 99.20, 99.30, 99.90 and 99.32%, respectively for Bosphorus dataset. The performance metrics are displayed in Table 4, showing precision, recall and specificity.
Discussions
The Procrustes ANOVA suggests a modest but appreciable variation in facial shape. Shape differences are statistically significant even after averaging faces within expression. Small localization errors for both datasets show that the landmarks can be annotated with precision using the proposed method. Table 5 demonstrated superiority of our method on localization error when compared with stateoftheart methods. Though, many approaches are available in addressing measurement error. Discussing such at length is beyond the scope of this study, more and extended details can be found in [51]. The expression recognition accuracy demonstrated superiority when compared with stateoftheart methods (Table 6 and Table 7).
There is agreement and consistency in our work with most of the stateoftheart studies, which carried out similar work on Bosphorus dataset using different methods. In [52], a differential evolution based optimization was presented by first transforming 3D faces in to 2D plane using conformal mapping and selecting optimal features using Speed Up Robust Features (SURF). The method was tested on Bosphorus dataset and classified by SVM containing six basic expressions. The results indicated that Sad expression has the lowest recognition accuracy of 67.50%. The use of covariance matrices of descriptors proposed in [43] tested on Bosphorus dataset indicted that Sad expression has the lowest recognition rate of 79.75%. Though both results are in agreement with our study, yet our method performed better in the Sad expression with recognition rate of 95.45% on Bosphorus dataset.
The scatter plot of the expressions along the first two linear discriminants produced maximal separation between all groups; these linear discriminants are linear combinations of the original variables as in principal component analysis, which indicates amount of variation explained by these linear discriminants. The classifier classified the expression groups with accuracy of 99.58 and 99.32% for both Stirling and Bosphorus, respectively. Though some Sad faces were misclassified as Fear faces in Stirling dataset. This indicates that it is possible to misrepresent a Sad expression with Fear expression. While Sad faces were misclassified as Surprise in Bosphorus dataset. This also indicates that it is possible to misrepresent a Sad expression with Surprise expression. In the visualization of the expression using PCs, the deformations show various expression regions in the faces. In Stirling dataset, Surprise shows more expression in mouth region, Happy shows more expression in the cheek region, Angry and Disgust show more expression both in mouth and eyes regions. Only Sad seems to be very close to the neutral expression but slightly show expression in the whole facial regions. Whereas in Bosphorus dataset, Surprise shows more expression in cheek region, Sad and Fear show more expression both in mouth and eyes regions, Angry show more expression in the cheek region. While Happy and Disgust show more expression in the whole facial region.
To the best of our knowledge, there is currently no facial landmark annotation analysis and expression recognition performed using Stirling/ESRC dataset. Therefore, this is the first facial expression study using Stirling/ESRC dataset. According to T Fang, et al. [29, 53] who reported that additional 3D datasets in expression recognition with different modalities, plus some examples of spontaneous and natural behaviour captured in 3D are needed for researchers to evaluate their methods. We believe that, in the future this dataset will be used for many research benchmarks especially in the field of facial expression in 3D.
We strongly advise not to rely on broken stick of scree plot decision on PCA when it comes to classification or machine learning, further data wrangling must be performed. Note also that the features were never standardised during learning as the data has already been Procrustesfitted in PAST software, as covariance matrix is always affected when such happens. Whereas there is no effect on covariance matrix for mean centering and variables scaling.
Conclusions
This method combines pragmatic solutions to configure an optimized pipeline for highthroughput multipoints facial signature in threedimensional. Only the reference surfaces and curves were warped to each sample faces using automatic warping approach and the errors were assessed using Procrustes ANOVA. The result acquired was further used in the selection of features for classification using PCA; and LDA was used to classified expressions. Such a highthroughput and accurate phenotypic facial data like this is not only valuable for facial expression recognition but also in forensic studies of human facial morphology, sexual dimorphism, anthropology, disease diagnosis and prediction, statistical shape or image analysis, face recognition and age estimation. In the feature, the method can be further improved by automatically applying the reference model to all the targets at once without applying to each target one after the other. Furthermore, ViewBox 4.0 does not work well in the annotation of eyeball when the eyes are opened. Though it does not affect the annotation and measurement of endocanthion and exocanthion as they lie at the tissue edges of the eyeballs; this will be addressed in the future studies.
Methods
Dataset & Description
The first dataset is acquired from Stirling/ESRC 3D face database captured by a Di3D camera system [54]. The image format used for this study is in wavefront obj file containing 240 faces which were randomly selected from different expression positions: Angry (40), Disgust (40), Fear (40), Happy (40), Sad (40), and Surprise (40). This is intended to facilitate research in sexual dimorphism, face recognition, facial expression recognition and perception. The dataset is being used as a test set for a competition on 3D face reconstruction from 2D images, with the 3D scans acting as ‘ground truth’ in IEEE conference. The second dataset is the Bosphorus database, which was intended for research on 3D and 2D human face processing tasks. A total of 180 subjects are rondomly selected for this study: Angry (30), Disgust (30), Fear (30), Happy (30), Sad (30), and Surprise (30). The dataset was acquired using structuredlight based 3D system. The subjects were instructed to sit at a 1.5 m distance with sensor resolution in x, y and z depth of 0.3 mm, 0.3 mm, and 0.4 mm, respectively, with a highresolution color texture [5, 55].
Creating template mesh
The template mesh was created by manually locating sixteen anatomical points on a 3D face (Fig. 4) with neutral expression called fixed points according to facial landmark standard [56] (details in Table 8). The anchor landmarks were not subjected to sliding but were used for establishing the warping fields that would be used for minimizing the bending energy. Due to the easy detection, pose correction [8] and invariant to facial expression of nose tip [9], Pronasale has been selected as the most robust and prominent landmark point. Since the nose tip area can be approximated as a semisphere of the human face. This is where the sliding points begin to spread across the facial surface. Using this anchor point (Pronasale), 484 semilandmarks were automatically generated overlapping on Pronasale showing in blue color. These were first randomly placed on the facial mesh before they were uniformly distributed on the selected facial surface using the locational position of the anchor anatomical points with 1.5 mm radius to accommodate all the 500 points (see Additional file 1: Table S1 and Additional file 2: Table S2) (Fig. 5). To quantify the morphological data of a complex, threedimensional trait of both reference and target shapes, we have used geometric morphometric tools based on a landmarkbased methodology in [57,58,59,60,61] and the landmark acquisition process was fully implemented in ViewBox 4.0 [61].
Multipoints warping
The geometry of curves and surfaces is easier in 2D or 3D but it is not so easy to define semilandmarks for nonplanar surfaces in 3D [62]. This is because they are not guaranteed to be homologous after first placement. This could alternately be achieved by subjecting the semilandmark to sliding in the direction that reduces shape variance. This closely positions the points on the same locations in the 3D space. The sliding step is important as it places the landmarks in positions where they correspond better to each other across individuals [26]. The semilandmarks were allowed to slide on the curves and surface mesh of each target using TPS warping of the template. This positions the reference points on the target facial mesh by minimizing the bending energy.
According to FL Bookstein [18], physical steel takes a bending form with a small displacement. This is because the function (x, y, z) is the configuration of lowest physical bending energy which is consistent with the given constraints. In this 3D face deformation, the transformation of TPS was done mathematically by interpolation of smooth mapping of h from ℝ^{3} → ℝ^{3} which is a selection of a set of corresponding points {Ρ_{Ri,}Ρ_{Ti}}, i = 1, …, N on the reference object (template) and target (subject) faces minimizing the bending energy function Ε(h) using the following interpolation conditions [7, 18, 63]:
where Ρ_{Ti} is the target object and Ρ_{Ri} is the reference object of the sets of corresponding points, h is the bending energy function that minimizes nonnegative quantity of the interpolation of the integral bending norm or the integral quadratic variation Ε(h). TPS now form a decomposition of each component into affine and nonaffine components such that,
where Ρ_{h} is the homogeneous coordinate points on the target 3D face, and Ψ(Ρ_{h}) = (Ψ_{1}(Ρ_{h}), Ψ_{2}(Ρ_{h}), …, Ψ_{M}(Ρ_{h})) is a 1 × M kernel vector of TPS with the form:
while Κ is a M × 4 nonaffine warping coefficient matrix, and Γ is homogeneous affine transformation of 4 × 4 matrix. The energy function is minimized to find optimum solution in (4) if the interpolation condition in (1) is no longer necessary.
The interpolation conditions in (1) are satisfied if the smoothing regularization term β is zero; Γ and Κ are TPS parameters obtained by solving the linear equation:
Ψ is a M × M matrix with the component Ψ_{wl} = ∥ Ρ_{Tw} − Ρ_{Tl} ∥ and Ρ_{R} is a M × 4 matrix with each row being the homogeneous coordinate of the point Ρ_{Ri}, i = 1, …, M. Using (2), the target facial mesh Ρ_{Ti} is deformed to the reference mesh Ρ_{Ri}. Applying the bending energy, the process was iterated specified number of cycles (6) to have optimum sliding of the points on the facial surface which gives points relaxed. This changed the bending energy from initial value E_{i} to final value E_{f} after a complete iteration. This makes the semilandmarks to be treated the same as homologous landmarks with respect to downstream analyses. Because the warping may result in points that do not lie directly on the facial surface on the target mesh, the transferred points were projected on the closest point on the mesh surface. This was done using Iterative Closest Point (ICP) method [8], which aims to iteratively minimize the mean square error between two point sets. If the distance between the two points is within the acceptable threshold, then the closest point is determined as the corresponding point [64]. The homologous landmark warping H_{KΓ} after a six complete iterations is, therefore:
Where
is the linear TPS equation obtained during deformation surface of the target mesh to the reference mesh before convergence was finally reached and E_{f − i} = E_{f} − E_{i} of six complete iterations. The first iteration showed a partial distribution of sliding points on the target surface mesh (Fig. 6). This was automatically repeated until optimum homologous result was achieved using exponential decay sliding step of hundred to 5 %. During the relaxation of the spline, the semilandmarks slid along the surface and the curve tangent structures, and not on the surfaces or the curves which reduced the computational effort. This makes the minimization problem become linear, as sliding along the tangents lets the semilandmarks slip off the data [22]. The target surface mesh is now treated as homologous points (Fig. 7). Note that we did not build a new deformable mathematical equation from scratch but extended the standard deformable method that has been established in [7].
In assessing error, 18 subjects (three from each expression) from each dataset were randomly selected; each one belonging to a different individual, distinct from the template subject. Each was digitized twice following the same method to account for digitization error. The results were analyzed using Procrustes ANOVA [65, 66] which has been implemented in morphometrics to analyze measurement error in MorphoJ [67,68,69]. This is done by the minimization of the squared sum of the distance of all objects and the consensus configuration [51].
Feature selection with PCA
The features were selected by dimensionality reduction using Principal Components Analysis (PCA). Here, the data is represented as matrix M = [m_{1}, m_{2}, …m_{n}], where m_{i} is the ith column vector representing the ith training data. The covariance matrix K = cov (M) = MM^{T}, we then carried out eigenvalue decomposition on the matrix M to produce highest ranking eigenvectors known as Principal Components (PCs) with the help of their corresponding eigenvalues. We chose x eigenvectors (p_{1}, p_{2}, …, p_{n}) that best described the data with projection onto the space spanned by these vectors such that \( X={\left[{p}_1,{p}_2,\dots, {p}_n\right]}^{T_m} \); where X is the n dimensional vector used as features during the training and classification process. The total PCs computed during reduction process is 239PCs and 179PCs for Stirling and Bosphorus, respectively (see Additional file 3: Table S3 and Additional file 4: Table S4). Among these, only 135PCs from Stirling and 98PCs from Bosphorus which have been observed to have the highest ranking eigenvectors were selected for classification using Bartlett’s test for the first principal component method [70, 71]. In other to establish total PCs that expressed meaningful variation in each expression group, a broken stick was used [70, 72]. This is based on the eigenvalues from random data of the principal components.
Linear discriminant analysis (LDA)
The method used multiclass LDA to classify the features. This is one of the supervised learning methods for classification. It operates by maximizing the ratio of betweenclass variance to that of withinclass variance in a dataset, thereby guaranteeing maximum separability. It has been widely applied to many applications such as microarray data classification [73], face recognition [74], and image retrieval [75]. LDA comes with singularity problem [76] which has given room to many extensions to LDA such as regularized LDA [77], pseudoinverse LDA [78], and subspace LDA [79]. In order to overcome the singularity issue of classical LDA, PCA was applied as an intermediate dimensionality reduction.
Computing LDA for multiclass is slightly different from twoclass. The multiclass requires the application of multiple discriminant analysis [80]. The maximization of ratio of withinclass scatter to betweenclass scatter is done among the competing classes [81]. The multiclass can also be called Canonical Variates Analysis (CVA) but the major assumption for LDA is that the variance–covariance matrices are all equal [82]. To simplify the computational process, we first computed the withinclass matrix for n classes (n = 6 for this study) such that:
followed by betweenclass matrix, given by:
where m_{i} is the number of samples for each class, \( {\overline{X}}_i \) is the mean vector for each class and X is the summed mean vector computed as \( \overline{X}=\frac{1}{m}{\sum}_{i=1}^n{m}_i{\overline{X}}_i. \)
By obtaining the withinclass and betweenclass matrices (\( {\hat{\Sigma}}_w \) and \( {\hat{\Sigma}}_b\Big) \), we now obtained the transformation Φ by solving generalized eigenvalue problem:
Once the transformation Φ is solved, the classification is then performed based on distance metrics in transformed space. Here, Euclidean distance is applied such that:
and cosine measure
we arrive at a new instance \( \mathbbm{z} \), which classified into \( argmin\ d\left(\mathbbm{z}\Phi, {\overline{X}}_k\ \Phi \right),\kern0.5em \) where \( {\overline{X}}_k \) is the centroid of kth class. The advantage of multiple discriminant analysis over single discriminant analysis is that it produces an elegant classification with the use of discriminant features [81].
Availability of data and materials
Raw threedimensional digitized data for each expression group and principal component analysis scores of the entire subjects. Table S1 and S2: Threedimensional raw data for each expression group in different sheet for Stirling and Bosphorus, respectively. Table S3 and S4: PCs scores for all subjects used in the experiment for Stirling and Bosphorus, respectively. Note that the 3D face dataset is not permitted to be shared by third party according to the license agreement.
References
Peng S, Tan J, Hu S, Zhou H, Guo J, Jin L, Tang K. Detecting genetic association of common human facial morphological variation using high density 3D image registration. PLoS Comput Biol. 2013;9(12):e1003375.
Anies OS, Torres MAJ, Manting MM, Demayo CG. Landmarkbased geometric Morphometrics in describing facial shape of the SamaBanguingui tribe from the Philippines. J Med Bioengineering. 2013;2(2):131–6.
Bookstein FL. Morphometric tools for landmark data: geometry and biology: Cambridge University Press; 1997. https://onlinelibrary.wiley.com/doi/abs/10.1002/bimj.4710350416.
Dean D. Threedimensional data capture and visualization. In: Advances in morphometrics: Springer; 1996. p. 53–69. https://link.springer.com/chapter/10.1007/9781489900920_12.
Savran A, Sankur B, Bilge MT. Comparative evaluation of 3D vs. 2D modality for automatic detection of facial action units. Pattern Recogn. 2012;45(2):767–82.
Wang Jun YL, Xiaozhou W, Yi S. 3D facial expression recognition based on primitive surface feature distribution. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR'06): IEEE; 2006. p. 1399–406. http://www.cs.binghamton.edu/~lijun/Research/3DFE/Yin_cvpr06.pdf.
Bookstein FL. Landmark methods for forms without landmarks morphometrics of group differences in outline shape. Med Image Anal. 1997;1(3):225–43.
Creusot C, Pears N, Austin J. 3D face landmark labelling. In: Proceedings of the ACM workshop on 3D object retrieval: ACM; 2010. p. 27–32. https://dl.acm.org/citation.cfm?id=1877815.
Colombo A, Cusano C, Schettini R. 3D face detection using curvature analysis. Pattern Recogn. 2006;39(3):444–55.
Elewa AM, Elewa AM. Morphometrics for nonmorphometricians, vol. 124: Springer; 2010. https://www.springer.com/gp/book/9783540958529.
Dryden IL, Mardia KV. Statistical shape analysis, vol. 4. Chichester: Wiley; 1998.
Adams DC, Rohlf FJ, Slice DE. Geometric morphometrics: ten years of progress following the ‘revolution’. Italian J Zoology. 2004;71(1):5–16.
Slice DE. Geometric Morphometrics. Annu Rev Anthropol. 2007;36(1):261–81.
Rohlf FJ. Relative warp analysis and an example of its application to mosquito. In: Contributions to morphometrics, vol. 8; 1993. p. 131.
Marcus LF, Bello E, GarcíaValdecasas A, Museo Nacional de Ciencias N. Contributions to morphometrics: Consejo Superior de Investigaciones Científicas; 1993.
Farkas LG. Anthropometry of the head and face: Raven Pr; 1994. https://www.sciencedirect.com/science/article/pii/0278239195902082?via%3Dihub.
Klingenberg CP. Visualizations in geometric morphometrics: how to read and how to make graphs showing shape changes. Hystrix Italian J Mammalogy. 2013;24(1):15–24.
Bookstein FL. Principal warps: thinplate splines and the decomposition of deformations. IEEE Trans Pattern Anal Mach Intell. 1989;11(6):567–85.
El Rai MC, Tortorici C, AlMuhairi H, Werghi N, Linguraru M. Facial landmarks detection using 3D constrained local model on mesh manifold. In: Circuits and Systems (MWSCAS), 2016 IEEE 59th International Midwest Symposium on: IEEE; 2016. p. 1–4. https://ieeexplore.ieee.org/document/7869954.
Vezzetti E, Marcolin F, Tornincasa S, Ulrich L, Dagnes N. 3D geometrybased automatic landmark localization in presence of facial occlusions. Multimed Tools Appl. 2017:1–29.
BottonDivet L, Houssaye A, Herrel A, Fabre AC, Cornette R. Tools for quantitative form description; an evaluation of different software packages for semilandmark analysis. PeerJ. 2015;3:e1417.
Gunz P, Mitteroecker P, Bookstein FL. Semilandmarks in three dimensions. In: Modern morphometrics in physical anthropology: Springer; 2005. p. 73–98. https://link.springer.com/chapter/10.1007/0387276149_3.
Parr W, Wroe S, Chamoli U, Richards H, McCurry M, Clausen P, McHenry C. Toward integration of geometric morphometrics and computational biomechanics: new methods for 3D virtual reconstruction and quantitative analysis of finite element models. J Theor Biol. 2012;301:1–14.
Cornette R, Baylac M, Souter T, Herrel A. Does shape covariation between the skull and the mandible have functional consequences? A 3D approach for a 3D problem. J Anat. 2013;223(4):329–36.
Fabre AC, Goswami A, Peigné S, Cornette R. Morphological integration in the forelimb of musteloid carnivorans. J Anat. 2014;225(1):19–30.
Philipp Mitteroecker PG, Sonja Windhagerc, Katrin Schaefer: A brief review of shape, form, and allometry in geometric morphometrics, with applications to human facial morphology. Hystrix Italian J Mammalogy 2013;24(1):5966.
Perez SI, Bernal V, Gonzalez PN. Differences between sliding semilandmark methods in geometric morphometrics, with an application to human craniofacial and dental variation. J Anat. 2006;208(6):769–84.
Li H, Wen G. Sample awarenessbased personalized facial expression recognition. Appl Intell. 2019:1–14.
Fang T, Zhao X, Ocegueda O, Shah SK, Kakadiaris IA. 3D facial expression recognition: A perspective on promises and challenges. In: Face and Gesture 2011: IEEE; 2011. p. 603–10. https://ieeexplore.ieee.org/abstract/document/5771466.
Zhen Q, Huang D, Wang Y, Chen L. Muscular movement modelbased automatic 3D/4D facial expression recognition. IEEE Trans Multimedia. 2016;18(7):1438–50.
Kakadiaris IA, Passalis G, Toderici G, Murtuza MN, Lu Y, Karampatziakis N, Theoharis T. Threedimensional face recognition in the presence of facial expressions: an annotated deformable model approach. IEEE Trans Pattern Anal Mach Intell. 2007;29(4):640–9.
Pantic M, Rothkrantz LJ. Automatic analysis of facial expressions: the state of the art. IEEE Trans Pattern Analysis Machine Int. 2000;12:1424–45.
Ekman P, Friesen WV. Constants across cultures in the face and emotion. J Pers Soc Psychol. 1971;17(2):124.
Tabia H, Daoudi M, Vandeborre JP, Colot O. A new 3Dmatching method of nonrigid and partially similar models using curve analysis. IEEE Trans Pattern Anal Mach Intell. 2011;33(4):852–8.
Shao J, Gori I, Wan S, Aggarwal J. 3D dynamic facial expression recognition using lowresolution videos. Pattern Recogn Lett. 2015;65:157–62.
Tang H, Huang TS. 3D facial expression recognition based on automatically selected features. In: 2008 IEEE computer society conference on computer vision and pattern recognition workshops: IEEE; 2008. p. 1–8. https://ieeexplore.ieee.org/document/4563052.
Li C, Barreto A. An integrated 3D faceexpression recognition approach. In: 2006 IEEE international conference on acoustics speech and signal processing proceedings: 2006: IEEE. p. III. https://ieeexplore.ieee.org/document/1660858.
Li H, Huang D, Morvan JM, Wang Y, Chen L. Towards 3D face recognition in the real: a registrationfree approach using finegrained matching of 3D keypoint descriptors. Int J Comput Vis. 2015;113(2):128–42.
Berretti S, Werghi N, Del Bimbo A, Pala P. Matching 3D face scans using interest points and local histogram descriptors. Comput Graph. 2013;37(5):509–25.
Lei Y, Guo Y, Hayat M, Bennamoun M, Zhou X. A twophase weighted collaborative representation for 3D partial face recognition with single sample. Pattern Recogn. 2016;52:218–37.
Kim D, Hernandez M, Choi J, Medioni G. Deep 3D face identification. In: 2017 IEEE International Joint Conference on Biometrics (IJCB): IEEE; 2017. p. 133–42. https://ieeexplore.ieee.org/document/8272691.
Drira H, Amor BB, Srivastava A, Daoudi M, Slama R. 3D face recognition under expressions, occlusions, and pose variations. IEEE Trans Pattern Anal Mach Intell. 2013;35(9):2270–83.
Hariri W, Tabia H, Farah N, Benouareth A, Declercq D. 3D facial expression recognition using kernel methods on Riemannian manifold. Eng Appl Artif Intell. 2017;64:25–32.
Chun SY, Lee CS, Lee SH. Facial expression recognition using extended local binary patterns of 3D curvature. In: Multimedia and Ubiquitous Engineering: Springer; 2013. p. 1005–12. https://link.springer.com/chapter/10.1007/9789400767386_124.
Klingenberg CP. MorphoJ: an integrated software package for geometric morphometrics. Mol Ecol Resour. 2011;11:353–7.
Hammer Ø, Harper D, Ryan P. Paleontological statistics software: package for education and data analysis. Palaeontol Electron. 2001;4.
Team RC: R: a language and environment for statistical computing. 2013.
Anderson MJ. A new method for nonparametric multivariate analysis of variance. Austral Ecol. 2001;26(1):32–46.
Gilani SZ, Rooney K, Shafait F, Walters M, Mian A. Geometric facial gender scoring: objectivity of perception. PLoS One. 2014;9(6):e99483.
BekiosCalfa J, Buenaposada JM, Baumela L. Revisiting linear discriminant techniques in gender recognition. IEEE Trans Pattern Anal Mach Intell. 2011;33(4):858–64.
Fruciano C. Measurement error in geometric morphometrics. Dev Genes Evol. 2016;226(3):139–58.
Azazi A, Lutfi SL, Venkat I, FernándezMartínez F. Towards a robust affect recognition: automatic facial expression recognition in 3D faces. Expert Syst Appl. 2015;42(6):3056–66.
Georgia Sandbach SZ, Pantic M, Yin L. Static and dynamic 3D facial expression recognition: a comprehensive survey. Image Vis Comput. 2012;30(2012):683–97.
StirlingESRC 3D Face Database [http://pics.stir.ac.uk/ESRC/3d_images.htm].
Savran A, Alyüz N, Dibeklioğlu H, Çeliktutan O, Gökberk B, Sankur B, Akarun L. Bosphorus database for 3D face analysis. In: European Workshop on Biometrics and Identity Management: Springer; 2008. p. 47–56.
Caple J, Stephan CN. A standardized nomenclature for craniofacial and facial anthropometry. Int J Legal Med. 2016;130(3):863–79. https://link.springer.com/article/10.1007%2Fs0041401512921.
Zelditch ML, Swiderski DL, Sheets HD. Geometric morphometrics for biologists: a primer: academic press; 2012.
Klingenberg CP, Zaklan SD. Morphological integration between developmental compartments in the Drosophila wing. Evolution. 2000;54(4):1273–85.
Kouli A, Papagiannis A, Konstantoni N, Halazonetis DJ, Konstantonis D. A geometric morphometric evaluation of hard and soft tissue profile changes in borderline extraction versus nonextraction patients. Eur J Orthod. 2018;41(3):264–72.
Yong R, Ranjitkar S, Lekkas D, Halazonetis D, Evans A, Brook A, Townsend G. Threedimensional (3D) geometric morphometric analysis of human premolars to assess sexual dimorphism and biological ancestry in Australian populations. Am J Phys Anthropol. 2018;166(2):373–85.
Viewbox 4  Cephalometric Software [http://dhal.com/viewboxindex.htm].
Huanca Ghislanzoni L, Lione R, Cozza P, Franchi L. Measuring 3D shape in orthodontics through geometric morphometrics. Prog Orthod. 2017;18(1):38.
Corner BD, Lele S, Richtsmeier JT. Measuring precision of threedimensional landmark data. J Quant Anthropol. 1992;3(4):347–59.
Mian AS, Bennamoun M, Owens R. Keypoint detection and local feature matching for textured 3D face recognition. Int J Comput Vis. 2008;79(1):1–12.
Klingenberg CP, McIntyre GS. Geometric morphometrics of developmental instability: analyzing patterns of fluctuating asymmetry with Procrustes methods. Evolution. 1998;52(5):1363–75.
Klingenberg CP, Barluenga M, Meyer A. Shape analysis of symmetric structures: quantifying variation among individuals and asymmetry. Evolution. 2002;56(10):1909–20.
Leamy LJK, Peter C, Sherratt E, Wolf JB, Cheverud JM. The genetic architecture of fluctuating asymmetry of mandible size and shape in a population of mice: another look. Symmetry. 2015;7(1):146–63.
Singh N, Harvati K, Hublin JJ, Klingenberg CP. Morphological evolution through integration: a quantitative study of cranial integration in Homo, Pan, Gorilla and Pongo. J Hum Evol. 2012;62(1):155–64.
Klingenberg C, Wetherill L, Rogers J, Moore E, Ward R, AuttiRämö I, Fagerlund Å, Jacobson S, Robinson L, Hoyme H. Prenatal alcohol exposure alters the patterns of facial asymmetry. Alcohol. 2010;44(7–8):649–57.
PeresNeto PR, Jackson DA, Somers KM. How many principal components? Stopping rules for determining the number of nontrivial axes revisited. Comp Stat Data Analysis. 2005;49(4):974–97.
Bartlett M. A further note on the multiplying factors for various X2 approximations in factor analysis. J R Stat Soc. 1954;16:296–8.
Jackson DA. Stopping rules in principal components analysis: A comparison of heuristical and statistical approaches, vol. 74. Brooklyn: Ecology; 1993. p. 8.
Dudoit S, Fridlyand J, Speed TP. Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc. 2002;97(457):77–87.
Belhumeur PN, Hespanha JP, Kriegman DJ. Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell. 1997;19(7):711–20.
Swets DL, Weng JJ. Using discriminant eigenfeatures for image retrieval. IEEE Trans Pattern Anal Mach Intell. 1996;18(8):831–6.
Ye J, Janardan R, Li Q. Twodimensional linear discriminant analysis. In: Advances in neural information processing systems; 2005. p. 1569–76.
Guo Y, Hastie T, Tibshirani R. Regularized linear discriminant analysis and its application in microarrays. Biostatistics. 2006;8(1):86–100.
Krzanowski W, Jonathan P, McCarthy W, Thomas M. Discriminant analysis with singular covariance matrices: methods and applications to spectroscopic data. J R Stat Soc: Ser C: Appl Stat. 1995;44(1):101–15.
Zhao W, Chellappa R, Phillips PJ. Subspace linear discriminant analysis for face recognition: Citeseer; 1999.
Johnson RA, Wichern DW. Applied multivariate statistical analysis, vol. 5. NJ: Prentice hall Upper Saddle River; 2002. https://searchworks.stanford.edu/view/6804286.
Li T, Zhu S, Ogihara M. Using discriminant analysis for multiclass classification: an experimental investigation. Knowl Inf Syst. 2006;10(4):453–72.
Peterson LE, Coleman MA. Machine learningbased receiver operating characteristic (ROC) curves for crisp and fuzzy classification of DNA microarrays in cancer research. Int J Approx Reason. 2008;47(1):17–36.
Acknowledgments
We acknowledge Stirling/ESRC (University of Stirling) and Bosphorus (Bogazici University) for prompt agreement to use their dataset. Furthermore, credit goes to the Computer Laboratory of the Faculty of Computer Science & Information Technology, Universiti Putra Malaysia.
Funding
This work was supported by Putra Grant Scheme, Malaysia – Code: 9538100 and Fundamental Research Grant Scheme, Malaysia  Code: 5524959. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Author information
Authors and Affiliations
Contributions
OA and AN conceived the idea and data acquisition; OA developed the methodology; AAG prepared original draft; YKC reviewed and edit the manuscript; OA and AN prepared the visualization and performed analysis; RY supervised the entire work and approved the manuscript. We also declared that all authors have read and approved the final manuscript.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
The use of human subject was approved by the committee in charge of Stirling database headed by Peter Hancock (p.j.b.hancock@stir.ac.uk) and Bosporus database headed by Prof. Bulent Sankur (Bulent.sankur@boun.edu.tr). The informed consent to participate was obtained by written and signed.
Consent for publication
For the purpose of method demonstration, only subject with ID F1002 was visualized in the study who has indicated her consent by written and signed for publication on Stirling dataset; while no image was visualized on Bosphorus dataset.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Additional file 1: Table S1.
Raw threedimensional digitized data for Stirling dataset expression group.
Additional file 2: Table S2.
Raw threedimensional digitized data for Bosphorus dataset expression group.
Additional file 3: Table S3.
PCs scores for all subjects used in the Stirling dataset.
Additional file 4:.
Table S4. PCs scores for all subjects used in the Bosphorus dataset
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Agbolade, O., Nazri, A., Yaakob, R. et al. 3Dimensional facial expression recognition in human using multipoints warping. BMC Bioinformatics 20, 619 (2019). https://doi.org/10.1186/s1285901931532
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1285901931532