9D Rotation Representation-SVD Fusion with Deep Learning for Unconstrained Head Pose Estimation
and
Sep 30, 2024
About this article
Published Online: Sep 30, 2024
Page range: 62 - 68
DOI: https://doi.org/10.2478/ijanmc-2024-0028
Keywords
© 2024 Jiaqi Lyu et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Comparison of the MAE between L2 and geodesic LOSS
AFLW2000 | BIWI | 70/30 BIWI | |
---|---|---|---|
Loss function | MAE | MAE | MAE |
L2 Loss | 3.90 | 3.92 | 2.71 |
Geodesic Loss | 3.85 | 3.73 | 2.50 |
Comparisons with state-of-the-art methods on the AFLW2000 and BIWI dataset
AFLW2000 | BIWI | |||||||
---|---|---|---|---|---|---|---|---|
Models | Yaw | Pitch | Roll | MAE | Yaw | Pitch | Roll | MAE |
HopeNet[ |
6.40 | 6.53 | 5.39 | 6.11 | 4.54 | 5.15 | 3.37 | 4.36 |
FSA-Net[ |
4.50 | 6.08 | 4.64 | 5.07 | 4.64 | 5.61 | 3.57 | 4.61 |
HPE[ |
4.80 | 6.18 | 4.87 | 5.28 | 3.12 | 5.18 | 4.57 | 4.29 |
QuatNet[ |
3.97 | 5.62 | 3.92 | 4.50 | 2.94 | 5.49 | 4.01 | 4.15 |
WHENet[ |
5.11 | 6.24 | 4.92 | 5.42 | 3.99 | 4.39 | 3.06 | 3.81 |
TriNet[ |
4.04 | 5.77 | 4.20 | 4.67 | 4.11 | 4.76 | 3.05 | 3.97 |
FDN[ |
3.78 | 5.61 | 3.88 | 4.42 | 4.52 | 4.70 | 2.56 | 3.93 |
6DRepNet[ |
3.63 | 4.91 | 3.37 | 3.97 | 3.24 | 4.48 | 2.68 | 3.47 |
9D-EfficientNet | 3.57 | 4.69 | 3.28 | 3.85 | 4.08 | 4.17 | 2.94 | 3.73 |
Comparison of MAE between ResNet and EfficientNetV2 backbone networks
AFLW2000 | BIWI | 70/30 BIWI | |
---|---|---|---|
Models | MAE | MAE | MAE |
ResNetl8 | 4.37 | 3.70 | 2.64 |
EfficientNetV2-S | 3.85 | 3.73 | 2.50 |
EfficientNetV2-S architecture
Stage | Operation | Stride | #Channels | #Layers |
---|---|---|---|---|
0 | Conv3x3 | 2 | 24 | 1 |
1 | Fused-MBConv1,3x3 | 1 | 24 | 2 |
2 | Fused-MBConv4,3x3 | 2 | 48 | 4 |
3 | Fused-MBConv4,3x3 | 2 | 64 | 4 |
4 | MBConv4,3x3,SE0.25 | 2 | 128 | 6 |
5 | MBConv6,3x3,SE0.25 | 1 | 160 | 9 |
6 | MBConv6,3x3,SE0.25 | 2 | 256 | 15 |
7 | Conv 1x1&Pooling&FC | - | 1280 | 1 |
Euler error comparisons with state-of-the-art methods on the 70/30 BIWI dataset
BIWI | ||||
---|---|---|---|---|
Models | Yaw | Pitch | Roll | MAE |
HopeNet[ |
3.29 | 3.39 | 3.00 | 3.23 |
FSA-Net[ |
2.89 | 4.29 | 3.60 | 3.60 |
TriNet[ |
2.93 | 3.04 | 2.44 | 2.80 |
FDN[ |
3.00 | 3.98 | 2.88 | 3.29 |
MDFNet[ |
2.99 | 3.68 | 2.99 | 3.22 |
DDD-Pose[ |
3.04 | 2.94 | 2.43 | 2.80 |
6DRepNet[ |
2.69 | 2.92 | 2.36 | 2.66 |
9D-EfficientNet | 2.62 | 2.36 | 2.51 | 2.50 |
Comparison of parameters and FLOPs between 6DRepNet and our method
Models | Params | FLOPs |
---|---|---|
6DRepNet | 43.752M | 9.844G |
9D-EfficientNet | 20.189M | 2.901G |