Shin S, Li Z, Halilaj E. Markerless Motion Tracking With Noisy Video and IMU Data.
IEEE Trans Biomed Eng 2023;
70:3082-3092. [PMID:
37171931 DOI:
10.1109/tbme.2023.3275775]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
OBJECTIVE
Marker-based motion capture, considered the gold standard in human motion analysis, is expensive and requires trained personnel. Advances in inertial sensing and computer vision offer new opportunities to obtain research-grade assessments in clinics and natural environments. A challenge that discourages clinical adoption, however, is the need for careful sensor-to-body alignment, which slows the data collection process in clinics and is prone to errors when patients take the sensors home.
METHODS
We propose deep learning models to estimate human movement with noisy data from videos (VideoNet), inertial sensors (IMUNet), and a combination of the two (FusionNet), obviating the need for careful calibration. The video and inertial sensing data used to train the models were generated synthetically from a marker-based motion capture dataset of a broad range of activities and augmented to account for sensor-misplacement and camera-occlusion errors. The models were tested using real data that included walking, jogging, squatting, sit-to-stand, and other activities.
RESULTS
On calibrated data, IMUNet was as accurate as state-of-the-art models, while VideoNet and FusionNet reduced mean ± std root-mean-squared errors by 7.6 ± 5.4 ° and 5.9 ± 3.3 °, respectively. Importantly, all the newly proposed models were less sensitive to noise than existing approaches, reducing errors by up to 14.0 ± 5.3 ° for sensor-misplacement errors of up to 30.0 ± 13.7 ° and by up to 7.4 ± 5.5 ° for joint-center-estimation errors of up to 101.1 ± 11.2 mm, across joints.
CONCLUSION
These tools offer clinicians and patients the opportunity to estimate movement with research-grade accuracy, without the need for time-consuming calibration steps or the high costs associated with commercial products such as Theia3D or Xsens, helping democratize the diagnosis, prognosis, and treatment of neuromusculoskeletal conditions.
Collapse