Using egocentric vision to improve the imu-based prediction of future knee and ankle joint angles, in complex out-of-the-lab environments.
Paper: https://ieeexplore.ieee.org/abstract/document/9729197
Here we fuse motion capture data with egocentric videos to improve the joint angle prediction performance in complex uncontrolled environment like public classrooms, atrium and stairwells. The optical flow features are generated from the raw images, by PWC-net trained on the synthetic MPI-Sintel dataset, and processed by a LSTM before being fused with the joint kinematics stream.
In the following video, we can see that the information about the future movements of the subject is available in their visual field, both in terms of what lies ahead of them e.g. stairs or chairs, as well as how they move their head and eyes for path-planning. Thus, vision acts as a "window into the future".
a_window_into_the_future.mp4
The following videos (frames dropped to shorten the videos) and the corresponding figures show example maneuvers and the improvement achieved over just kinematics inputs (red line), by fusing kinematics and vision inputs (green line).
go_around_the_podium.mp4
enter_the_classroom.mp4
In the figure below, we compared performance improvements due to vision with increase in the amount of data per subject (left) and increase in the number of subjects (right). We see that inclusion of vision shows better improvement than no vision condition, for both the cases. We also see that rate of improvement for vision reduces slowly compared to the no vision condition. Indicating that with more data better performance could be achieved.
The dataset used in the paper can be accessed through the following repository: https://github.com/abs711/The-way-of-the-future . The detailed description of the dataset is available in the following publication: https://doi.org/10.1038/s41597-023-01932-7
Run 'torchVision/MainPain/main.py' to start training a model. The models used in the paper are defined 'UtilX/Vision4Prosthetics_modules.py'.