Vision-based activity, gesture or pose recognition
Classify different activities, gestures, or poses using either raw video from the camera or using pose information estimated using an existing model like PoseNet. Your system should support a minimum of four new activities or gestures. We recommend starting with existing pretrained models that are trained for these applications.
Use the smartphone’s microphone to classify sounds for applications like activity recognition or keystroke detection. For activity recognition, the system should support a minimum of four new activities. For keystroke detection, you could classify between the ten digits from 0 to 9. For this project you could apply transfer learning using existing pretrained models such as VGGish.
Related papers:If you would like to complete a project that is not on this list please email the course staff with a description of the idea and we will get back to you about whether the project is suitable.
You can train your ML models offline on a desktop or laptop using a machine learning library. Once your model has been trained, you can then export or convert it to do realtime inference on a smartphone.
For iOS users Turi can be used to create ML models that can be imported into iOS.
For Android users, if you use Keras or Tensorflow, you can save your model directly as a Tensorflow Lite file which can be used by Android. If you use other libraries such as Pytorch, you can convert your models for use on Android.