Ressources

This page contains a non-exhaustive list of libraries, APIs, SDKs, hardware, and multimodal interaction resources that may be useful for the course projects.

Do not hesitate to suggest additional resources that could benefit future students.


Modality Recognition Libraries & APIs

Modality Name Language(s) Description
Vision MediaPipe Python, Web, Mobile Real-time hand, pose, face, gesture, and object recognition
Vision Ultralytics (YOLO) Python Real-time object detection and segmentation
Vision RF-DETR Python Transformer-based object detection
Speech Web Speech API JavaScript Browser-based speech recognition and synthesis
Speech Whisper Python Speech recognition
Speech Vosk Python Offline speech recognition
Speech PaddleSpeech Python Speech recognition and synthesis
Speech FunASR Python Speech recognition toolkit
Gaze (Eye Tracking) WebGazer.js JavaScript Webcam eye tracking in the browser
Emotion DeepFace Python Facial expression and emotion recognition
Emotion face-api.js JavaScript Face detection and emotion recognition
Motion Tracking BoxMOT Python Multi-object tracking
Sonification Tone.js JavaScript Audio synthesis and sonification
Sonification SoniPy Python Sonification toolkit
Touch / Gestures React Native Gesture Handler TypeScript Mobile touch and gesture interactions
Mouse / Gestures Interact.js JavaScript Drag-and-drop and gesture interactions
Mouse / Gestures MagicMouse.js JavaScript Mouse gesture interactions

Multimodal Interaction Toolkits & Examples

Name Description
Geno Tool for authoring multimodal interactions on existing web applications
Tangible Engine SDK SDK for tabletop and tangible interaction systems
MRTK Mixed Reality Toolkit for HoloLens
XR Interaction Toolkit Unity toolkit for XR interactions
VRTK VR interaction toolkit for Unity

Game Engines, Simulation & XR Development

Name Description
Unity Game engine commonly used for XR and multimodal interaction
Unreal Engine Real-time 3D engine
Godot Open-source game engine
GameMaker 2D game engine
MuJoCo Physics simulator
Gymnasium Reinforcement learning environments
Bullet Physics Physics engine
Android Studio Android development environment
Vuforia AR SDK
ARCore Mobile AR SDK
Meta XR SDK XR development SDK
OpenAI Gym Reinforcement learning toolkit

Large Language Models (LLMs)

Name Description
vLLM High-performance LLM inference and serving engine
Gemini API Multimodal large language model API
OpenAI API API for LLMs and multimodal AI models

Available Hardware

Hardware Notes
Tobii Eye Tracker 4C 6 available
Tobii Pro Glasses 3 On demand
HoloLens 2 On demand
Meta Quest 2 VR development
SenseGlove Nova Haptic gloves
Muse S (Gen 2) EEG headset
OpenBCI Biosensing Starter Bundle EEG / EMG biosensing
Interactive Tabletop Tangible interaction experiments
Smartwatches Sensor-rich wearable interactions
Dobot Magician E6 Serial robot arm
PMB2 Robot Mobile robot platform
TurtleBot3 Burger ROS-compatible mobile robot

Additional Useful Resources

Name Description
Sonification Design Examples Examples of sonification projects
W3C Multimodal Interaction W3C multimodal interaction architecture
Web Speech API Demo Browser speech recognition demo