GPTAlly: a safety-oriented system for human-robot collaboration based on foundation models
Files
Bastin_22541900_2024.pdf
Embargoed access until 2025-07-01 - Adobe PDF
- 13.23 MB
Details
- Supervisors
- Faculty
- Degree label
- Abstract
- We are aiming for Society 5.0, which emphasizes improving workplace quality of life through AI and robotics. However, current robots lack human-like situational understanding and often rely on pre-programmed tasks or supervised learning. Additionally, there is a need for safety metrics that consider users' subjective safety perceptions. This thesis introduces GPTAlly, a system for safe human-robot collaboration using Large Language Models (LLMs) and Visual Language Models (VLMs). LLMs help infer users' subjective safety perceptions in collaborative tasks, influencing a Safety Index algorithm that adjusts safety evaluations. The system ensures robots stop to prevent harmful collisions and uses an LLM-based coding paradigm to determine subsequent actions, either autonomously or as per user preferences. The actions are implemented by an LLM, which shapes robotic arm trajectories by interpreting the user's natural language instructions to suggest 3D poses. A user study compares safety perception scaling factors from GPT-4 with participants' estimates. The study also evaluates user satisfaction with the changes in robot behavior. The accuracy of the streamlined coding paradigm is evaluated through contextual experiments by varying the number of conditions processed by the LLM and paraphrasing the conditions. The satisfaction with the trajectories shaped from 3D poses is assessed through another user study. The study finds that LLMs effectively integrate human safety perceptions. GPT-4's estimations of the scaling factors closely match the user responses, and participants express satisfaction with behavior changes. However, the coding paradigm's contextual accuracy can be below 50%. Finally, the robotic arm trajectories found that users preferred trajectories shaped by their natural language inputs over uninfluenced ones. Codebase available at: https://axtiop.github.io/GPTAlly