This paper proposes a method, called Relational Keypoint Constraints (ReKep), to mapping a set of 3D keypoints in the environment to a numerical cost (formulate the constraints) by:
- detecting keypoints of images using DINOv2
- generating python cost functions by using GPT-4o
- solving the sense actions of end-effector
The keypoints idea is actually not new, while the proposed ReKep cost is interesting. That makes the proposed method can handle various constraints in the environments, such as:
- versatile to diverse tasks
- free of manual labeling
- optimizable by off-the-shelf solvers to produce robot actions in real-time