PurposeThis paper aims to present a new approach for robot programming by demonstration, which generates robot programs by tracking 6 dimensional (6D) pose of the demonstrator’s hand using a single red green blue (RGB) camera without requiring any additional sensors.Design/methodology/approachThe proposed method learns robot grasps and trajectories directly from a single human demonstration by tracking the movements of both human hands and objects. To recover the 6D pose of an object from a single RGB image, a deep learning–based method is used to detect the keypoints of the object first and then solve a perspective-n-point problem. This method is first extended to estimate the 6D pose of the nonrigid hand by separating fingers into multiple rigid bones linked with hand joints. The accurate robot grasp can be generated according to the relative positions between hands and objects in the 2 dimensional space. Robot end-effector trajectories are generated from hand movements and then refined by objects’ start and end positions.FindingsExperiments are conducted on a FANUC LR Mate 200iD robot to verify the proposed approach. The results show the feasibility of generating robot programs by observing human demonstration once using a single RGB camera.Originality/valueThe proposed approach provides an efficient and low-cost robot programming method with a single RGB camera. A new 6D hand pose estimation approach, which is used to generate robot grasps and trajectories, is developed.