Scheduling of a robotic flow shop, where a dual-gripper robot transports parts between machines, is addressed with a makespan measure. Most previous studies on robotic flow shop scheduling have focused on cyclic scheduling where the robot repeats a certain sequence to process identical parts. Recently, noncyclic scheduling of robotic flow shops is strongly required due to the need for producing customized products and smaller order sizes efficiently. This study therefore considers noncyclic scheduling of a dual-gripper robotic flow shop with a given part sequence and proposes a novel solution approach, look-ahead based reinforcement learning (LARL). The LARL consists of deep Q-learning for training a Q-network based on a given set of instances and the look-ahead search used for testing new instances. The look-ahead search in the LARL is efficient, especially for robotic flow shop scheduling where future state information can be used for determining the current robot task. The experimental results comparing the LARL with an optimal algorithm and the well-known robot task sequence show the effectiveness of the LARL.