The reality of the ray tracing technology that leads to its rendering effect is becoming increasingly apparent in computer vision and industrial applications. However, designing efficient ray tracing hardware is challenging due to memory access issues, divergent branches, and daunting computation intensity. This article presents a novel architecture, a RT engine (Ray Tracing engine), that accelerates ray tracing. First, we set up multiple stacks to store information for each ray so that the RT engine can process many rays parallel in the system. The information in these stacks can effectively improve the performance of the system. Second, we choose the three-phase break method during the triangle intersection test, which can make the loop break earlier. Third, the reciprocal unit adopts the approximation method, which combines Parabolic Synthesis and Second-Degree interpolation. Combined with these strategies, we implement our system at RTL level with agile chip development. Simulation and experimental results show that our architecture achieves a performance per area which is 2.4 × greater than the best reported results for ray tracing on dedicated hardware.
Read full abstract