Nacre is known for its uniquely high toughness and lightweight capabilities. Its unique structure is composed of soft nacre proteins and stiff calcium carbonates, allowing it to deflect cracks that expand in straight lines to increase energy dissipation. However, nacre microstructures are challenging to mimic due to the intractable number of combinations in the design space. We thus propose a reinforcement learning (RL) framework to efficiently design a high-toughness nacre-like structure. By designing local structures at the crack tip, we incorporated reinforcement learning with finite element to optimize the structure by replacing the soft and stiff materials in the design space. Starting from the initial unit cell, where the majority of the unit cell consists of soft materials, our method gradually improves the cell by arranging stiff and soft materials on the unit cell to achieve higher toughness. The optimized designs exhibit crack-insensitive behavior and excellent crack resistance when subjected to finite element simulation and experimental testing. This design framework can be used in synthetic instruments that require rapid construction rearrangements such as biomaterials and unexposed substructures, increasing their mechanical performance.