Network-on-chip (NOC) architectures are being extensively employed following the growing number of cores in such networks as well as a rising trend in power/energy consumption and latency development. Proper NOC architectures can thus significantly contribute to the performance of these networks. Accordingly, a new architecture is proposed in this paper to decrease network diameter. A new particular topological structure is also presented based on node-layer clustering algorithm (NLCA) together with several rules for the main node as cluster-head (CH). A deadlock-free routing is subsequently suggested using this topology. To examine the effect of the given architecture on algorithm speed, the Scalable Universal Matrix Multiplication Algorithm (SUMMA) is further implemented and evaluated. Upon a decrease in the network diameter, the simulation results indicate a 10% improvement in energy consumption, 5.3% growth in network latency, and 20% enhancement in throughput as the given architecture is utilized. Moreover, SUMMA is employed in which a better cost in the proposed architecture can be established compared with its counterparts.