Parametrizable neural network accelerators enable the deployment of targeted hardware for specialized environments. Finding the best architecture configuration for a given specification, however, is challenging. A large number of hardware configurations have to be considered, and for each hardware instance, an efficient software execution plan needs to be found, leading to a vast search space. Prior work has tackled this problem by dividing the search into subproblems for individual layers of a network. There is no guarantee, however, that the overall best hardware configuration that delivers the desired end-to-end performance across the entire network is among the best individual layer configurations. This work presents SENNA, a unified hardware/software space exploration framework for parametrizable neural network accelerators. To guide the exploration towards the overall best configuration, SENNA employs a multi-objective genetic algorithm with a novel design space representation that encodes the configuration of hardware and software parameters in a single chromosome. Using the Parallel Island Model (PIM), each layer is represented by one or more individual islands each containing a separate population to simultaneously search for the best configuration across the entire network. A tailored gene migration technique enables the exchange of genes between the populations of different islands. SENNA is evaluated with three parametrizable architectures and four neural networks. The evaluation result demonstrates that SENNA achieves upto 1.92x EDP improvement compared to the State-of-the-Art. With equivalent evaluation budgets, SENNA shows 2.5x-9.3x speedup compared to an Oracle scheme and the State-of-the-Art.
Read full abstract