Abstract

Deep learning is a branch in machine learning which focuses on learning hierarchical feature representations of data using neural networks, with each successive layer representing information at a progressively more abstract level than the previous one. The progress of research in deep learning has greatly accelerated in the past decade, with tremendous advances leading to state-of-the-art performance in a wide range of tasks well beyond conventional machine learning methods. For example, deep neural networks have excelled at visual perception tasks, leading to an increase in top-5 accuracy from 74.2% in 2011 (before deep learning) to 97.6% in 2019, which is beyond human performance. Similar demonstrations of state-of-the-art performance beyond past machine learning methods have also been achieved for audio perception [1], natural language processing [2], game playing [3], scientific discovery [4], and content generation [5]. Given all the recent success, there has now been tremendous interest and focus on not only the academic research community, but also significant investment by industry in the wide-spread adoption of deep learning for solving complex real-world problems such as autonomous driving, smart cities, manufacturing, and finance. Despite the advances in accuracy and performance gained via deep learning, one of the biggest challenges with widespread ‘operational’ adoption is the sheer complexity of these high-performant deep neural networks created by the research community. As much of the focus had been on modelling accuracy and performance, many of the created deep neural networks have highly complex architectures that are intractable from both computational and memory perspectives in real-world operational scenarios. This complexity issue is particularly challenging for edge and mobile scenarios, where on-device processing is highly desirable (and often necessary) for privacy and latency/bandwidth reasons, and the embedded chips have greatly restricted computational, memory, and energy resources. As such, the ability to design and create deep neural networks that are not only high-performing but also operate well on low-power embedded devices has become a crucial path in breaking the barrier towards real-world, operational use and deployment of deep learning within industrial scenarios. Aiming to break the barrier towards real-world, operational deep learning, one of the key research focuses of our research group at the Department of Systems Design Engineering, University of Waterloo, Canada has been on novel computer-assisted design strategies for creating highly efficient, high-performant deep neural networks for edge and mobile usage on low-power devices. Given the tremendous need to bring deep learning from theory to operational practice, several strategies have been introduced in recent years for achieving compact neural networks, ranging from design principles [6, 7] and precision reduction [8] to model compression [9] and automated network architecture search [10, 11]. Much current research focuses on either purely manual principled design strategies (which are highly time consuming for designers and require significant trial and error by their nature) or fully automatic methods (which are potentially more flexible but extremely computationally expensive and intractable for rapid design), and as such the challenge we wish to address is to devise a strategy that offers great flexibility to create custom-tailored neural networks for specific tasks and requirements but in a rapid design process from both a human and computational labour perspective. Motivated by this, what sets our research direction apart from other strategies is our focus on the notion of human-machine collaborative design, where we combine human experience and creativity with the meticulousness and raw speed of machines to enable rapid design of compact neural networks tailored around human-specified operational requirements and constraints for a given real-world scenario. Overview of the generative synthesis (GenSynth) process for teaching generative machines to automatically generate deep neural networks with efficient network architectures. To realise this concept, we explore a very different research question: Can we teach generative machines to automatically generate deep neural networks with efficient network architectures? We mathematically formulate this question as a constrained optimisation problem, where the goal is to teach a generative machine which synthesises deep neural networks maximising a universal performance function while satisfying human-specified design and operational requirements and constraints. Given that such a constrained optimisation problem is intractable to solve due to the enormity of the feasible solution region, we instead introduce a new method for finding an approximate solution in the form of generative synthesis, which is premised on the intricate interplay between a generator-inquisitor pair that work in tandem to garner insights and learn to generate highly efficient deep neural networks that best satisfy operational requirements. At a higher level (see main figure), an inquisitor probes a neural network synthesised by a generator to learn at a foundational level about the architectural efficiencies of the network based on reactionary responses. The generator used to synthesise the network is then updated based on knowledge and insights gained by the inquisitor, thus enabling it to learn to synthesise better, more compact neural networks. This entire process is repeated over cycles, resulting in progressive improvements in the generator in its ability to synthesise progressively more compact deep neural networks that satisfy operational requirements. What is most interesting is that, once the generator has been taught through generative synthesis, it can be used to generate not just one but a large variety of different, unique highly efficient deep neural networks that satisfy operational requirements. The result of this generative synthesis strategy is the automatic generation of highly efficient deep neural networks (which we nickname FermiNets) that outperform neural networks produced using state-of-the-art human-driven principled design strategies and machine-driven automatic network architecture search strategies in terms of information density, computational cost, and accuracy, by more than an order of magnitude in some cases. Furthermore, we were able to demonstrate significant energy efficiency improvements (by > 4x in one case) on mobile processors, which further illustrates the effectiveness of the proposed generative synthesis strategy in rapid design of compact deep neural networks catered for real-world, practical usage. The biggest impact of this work, along with related work in our research group focused on AI-assisted design of compact deep neural networks for operational, real-world use via human-machine collaborative design, is that it is a big enabler for widespread adoption of deep learning on current low-power, low-cost hardware, and is already being adopted by industry across different sectors ranging from aerospace to consumer electronics to create intelligent AI-powered systems on the edge. Since the results reported in this study, we have made further advances in the types of unique architecture topologies that can be generated via generative synthesis to further push the limits on the diversity of tasks it can handle (e.g., audio perception, natural language processing, etc.) as well as better cater to underlying hardware architecture (e.g., CPU, GPU, etc.) for greater efficiency and acceleration. We are at a very exciting time around operationalising deep learning, as it is crucial for moving this powerful machine learning concept from theory to widespread practice. As this area has only garnered interest in the past few years and is very much in its infancy, there are numerous unexplored research directions and avenues to explore given the current limitations that still exist for AI-assisted design of compact deep neural networks (e.g., design exploration speed, optimization of design exploration space for high quality deep neural network architectures, assessment of quality of deep neural network designs based on performance targets and criteria, etc.). There will be significant advances and developments in this area in the next decade and beyond, and our research group is very happy to help push the boundaries to enable deep learning for everyone.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call