Abstract

Recently, deep neural networks (DNNs) have been widely studied and achieved tremendous success in a variety of real-world applications, such as computer vision, natural language processing, medical diagnosis, and machine translation. Deep reinforcement learning (DRL), as an emerging powerful deep learning technique, combines DNNs with reinforcement learning into an interactive system that enables the agent to learn the best policy from the environment to optimize their rewards. DRL opens up many new applications in domains such as healthcare, robotics, and smart grids. With the rapid evolution of IT infrastructures, cloud computing has been witnessed as the most common computing paradigm. The underlying infrastructure of cloud computing relies on a large amount of data centers. Nowadays, with the fast development of communication technologies, such as 5G, cloud computing is now evolving into yet another new paradigm called edge computing, providing for emerging applications, especially in the Internet-of-Things. The energy efficiency issue from both "cloud" and "edge" are becoming more crucial and calls for more attention. To solve the real-world energy efficiency problems, we take advantage of the deep learning and deep reinforcement learning techniques for efficient modeling of both "edge" and "cloud" applications. On the "edge" side, wearable technologies and the Internet of Things (IoT) are becoming increasingly popular in both personal and commercial applications with the employment of the embedded system technologies. Non-Volatile Processors (NVPs) have been proposed for the intermittent power supply problem, to ensure the instant on/off for embedded systems under the unstable power supply. We proposed neural network-based prediction techniques to derive the best power extraction policy and converter parameters optimization technique for NVPs. In cloud computing, we formulated a novel optimal power management problem of data centers to minimize the total cost by proposing the long short term memory (LSTM) neural network-based method for the prediction of future data center power consumption, and further we presented a novel DRL-based hierarchical framework for solving the overall resource allocation and power management problem in cloud computing systems. Furthermore, we explored the effectiveness of the DRL-based framework for high dimensional action and state spaces problem-dynamic treatment regimes, to model real-life complexity in heterogeneous disease progression and treatment choices, to provide doctors and patients the data-driven personalized decision recommendations. Last but not least, we explored the efficiency of deep neural networks. Structured weight pruning is a representative model compression technique of DNNs to reduce both the storage and computation requirements and meanwhile accelerate inference. An automatic hyperparameter determination process is necessary due to a large number of flexible hyperparameters. We propose an automatic structured pruning framework to compress the DNNs. Our framework outperforms the prior works on automatic model compression by up to 33× in pruning rate (120× reduction in the actual parameter count) under the same accuracy. Significant inference speedup has been observed from the proposed framework on actual measurements on the smartphone.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call