An efficient and uncertainty-aware reinforcement learning framework for quality assurance in extrusion additive manufacturing
An efficient and uncertainty-aware reinforcement learning framework for quality assurance in extrusion additive manufacturing
33
- 10.1109/access.2019.2905264
- Jan 1, 2019
- IEEE Access
34
- 10.1016/j.jmsy.2024.04.013
- Apr 22, 2024
- Journal of Manufacturing Systems
78
- 10.1021/acsbiomaterials.0c01761
- Apr 22, 2021
- ACS Biomaterials Science & Engineering
43
- 10.1109/tii.2019.2920661
- Jun 14, 2019
- IEEE Transactions on Industrial Informatics
126
- 10.1038/s41467-022-32126-1
- Aug 23, 2022
- Nature Communications
71
- 10.3390/jcs6070202
- Jul 8, 2022
- Journal of Composites Science
4
- 10.1108/wje-08-2020-0347
- Apr 6, 2021
- World Journal of Engineering
39
- 10.3390/inventions5030025
- Jul 1, 2020
- Inventions
455
- 10.1016/j.compchemeng.2020.106886
- Apr 28, 2020
- Computers & Chemical Engineering
53
- 10.1089/3dp.2021.0231
- Jan 5, 2022
- 3D printing and additive manufacturing
- Research Article
2
- 10.1016/j.measurement.2016.05.082
- May 30, 2016
- Measurement
Indirect measurement of high grid strip densities over Nyquist sampling rate based on the moiré pattern analysis for quality assurance in grid manufacturing
- Research Article
- 10.1002/ente.202500916
- Oct 12, 2025
- Energy Technology
An efficient deep reinforcement learning (DRL) energy management strategy for parallel hybrid electric vehicles is proposed in this paper. Firstly, a comprehensive model of the vehicle's powertrain system is established, and the energy management problem is briefly described. Subsequently, an efficient learning framework based on the deep Q‐network (DQN) algorithm is constructed. The framework additionally incorporates a thermostat‐based rule‐aiding system to direct the model's training, accelerate the rate of training, and enhance the model's optimality and training rate in conjunction with enhanced priority experience replay, avoiding cold‐start in RL. In the simulation, the proposed model is compared with methods based on DQN and rule‐DQN. The results demonstrate that the newly developed DRL framework achieves higher training efficiency and optimality. Furthermore, this research examines how changes in the initial values of rule adoption rate and exploration rate influence the control performance of the model. Finally, the model, trained using a 520‐second driving cycle, is evaluated on the worldwide light‐duty test cycle (WLTC) driving cycle, validating its high adaptability.
- Research Article
104
- 10.1109/lra.2020.3005126
- Jun 25, 2020
- IEEE Robotics and Automation Letters
In the past decades, we have witnessed significant progress in the domain of autonomous driving. Advanced techniques based on optimization and reinforcement learning become increasingly powerful when solving the forward problem: given designed reward/cost functions, how we should optimize them and obtain driving policies that interact with the environment safely and efficiently. Such progress has raised another equally important question: what should we optimize? Instead of manually specifying the reward functions, it is desired that we can extract what human drivers try to optimize from real traffic data and assign that to autonomous vehicles to enable more naturalistic and transparent interaction between humans and intelligent agents. To address this issue, we present an efficient sampling-based maximum-entropy inverse reinforcement learning (IRL) algorithm in this letter. Different from existing IRL algorithms, by introducing an efficient continuous-domain trajectory sampler, the proposed algorithm can directly learn the reward functions in the continuous domain while considering the uncertainties in demonstrated trajectories from human drivers. We evaluate the proposed algorithm via real-world driving data, including both non-interactive and interactive scenarios. The experimental results show that the proposed algorithm achieves more accurate prediction performance with faster convergence speed and better generalization compared to other baseline IRL algorithms.
- Research Article
4
- 10.1109/mm.2022.3199686
- Nov 1, 2022
- IEEE Micro
In this article, we propose an energy-efficient architecture, which is designed to receive both images and text inputs as a step toward designing reinforcement learning agents that can understand human language and act in real-world environments. We evaluate our proposed method on three different software environments and a low power drone named Crazyflie to navigate toward specified goals and avoid obstacles successfully. To find the most efficient language-guided reinforcement learning model, we implemented the model with various configurations of image input sizes and text instruction sizes on the Crazyflie drone GAP8, which consists of eight reduced instruction set computer-V cores. The task completion success rate and onboard power consumption, latency, and memory usage of GAP8 are measured and compared with Jetson TX2 ARM central processing unit and Raspberry Pi 4. The results show that by decreasing 20% of input image size we achieve up to 78% energy improvement while achieving an 82% task completion success rate.
- Research Article
3
- 10.1177/09544089231175207
- Jun 20, 2023
- Proceedings of the Institution of Mechanical Engineers, Part E: Journal of Process Mechanical Engineering
Among all the additive manufacturing (AM) technologies, the material extrusion-based AM is the most popular, and widely used for prototyping, rapid tooling, and direct part production in the current scenario. In material extrusion AM (MEAM), a solid form of material is converted to a semi-molten form and extruded out of a nozzle to deposit in a layer-by-layer manner. Nowadays MEAM three-dimensional (3D) printers are very popular among different application sectors. This motivates us to develop various commercial as well as open-source entry-level 3D printers such as the development of filament extruder 3D printers, pellet extruder 3D printer, or a combination of both. However, to develop or operate these printers, preliminary knowledge of various extrusion parameters is required to obtain the desired quality in the fabricated parts. Therefore, an initial attempt has been made to provide information related to various extrusion parameters such as definition, working range, and the effect of these parameters on performance measured. The impact of process parameters on the performance of either filament extrusion AM or pellet/granules extrusion AM has only been investigated in the studies available in the literature. However, this study provides a comprehensive review of the effects of process parameters on performance measurements for both types of extrusion-based AM processes. In this research work, along with a detailed review of previous works, a study of various process parameters, modeling, and methodology has been provided. The statistical modeling approach, histogram normal plots have been provided along with mean, standard deviation, and percentage to decide the optimum process parameters based on data available in the existing literature.
- Research Article
- 10.1609/aaai.v39i27.35123
- Apr 11, 2025
- Proceedings of the AAAI Conference on Artificial Intelligence
Reinforcement Learning (RL) has emerged as a powerful paradigm for sequential decision-making with numerous real-world applications. However, in practical environments such as recommender systems, search engines, and LLMs, RL algorithms must efficiently learn from biased human feedback that may be subject to corruption. In this talk, I will present our recent efforts in developing robust RL algorithms that can provably effectively handle such challenging scenarios. First, I will introduce our works on reinforcement learning from biased click feedback in ranking. While previous approaches typically relied on strong assumptions about human click behavior (formalized as click models) and required specialized debiasing methods for different models, we propose a novel unified framework that formulates the ranking process under general click models as a Markov Decision Process, enabling the development of a click model-agnostic RL algorithm. Second, I will introduce the fundamental vulnerability of bandits and reinforcement learning under corrupted feedback. Our theoretical analysis provides complete necessity and sufficiency characterizations of the attackability of linear bandits and linear RL, revealing their intrinsic robustness and limitations. Lastly, I will discuss our recent works on improving RL finetuning for LLMs, including sample efficient off-policy RLHF and solving the gradient entanglement issue in margin-based alignment methods.
- Book Chapter
7
- 10.1016/b978-0-12-818411-0.00021-5
- Jan 1, 2021
- Additive Manufacturing
Chapter 6 - Polymer and composites additive manufacturing: material extrusion processes
- Conference Article
22
- 10.1109/icra.2017.7989383
- May 1, 2017
Autonomous learning of robotic skills can allow general-purpose robots to learn wide behavioral repertoires without extensive manual engineering. However, robotic skill learning must typically make trade-offs to enable practical real-world learning, such as requiring manually designed policy or value function representations, initialization from human demonstrations, instrumentation of the training environment, or extremely long training times. We propose a new reinforcement learning algorithm that can train general-purpose neural network policies with minimal human engineering, while still allowing for fast, efficient learning in stochastic environments. We build on the guided policy search (GPS) algorithm, which transforms the reinforcement learning problem into supervised learning from a computational teacher (without human demonstrations). In contrast to prior GPS methods, which require a consistent set of initial states to which the system must be reset after each episode, our approach can handle random initial states, allowing it to be used even when deterministic resets are impossible. We compare our method to existing policy search algorithms in simulation, showing that it can train high-dimensional neural network policies with the same sample efficiency as prior GPS methods, and can learn policies directly from image pixels. We also present real-world robot results that show that our method can learn manipulation policies with visual features and random initial states.
- Conference Article
4
- 10.1145/3380446.3430640
- Nov 16, 2020
At the era of Artificial Intelligence and Internet of Things (AIoT), battery-powered mobile devices are required to perform more sophisticated tasks featured with fast varying workloads and constrained power supply, demanding more efficient run-time power management. In this paper, we propose a deep reinforcement learning framework for dynamic power and thermal co-management. We build several machine learning models that incorporate the physical details for an ARM Cortex-A72, with on average 3% and 1 % error for power and temperature predictions, respectively. We then build an efficient deep reinforcement learning control incorporating the machine learning models and facilitating the run-time dynamic voltage and frequency scaling (DVFS) strategy selection based on the predicted power, workloads and temperature. We evaluate our proposed framework, and compare the performance with existing management methods. The results suggest that our proposed framework can achieve 6.8% performance improvement compared with other alternatives.
- Research Article
4
- 10.1109/tvt.2023.3268822
- Sep 1, 2023
- IEEE Transactions on Vehicular Technology
In order to provide customized services for the future sixth-generation (6 G) mass business, we propose a two-timescale intelligent radio access network (RAN) slicing scheme under the architecture of cell-free distributed massive multiple-input multiple-output (MIMO) systems. Cell-free distributed massive MIMO systems have powerful macro diversity gain and multi-user interference suppression capabilities to improve the performance of different slices, and are more flexible in terms of service types based on network slicing. In the proposed scheme, we utilize the long-term and short-term trends of network to achieve adaptive resource allocation at different timescales, so as to utilize resources more effectively and meet performance requirements in parallel. Moreover, multi-connectivity is utilized to improve link reliability and further improve system performance, and user clustering reduces the impact of pilot contamination on system performance. In order to implement the RAN control strategy effectively, an efficient two-level deep reinforcement learning framework is proposed and the multi-agent reinforcement learning algorithm is used to realize efficient network resource interaction in multi-device scenarios. Simulation results further verify the effectiveness of the proposed intelligent RAN slicing scheme.
- Research Article
3
- 10.1109/tnnls.2023.3296642
- Nov 1, 2024
- IEEE transactions on neural networks and learning systems
Deep reinforcement learning (RL) typically requires a tremendous number of training samples, which are not practical in many applications. State abstraction and world models are two promising approaches for improving sample efficiency in deep RL. However, both state abstraction and world models may degrade the learning performance. In this article, we propose an abstracted model-based policy learning (AMPL) algorithm, which improves the sample efficiency of deep RL. In AMPL, a novel state abstraction method via multistep bisimulation is first developed to learn task-related latent state spaces. Hence, the original Markov decision processes (MDPs) are compressed into abstracted MDPs. Then, a causal transformer model predictor (CTMP) is designed to approximate the abstracted MDPs and generate long-horizon simulated trajectories with a smaller multistep prediction error. Policies are efficiently learned through these trajectories within the abstracted MDPs via a modified multistep soft actor-critic algorithm with a λ -target. Moreover, theoretical analysis shows that the AMPL algorithm can improve sample efficiency during the training process. On Atari games and the DeepMind Control (DMControl) suite, AMPL surpasses current state-of-the-art deep RL algorithms in terms of sample efficiency. Furthermore, DMControl tasks with moving noises are conducted, and the results demonstrate that AMPL is robust to task-irrelevant observational distractors and significantly outperforms the existing approaches.
- Research Article
2
- 10.1613/jair.1.12270
- Jan 20, 2021
- Journal of Artificial Intelligence Research

 
 
 Solving multi-objective optimization problems is important in various applications where users are interested in obtaining optimal policies subject to multiple (yet often conflicting) objectives. A typical approach to obtain the optimal policies is to first construct a loss function based on the scalarization of individual objectives and then derive optimal policies that minimize the scalarized loss function. Albeit simple and efficient, the typical approach provides no insights/mechanisms on the optimization of multiple objectives due to the lack of ability to quantify the inter-objective relationship. To address the issue, we propose to develop a new efficient gradient-based multi-objective reinforcement learning approach that seeks to iteratively uncover the quantitative inter-objective relationship via finding a minimum-norm point in the convex hull of the set of multiple policy gradients when the impact of one objective on others is unknown a priori. In particular, we first propose a new PAOLS algorithm that integrates pruning and approximate optimistic linear support algorithm to efficiently discover the weight-vector sets of multiple gradients that quantify the inter-objective relationship. Then we construct an actor and a multi-objective critic that can co-learn the policy and the multi-objective vector value function. Finally, the weight discovery process and the policy and vector value function learning process can be iteratively executed to yield stable weight-vector sets and policies. To validate the effectiveness of the proposed approach, we present a quantitative evaluation of the approach based on three case studies.
 
 
- Research Article
3
- 10.1023/a:1018004120707
- Jan 1, 1996
- Machine Learning
This article presents a new reinforcement learning method called SANE (Symbiotic, Adaptive Neuro-Evolution), which evolves a population of neurons through genetic algorithms to form a neural network capable of performing a task. Symbiotic evolution promotes both cooperation and specialization, which results in a fast, efficient genetic search and discourages convergence to suboptimal solutions. In the inverted pendulum problem, SANE formed effective networks 9 to 16 times faster than the Adaptive Heuristic Critic and 2 times faster than Q-learning and the GENITOR neuro-evolution approach without loss of generalization. Such efficient learning, combined with few domain assumptions, make SANE a promising approach to a broad range of reinforcement learning problems, including many real-world applications.
- Research Article
254
- 10.1007/bf00114722
- Jan 1, 1996
- Machine Learning
This article presents a novel reinforcement learning method called SANE (Symbiotic, Adaptive Neuro-Evolution), which evolves a population of neurons through genetic algorithms to form a neural network capable of performing a task. Symbiotic evolution promotes both cooperation and specialization, which results in a fast, efficient genetic search and prevents convergence to suboptimal solutions. In the inverted pendulum problem, SANE formed effective networks 9 to 16 times faster than the Adaptive Heuristic Critic and 2 times faster than the GENITOR neuro-evolution approach without loss of generalization. Such efficient learning, combined with few domain assumptions, make SANE a promising approach to a broad range of reinforcement learning problems, including many real-world applications.
- Conference Article
1
- 10.1115/detc2023-114848
- Aug 20, 2023
This paper discusses a Digital Twin framework for quality assurance in mould manufacturing consisting of the physical and virtual manufacturing process connected by a data lake. To this end, we present the DT framework and demonstrate its applicability at the example of ensuring the quality of milling features on a mould tool part. For the demonstration we build two cascaded models, a tool wear state model and an insert part quality model. The tool wear state model assigns a label corresponding to the tool wear state using cutting force measurements. The part quality model then uses the tool state and engineering data to classify quality of individual milling features on the mould tool part. To develop the cascaded models, we conducted a case study which experimentally collected machine controller data and cutting forces using a Kistler dynamometer for machining a test part on a Makino V33i three-axis vertical milling machine. The test part contains mould tool specific features and is made of hardened tool steel (46-52 HRC). Three milling tools were repeatedly used to machine test parts and gather data at different tool wear states. For validating the part quality model, we further collected metrology data from a coordinate measurement machine. Results show that the developed cascaded models are able to monitor the tool wear stage and to classify quality deviations with a weighted F1 measure of 89.0%.
- Research Article
- 10.1016/j.addma.2025.105012
- Nov 1, 2025
- Additive Manufacturing
- Research Article
- 10.1016/j.addma.2025.105014
- Nov 1, 2025
- Additive Manufacturing
- Research Article
- 10.1016/j.addma.2025.105008
- Nov 1, 2025
- Additive Manufacturing
- Research Article
- 10.1016/j.addma.2025.105018
- Nov 1, 2025
- Additive Manufacturing
- Research Article
- 10.1016/j.addma.2025.104992
- Oct 1, 2025
- Additive Manufacturing
- Research Article
- 10.1016/j.addma.2025.104983
- Oct 1, 2025
- Additive Manufacturing
- Research Article
- 10.1016/j.addma.2025.104999
- Oct 1, 2025
- Additive Manufacturing
- Research Article
- 10.1016/j.addma.2025.105017
- Sep 1, 2025
- Additive Manufacturing
- Research Article
- 10.1016/j.addma.2025.105009
- Sep 1, 2025
- Additive Manufacturing
- Research Article
- 10.1016/j.addma.2025.104989
- Sep 1, 2025
- Additive Manufacturing
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.