Offline reinforcement learning methods for real-world problems
Offline reinforcement learning methods for real-world problems
- Research Article
- 10.20473/jpk.v13.isi2.2025.56-65
- Aug 4, 2025
- Jurnal Promkes
Background: Quality learning is one aspect of improving the quality of education. Learning can occur offline (face-to-face) or online. When offline learning occurs simultaneously with online learning, stress arises for students due to numerous burdens and a denser schedule in navigating blended learning methods during this pandemic. Stress can interfere with student resilience. Resilience is the process, capacity, or outcome of individuals adapting successfully to challenges or threatening circumstances. This study aimed to analyze the relationship between stress levels and student resilience in managing combinations of blended offline and online learning methods during the COVID-19 pandemic at the UCB Nursing Study Program. The research method is quantitative, employing a correlational analytic research design with a cross-sectional approach. The sample consists of 153 respondents selected using the total sampling technique. Data will be collected by distributing questionnaires online through the Google Forms platform with informed consent and analyzed using the Chi-Square test. The results indicated a relationship between stress levels and student resilience in managing blended offline and online learning methods during the COVID-19 pandemic at the Nursing Study Program, Faculty of Health, University of Citra Bangsa, with a P-value of 0.001. Conclusion: Students can overcome psychological burdens in the form of stress in both online and offline learning, making this resilience crucial for helping students manage stress in their studies.
- Research Article
2
- 10.1299/kikaic.57.2256
- Jan 1, 1991
- TRANSACTIONS OF THE JAPAN SOCIETY OF MECHANICAL ENGINEERS Series C
When neural networks are applied to servo systems, the computation time of the control algorithms cannot be neglected, and the time period of on-line learning is expected to be shortened, considering the life span of the plant and the saving of electric energy. In this paper, we first proposed a design method of digital control systems using neural networks. In this method, the future information of the desired value is used as the input of the neural network and the computation time of the control algorithm is taken into account. Secondly, we presented an off-line learning method based on an approximate model of the servo system. The effectiveness of the proposed control systems and the off-line learning method is demonstrated by experiments and simulations on the control of a parallelogram link robot manipulator of 2 degrees of freedom.
- Research Article
1
- 10.30780/ijtrs.v08.i08.001
- Aug 25, 2023
- International Journal of Technical Research & Science
The education sector is witnessing a paradigm shift with rapid and ongoing technological advancements. The online, offline, and blended modes of learning continue to evolve with time. The purpose of this survey is to collect students’ responses to understand their perspectives on the different modes of learning. The advantages, challenges, and requirements for conducting classes through online, offline, and blended learning methods are discussed. A questionnaire was designed, and a survey was conducted among undergraduate and graduate students. The questions are carefully planned to understand the choice of students while selecting different modes of learning, various activities and tools, and the reasons for their preferences. 200 students took part in the survey and shared their feedback. The advantages and disadvantages of online and offline learning are presented. A chi-square test was conducted, and the association between the two questions is shown to be significant. Suggestions for enhancing teaching and learning based on the findings of the survey help faculty members to plan the teaching methodology to suit the requirements of students.
- Research Article
- 10.30693/smj.2024.13.8.58
- Aug 30, 2024
- Korean Institute of Smart Media
Nitrogen oxides(NOx) in coal-fired power plants are significant contributors to air pollution, influencing the formation of ozone and fine particulate matter, thereby adversely affecting health. Therefore, accurate prediction of NOx emissions is essential. Existing researches have mainly performed based on off-line learning methods, leading to poor prediction performance with the limited training dataset. This paper proposes the online learning model of online support vector regression to predict NOx emissions from coal-fired power plants. Online learning model, which updates a model whenever new observations come out, demonstrates high prediction accuracy even when initial data is scarce. The experimental results showed that the performance of online learning prediction was better than existing off-line learning methods. The results indicated online learning method is a valuable tool for predicting NOx emissions, especially in situations where initial data is limited and data is continuously updated in real-time.
- Research Article
- 10.33394/j-ps.v11i3.8097
- Jul 3, 2023
- Prisma Sains : Jurnal Pengkajian Ilmu dan Pembelajaran Matematika dan IPA IKIP Mataram
The survival rate in cases of cardiac arrest outside the hospital is still low, besides that the public's knowledge about CPR is still limited, and there is a critical time limit for performing CPR. Therefore, the role of lay people, including nursing students, in CPR is important. Improving the knowledge and skills of nursing students in CPR through effective learning methods, one of which is using offline and online methods. This study aims to see differences in learning methods between offline and online methods in increasing the knowledge and skills of Cardiac Pulmonary Resuscitation (CPR) in students of STIKes Hutama Abdi Husada Tulungagung. This research is a quasy experimental research with a posttest only control design approach. The research sample consisted of 56 students who were divided into two groups. The research instrument used questionnaires and observation sheets with statistical tests using Mann Whitney. The results showed that learning methods using both offline and online methods could significantly increase respondents' knowledge about CPR (p=0.002) but did not show a significant difference in respondents' skills in performing CPR (p=0.052). The two learning methods both offline and online have their respective effectiveness, where the online method is more effective in increasing students' knowledge and skills on RJP.
- Conference Article
3
- 10.1109/pic.2016.7949511
- Dec 1, 2016
Face detection technology is a hot topic in the past recent years. It has been maturely applied to many practical areas. However, the driver face detection is still an open problem to solve. In this paper, we proposed an improved method to promote the face detection rate and apply it to the images from the monitoring videos. The first step is to detect the car from the images according to an off-line learning method. Then the method based on additional off-line learning method is the front level for skin color feature in order to correctly detect the driver face. The proposed systems are implemented on the various complicated road environment. The results show that the proposed method improves the efficiency of the driver face detection and is of strong robustness on having glasses, driver head rotation, and lighting change.
- Conference Article
1
- 10.1109/mlccim55934.2022.00073
- Aug 1, 2022
The problem of tracking and controlling drifting objects on the sea is often faced with many problems, such as complex control environment, low stability of controller and a large number of algorithm calculation. In order to realize the accurate control of drifting objects on the sea, a course controller based on consciousness neural network technology is proposed. First of all, combined with Nash equilibrium guidance, based on the trajectory characteristics and control requirements of drifting objects, the tracking problem of drifting objects is modeled, and its state space, action space and reward function are designed. Then, the conscious neural network algorithm is used as the implementation of the controller, and the off-line learning method is used to train the controller. Finally, the trained controller is compared with the original controller BP-PID controller is compared to analyze the control effect. The simulation results show that the designed controller based on consciousness neural network can converge rapidly from the training process to meet the control requirements. Compared with BP-PID controller, the trained neural network has the advantages of fast tracking and small error. The research results can provide a reference for the path tracking control of drifting objects on the sea.
- Research Article
- 10.30863/palakka.v3i1.2529
- Jun 30, 2022
- Palakka : Media and Islamic Communication
This study focuses on comparing the effectiveness of online and offline learning methods in the 2020 Islamic Communication and Broadcasting study program as a trial class in 2021. This study uses the Uses and Gratification method. This research was conducted using the Quantitative Comparative method. Based on the results of the study, it was found that the offline learning method has a higher effectiveness than the online learning method on KPI Students Class of 2020 as a trial class in 2021, this is evidenced by comparative quantitative calculations carried out by researchers in which the variable y shows a higher value. compared to variable x. This research is limited to one class in the study program at IAIN Parepare, therefore it is recommended that further research can expand the scope of the subject.
- 10.11517/pjsai.jsai2024.0_3xin246
- Jan 1, 2024
- Proceedings of the Annual Conference of JSAI
Robustness Evaluation of Offline Reinforcement Learning Methods to Perturbations in Joint Torque Signals
- Book Chapter
9
- 10.1007/978-0-387-34887-2_9
- Jan 1, 1995
The objectives of this paper are to investigate applicability of neural network techniques for single and multiple frame video traffic prediction. In the single and multiple frame traffic prediction problems, the information of previous frame sizes is used to predict either the following or several following frame sizes respectively. Accurate traffic prediction can be used to optimally smooth delay sensitive traffic [Ott et al., 1992] and increase multiplexing gain in asynchronous transfer mode (ATM) networks. Neural network models for both single and multiple frame traffic prediction problems are proposed. Two important types of video sequences are considered - video teleconferencing and entertainment video. An off-line learning method is suggested for simple traffic and an on-line learning method for complex one. Simulation studies of cell losses in an ATM multiplexer using recorded variable-bit-rate coded video teleconference data indicate reasonably good predictions for buffer delays between 0.5 and 5 ms.KeywordsRadial Basis Function NetworkFrame SizeAsynchronous Transfer ModeVideo TrafficScene ChangeThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
- Research Article
- 10.3795/ksme-a.2002.26.6.1092
- Jun 1, 2002
- Transactions of the Korean Society of Mechanical Engineers A
This paper discusses the composition of the theory of reinforcement learning, which is applied in real-time learning, and evolutionary strategy, which proves its the superiority in the finding of the optimal solution at the off-line learning method. The individuals are reduced in order to learn the evolutionary strategy in real-time, and new method that guarantee the convergence of evolutionary mutations are proposed. It is possible to control the control object varied as time changes. As the state value of the control object is generated, applied evolutionary strategy each sampling time because of the learning process of an estimation, selection, mutation in real-time. These algorithms can be applied, the people who do not have knowledge about the technical tuning of dynamic systems could design the controller or problems in which the characteristics of the system dynamics are slightly varied as time changes. In the future, studies are needed on the proof of the theory through experiments and the characteristic considerations of the robustness against the outside disturbances.
- Conference Article
1
- 10.1109/isie.2001.931996
- Jun 12, 2001
This paper discusses the composition of the theory of reinforcement learning, which is applied in real-time learning, and evolutionary strategy, which proves its superiority in the finding of the optimal solution in the off-line learning method. The individuals are reduced in order to learn the evolutionary strategy in real-time, and a new method that guarantees the convergence of evolutionary mutations is proposed. It is possible to control the control object varied as time changes. As the state value of the control object is generated, evolutionary strategy is applied to each sampling time because the learning process of an estimation, selection, mutation is in real-time. These algorithms can be applied by people who do not have knowledge about the technical tuning of dynamic systems to design the controller or problems in which the characteristics of the system dynamics are slightly varied as time changes. In the future, studies are needed on the proof of the theory through experiments and the characteristic considerations of the robustness against outside disturbances.
- Conference Article
11
- 10.1109/icip.2015.7351493
- Sep 1, 2015
The problem of behavior assessment in video surveillance is approached using trajectory classification. Lagrangian state dynamic is used for probabilistic modeling of trajectory patterns and an off-line parameter learning method for the model is proposed. For classification purpose, an on-line sequential maximum a posterior trajectory classifier is introduced based on particle filter. Finally, the performance of this method is evaluated using a traffic video data set.
- Conference Article
9
- 10.1109/aici.2009.482
- Jan 1, 2009
3D models are increasing greatly, and have been used in different fields. The need of retrieving 3D models is constantly emerging. Especially, how to reduce the `semantic gap' between the low-level features and high-level semantics, becomes one of the most hot topic. This paper gives a deep survey about the state of the art on semantic processing in content-based 3D model retrieval. Firstly, a framework of contend-based 3D model retrieval system integrated with high-level semantics is presented. Secondly, this paper concludes existing researches and divides the way of high-level semantic processing into three main categories: (1) using relevance feedback based on-line learning to integrate effectively users' high level semantic knowledge; (2) using off-line machine learning methods to narrow the gap between high-level semantic knowledge and low-level object representation; (3) using object ontology to define high-level concepts. Finally, the paper recommends some challenges in this field.
- Research Article
2
- 10.1177/0959651820903549
- Mar 4, 2020
- Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering
Energy consumption and temperature rise are challenging in traditional high-power hydraulic pump/motor loading systems. This study develops a novel loading system configuration where energy can be regenerated by mechanical compensation. A compound loading method is proposed to implement static loading with displacement control and dynamic loading with valve control, respectively. This method combines the high energy-regeneration capability of the displacement control and the fast dynamic response of the valve control. Pressure controllers are designed based on system modeling and characteristics analysis. For static loading, a proportional–integral–derivative controller is employed, and the variation of rotational speed is taken into consideration to reduce overshoot and oscillation. For dynamic loading, an off-line learning method is developed by using least square calibration to keep the relief flow rate at a low level without flow meters. An experimental setup of a 300-kW-class loading system is manufactured, and experiments in various loading modes are performed. The results show that the designed system and controllers have accurate pressure tracking performance and low energy consumption.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.