AI-driven Java Performance Testing: Balancing Result Quality with Testing Time

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Performance testing aims at uncovering efficiency issues of software systems.\nIn order to be both effective and practical, the design of a performance test\nmust achieve a reasonable trade-off between result quality and testing time.\nThis becomes particularly challenging in Java context, where the software\nundergoes a warm-up phase of execution, due to just-in-time compilation. During\nthis phase, performance measurements are subject to severe fluctuations, which\nmay adversely affect quality of performance test results. However, these\napproaches often provide suboptimal estimates of the warm-up phase, resulting\nin either insufficient or excessive warm-up iterations, which may degrade\nresult quality or increase testing time. There is still a lack of consensus on\nhow to properly address this problem. Here, we propose and study an AI-based\nframework to dynamically halt warm-up iterations at runtime. Specifically, our\nframework leverages recent advances in AI for Time Series Classification (TSC)\nto predict the end of the warm-up phase during test execution. We conduct\nexperiments by training three different TSC models on half a million of\nmeasurement segments obtained from JMH microbenchmark executions. We find that\nour framework significantly improves the accuracy of the warm-up estimates\nprovided by state-of-practice and state-of-the-art methods. This higher\nestimation accuracy results in a net improvement in either result quality or\ntesting time for up to +35.3% of the microbenchmarks. Our study highlights that\nintegrating AI to dynamically estimate the end of the warm-up phase can enhance\nthe cost-effectiveness of Java performance testing.\n

Similar Papers
  • Research Article
  • Cite Count Icon 2
  • 10.1007/s10470-014-0461-3
Designing nonlinearity characterization for mixed-signal circuits in system-on-chip
  • Nov 28, 2014
  • Analog Integrated Circuits and Signal Processing
  • Byoung‐Ho Kim + 1 more

Long test times and the use of conventional automatic test equipment (ATE) makes conventional mixed-signal linearity performance testing costly. Diminishing test time of linearity test significantly reduces system-on-a-chip production test costs and, therefore, lessens total product manufacturing costs. Several low-cost linearity test methods have addressed this issue for a single-ended mixed-signal circuit testing. On the other hand, a low-cost test approach has rarely been proposed for differential mixed-signal circuits, due to a new class of test obstacles from differential circuits that are widely employed for high-speed I/O products. This paper presents a cost-effective self-test methodology to characterize the linearity performance of differential mixed-signal circuits in loopback mode. The proposed method precisely predicts the device-under-test (DUT) linearity specifications by building accurate DUT nonlinear polynomial models using spectral specifications from recent work. The test cost is significantly reduced by replacing conventional ATE with the proposed self-test platform and by reducing test time to a fraction of conventional testing time. Hardware measurement results validated the test performance of the proposed test scheme.

  • Research Article
  • Cite Count Icon 192
  • 10.1016/s0378-7753(02)00210-0
Calendar- and cycle-life studies of advanced technology development program generation 1 lithium-ion batteries
  • May 7, 2002
  • Journal of Power Sources
  • R.B Wright + 14 more

Calendar- and cycle-life studies of advanced technology development program generation 1 lithium-ion batteries

  • Book Chapter
  • 10.1007/978-3-540-27009-6_144
Time requirements for scramjet performance study with fuel of kerosene
  • Jan 1, 2005
  • Shock Waves
  • J L Le + 3 more

Progress and achievement of airbreathing propulsion is seriously constrained by the capability of ground test facility and test time is one of the critical technical issues for the development of its concept and test technique. As we know, airbreathing propulsion test objectives are typically classified into three categories: performance, operability and durability tests. Usually, the performance test time criterion is the time needed to establish steady combustion flow, the operation test time criterion is the time needed to operate moving components of the propulsion system through its operating range, and the durability test time criterion is the time to reach equilibrium temperature along a trajectory or the time of a long cruise. For operability and durability research, blow-down wind tunnel (run duration of tens seconds) and continuous wind tunnel (run duration of more than several minutes) are regularly used. However, for performance research, run duration is still often cited as an impediment to the use of pulse facilities, like shock-tunnels. The generally accepted test time criterion for steady combustion flow establishment is the ratio of the product of test time and flow velocity (often referred to as “slung length”) to the model’s length, and there are a lot of reviews about the time requirement for performance research of model scramjet engine under the condition of laminar or separated turbulent flow[1]. Nevertheless, using shock tunnel, for model engine length of 3.6 m with gas hydrogen fuel, steady combustion flow is fully established within 2ms from flow initiation through the combustor and important sets of data have been successfully obtained for duplicating flight condition of Mach 12 with run duration of 8ms (this represents 8 model flow lengths) [2].

  • Conference Article
  • Cite Count Icon 3
  • 10.1109/wgec.2009.61
Test Scheduling of SOC with Power Constraint Based on Particle Swarm Optimization Algorithm
  • Oct 1, 2009
  • Chuan-Pei Xu + 2 more

The rapid development of modern VLSI technology allows incorporating a complete system on a single chip using the system-on-chip (SOC) methodology. Test scheduling solution for SOC embedded IP cores is a very complex problem. It is necessary to test these cores in parallel for reducing test time. This paper presents an efficient approach based on particle swarm optimization (PSO) algorithm for the test scheduling problem of core-based SOCs with power constraint. PSO algorithm is improved on so that the algorithm can be used to optimize SOC testing. The cores are assigned to test access mechanism (TAM) of given widths such that the total test time is minimized. Experimental results for ITC’02 benchmarks demonstrate that the method has better performance and lower testing time compared to other heuristic algorithms in test scheduling of SOC.

  • Research Article
  • Cite Count Icon 11
  • 10.1007/s11219-020-09532-z
An autonomous performance testing framework using self-adaptive fuzzy reinforcement learning
  • Mar 10, 2021
  • Software Quality Journal
  • Mahshid Helali Moghadam + 4 more

Test automation brings the potential to reduce costs and human effort, but several aspects of software testing remain challenging to automate. One such example is automated performance testing to find performance breaking points. Current approaches to tackle automated generation of performance test cases mainly involve using source code or system model analysis or use-case-based techniques. However, source code and system models might not always be available at testing time. On the other hand, if the optimal performance testing policy for the intended objective in a testing process instead could be learned by the testing system, then test automation without advanced performance models could be possible. Furthermore, the learned policy could later be reused for similar software systems under test, thus leading to higher test efficiency. We propose SaFReL, a self-adaptive fuzzy reinforcement learning-based performance testing framework. SaFReL learns the optimal policy to generate performance test cases through an initial learning phase, then reuses it during a transfer learning phase, while keeping the learning running and updating the policy in the long term. Through multiple experiments in a simulated performance testing setup, we demonstrate that our approach generates the target performance test cases for different programs more efficiently than a typical testing process and performs adaptively without access to source code and performance models.

  • Research Article
  • Cite Count Icon 3
  • 10.1093/ptj/pzad122
Floor-to-Stand Performance Among People Following Stroke.
  • Sep 10, 2023
  • Physical therapy
  • Angela F Davis + 3 more

Studies have examined floor-to-stand performance in varied adult populations both quantitatively and qualitatively. Despite an elevated risk of falls and inability to independently return to stand after a fall, few have examined the ability to stand from the floor in patients recovering from stroke. There were 2 objectives of the study: to identify the relationships between floor-to-stand performance using a timed supine-to-stand test (TSS) and physical performance measures of gait, balance, and balance confidence among persons in the subacute phase after stroke; and to analyze descriptive strategies used in the completion of the TSS. A cross-sectional design was implemented. Fifty-eight adults (mean age = 59.2 [standard deviation (SD) = 13.9] years; 34 [58.6%] men) who were in the subacute phase after ischemic or hemorrhagic stroke and who could stand from the floor with no more than supervision completed the TSS and physical performance assessments. The median time to complete the TSS in our sample was 13.0 (interquartile range = 15.5) seconds. TSS time was significantly correlated with physical performance tests, including the Timed "Up & Go" Test (ρ = 0.70), gait speed (ρ = -0.67), Dynamic Gait Index (ρ = -0.52), and Activities-Specific Balance Confidence Scale (ρ = -0.43). Thirty-two percent of the variance in TSS time was attributed to Timed "Up & Go" Test time and the use of the quadruped position to transition to standing. Participants who used a gait device were more likely to use a chair during rise to stand. The TSS demonstrates concurrent validity with physical performance measures. Findings serve to improve functional mobility examination after stroke and to formulate effective treatment interventions to improve floor-to-stand performance.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/icitbs49701.2020.00090
Analysis on Anti-Jamming Performance Test Method of Communication System in New Energy Power Plant
  • Jan 1, 2020
  • Hao Wang + 3 more

The anti-interference performance test method of the communication system of the traditional new energy power plant has the shortcoming of long test time. To this end, the anti-interference performance test method of the communication system of the new energy power plant is proposed. By determining the interference type of the new energy communication system, establishing the anti-interference performance index of the communication system, and then calculating the differential mode rejection ratio, common mode rejection ratio and bit error rate, the anti-interference performance test method analysis of the communication system of the new energy power plant is completed. Compare the test time with two traditional anti-interference performance test methods, the experimental results show that the proposed anti-jamming performance test method of communication system in new energy power plant requires shorter test time, and the results indicating that it has a higher test efficiency.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 33
  • 10.1186/2192-1962-3-22
The art of software systems development: Reliability, Availability, Maintainability, Performance (RAMP)
  • Dec 1, 2013
  • Human-centric Computing and Information Sciences
  • Mohammad Isam Malkawi

The production of software systems with specific demand on reliability, availability, maintenance, and performance (RAMP) is one of the greatest challenges facing software engineers at all levels of the development cycle. Most requirements specification tools are more suited for functional requirements than for non-functional RAMP requirements. RAMP requirements are left unspecified, specified at a later stage, or at best vaguely specified, which makes requirements specifications more of an art than a science. Furthermore, the cost of testing for RAMP requirements is quite often prohibitive. In many cases, it is difficult to test for some of the RAMP specifications such as maintainability, reliability, and high availability. Even the test for performance is quite often workload dependent and as such the performance numbers provided at test time or at system commissioning time may not be achievable during actual system workload. What makes the subject matter more difficult is the absence of a clear set of rules or practices, which, if followed closely, produce a system with acceptable RAMP specifications. As such, and until the design of RAMP software systems becomes a well understood theme, the development of such systems will be a fine art, where the tools and capabilities of developing such systems will depend on the particular system to be developed, the environment in which it will run, and the level of expertise and knowledge deployed. Just like no two pieces of art produced by the same artist are the same, no two software systems will have the same RAMP characteristics.This paper will focus on the paradigms involved in the production of RAMP software systems through several case studies. The purpose is to promote the interest of researchers to develop more specific guidelines for the production of SW systems with well defined RAMP qualities.

  • Conference Article
  • 10.2991/nceece-15.2016.221
Evaluation on Thermal Performance of Solar Water Heating System of High-Rise House in Jinan City
  • Jan 1, 2016
  • Jinglei Shi + 2 more

The paper selects distributed solar water heating system of certain high-rise house in Jinan City as research object, its field performance test and energy efficiency assessment are made, obtains annual solar guarantee rate and conventional energy substitution quantity of solar water heating system of the building, makeseconomic benefit assessment, analyzes energy saving effect and economic benefit of the system, and gives out energy saving index and economic index. Research results show distributed solar water heating system of the high-rise building has good energy saving effect and economic benefit.

  • Research Article
  • Cite Count Icon 29
  • 10.1007/s002280050178
Single oral doses of amisulpride do not enhance the effects of alcohol on the performance and memory of healthy subjects.
  • Oct 14, 1996
  • European Journal of Clinical Pharmacology
  • M J Mattila + 6 more

Amisulpride is a benzamide antipsychotic that binds selectively to dopamine D2- and D3-receptors, preferentially in limbic and hippocampal structures. Since other substituted benzamides have a limited or negligible interaction with alcohol on human performance, amisulpride was studied for this potential. In a randomised double-blind crossover study, 18 young, non-smoking men took single oral doses of placebo and amisulpride 50 mg and 200 mg, without and with ethanol (0.8 g. kg-1) taken 30 min later. Objective performance tests and self-ratings were done at baseline and 1.5, 3.5 and 6.5 h after drug intake. Memory (immediate and delayed recall) was tested 2 h after dosing. Breath ethanol and the plasma concentrations of amisulpride and prolactin were measured. Three-way ANOVA + Newman-Keul tests were used for statistical analyses; interactions were confirmed by factorial contrast ANOVA. Mean blood ethanol was 0.94, 0.62 and 0.26 g.1(-1) at the three test times. It produced significant impairment in all performance tests (symbol digit substitution, simulated driving, body sway, flicker fusion, tapping, nystagmus), reduced both immediate and delayed recall in memory tests, and caused subjective clumsiness, muzziness and mental slowness, mainly between 1.5 to 4.5 h after dosing. Amisulpride, 50 and 200 mg elevated plasma prolactin but had minimal or no effect on performance, attention and memory. The decreases in immediate free recall after the 50 mg dose and in delayed free recall after the 200 mg dose were slight. Amisulpride neither modified blood ethanol concentrations nor enhanced the detrimental effect of ethanol on skilled and cognitive performance; it slightly antagonised ethanol in the digit copying test. Ethanol did not modify the effect of amisulpride on plasma prolactin, and the plasma concentrations of amisulpride were little changed by ethanol. Amisulpride in single oral doses of 50 and 200 mg did not interact significantly with the effects of high, moderate or low concentrations of ethanol on human skilled and cognitive performance. The drugs did interact pharmacokinetically.

  • Conference Article
  • Cite Count Icon 4
  • 10.1145/3053600.3053614
In-Test Adaptation of Workload in Enterprise Application Performance Testing
  • Apr 18, 2017
  • Maciej Kaczmarski + 3 more

Performance testing is used to assess if an enterprise application can fulfil its expected Service Level Agreements. However, since some performance issues depend on the input workloads, it is common to use time-consuming and complex iterative test methods, which heavily rely on human expertise. This paper presents an automated approach to dynamically adapt the workload so that issues (e.g. bottlenecks) can be identified more quickly as well as with less effort and expertise. We present promising results from an initial validation prototype indicating an 18-fold decrease in the test time without compromising the accuracy of the test results, while only introducing a marginal overhead in the system.

  • Research Article
  • Cite Count Icon 1
  • 10.1080/10671188.1961.10762073
Rate and Pattern of Recuperation from the Effects of Ethyl Alcohol on Man as Measured by Selected Gross Motor Skills
  • Mar 1, 1961
  • Research Quarterly. American Association for Health, Physical Education and Recreation
  • Dale O Nelson

The purpose of this investigation was to study the rate and pattern of recuperation from the effects of ethyl alcohol on man as measured by selected gross motor tests. One level of alcohol (2 oz.) and one testing time, with graduated time intervals between consumption of alcohol and the performance test, were used in the experiment. There is a clear pattern of recovery from the effects of alcohol on man as measured by the selected gross motor tests.

  • Supplementary Content
  • 10.25904/1912/524
The Effects of Intermittent Hypoxic Exposure (IHE) on Haemorheology of Elite Middle Distance Runners
  • Jan 1, 2005
  • Griffith Research Online (Griffith University, Queensland, Australia)
  • Matthew Jamieson

The Effects of Intermittent Hypoxic Exposure (IHE) on Haemorheology of Elite Middle Distance Runners

  • Conference Article
  • Cite Count Icon 6
  • 10.1145/3021460.3021479
Setting Realistic Think Times in Performance Testing
  • Feb 5, 2017
  • Raghu Ramakrishnan + 2 more

Think time in performance testing is the time spent by users between their interactions with a software system such as viewing responses, navigating on the screen, entering data etc. The value of think time has a significant effect on the results of the performance test in terms of the system load which is generated during the test. However, the use of realistic think time values has not been explored in detail either by researchers or performance testing practitioners in IT industry. Practitioners either use subjective think time values or values generated using uniform distribution. Our work provides an algorithm for extracting think time values by processing web server logs. The steps for using the extracted think time values are described for load testing tools such as Rational Performance Tester, JMeter and LoadRunner. The algorithm is used to extract think times from web server logs of a software system servicing a large number of users. The real-world think times are compared with the think times being used in the performance testing of the application. This work will help practitioners in the IT industry to generate more accurate and reliable performance testing results by moving away from the use of experience or intuition based think times to incorporating representative user delays in their tests. The proposed approach requires minimal effort and can be seamlessly made part of the performance testing process.

  • Conference Article
  • Cite Count Icon 3
  • 10.1109/isie.2009.5222622
Design and implementation of an integrated remote test system for mobile phones
  • Jul 1, 2009
  • Ying-Wen Bai + 1 more

In this paper we design three programs, the first one is an automatic network RF signal measurement (ANRSM), the second one is automatic RF module and baseband hardware device self-diagnosis SmartMobileTest, and the last one is RF remote control test program (RRCTP). The greatest advantages of our ANRSM and our SmartMobileTest are that they are of a small size with portability, and that they reduce both the test time and the complexity of RF test programs. Our ANRSM is written in C++ language. This program uses the embedded 3GPP RF Spec TS 51.010 and by means of Virtual Instrument Software Architecture (VISA) controls the General Purpose Interface Bus (GPIB) which is used to control the RF base station simulator. Moreover we use LabVIEW and Web publishing technology to develop RRCTP to implement the remote control and command exchange. The SmartMobileTest program is created on a Windows Mobile OS to test the function of the RF and baseband module. Both our ANRSM and our SmartMobileTest support RF automatic testing and detection for the hardware performance of all the RF and baseband modules in a Smartphone. By combining these two programs we are able to identify the effect of both RF sensitivity and noise quickly.

Save Icon
Up Arrow
Open/Close