An Efficient Interaction Protocol Inference Scheme for Incompatible Updates in IoT Environments
Incompatible updates of IoT systems and protocols give rise to interoperability problems. Even though various protocol adaptation and unknown protocol inference schemes have been proposed, they either do not work where the updated protocol specifications are not given or suffer from inefficiency issues. In this work, we present an efficient protocol inference scheme for incompatible updates in IoT environments. The scheme refines an active automata learning algorithm, L*, by incorporating a knowledge base of the legacy protocol behavior into its membership query selection procedure for updated protocol behavior inference. It also infers protocol syntax based on our previous work that computes the most probable message field updates and adapts the legacy protocol message accordingly. We evaluate the proposed scheme with two case studies with the most popular IoT protocols and prove that it infers updated protocols efficiently while improving the L* algorithm’s performance for resolving the incompatibility.
- Abstract
1
- 10.1093/cdn/nzz051.p04-162-19
- Jun 1, 2019
- Current Developments in Nutrition
Integrating Maternal Infant & Young Child Nutrition (MIYCN) in Undergraduate Medical Teaching Curriculum and Service Delivery in Two States of India (P04-162-19)
- Research Article
14
- 10.1007/s00453-014-9954-9
- Nov 11, 2014
- Algorithmica
We describe a framework for designing efficient active learning algorithms that are tolerant to random classification noise and are differentially-private. The framework is based on active learning algorithms that are statistical in the sense that they rely on estimates of expectations of functions of filtered random examples. It builds on the powerful statistical query framework of Kearns (JACM 45(6):983---1006, 1998). We show that any efficient active statistical learning algorithm can be automatically converted to an efficient active learning algorithm which is tolerant to random classification noise as well as other forms of uncorrelated noise. The complexity of the resulting algorithms has information-theoretically optimal quadratic dependence on $$1/(1-2\eta )$$1/(1-2?), where $$\eta $$? is the noise rate. We show that commonly studied concept classes including thresholds, rectangles, and linear separators can be efficiently actively learned in our framework. These results combined with our generic conversion lead to the first computationally-efficient algorithms for actively learning some of these concept classes in the presence of random classification noise that provide exponential improvement in the dependence on the error $$\epsilon $$∈ over their passive counterparts. In addition, we show that our algorithms can be automatically converted to efficient active differentially-private algorithms. This leads to the first differentially-private active learning algorithms with exponential label savings over the passive case.
- Abstract
- 10.1136/archdischild-2024-rcpch.642
- Jul 30, 2024
- Archives of Disease in Childhood
ObjectivesMagnetic resonance imaging (MRI) is widely used in paediatrics as it provides high quality imaging without exposing patients to harmful radiation. However, its use is restricted due to poor patient...
- Research Article
19
- 10.3233/jad-160143
- May 6, 2016
- Journal of Alzheimer’s Disease
Recently, a Korean research group suggested a consensus protocol, based on the Alzheimer's Disease Neuroimaging Initiative study protocol but with modifications for minimizing the confounding factors, for the evaluation of cerebrospinal fluid (CSF) biomarkers. Here, we analyzed fluid and imaging biomarkers of Alzheimer's disease (AD) in Korean population. We used the updated protocol to propose a more accurate CSF biomarker value for the diagnosis of AD. Twenty-seven patients with AD and 30 cognitively normal controls (NC) were enrolled. CSF was collected from 55 subjects (patients with AD = 26, NC = 29) following the Korea consensus protocol. CSF biomarkers were measured using the INNO-BIA AlzBio3 immunoassay, and Pittsburgh compound B (PIB) positron emission tomography (PET) scans were also performed. The cutoff values of CSF amyloid beta 1-42 (Aβ42), total tau (t-Tau), and phosphorylated tau (p-Tau) proteins were 357.1 pg/ml, 83.35 pg/ml, and 38.00 pg/ml, respectively. The cutoff values of CSF t-Tau/Aβ42 and p-Tau/Aβ42 ratio- were 0.210 (sensitivity 100%, specificity 86.21%) and 0.1350 (sensitivity 88.46%, specificity of 92.86%). The concordance rate with PIB-PET was higher using the CSF t-Tau/Aβ42 ratio (κ= 0.849, CI 0.71-0.99) than CSF Aβ42 alone (κ= 0.703, CI 0.51-0.89). Here, we improved controversial factors associated with the previous CSF study protocol and suggested a new cutoff value for the diagnosis of AD. Our results showed good diagnostic performance for differentiation of AD. Thus, we expect our findings could be a cornerstone in the establishment and clinical application of biomarkers for AD diagnosis.
- Research Article
4
- 10.1007/978-1-0716-1875-2_8
- Jan 1, 2022
- Methods in molecular biology (Clifton, N.J.)
The availability of protocols for virus-induced gene silencing (VIGS) in rice has opened up an important channel for the elucidation of gene functions in this important crop plant. Here, we present an updated protocol of a VIGS system based on Rice tungro bacilliform virus (RTBV) for gene silencing in rice. We present complete updated protocols for VIGS in rice, compare the system with other existing ones for monocots, identify some of the challenges faced by this system and discuss ways in which the vector could be improved for better silencing efficiency.
- Research Article
26
- 10.17877/de290r-4817
- Jun 26, 2012
- Technische Universität Dortmund Eldorado (Technische Universität Dortmund)
Computer systems today are no longer monolithic programs; instead they usually comprise multiple interacting programs. With the continuous growth of these systems and with their integration into systems of systems, interoperability becomes a fundamental issue. Integration of systems is more complex and occurs more frequently than ever before. One solution to this problem could be the automated model-based synthesis of mediators at runtime. However, this approach has strong prerequisites. It requires the existence of adequate models of the systems to be connected. Many systems encountered in practice, on the other hand, do not come with models. In such cases models have to be constructed ex post (at runtime). Furthermore, adequate models must capture control as well as data aspects of a system. In most protocols, for instance, data parameters (e.g., session identifiers or sequence numbers) can influence system behavior. Models of such systems can be thought of as interface programs: Rather than covering only the control behavior, they describe explicitly which data values are relevant to the communication and have to be remembered and reused. This thesis addresses the problem of inferring interface programs of systems at runtime using active automata learning techniques. Active automata learning uses a test-based and counterexample-driven approach to inferring models of black-box systems. The method has originally been introduced for finite automata (the popular L∗ algorithm). Extending active learning to interface programs requires research in three directions: First, the efficiency of active learning algorithms has to be optimized to scale when dealing with data parameters. Second, techniques are needed for finding counterexamples driving the learning process in practice. Third, active learning has to be extended to richer models than Mealy machines or DFAs, capable of expressing interface programs. The work presented in this thesis improves the state of the art in all three directions. More concretely, the contributions of this thesis are the following: first, an efficient active learning algorithm for DFAs and Mealy machines that combines the ideas of several known active learning algorithms in a non-trivial way; second, a framework for finding counterexamples in black-box scenarios, leveraging the incremental and monotonic evolution of hypothetical models characteristic of active automata learning; third, and most importantly, the technically involved extension of the partition/refinement-based approach of active learning to interface programs. The impact of extending active learning to interface programs becomes apparent already for small systems. We inferred a simple data structure (a nested stack of overall capacity 16) as an interface program in no more than 20 seconds, using less than 45,000 tests and only 9 counterexamples. The corresponding Mealy machine model, on the other hand, would have more than 109 states already in the case of a very small finite data domain of size 4 and require significantly more than 109 tests when being inferred using the classic L∗ algorithm.
- Research Article
6
- 10.1155/2015/124601
- Jan 1, 2015
- Mathematical Problems in Engineering
Sea ice is one of the most critical marine disasters, especially in the polar and high latitude regions. Hyperspectral image is suitable for monitoring the sea ice, which contains continuous spectrum information and has better ability of target recognition. The principal bottleneck for the classification of hyperspectral image is a large number of labeled training samples required. However, the collection of labeled samples is time consuming and costly. In order to solve this problem, we apply the active learning (AL) algorithm to hyperspectral sea ice detection which can select the most informative samples. Moreover, we propose a novel investigated AL algorithm based on the evaluation of two criteria: uncertainty and diversity. The uncertainty criterion is based on the difference between the probabilities of the two classes having the highest estimated probabilities, while the diversity criterion is based on a kernelk-means clustering technology. In the experiments of Baffin Bay in northwest Greenland on April 12, 2014, our proposed AL algorithm achieves the highest classification accuracy of 89.327% compared with other AL algorithms and random sampling, while achieving the same classification accuracy, the proposed AL algorithm needs less labeling cost.
- Research Article
6
- 10.1186/s13063-015-0985-6
- Oct 14, 2015
- Trials
BackgroundSUN(^_^)D, the Strategic Use of New generation antidepressants for Depression, is an assessor-blinded, parallel-group, multicenter pragmatic mega-trial to examine the optimum treatment strategy for the first- and second-line treatments for unipolar major depressive episodes. The trial has three steps and two randomizations. Step I randomization compares the minimum and the maximum dosing strategy for the first-line antidepressant. Step II randomization compares the continuation, augmentation or switching strategy for the second-line antidepressant treatment. Step III is a naturalistic continuation phase. The original protocol was published in 2011, and we hereby report its updated protocol including the statistical analysis plan.ResultsWe implemented two important changes to the original protocol. One is about the required sample size, reflecting the smaller number of dropouts than had been expected. Another is in the organization of the primary and secondary outcomes in order to make the report of the main trial results as pertinent and interpretable as possible for clinical practices. Due to the complexity of the trial, we plan to report the main results in two separate reports, and this updated protocol and the statistical analysis plan have laid out respective primary and secondary outcomes and their analyses. We will convene the blind interpretation committee before the randomization code is broken.ConclusionThis paper presents the updated protocol and the detailed statistical analysis plan for the SUN(^_^)D trial in order to avoid reporting bias and data-driven results.Trial registrationClinicalTrials.gov: NCT01109693 (registered on 21 April 2010).
- Research Article
16
- 10.1007/s00449-010-0462-2
- Aug 14, 2010
- Bioprocess and Biosystems Engineering
The purpose of this paper is to refine the BIOMATH calibration protocol for SBR systems, in particular to develop a pragmatic calibration protocol that takes advantage of SBR information-rich data, defines a simulation strategy to obtain proper initial conditions for model calibration and provides statistical evaluation of the calibration outcome. The updated calibration protocol is then evaluated on a case study to obtain a thoroughly validated model for testing the flexibility of an N-removing SBR to adapt the operating conditions to the changing influent wastewater load. The performance of reference operation using fixed phase length and dissolved oxygen set points and two real-time control strategies is compared to find optimal operation under dynamic conditions. The results show that a validated model of high quality is obtained using the updated protocol and that the optimization of the system's performance can be achieved in different manners by implementing the proposed control strategies.
- Research Article
2
- 10.1007/s10044-020-00894-5
- Jun 25, 2020
- Pattern Analysis and Applications
In image classification, the acquisition of images labels is often expensive and time-consuming. To reduce this labeling cost, active learning is introduced into this field. Although some active learning algorithms have been proposed, they are all single-sampling strategies or combined with multiple-sampling strategies simultaneously (i.e., correlation, uncertainty and label-based measure), without considering the relationship between substep sampling strategies. To this end, we designed a new active learning scheme called substep active deep learning (SADL) for image classification. In SADL, samples were selected by correlation strategy and then determined by the uncertainty and label-based measurement. Finally, it is fed to CNN model training. Experiments were performed with three data sets (i.e., MNIST, Fashion-MNIST and CIFAR-10) to compare against state-of-the-art active learning algorithms, and it can be verified that our substep active deep learning is rational and effective.
- Research Article
7
- 10.1016/j.neucom.2011.07.012
- Aug 22, 2011
- Neurocomputing
Column subset selection for active learning in image classification
- Research Article
4
- 10.1609/aaai.v30i1.10233
- Feb 21, 2016
- Proceedings of the AAAI Conference on Artificial Intelligence
We study the robustness of active learning (AL) algorithms against prior misspecification: whether an algorithm achieves similar performance using a perturbed prior as compared to using the true prior. In both the average and worst cases of the maximum coverage setting, we prove that all alpha-approximate algorithms are robust (i.e., near alpha-approximate) if the utility is Lipschitz continuous in the prior. We further show that robustness may not be achieved if the utility is non-Lipschitz. This suggests we should use a Lipschitz utility for AL if robustness is required. For the minimum cost setting, we can also obtain a robustness result for approximate AL algorithms. Our results imply that many commonly used AL algorithms are robust against perturbed priors. We then propose the use of a mixture prior to alleviate the problem of prior misspecification. We analyze the robustness of the uniform mixture prior and show experimentally that it performs reasonably well in practice.
- Research Article
3
- 10.1016/j.future.2019.02.007
- Feb 13, 2019
- Future Generation Computer Systems
Towards interactive networking: Runtime message inference approach for incompatible protocol updates in IoT environments
- Research Article
16
- 10.1109/access.2019.2914263
- Jan 1, 2019
- IEEE Access
Active learning selects the most critical instances and obtains their labels through interaction with an oracle. Selecting either informative or representative unlabeled instances may result in sampling bias or cluster dependency. In this paper, we propose a multi-standard optimization active learning (MSAL) algorithm that considers the informativeness, representativeness, and diversity of instances. Informativeness is measured by the soft-max predicted entropy, whereas representativeness is measured by the probability density function obtained by a non-parametric estimation. The multiplex of the two is used as an optimization objective to reduce model uncertainty and explore the distribution of unlabeled data. Diversity is measured by the difference between the selected critical instances. This is used as a constraint to prevent the selection of instances that are too similar. Learning experiments were performed with 12 datasets from various domains. The results of significance tests verify the effectiveness of MSAL and its superiority over state-of-the-art active learning algorithms.
- Research Article
2
- 10.1021/acs.jcim.3c01430
- Oct 19, 2023
- Journal of Chemical Information and Modeling
We introduce an exploratory active learning (AL) algorithm using Gaussian process regression and marginalized graph kernel (GPR-MGK) to sample chemical compound space (CCS) at minimal cost. Targeting 251,728 enumerated alkane molecules with 4-19 carbon atoms, we applied the AL algorithm to select a diverse and representative set of molecules and then conducted high-throughput molecular simulations on these selected molecules. To demonstrate the power of the AL algorithm, we built directed message-passing neural networks (D-MPNN) using simulation data as the training set to predict liquid densities, heat capacities, and vaporization enthalpies of the CCS. Validations show that D-MPNN models built on the smallest training set considered in this work, which consists of 313 molecules or 0.124% of the original CCS, predict the properties with R2 > 0.99 against the computational data and R2 > 0.94 against the experimental data. The advantage of the presented AL algorithm is that the predicted uncertainty of GPR depends on only the molecular structures, which renders it compatible with high-throughput data generation.