Client Scheduling and Resource Management for Efficient Training in Heterogeneous IoT-Edge Federated Learning
Federated learning (FL) offers a promising paradigm that empowers numerous Internet of Things (IoT) devices to implement distributed learning on the premise of ensuring user privacy and data security. However, since FL adopts a synchronous distributed training mode, the heterogeneity of participating IoT devices and limited communication resources make FL encounter serious issues of low training efficiency in actual deployment. In this article, we propose an excellent FL policy for the heterogeneous IoT-edge FL system to improve distributed training efficiency. Specifically, first, by borrowing the idea of clustering, we explore an iterative self-organizing data analysis techniques algorithm (ISODATA)-based heterogeneous-aware client scheduling strategy to alleviate the issue of low training efficiency incurred by the heterogeneity of clients. Subsequently, to tackle the challenge of limited communication resources in FL, we first analyze the characteristics of the optimal resource block allocation solution theoretically and then introduce a mixed-integer linear programming (MILP)-based strategy to judiciously allocate resource blocks for scheduled clients. Comprehensive experimental results demonstrate that, compared with benchmarking strategies, our proposed FL policy can achieve up to 55.22% accuracy improvement in a relaxed time scenario, and attain up to <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$3.62\times $ </tex-math></inline-formula> acceleration for reaching the specific expected accuracy.
- # Federated Learning
- # Iterative Self-organizing Data Analysis Techniques Algorithm
- # Self-organizing Data Analysis Techniques Algorithm
- # Iterative Self-organizing Data Analysis Techniques
- # Federated Learning System
- # Limited Communication Resources
- # Client Scheduling
- # Benchmarking Strategies
- # Resource Block
- # Actual Deployment
- Conference Article
- 10.1109/dicta51227.2020.9363419
- Nov 29, 2020
Satellites have proven to be a technology that can help in a variety of environmental and human development contexts. However, at times some pixels in the satellite images are not captured. These uncaptured pixels are called missing pixels. Having these missing pixels means that important data for research and satellite imagery-based applications is lost. Therefore, people have developed pixel synthesis methods. This paper presents a new pixel synthesis method called the Iterative Self-Organizing Data Analysis Techniques Algorithm - Integration of Geostatistical and Temporal Missing Pixels' Properties (ISODATA-IGTMPP). The method is built upon the Integration of Geostatistical and Temporal Missing Pixels' Properties (IG TMPP) method and adds a seminal clustering technique called the Iterative Self-Organizing Data Analysis Techniques Algorithm (ISODATA). The clustering technique allows a new way of predicting the missing pixel from their environmental class with benefit of the spatial and temporal properties. Here, the ISODATA-IGTMPP method was tested on the Spatial-Temporal Change in the Environment Context (STCEC) dataset and was compared with results of four missing pixel predicting methods. The method shows the best performing results and preforms very well across different environment types.
- Conference Article
- 10.1109/powercon.2014.6993582
- Oct 1, 2014
Dissolved Gas Analysis (DGA) has already gained its popularity in fault diagnosis for the oil-immersed transformers. However, owing to the fuzziness and uncertainty between the failure phenomena and failure mechanisms, power equipment failure reasons are very complicated and the accuracy of existing algorithms is low. Diagnostic methods based on artificial intelligence are commonly introduced in the field above. Because of the complexity of the network, the speed of the algorithm convergence is badly affected. With the limitation of artificial guidance and expertise, the current algorithms are short of the self-learning ability. That is why there is no common diagnostic program can be formed. Based on this reason, the Iterative Self-Organizing Data Analysis Techniques Algorithm (ISODATA) based on DGA is proposed. First, the feasibility of transformer fault diagnosis method based on the ISODATA and DGA are analyzed, as well as its limitations. In order to improve the efficiency of the algorithm, a genetic algorithm is introduced to optimize the transformer fault diagnosis model by reducing the dependence of the initial clustering for ISODATA based on DGA. Through these methods, the accuracy and efficiency of optimizing diagnosis are improved. With the analysis of the principles of fuzzy ISODATA algorithm and genetic algorithm, and the optimization the initial cluster centers on ISODATA algorithm, the feasibility of the optimized transformer fault diagnosis program is proved. Finally, a specific case is programmed and compared to prove its accuracy and efficiency by analysis and comparing indicators before and after improvement. It is shown in the experimental comparison that the number of iterations is less after the improvement with the same precision and the operating speed is faster with the less error. The results showed that the fuzzy ISODATA algorithm optimized by the genetic algorithms is more in line with actual needs by largely overcoming the dependence on initial cluster center and can be easily applied to oil-immersed transformer fault diagnosis.
- Conference Article
- 10.1109/iske.2017.8258822
- Nov 1, 2017
Two main challenges introduced in current voice conversion are the dependence on parallel training data and the trade-off between speaker similarity and speech quality. To tackle the latter problem, this paper proposes a novel conversion method based on Iterative Self-organizing DATA Analysis Techniques Algorithm (ISODATA) clustering algorithm. Specially, we use ISODATA during the training of Gaussian mixture model, the optimized mixture number can guarantee the validity and accuracy of the GMM model, which can acquire speaker's identity effectively related to speaker similarity between original target speech and converted speech, Next, we combine improved GMM and bilinear frequency warping for the conversion stage, which can get a good balance between speaker similarity and speech quality. Theory analysis and experimental results demonstrate that the proposed algorithm can achieve higher quality and similarity compared with other two methods.
- Conference Article
2
- 10.1109/upinlbs.2018.8559789
- Mar 1, 2018
With the development of the smart city, indoor localization has received much attentions. In this paper, a novel received signal strength (RSS) based fingerprint localization algorithm was proposed by utilizing iterative self-organizing data analysis techniques algorithm (ISODATA) and multiple kernel extreme learning machine (MK-ELM) technique. In the offline phase, the measurement label of each RSS measurement training data is given after using ISODATA clustering. And then the measurement-label training set and the measurement-position training subsets can be formed. Next, using the MK-ELM algorithm, the measurement classification function and the position regression sub-function can be learned by the measurement-label training set, measurement-position training subset respectively. In the online phase, the classification result of the obtained RSS measurements is obtained firstly. Then the corresponding regression function is chosen for the final position estimation. The experimental results illustrated its performance with respect to position estimation and computational complexity.
- Research Article
15
- 10.1109/access.2021.3101871
- Jan 1, 2021
- IEEE Access
Federated learning (FL) is the up-to-date approach for privacy constraints Internet of Things (IoT) applications in next-generation mobile network (NGMN), 5 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">th</sup> generation (5G), and 6 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">th</sup> generation (6G), respectively. Due to 5G/6G is based on new radio (NR) technology, the multiple-input and multiple-output (MIMO) of radio services for heterogeneous IoT devices have been performed. The autonomous resource allocation and the intelligent quality of service class identity (IQCI) in mobile networks based on FL systems are obligated to meet the requirements of privacy constraints of IoT applications. In massive FL communications, the heterogeneous local devices propagate their local models and parameters over 5G/6G networks to the aggregation servers in edge cloud areas. Therefore, the assurance of network reliability is compulsory to facilitate end-to-end (E2E) reliability of FL communications and provide the satisfaction of model decisions. This paper proposed an intelligent lightweight scheme based on the reference software-defined networking (SDN) architecture to handle the massive FL communications between clients and aggregators to meet the mentioned perspectives. The handling method adjusts the model parameters and batches size of the individual client to reflect the apparent network conditions classified by the k-nearest neighbor (KNN) algorithm. The proposed system showed notable experimented metrics, including the E2E FL communication latency, throughput, system reliability, and model accuracy.
- Research Article
90
- 10.1109/jiot.2022.3151193
- Sep 1, 2022
- IEEE Internet of Things Journal
To leverage massive distributed data and computation resources, machine learning in the network edge is considered to be a promising technique, especially for large-scale model training. Federated learning (FL), as a paradigm of collaborative learning techniques, has obtained increasing research attention with the benefits of communication efficiency and improved data privacy. Due to the lossy communication channels and limited communication resources (e.g., bandwidth and power), it is of interest to investigate fast responding and accurate FL schemes over wireless systems. Hence, we investigate the problem of jointly optimized communication efficiency and resources for FL over wireless Internet of Things (IoT) networks. To reduce complexity, we divide the overall optimization problem into two subproblems, i.e., the client scheduling problem and the resource allocation problem. To reduce the communication costs for FL in wireless IoT networks, a new client scheduling policy is proposed by reusing stale local model parameters. To maximize successful information exchange over networks, a Lagrange multiplier method is first leveraged by decoupling variables, including power variables, bandwidth variables, and transmission indicators. Then, a linear-search-based power and bandwidth allocation method is developed. Given appropriate hyperparameters, we show that the proposed communication-efficient FL (CEFL) framework converges at a strong linear rate. Through extensive experiments, it is revealed that the proposed CEFL framework substantially boosts both the communication efficiency and learning performance of both training loss and test accuracy for FL over wireless IoT networks compared to a basic FL approach with uniform resource allocation.
- Research Article
16
- 10.1007/s11053-021-09865-x
- Apr 10, 2021
- Natural Resources Research
In this paper, we explore unsupervised cluster analysis to aid mineral prospectivity mapping (MPM) in two aspects: (1) to cluster geochemical data for MPM based on detailed analysis of evidence maps and (2) to explore coherence of spatial signatures at/around mineralized locations as well as outliers of geochemical data. To do so, a systematic procedure is proposed based on the Iterative Self-organizing Data Analysis Techniques Algorithm (ISODATA). Through this procedure, the detailed analysis of evidence maps in Hezuo–Meiwu district, Gansu Province, China, which portray five selected geochemical elements, showed that clusters with and without mineralized locations provide insight to weighing of each evidence. Finally, through the integration of evidence maps, the favorability score map yielded high AUC (> 0.80) for delineating various mineralized locations in the study area, which proves the efficacy of unsupervised cluster analysis as an aid to MPM. Moreover, the coherence of spatial signatures of known mineralized locations, which comprise a training dataset, is vital to data-driven MPM. Groupings of mineralized locations based on the ISODATA and visual inspection supported by PCs from principal component analysis imply that different deposit types may share the same or similar spatial signature and outliers in geochemical data may be potential training samples used for data-driven MPM. Mineralized locations of the same deposit type may show significant dissimilarity. However, this provides insights into selecting mineralized/non-mineralized locations for creation of training datasets. Interestingly, in the study area, major mineralized locations in zones divided by regional fault are clustered separately into two groups. This result not only proves that cluster analysis is effective for exploring the coherence of spatial signatures at/around mineralized locations, but it also justified our previous study, whereby we performed MPM by zones using machine learning algorithms.
- Research Article
8
- 10.1016/j.gexplo.2022.107126
- Nov 25, 2022
- Journal of Geochemical Exploration
Identification of geochemical anomalies related to mineralization: A case study from porphyry copper deposits in the Qulong-Jiama mining district of Tibet, China
- Research Article
3
- 10.12733/jics20101678
- May 1, 2013
- Journal of Information and Computational Science
The Fuzzy C-means (FCM) clustering algorithm is a well-known tool for pattern and image classification, while FCM is showing unstable behaviors with different fuzzy indexes. The New Weighted Fuzzy C-means (NW-FCM) algorithm, based on the weighted mean concept, was proposed to improve the performance of FCM, however, NW-FCM needs man-machine interaction and can not classify by itself. In this paper, an Adaptive Weighted Fuzzy Clustering Algorithm (AWFCM), which integrates the weighted mean concept with the mechanism of splitting and lumping from the Iterative Self-organizing Data Analysis Techniques Algorithm (ISODATA), is proposed for remote sensing image classification. At the same time, a statistical method is introduced into the algorithm. The AWFCM does not need a priori knowledge about the number of clusters and their centers, and can automatically estimate an initial number of clusters and their centers, also the optimum final number of clusters. Experimental results demonstrate that the AWFCM has better performance than those of the K-means, ISODATA and FCM algorithms.
- Research Article
1
- 10.1080/22797254.2023.2289616
- Dec 11, 2023
- European Journal of Remote Sensing
This paper takes landslide as a special research object. For the problems of landslide detection in remote sensing images, deep learning and playback method is adopted. Using the You Only Look Once v5 network (YOLOv5) in combination with the Gabor filter, its edge detection, detection anchor frame and small object detection scale are improved and optimized. The YOLOv5(ISODATA) model was finally established for landslide image detection by incorporating the edge control factor and four clustering algorithms (K-means, K-means + +, k-medoid, and Iterative Self-Organizing Data Analysis Techniques Algorithm (ISODATA) to evaluate the accuracy of the detection anchor frame and add small target large-scale sampling. Three target identification models – YOLOv5, Region Convolution Neural Network (R-CNN), and Fast R-CNN – are experimentally compared in order to assess the effectiveness of the proposed method. According to the results of experiments, the proposed method outperforms the other three detection models with an AUC of 0.921, a recall of 86.14%, and an MCC of 0.887. It further demonstrates the method’s positive impact on landslide remote sensing image recognition and its ability to solve related issues.
- Conference Article
16
- 10.1109/cisp.2014.7003861
- Oct 1, 2014
Hyperspectral image classification is an important part of the hyperspectral remote sensing information processing. The Iterative Selforganizing Data Analysis Techniques Algorithm (ISODATA) clustering algorithm which is an unsupervised classification algorithm is considered as an effective measure in the area of processing hyperspectral images. In this paper, an improved ISODATA algorithm is proposed for hyperspectral images classification. The algorithm takes the maximum and minimum spectrum of the image into consideration and determines the initial cluster center by the stepped construction of spectrum accurately. The classification experiment results show that using the improved ISODATA algorithm can determine the initial cluster number adaptively. In comparison with the SAM (Spectral Angle Mapper) algorithm and the original ISODATA algorithm, a better performance of the proposed ISODATA method is shown in the part of results.
- Research Article
8
- 10.1016/j.ijrmms.2020.104249
- Feb 24, 2020
- International Journal of Rock Mechanics and Mining Sciences
Automated demarcation of the homogeneous domains of trace distribution within a rock mass based on GLCM and ISODATA
- Research Article
30
- 10.3390/electronics11101624
- May 19, 2022
- Electronics
Privacy and data security have become the new hot topic for regulators in recent years. As a result, Federated Learning (FL) (also called collaborative learning) has emerged as a new training paradigm that allows multiple, geographically distributed nodes to learn a Deep Learning (DL) model together without sharing their data. Blockchain is becoming a new trend as data protection and privacy are concerns in many sectors. Technology is leading the world and transforming into a global village where everything is accessible and transparent. We have presented a blockchain enabled security model using FL that can generate an enhanced DL model without sharing data and improve privacy through higher security and access rights to data. However, existing FL approaches also have unique security vulnerabilities that malicious actors can exploit and compromise the trained model. The FL method is compared to the other known approaches. Users are more likely to choose the latter option, i.e., providing local but private data to the server and using ML apps, performing ML operations on the devices without benefiting from other users’ data, and preventing direct access to raw data and local training of ML models. FL protects data privacy and reduces data transfer overhead by storing raw data on devices and combining locally computed model updates. We have investigated the feasibility of data and model poisoning attacks under a blockchain-enabled FL system built alongside the Ethereum network and the traditional FL system (without blockchain). This work fills a knowledge gap by proposing a transparent incentive mechanism that can encourage good behavior among participating decentralized nodes and avoid common problems and provides knowledge for the FL security literature by investigating current FL systems.
- Research Article
6
- 10.1016/j.eswa.2023.123006
- Dec 22, 2023
- Expert Systems with Applications
T-FedHA: A Trusted Hierarchical Asynchronous Federated Learning Framework for Internet of Things
- Conference Article
10
- 10.1109/globecom48099.2022.10000743
- Dec 4, 2022
Federated Learning (FL) is considered the key approach for privacy-preserving, distributed machine learning (ML) systems. However, due to the transmission of large ML models from users to the server in each iteration of FL, communication on resource-constrained networks is currently a fundamental bottleneck in FL, restricting the ML model complex-ity and user participation. One of the notable trends to reduce the communication cost of FL systems is gradient compression, in which techniques in the form of sparsification or quantization are utilized. However, these methods are pre-fixed and do not capture the redundant, correlated information across parameters of the ML models, user devices' data, and iterations of FL. Further, these methods do not fully take advantage of the error-correcting capability of the FL process. In this paper, we propose the Federated Learning with Autoencoder Compression (FLAC) approach that utilizes the redundant information and error-correcting capability of FL to compress user devices' models for uplink transmission. FLAC trains an autoencoder to encode and decode users' models at the server in the Training State, and then, sends the autoencoder to user devices for compressing local models for future iterations during the Compression State. To guarantee the convergence of the FL, FLAC dynamically controls the autoencoder error by switching between the Training State and Compression State to adjust its autoencoder and its compression rate based on the error tolerance of the FL system. We theoretically prove that FLAC converges for FL systems with strongly convex ML models and non-i.i.d. data distribution. Our extensive experimental results'over three datasets with different network architectures show that FLAC can achieve compression rates ranging from 83x to 875x while staying near 7 percent of the accuracy of the non-compressed FL systems.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.