Spatio-temporal predictive coding for weather forecasting and Governance: a plug-and-play module for causal discovery and data augmentation
Spatio-temporal predictive coding for weather forecasting and Governance: a plug-and-play module for causal discovery and data augmentation
- Research Article
7
- 10.1016/j.apacoust.2021.108382
- Sep 14, 2021
- Applied Acoustics
Robust children’s speech recognition in zero resource condition
- Conference Article
1
- 10.1109/spcom55316.2022.9840767
- Jul 11, 2022
Developing an automatic speech recognition (ASR) system for children’s speech is extremely challenging due to the unavailability of data from the child domain for the majority of the languages. Consequently, in such zero-resource scenarios, we are forced to develop an ASR system using adults’ speech for transcribing data from child speakers. However, differences in formant frequencies and speaking-rate between the two groups of speakers degrade recognition performance. To reduce the said mismatch, out-of-domain data augmentation approaches based on formant and duration modification are proposed in this work. For that purpose, formant frequencies of adults’ speech training data are up-scaled using warping of linear predictive coding coefficients. Next, the speaking-rate of adults’ data is also increased through time-scale modification. Due to simultaneous altering of formant frequencies and duration of adults’ speech and then pooling the modified data into training, the acoustic mismatch due to the aforementioned factors gets reduced. This, in turn, enhances the recognition performance significantly. Additional improvement is obtained by combining the recently reported voice-conversion-based data augmentation technique with the proposed ones. On combining the proposed and voice-conversion-based data augmentation techniques, a relative reduction of nearly 32.3% in word error rate over the baseline is obtained.
- Research Article
- 10.1142/s0219876224500257
- Aug 6, 2024
- International Journal of Computational Methods
Weather forecasting is an effort by meteorologists to predict whether states at a few prospective times and the conditions of weather that might be estimated. With new modern technology, present weather forecasting methods are highly precise. To achieve high accuracy, the methods developed for weather forecasting were much more complicated owing to many factors. Here, the usage of time series data for weather forecasting is done by the devised Long-Short Term Memory fused Convolutional neural network (LSTMFCNN). At first, the acquisition of input time series data from the specific dataset is done. In The feature extraction technical features are done by considering the input time series data. Then, the feature extraction is done utilizing the Rider Optimization Algorithm-Based Neural Network (RideNN) with the Soergel metric. RideNN is the integration of the Rider Optimization Algorithm (ROA) with the Neural Network (NN) classifier. Thus, the feature fusion step reduces the complexity and improves the accuracy. Thereafter, the oversampling technique is utilized for the data augmentation (DA) process. Finally, weather forecasting is done utilizing the newly designed LSTMFCNN and is obtained by the integration of Convolutional Neural Network (CNN) and Deep Long-Short Term Memory (DLSTM).
- Preprint Article
- 10.5194/egusphere-egu25-12110
- Mar 18, 2025
Accurately forecasting streamflow is essential for effectively managing water resources. High-quality operational forecasts allow us to prepare for extreme weather events, optimize hydropower generation, and minimize the impact of human development on the natural environment. However, streamflow forecasts are inherently limited by the quality and availability of upstream weather sources. The weather forecasts that drive hydrological modeling vary in their temporal resolutions and are prone to outages, such as the ECMWF data outage in November of 2023. Here, we present HydroForecast Short Term 3 (ST-3), a state-of-the-art probabilistic deep learning model for medium-term (10-day) streamflow forecasts. ST-3 combines long short-term memory architecture with Boolean tensors representing data availability and dense embeddings for processing of the information in these tensors. This architecture allows for a training routine that implements data augmentation to synthesize varying amounts of availability of weather inputs. The result is a model that 1) makes accurate forecasts even in the case of an upstream data outage, 2) achieves higher accuracy by leveraging data of varying temporal resolutions including regional weather inputs with shorter lead times than the most common medium term weather inputs, and 3) generates individual forecast traces for each individual weather source, facilitating inference across regions where weather data availability is limited. Initial results across CAMELS sites in North America indicate that the incorporation of near-term high resolution weather data increases early horizon forecast KGE by nearly 0.25 with meaningful improvements in metrics seen across our customers’ operational sites. Validation metrics across individual weather sources, as well as model interrogation through integrated gradients highlights a high level of fidelity in the model’s learned physical relationships across forecast scenarios.
- Conference Article
2
- 10.1109/isai-nlp54397.2021.9678178
- Dec 21, 2021
Kale is a popular ingredient in Thai cuisine and can be grown year-round. However, kale requires particular care, especially pests. Therefore, this study applies the Internet of Things to propose the KaleCare, a smart farm management system for kale with four main functions including automatic watering based on weather forecasting, automatic fertilizing, reporting, and pest detection for cutworms, and aphids. There are three processes to create the pest classification models for pest detection function. Firstly, the raw images were applied to the GrabCut to remove the background. Secondly, data augmentation was applied to generate images due to the small amount of raw data. Finally, the modified GoogLeNet reduced the original GoogLeNet structure is proposed to classify both types of pests. The experimental results show that the proposed model outperforms with 0.8903 and 0.7959 in average classification rate and 0.886 and 0.7965 in average F1-score to classify cutworm and aphid, respectively.
- Research Article
28
- 10.15837/ijccc.2022.2.4356
- Feb 18, 2022
- INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL
Agriculture is the primary source of livelihood for about 60% of the world's total population according to the Food and Agricultural Organization (FAO). The economy of the developing countries is solely dependent on agriculture commodities. As the world population is increasing at faster pace, the demand for food is also escalating tremendously. In recent days, agriculture is experiencing an automation revolution. Hence the introduction of disruptive technologies like Artificial Intelligence plays a major role in increasing agricultural productivity. AI enabled approaches would help in overcoming the traditional challenges faced in agriculture practices, by automating various agriculture related tasks. Nowadays, farmers adopt precision farming which uses AI techniques namely in crop health monitoring, weed detection, plant disease identification and detection, and forecast weather, commodity prices to increase the yield. As there is scarcity of manpower in agriculture sector, AI based equipment like bots and drones are used widely. Crop diseases are a major threat to food security and the manual identification of the diseases with the help of experts will incur more cost and time, especially for larger farms. The machine-vision based techniques provide image based automatic process control, inspection, and robot guidance for pest and disease control. It provides automated process in agriculture, paving way for improved efficiency and profitability. Various factors contribute for plant diseases, which includes soil health, climatic conditions, species and pests. The proposed chapter elaborates on the use of deep learning techniques in the leaf disease detection of Cassava plants. The chapter initially describes the evolution of various neural network techniques used in classification and prediction. It describes the significance of using Convolutional Neural Network (CNN) over deep neural networks. The chapter focuses on classification of leaf disease in Cassava plants using images acquired real time and from Kaggle dataset. In the final part of the chapter, the results of the models with original and augmented data were illustrated considering accuracy as performance metric.
- Research Article
52
- 10.1609/aaai.v37i3.25378
- Jun 26, 2023
- Proceedings of the AAAI Conference on Artificial Intelligence
Standard multi-modal models assume the use of the same modalities in training and inference stages. However, in practice, the environment in which multi-modal models operate may not satisfy such assumption. As such, their performances degrade drastically if any modality is missing in the inference stage. We ask: how can we train a model that is robust to missing modalities? This paper seeks a set of good practices for multi-modal action recognition, with a particular interest in circumstances where some modalities are not available at an inference time. First, we show how to effectively regularize the model during training (e.g., data augmentation). Second, we investigate on fusion methods for robustness to missing modalities: we find that transformer-based fusion shows better robustness for missing modality than summation or concatenation. Third, we propose a simple modular network, ActionMAE, which learns missing modality predictive coding by randomly dropping modality features and tries to reconstruct them with the remaining modality features. Coupling these good practices, we build a model that is not only effective in multi-modal action recognition but also robust to modality missing. Our model achieves the state-of-the-arts on multiple benchmarks and maintains competitive performances even in missing modality scenarios.
- Conference Article
108
- 10.1109/slt48900.2021.9383605
- Jan 19, 2021
Contrastive Predictive Coding (CPC), based on predicting future segments of speech from past segments is emerging as a powerful algorithm for representation learning of speech signal. However, it still under-performs compared to other methods on unsupervised evaluation benchmarks. Here, we intro-duce WavAugment, a time-domain data augmentation library which we adapt and optimize for the specificities of CPC (raw waveform input, contrastive loss, past versus future structure). We find that applying augmentation only to the segments from which the CPC prediction is performed yields better results than applying it also to future segments from which the samples (both positive and negative) of the contrastive loss are drawn. After selecting the best combination of pitch modification, additive noise and reverberation on unsupervised metrics on LibriSpeech (with a gain of 18-22% relative on the ABX score), we apply this combination without any change to three new datasets in the Zero Resource Speech Benchmark 2017 and beat the state-of-the-art using out-of-domain training data. Finally, we show that the data-augmented pretrained features improve a downstream phone recognition task in the Libri-light semi-supervised setting (10 min, 1 h or 10 h of labelled data) reducing the PER by 15% relative.
- Book Chapter
- 10.9734/bpi/mcsru/v9/6825
- Mar 2, 2026
Numerous problems in agriculture, including unpredictable crop yields, disease susceptibility, and the consequences of weather variability, put nutrition and farmer livelihoods at risk. In order to increase agricultural yields, detect diseases early, and provide valuable insights on the Crop Yield Prediction Dataset and Plant Village Dataset, this research provides an AI-powered solution to these issues by integrating deep learning, sophisticated machine learning algorithms, and instantaneous data analysis. The system employs a sophisticated methodology that forecasts temperature, humidity, and conditions for the next five days using the PyOWM API; detects crop diseases using data augmentation and deep learning models such as CNN (accuracy 99.14%), DenseNet-201 (accuracy 99.04%), and Visual Geometry Group-VGG19 (accuracy 97%); and predicts crop yield using models such as Multi-Layer Perceptron-MLP (R2 Score: 0.8242), MLP + Regressor, and Random Forest Regressor achieves the highest R2 Score (0.1789). An AI chatbot that provides farmers with recommendations, disease control methods, and personalised suggestions is part of the technology's real-time help. In order to provide an AI-driven system for weather forecasting, disease detection, yield prediction, and real-time assistance via a chatbot, this project integrates models with high accuracy rates. The user-friendly Streamlit UI is available in Telugu, Hindi, and English, and SQLite handles the secure login and registration procedure.
- Conference Article
- 10.1109/icaitech66481.2025.11387243
- Nov 20, 2025
Crop yields are strongly affected by weather factors including temperature, humidity, rainfall, and wind, which play critical roles in plant development, irrigation management, and harvest outcomes. Reliable weather forecasting is thus crucial for enabling data-driven horticultural decisions and minimizing production risks. Conventional prediction techniques often struggle with non-stationary and multivariate time-series data and typically rely on a single preprocessing or feature engineering method. This study proposes a comprehensive framework integrating multiscenario preprocessing, scaling, and data augmentation with Bidirectional Long Short-Term Memory (BiLSTM) modeling to predict daily weather parameters. Preprocessing strategies include interpolation, mean, median, and k-nearest neighbors (KNN) imputation, while scaling methods involve Min-Max and Standard normalization. Data augmentation techniques such as sliding windows, Gaussian noise injections, and seasonal encoding are applied to enhance temporal pattern learning and model generalization. The framework evaluates multiple combinations of imputation, scaling, and augmentation strategies to identify optimal pipelines for each weather parameter. Results show that the best scenario, which combines Median imputation, Min-Max scaling, and Seasonal augmentation with a window size of 7, achieves the lowest RMSE of 0.0833 for temperature and an $\mathbf{R}^{\mathbf{2}}$ of 0.5151, indicating substantial variance explained. Wind speed also shows robust performance under similar preprocessing, whereas rainfall and relative humidity remain challenging to forecast due to high variability. The findings highlight the importance of systematic preprocessing and augmentation for improving predictive accuracy. This framework provides a robust and adaptive tool for supporting precision agriculture, enabling more informed management decisions amid varying weather conditions.
- Conference Article
3
- 10.1109/ccis57298.2022.10016374
- Nov 26, 2022
Visibility prediction in coastal areas has always been an important issue affecting the safety of residents and the efficiency of urban transportation. The visibility prediction methods currently used by meteorological centers are mainly based on the statistical forecast with relatively low prediction accuracy and high computational complexity. These methods cannot work well with large amounts of data. However, with the rapid development of deep learning technology, the use of deep learning has become a primary trend. In this paper, we propose our visibility prediction model based on (Long Short-Term Memory) LSTM network and self-attention mechanism. The model takes Medium-range Forecasts Data from European Centre for Mediumrange Weather Forecasting (ECMWF) which we use EC data to refer it for simplicity and observatory visibility data as input to predict and uses the LSTM network as the backbone to extract time series information. We also use self-attention mechanism to process the input data before the data is input to the model to let the model better focus on the valuable information for prediction. Compared with the predicted visibility in EC data, our proposed method improved the 3-hour prediction accuracy by 20%, 1.5 times, and 8 times for high-range, medium-range, and low-range visibility, respectively. We also find the data imbalance will greatly affect the prediction accuracy for low-visibility data and use the weighted-loss and mix-up data augmentation strategy model in our model training. We improved the accuracy of low-visibility data by 1.2 times while the prediction results of high-visibility and medium-visibility data remained almost the same. In addition, we conduct several experiments to verify the effectiveness of our model design and the rationality of data augmentation.
- Abstract
- 10.1093/schbul/sby014.119
- Apr 1, 2018
- Schizophrenia Bulletin
BackgroundOur sense of embodied self depends on continuous spatiotemporal integration and predictive coding of multisensory signals to yield a stable internal landscape. However, schizophrenia is characterized by inconsistent mapping of the physical and parasomatic body space, autoscopic hallucinations and flexible body self boundary. We aimed to elucidate the specific roles of exteroceptive, proprioceptive and interoceptive systems in generating self disturbances. Lastly, if schizophrenia represents one end of the spectrum of bodily self disorders, it is also important to understand what lies at the other extreme end, represented by those whose prediction coding is honed to perfection from years of training (athletes) to gain insight into potential remediation strategies.MethodsIn Study 1, components of bodily self-disturbances were examined in individuals with schizophrenia (SZ), matched controls (CO) and prodromal participants (P) with tasks that assessed tactile perception (2-point discrimination task), susceptibility to proprioceptive-tactile illusions, multisensory integration, visual body mapping of emotions (emBODY), and interoceptive awareness (heartbeat detection task). Phenomenological dissociative experiences were captured with a novel picture-based inventory (BODI). In Study 2, we recruited healthy participants with extraordinary expertise to coordinate interoceptive, proprioceptive and exteroceptive signals to perform physical tasks (athletes), and compared their embodiment of emotions with that of matched controls and individuals with schizophrenia.ResultsIndividuals with schizophrenia and prodromal participants were impaired in interoceptive awareness, exteroceptive tactile discrimination, and audio-visual integration compared with matched control groups. SZ and P also showed increased sensitivity to proprioceptive illusions, which was associated with increased dissociative experiences and positive syndromes. Bodily sensations associated with emotions were reduced in SZ and P compared to CO. Importantly, the spatial locations of embodied emotions were different in SZ compared with CO. Interestingly, athletes showed highly precise localization of embodied emotions compared with matched controls. Self-disturbances were exacerbated by social isolation regardless of diagnosis.DiscussionThese results suggest that mapping of internal signals to the experience of external world is inconsistent or incoherent, contributing to fragmented and discontinuous self experience in persons with schizophrenia. More specifically, proprioceptive prediction errors seem to contribute to abnormally flexible self boundary. Diminished access to interoceptive signals may lead to reduced mapping of bodily sensations. Embodied emotions were reduced in SZ and P compared to CO. Athletes seemed to have much more precisely tuned awareness of embodied emotions. These results are consistent with the framework of increased internal neural noise in schizophrenia, which could lead to both weakened and poorly integrated interoceptive, proprioceptive and exteroceptive signaling, and a fragmented sense of self. Athletes data suggest that it may be possible to remediate bodily self disturbances via physical training. These findings underscore the importance of bringing back the body to psychiatry.
- Research Article
2
- 10.3390/make2030017
- Aug 25, 2020
- Machine Learning and Knowledge Extraction
In spatio-temporal predictive coding problems, like next-frame prediction in video, determining the content of plausible future frames is primarily based on the image dynamics of previous frames. We establish an alternative approach based on their underlying semantic information when considering data that do not necessarily incorporate a temporal aspect, but instead they comply with some form of associative ordering. In this work, we introduce the notion of semantic predictive coding by proposing a novel generative adversarial modeling framework which incorporates the arbiter classifier as a new component. While the generator is primarily tasked with the anticipation of possible next frames, the arbiter’s principal role is the assessment of their credibility. Taking into account that the denotative meaning of each forthcoming element can be encapsulated in a generic label descriptive of its content, a classification loss is introduced along with the adversarial loss. As supported by our experimental findings in a next-digit and a next-letter scenario, the utilization of the arbiter not only results in an enhanced GAN performance, but it also broadens the network’s creative capabilities in terms of the diversity of the generated symbols.
- Research Article
24
- 10.3390/technologies11010028
- Feb 7, 2023
- Technologies
Tour planning has become both challenging and time-consuming due to the huge amount of information available online and the variety of options to choose from. This is more so as each traveler has unique set of interests and location preferences in addition to other tour-based constraints such as vaccination status and pandemic travel restrictions. Several travel planning companies and agencies have emerged with more sophisticated online services to capitalize on global tourism effectively by using technology for making suitable recommendations to travel seekers. However, such systems predominantly adopt a destination-based recommendation approach and often come as bundled packages with limited customization options for incorporating each traveler’s preferences. To address these limitations, “thematic travel planning” has emerged as a recent alternative with researchers adopting text-based data mining for achieving value-added online tourism services. Understanding the need for a more holistic theme approach in this domain, our aim is to propose an augmented model to integrate analytics of a variety of big data (both static and dynamic). Our unique inclusive model covers text mining and data mining of destination images, reviews on tourist activities, weather forecasts, and recent events via social media for generating more user-centric and location-based thematic recommendations efficiently. In this paper, we describe an implementation of our proposed inclusive hybrid recommendation model that uses data of multimodal ranking of user preferences. Furthermore, in this study, we present an experimental evaluation of our model’s effectiveness. We present the details of our improvised model that employs various statistical and machine learning techniques on existing data available online, such as travel forums and social media reviews in order to arrive at the most relevant and suitable travel recommendations. Our hybrid recommender built using various Spark models such as naïve Bayes classifier, trigonometric functions, deep learning convolutional neural network (CNN), time series, and NLP with sentiment scores using AFINN (sentiment analysis developed by Finn Årup Nielsen) shows promising results in the directions of benefit for an individual model’s complementary advantages. Overall, our proposed hybrid recommendation algorithm serves as an active learner of user preferences and ranking by collecting explicit information via the system and uses such rich information to make personalized augmented recommendations according to the unique preferences of travelers.
- Conference Article
3
- 10.1109/icassp.2011.5946590
- May 1, 2011
Compressed video bitstreams are very sensitive to transmission errors. If we lose packets or receive them with errors during transmission, not only the current frame will be corrupted, but also the error will propagate to succeeding frames due to the spatiotemporal predictive coding structure. Error detection and concealment is a good approach to reduce the bad influence on the reconstructed visual quality. To increase concealment efficiency, we need to get some more accurate error detection algorithm. In this paper, we present a new error detection scheme based on a fragile watermarking algorithm to increase the error detection ratio. We verified that the pro posed algorithm generates good performances in PSNR and objective visual quality through the computer simulation by H.324M mobile simulation set.