Sort by
Deep Image Deblurring for Non-Uniform Blur: a Comparative Study of Restormer and BANet

Image blur is one of the common degradations on an image. The blur that occurs on the captured images is sometimes non-uniform, with different levels of blur in different areas of the image. In recent years, most deblurring methods have been deep learning-based. These methods model deblurring as an imageto-image translation problem, treating images globally. This may result in poor performance when handling non-uniform blur in images. Therefore, in this paper, the author compared two state-of-the-art supervised deep learning methods for deblurring and restoration, e.g. BANet and Restormer, with a special focus on the non-uniform blur. The GOPRO training dataset, which is also used in various studies as a benchmark, was used to train the models. The trained models were then tested on the GOPRO testing test, the HIDE testing set for cross-dataset testing, and GOPRO-NU, which consists of specifically selected non-uniform blurred images from the GOPRO testing set, for the non-uniform deblur testing. On the GOPRO testing set, Restormer achieved an SSIM of 0.891 and PSNR of 27.66 while BANet obtained an SSIM of 0.926 and PSNR of 34.90. Meanwhile, for the HIDE dataset, Restormer achieved an SSIM of 0.907 and PSNR of 27.93 while BANet obtained an SSIM of 0.908 and PSNR of 34.52. Finally, on the non-uniform blur GOPRO dataset, Restormer achieved an SSIM of 0.911 and PSNR of 29.48 while BANet obtained an SSIM of 0.935 and PSNR of 35.47. Overall, BANet shows the best result in handling non-uniform blur with a significant improvement over Restormer.

Open Access
Relevant
Application of Q-learning Method for Disaster Evacuation Route Design Case Study: Digital Center Building UNNES

The Digital Center (DC) building at UNNES is a new building on the campus that currently lacks evacuation routes. Therefore, it is necessary to create an evacuation route plan in accordance with the Minister of Health Regulation Number 48 of 2016. Creating a manual evacuation route plan can be inefficient and prone to errors, especially for large buildings with complex interiors. To address this issue, learning techniques such as reinforcement learning (RL) are being used. In this study, Q-learning will be utilized to find the shortest path to the exit doors from 11 rooms on the first floor of the DC building. The study makes use of the floor plan data of the DC building, information about the location of the exit doors, and the distance required to reach them. The results of the study demonstrate that Qlearning successfully identifies the shortest evacuation routes for the first-floor DC building. The routes generated by Q-learning are identical to the manually created shortest paths. Even when additional obstacles are introduced into the environment, Q-learning is still able to find the shortest routes. On average, the number of episodes required for convergence in both environments is less than 1000 episodes, and the average computation time needed for both environments is 0.54 seconds and 0.76 seconds, respectively.

Open Access
Relevant
Classification of Coffee Fruit Maturity Level based on Multispectral Image Using Naïve Bayes Method

The current research about the classification of coffee fruit ripeness based on multispectral images has been developed using the Convolutional Neural Network (CNN) method to extract patterns from highdimensional multispectral images. The high complexity of CNN allows the model to capture complex features but requires more time and computational resources for model training and testing. Therefore, in this study, classification is performed using a more straightforward method such as Naïve Bayes because its complexity only depends on the number of features and samples. The method only considers each feature independently, so it has high speed and does not require a lot of computational resources. Naïve Bayes is applied to color and texture features extracted from multispectral images of coffee fruit. There are 300 features consisting of 60 color features and 240 texture features. Experiments were conducted based on the comparison of training and testing data and the use of each feature. The combination of color and texture features showed better performance than color or texture features alone, with the highest accuracy reaching 91.01%. In conclusion, using Naïve Bayes is still reasonably good in classifying the ripeness of coffee fruit based on multispectral images.

Open Access
Relevant
Temporal Action Segmentation in Sign Language System for Bahasa Indonesia (SIBI) Videos Using Optical Flow-Based Approach

Sign language (SL) is vital in fostering communication for the deaf and hard-of-hearing communities. Continuous Sign Language Translation (CSLT) is a work that translates sign language into spoken language. CSLT translation is done by changing continuous forms into isolated signs. Segmenting morpheme signs from phrase signs has several challenges, such as the availability of annotated datasets and the complexity of continuous gesture movements. The Indonesian Sign Language (SIBI) system follows Indonesian grammatical norms, including word formation, in contrast to other sign languages with rules derived from their spoken language. In SIBI, a word can consist of a root word and an affix word. Therefore, temporal action segmentation in SIBI is important to reconstruct the results of translating each sign into spoken Indonesian sentences. This research uses an optical flow approach to segment temporal actions in SIBI videos. Optical flow methods that calculate changes in intensity between adjacent frames can be used to determine the occurrence of sign movement or vice versa to determine the delay between sign movements. The absence of intensity differences between the two frames indicates the boundary between sign gestures. This study tested the use of dense optical flow on videos containing SIBI sentences taken from 3 signers. Evaluation is done on several parameters in the dense optical flow algorithm, such as threshold size, PyrScale, and WinSize, to obtain the best accuracy. This paper shows that the optical flow algorithm successfully performs segmentation, as measured by Perf and F1r. The experimental results showed that the highest Perf and F1r yields were 0.8298 and 0.8524, respectively.

Open Access
Relevant
Hand Sign Interpretation through Virtual Reality Data Processing

The research lays the groundwork for further advancements in VR technology, aiming to develop devices capable of interpreting sign language into speech via intelligent systems. The uniqueness of this study lies in utilizing the Meta Quest 2 VR device to gather primary hand sign data, subsequently classified using Machine Learning techniques to evaluate the device's proficiency in interpreting hand signs. The initial stages emphasized collecting hand sign data from VR devices and processing the data to comprehend sign patterns and characteristics effectively. 1021 data points, comprising ten distinct hand sign gestures, were collected using a simple application developed with Unity Editor. Each data contained 14 parameters from both hands, ensuring alignment with the headset to prevent hand movements from affecting body rotation and accurately reflecting the user's facing direction. The data processing involved padding techniques to standardize varied data lengths resulting from diverse recording periods. The Interpretation Algorithm Development involved Recurrent Neural Networks tailored to data characteristics. Evaluation metrics encompassed Accuracy, Validation Accuracy, Loss, Validation Loss, and Confusion Matrix. Over 15 epochs, validation accuracy notably stabilized at 0.9951, showcasing consistent performance on unseen data. The implications of this research serve as a foundation for further studies in the development of VR devices or other wearable gadgets that can function as sign language interpreters.

Open Access
Relevant