Gastric Anatomical Sites Recognition in Gastroscopic Images Based on Dual-branch Perception and Multi-scale Semantic Aggregation
Gastric Anatomical Sites Recognition in Gastroscopic Images Based on Dual-branch Perception and Multi-scale Semantic Aggregation
- Research Article
3
- 10.1109/access.2022.3213675
- Jan 1, 2022
- IEEE Access
In existing image recognition algorithms, the position and sequence of image pixels are key factors that affect the accuracy of image recognition. Therefore, the topological invariance of complex networks has led to the recognition that applying complex networks to image recognition analysis will significantly reduce the impact of images on classification recognition accuracy when rotation, translation, and scaling occur. However, most studies on image classification by complex networks have focused on a single network, lacking dynamic evolution with the networks among them. In this paper, we propose a new complex network classification method that combines complex networks and convolutional neural networks(CNN) to train classification using deep learning. We show that the method has high classification accuracy and distinct network features and compares well with a single complex network approach. In addition, to make the distribution of the degree histogram of the image more uniform and concentrated, the original formula for calculating the power value was optimized to reduce the influence of the radius parameter on the power value.
- Research Article
12
- 10.1016/j.bspc.2021.103167
- Sep 22, 2021
- Biomedical Signal Processing and Control
Channel separation-based network for the automatic anatomical site recognition using endoscopic images
- Front Matter
- 10.1111/exsy.12946
- Feb 24, 2022
- Expert systems
COVID-19 special issue: Intelligent solutions for computer communication-assisted infectious disease diagnosis.
- Conference Article
16
- 10.1109/icip.1999.819576
- Oct 24, 1999
An integral scheme that provides a global eigen approach to the problem of face recognition of still images has been presented by Lorente and Torres, (1998). The scheme is based on the representation of the face images using the so called eigenfaces, generated performing a PCA (Principal Components Analysis). The data base used was designed for still image recognition and the corresponding images were very controlled. That is, the test images had controlled expression, orientation and lighting variations. Preliminary results were shown using only a frontal view image by person in the training set. In this paper, we present our first results for face recognition of video sequences. To that end, we have modified our original scheme in such a way that is able to cope with the different face conditions present in a video sequence. The main and final objective is to develop a tool to be used in the MPEG-7 standardization effort to help video indexing activities. The system is not yet fully automatic, but an automatic facial point location is under development. Good results have been obtained using the video test sequences used in the MPEG-7 evaluation group.
- Research Article
2
- 10.13201/j.issn.2096-7993.2023.01.002
- Jan 1, 2023
- Lin chuang er bi yan hou tou jing wai ke za zhi = Journal of clinical otorhinolaryngology, head, and neck surgery
Objective:To explore the automatic recognition and classification of 20 anatomical sites in laryngoscopy by an artificial intelligence(AI) quality control system using convolutional neural network(CNN). Methods: Laryngoscopic image data archived from laryngoscopy examinations at the Department of Endoscopy, Cancer Hospital, Chinese Academy of Medical Sciences from January to December 2018 were collected retrospectively, and a CNN model was constructed using Inception-ResNet-V2+SENet. Using 14000 electronic laryngoscope images as the training set, these images were classified into 20 specific anatomical sites including the whole head and neck, and their performance was tested by 2000 laryngoscope images and 10 laryngoscope videos. Results:The average time of the trained CNN model for recognition of each laryngoscopic image was(20.59 ± 1.55) ms, and the overall accuracy of recognition of 20 anatomical sites in laryngoscopic images was 97.75%(1955/2000), with average sensitivity, specificity, positive predictive value, and negative predictive value of 100%, 99.88%, 97.76%, and 99.88%, respectively. The model had an accuracy of ≥ 99% for the identification of 20 anatomical sites in laryngoscopic videos. Conclusion:This study confirms that the CNN-based AI system can perform accurate and fast classification and identification of anatomical sites in laryngoscopic pictures and videos, which can be used for quality control of photo documentation in laryngoscopy and shows potential application in monitoring the performance of laryngoscopy.
- Research Article
1
- 10.1038/s41598-021-01520-y
- Nov 26, 2021
- Scientific Reports
Sequence recognition of natural scene images has always been an important research topic in the field of computer vision. CRNN has been proven to be a popular end-to-end character sequence recognition network. However, the problem of wide characters is not considered under the setting of CRNN. The CRNN is less effective in recognizing long dense small characters. Aiming at the shortcomings of CRNN, we proposed an improved CRNN network, named CRNN-RES, based on BiLSTM and multiple receptive fields. Specifically, on the one hand, the CRNN-RES uses a dual pooling core to enhance the CNN network’s ability to extract features. On the other hand, by improving the last RNN layer, the BiLSTM is changed to a shared parameter BiLSTM network using recursive residuals, which reduces the number of network parameters and improves the accuracy. In addition, we designed a structure that can flexibly configure the length of the input data sequence in the RNN layer, called the CRFC layer. Comparing the CRNN-RES network proposed in this paper with the original CRNN network, the extensive experiments show that when recognizing English characters and numbers, the parameters of CRNN-RES is 8197549, which decreased 133,752 parameters compare with CRNN. In the public dataset ICDAR 2003 (IC03), ICDAR 2013 (IC13), IIIT 5k-word (IIIT5k), and Street View Text (SVT), the CRNN-RES obtain the accuracy of 96.90%, 89.85%, 83.63%, and 82.96%, which higher than CRNN by 1.40%, 3.15%, 5.43%, and 2.16% respectively.
- Research Article
- 10.1002/ecjc.20106
- Apr 26, 2004
- Electronics and Communications in Japan (Part III: Fundamental Electronic Science)
This paper considers the recognition of moving images and proposes a new framework in which the local features, the global structure, and motion information are handled comprehensively. The method is applied to the extraction of facial expressions and its effectiveness is demonstrated. In most conventional moving image recognition methods, recognition is performed on the basis of the sequence of recognition results for individual frames. However, the authors believe that the motion information should be more positively utilized in the recognition of the individual frames. When the time‐series data are combined into the recognition of a still image, however, the amount of information is tremendous, making the processing complicated. The method proposed in this paper extends the concept of the labeled graph matching method, which has been used for still images, to moving images. The proposed method handles sparse graphs and can prevent an increase in the amount of computation. By dynamically adjusting the features to be handled according to the stage of processing, complex processing in dynamic image recognition can be integrated in a simple and straightforward way. As practical examples, the human head, and parts of the head, are extracted, indicating the effectiveness of the proposed method. © 2004 Wiley Periodicals, Inc. Electron Comm Jpn Pt 3, 87(10): 35–43, 2004; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/ecjc.20106
- Book Chapter
7
- 10.1007/978-1-4757-3186-6_3
- Jan 1, 2002
Recognizing objects and interpreting the meaning of static or mobile configurations of objects is a problem which has to be solved in different applications, like medical diagnosis, autonomous mobile systems, or remote sensing. Any system for the recognition and interpretation of images, image sequences, or sensor signals in general implicitly or explicitly makes use of a priori knowledge about the origin and properties of the image, about the objects, scenes, and events visible in an image, about the requirements of the user or the application, and about actions and conclusions which may be infered due to the image content. In a model-based approach to image recognition and/or interpretation all or at least a significant amount of this a priori knowledge is represented explicitly in a model. The model may contain declarative knowledge, that is, knowledge about structural properties, and procedural knowledge, that is, procedures computing certain attributes of the structural components. In the most general case an algorithm is provided which computes an interpretation based on the input image and the availaole model. Basically, this algorithm specifies which procedural knowledge to activate and which intermediate results to use for further processing. In this sense the algorithm computes a processing strategy or controls the interpretation process and hence will be referred to as the control algorithm. Two approaches to modeling, that is, semantic models and statistical models are treated in Section 3.2 and Section 3.3, respectively.
- Book Chapter
10
- 10.1007/978-3-642-25085-9_1
- Jan 1, 2011
The patterns in collections of real world objects are often not based on a limited set of isolated properties such as features. Instead, the totality of their appearance constitutes the basis of the human recognition of patterns. Structural pattern recognition aims to find explicit procedures that mimic the learning and classification made by human experts in well-defined and restricted areas of application. This is often done by defining dissimilarity measures between objects and measuring them between training examples and new objects to be recognized.The dissimilarity representation offers the possibility to apply the tools developed in machine learning and statistical pattern recognition to learn from structural object representations such as graphs and strings. These procedures are also applicable to the recognition of histograms, spectra, images and time sequences taking into account the connectivity of samples (bins, wavelengths, pixels or time samples).The topic of dissimilarity representation is related to the field of non-Mercer kernels in machine learning but it covers a wider set of classifiers and applications. Recently much progress has been made in this area and many interesting applications have been studied in medical diagnosis, seismic and hyperspectral imaging, chemometrics and computer vision. This review paper offers an introduction to this field and presents a number of real world applications.KeywordsPattern RecognitionDynamic Time WarpingDissimilarity MeasureDissimilarity MatrixLinear Support Vector MachineThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
- Conference Article
4
- 10.1109/cbmsys.1990.109378
- Jun 3, 1990
An approach to automatic prediction and detection of ovulation is described. It is based on the application of image processing techniques to the cervical mucus fern test, a popular clinical diagnostic method. The sequence of histogram equalization, filtering, edge detection, binarization, labeling, thinning, Hough transform, and automatic pattern recognition in a feature space is applied to microscopic images of the ferning patterns. This method permits decisions to be made based on quantitative data instead of the subjective evaluations that are presently used. >
- Research Article
6
- 10.1016/j.bpc.2007.10.016
- Nov 12, 2007
- Biophysical Chemistry
Site localization of membrane-bound proteins on whole cell level using atomic force microscopy
- Research Article
- 10.1088/1742-6596/100/5/052040
- Mar 1, 2008
- Journal of Physics: Conference Series
At this study we present molecular recognition method which is based on force spectroscopy analysis for biological markers on the whole cell level. The presented method allows recognition of specific cell surface proteins and receptor sites by nanometer accuracy level. Here we demonstrate specific recognition of membrane-bond Osteopontin (OPN) sites over a whole Preosteogenic cell membrane. By merging specific force detection map of the proteins and topography image of the cell, we create a new image (recognition image), which demonstrate the exact locations of the proteins relative to the cell membrane. The recognition results indicate on the strong affinity between the modified tip and the target molecules, therefore, it enables the use of an AFM as a remarkable nanoscale tracking tool at the whole cell level.
- Conference Article
4
- 10.1109/cme55444.2022.10063301
- Nov 4, 2022
In medical image diagnosis, the effectiveness of pattern recognition methods is an important factor in the accurate diagnosis and evaluation of diseases. In recent years, image acquisition has been significantly improved, with devices acquiring data at faster rates and higher resolutions, but the lack of an effective image diagnosis method for gastric diseases will not meet the requirements of clinical applications if its diagnostic accuracy is not high. Therefore, this paper proposes to use particle swarm optimization to improve ELM to achieve effective recognition of gastroscopic images and provide an effective intelligent diagnosis method for medical disease diagnosis, and PSO-ELM obtains better classification performance in the experimental results of this paper. The classification accuracy was improved from 85% to 95%, the precision rate had 86.67% to 95.83%, providing an intelligent method to effectively improve the diagnosis of gastric diseases.
- Conference Article
1
- 10.1109/cisp.2012.6469678
- Oct 1, 2012
Video background estimation is to reconstruct the original background image from the video image sequence, and it is the foundation of the video image sequence analysis and recognition. A new belief propagation (BP) algorithm for background estimation based on local maximum weight matching is proposed. The tactics of this algorithm is that it implement the correlation matching in the pixels, and obtain the local optimal paths of message-passing, then it pass message according to the local optimal paths, and compute the information of pixels make use of BP algorithm, then select the pixels of minimal information as background pixels, achieve the efficient estimation of video background. This paper algorithm can decrease the computational complexity of BP algorithm, and effectively avoid the problem of over-smoothing. Simultaneously, this new algorithm can opportunely track illumination change to update the background. The experimental results verify the availability of the algorithm.
- Conference Article
1
- 10.1145/3338840.3355648
- Sep 24, 2019
For the construction site image understanding, object detection and recognition are the most important tasks. In the construction site with electrical equipment, the scene need to be monitored carefully to avoid accident. In our work, one anomaly detection method via the cloud computation is proposed. The method consists of the one-stage deep learning object detection model and the one-class classification. The one-stage object detection method detects and recognizes the objects in the scenes. Then, the one-class SVM alarms the abnormal region. The proposal algorithm has been tested on several scenes of real construction sites, and achieves fine results practicably.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.