HMAX Model Research Articles

HMAX is a well-known computational model of visual recognition in cortex consisting of just two computational operations – a “template match” and non-linear pooling – alternating in a feedforward hierarchy in which receptive fields exhibit increasing specificity and invariance [1]. Interestingly, auditory recognition problems (such as speech recognition) share similar computational requirements, and recent work in auditory neuroscience suggests that auditory and visual cortex share similar anatomical and functional organization. Based on these similarities, we tested whether HMAX could support an auditory recognition task (specifically, word spotting). To test HMAX on word spotting, recorded speech samples from the TIMIT corpus [2] were first converted into time-frequency spectrograms using a computational model of the auditory periphery [3]. These spectrograms were then split into 750 ms frames and input to a standard HMAX model [4]. Based on observed similarities between the receptive fields in primary auditory cortex (spectro-temporal receptive fields, or STRFs) and primary visual cortex (typically modeled as oriented Gabor filters), we used S1 filters identical to those used in vision [4]. Similarly, S2 “patches” were randomly selected from C1 representations of speech sounds drawn from an independent speech corpus. One vs. all linear support vector machines (SVMs) were then trained to discriminate frames that contain a target word from those that did not. These SVMs were then tested on a novel set of test sentences using a sliding frame approach (750 ms frame size, 20 ms step size). For each frame in a sentence, the SVM produced a distance from the hyperplane, and a threshold value was applied to produce a binary classification whether or not the target word was present in the sentence. When tested on target words that appeared in a fixed context (i.e. SA sentences in TIMIT), performance was highly robust, with ROC areas consistently above 0.9. When tested on target words that appeared in variable contexts (i.e., SI sentences in TIMIT), performance was somewhat decreased with ROC areas around 0.8. This decrease in performance is likely due to the inclusion of “clutter” (i.e., target irrelevant features) within the frame, also commonly observed when HMAX is applied to visual object recognition tasks [1]. These results are novel in that they provide support for the hypothesis that the simple computational framework implemented in HMAX – consisting of a feedforward hierarchy of only two alternating computational operations – may generalize beyond vision to support auditory recognition as well. It is possible that such a representation could give rise to stable neural encodings that are invariant to behaviorally irrelevant characteristics as seen in higher order visual and auditory cortices [5,6]. While it is likely that this auditory version of the HMAX model would benefit from the use of more auditory-specific filters based on STRF models [7], the Gabor features used here are largely compatible with previous computational models based on STRFs up to the level of primary auditory cortex [8]. Additional benefit may also be gained by learning sparse representations from natural sounds, at both the S1 and S2 levels [9].

Read full abstract

本文实现了将改进的HMAX算法应用于车型识别领域。算法的主要创新点是采用ITTI显著性算法选取车辆图片显著点构造图库，通过采用对颜色、亮度和朝向三方面敏感度的综合，构造更具代表性的模板库，提高了算法的识别率。此外，通过计算同一模板与不同图片响应度值的方差，消除了冗余模板，降低了算法识别时间。分析和实验表明，改进的HMAX算法可以有效地实现车型识别，新提出的算法相对原先HMAX模型在识别率可以提高1%~2%，相对当前存在的其他车型识别算法在识别率方面可提升约5%~10%，在保证特征数量的前提下，识别率接近95%；此外，加入模版筛选方法的改进算法相对于原先HMAX模型在识别率基本保持不变的条件下，识别时间可以缩减到原先的1/4，最终根据效益值评估的最优组合中，识别率约为92%，识别时间为0.6 s/幅图片，相对于原先方法均达到了一定的提升。 The paper implements the improved HMAX model for vehicle type (make and model) recognition. The main improvement of this algorithm is using ITTI model with the characteristic of quick response to color, intensity or orien- tation discrimination, to select the saliency area. By this way the template can be more representative and more beneficial to the detection rate. In addition, the calculating of the variance of response degree between different images and the same template eliminates redundant templates, which make contribution to reducing the time of classification. Analyz- ing and experiments claims that the improved HMAX model can be effective and reliable by the detection rate at 95%, with enough features extracted, 5% - 10% higher than current vehicle type recognition methods and 1% - 2% higher than original HMAX model. In addition, with the template screening method added, the improved HMAX model can keep the detection rate and curtail the classification time by quarter time. According to the optimal setting by the final value estimation, the detection rate is about 92%, and the classification time is 0.6 s/image, with promotion to the past.

Read full abstract

HMAX Model Research Articles

Articles published on HMAX Model

There Is a "U" in Clutter: Evidence for Robust Sparse Codes Underlying Clutter Tolerance in Human Vision.

Enhanced HMAX model with feedforward feature learning for multiclass categorization.

Modeling object recognition in visual cortex using multiple firing k-means and non-negative sparse coding

Combining two visual cortex models for robust face recognition

Visual dictionaries as intermediate features in the human brain.

A hierarchical model of vision (HMAX) can also recognize speech

Classification and identification of vehicle type and make by cortex-like image descriptor HMAX

Introducing Memory and Association Mechanism Into a Biologically Inspired Visual Model

Modeling guidance and recognition in categorical search: bridging human and computer object detection.

Fast neuromimetic object recognition using FPGA outperforms GPU implementations.

C-HMAX: Artificial cognitive model inspired by the color vision mechanism of the human brain

Exploiting temporal continuity of views to learn visual object invariance.

Notice of retraction: 'The emergence of orthographic word representations in the brain: evaluating a neural shape-based framework using fMRI and the HMAX model' by Wouter Braet, Jonas Kubilius, Johan Wagemans and Hans P. Op de Beeck. doi:10.1093/Cercor/bhs355, published online November 16, 2012.

Extended Coding and Pooling in the HMAX Model

Evaluating a neural shape-based framework for the emergence of visual word form representations using fMRI and the HMAX model

A stable biologically motivated learning mechanism for visual feature extraction to handle facial categorization.

Texture Feature Extraction Inspired By Natural Vision System And Hmax Algorithm

2P1-J04 スケールと位置に不変な物体識別を行う脳型視覚システム(ロボットビジョン(2))

基于改进的HMAX算法的车型识别应用

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

HMAX Model Research Articles

Articles published on HMAX Model

There Is a "U" in Clutter: Evidence for Robust Sparse Codes Underlying Clutter Tolerance in Human Vision.

Enhanced HMAX model with feedforward feature learning for multiclass categorization.

Modeling object recognition in visual cortex using multiple firing k-means and non-negative sparse coding

Combining two visual cortex models for robust face recognition

Visual dictionaries as intermediate features in the human brain.

A hierarchical model of vision (HMAX) can also recognize speech

Classification and identification of vehicle type and make by cortex-like image descriptor HMAX

Introducing Memory and Association Mechanism Into a Biologically Inspired Visual Model

Modeling guidance and recognition in categorical search: bridging human and computer object detection.

Fast neuromimetic object recognition using FPGA outperforms GPU implementations.

C-HMAX: Artificial cognitive model inspired by the color vision mechanism of the human brain

Exploiting temporal continuity of views to learn visual object invariance.

Notice of retraction: 'The emergence of orthographic word representations in the brain: evaluating a neural shape-based framework using fMRI and the HMAX model' by Wouter Braet, Jonas Kubilius, Johan Wagemans and Hans P. Op de Beeck. doi:10.1093/Cercor/bhs355, published online November 16, 2012.

Extended Coding and Pooling in the HMAX Model

Evaluating a neural shape-based framework for the emergence of visual word form representations using fMRI and the HMAX model

A stable biologically motivated learning mechanism for visual feature extraction to handle facial categorization.

Texture Feature Extraction Inspired By Natural Vision System And Hmax Algorithm

2P1-J04 スケールと位置に不変な物体識別を行う脳型視覚システム(ロボットビジョン(2))

基于改进的HMAX算法的车型识别应用