Rapid biasing effect of prior auditory contexts on bistable tritone perception

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

The tritone paradox is a bistable auditory phenomenon where two Shepard tones can be interpreted as either ascending or descending. Previous studies have demonstrated that preceding auditory context can bias the direction of tritone perception. Here, we systematically manipulated both the quantity (anywhere between 1 and 10) and types (higher, lower, same as first target tone, or silent) of context tones before presenting a target tritone pair. We found that the contextual biasing effect can emerge with as few as 1–2 context tones, and plateaus quickly within this small window. Notably, low-frequency context tones produced a more pronounced and immediate bias than high-frequency tones. Together, this study demonstrates a narrow window of the auditory context effect, where minimal contextual cues are sufficient to guide perceptual interpretation of ambiguous auditory stimuli. The findings pave the way for more detailed investigations into the cognitive mechanisms of auditory perception, emphasizing the swift influence of immediate auditory contexts on perceptual outcomes.

Similar Papers
  • Book Chapter
  • 10.1007/978-3-319-03731-8_56
Auditory Context Recognition Combining Discriminative and Generative Models
  • Jan 1, 2013
  • Feng Su + 1 more

The paper considers the task of recognizing the category of a context surrounding an audio sensor. Due to the unstructured and diverse nature of the auditory context and constituent environmental sounds, which differs from the usual structured audio data like speech or music, the recognition of auditory context faces many difficulties and relatively fewer researchs have addressed it. In this paper, we propose an ensemble recognition scheme based on the Hough forest framework for unstructured auditory contexts, which combines the discriminative and generative modeling of the context. We learn the effective audio feature representation for environmental sounds in the context with the LDB algorithm, and recognize the context using the Hough forest based ensemble classifier, which aggregates both the segmental and the contextual probabilistic votes on the context category by the segments of the auditory context. The experimental results demonstrate the effectiveness of the proposed approach for auditory context recognition.

  • Research Article
  • Cite Count Icon 12
  • 10.3758/s13414-013-0527-9
Effects of temporal asynchrony and stimulus magnitude on competitive audio–visual binding
  • Aug 17, 2013
  • Attention, Perception, & Psychophysics
  • Jonathan M P Wilbiks + 1 more

When making decisions as to whether or not to bind auditory and visual information, temporal and stimulus factors both contribute to the presumption of multimodal unity. In order to study the interaction between these factors, we conducted an experiment in which auditory and visual stimuli were placed in competitive binding scenarios, whereby an auditory stimulus was assigned to either a primary or a secondary anchor in a visual context (VAV) or a visual stimulus was assigned to either a primary or secondary anchor in an auditory context (AVA). Temporal factors were manipulated by varying the onset of the to-be-bound stimulus in relation to the two anchors. Stimulus factors were manipulated by varying the magnitudes of the visual (size) and auditory (intensity) signals. The results supported the dominance of temporal factors in auditory contexts, in that effects of time were stronger in AVA than in VAV contexts, and stimulus factors in visual contexts, in that effects of magnitude were stronger in VAV than in AVA contexts. These findings indicate the precedence for temporal factors, with particular reliance on stimulus factors when the to-be-assigned stimulus was temporally ambiguous. Stimulus factors seem to be driven by high-magnitude presentation rather than cross-modal congruency. The interactions between temporal and stimulus factors, modality weighting, discriminability, and object representation highlight some of the factors that contribute to audio-visual binding.

  • Conference Article
  • Cite Count Icon 9
  • 10.1109/icassp.2012.6288386
Auditory context classification using random forests
  • Mar 1, 2012
  • Li Yang + 1 more

High-level semantic information can be extracted from audio materials to facilitate various content-based analysis and context-awareness applications. In this paper, we propose a novel automatic auditory context classification method, which combines the characterization of audio events and the inference of auditory context category in a single ensemble analysis framework. In the proposed framework, key audio events in the context are characterized by composite features from discriminative representation models (local discriminant bases, pseudo-semantic and bag-of-audio-words) learned from samples. A random forest based ensemble learning and classification model is employed for auditory contexts, in which individual segments of audio stream are classified and aggregated by Hough voting or bagging to form the final context category. The effectiveness of the proposed approach is demonstrated by the experimental results.

  • Research Article
  • 10.1121/1.4744755
Sample discrimination of target-tone duration with variable-duration context tones
  • May 1, 2001
  • The Journal of the Acoustical Society of America
  • Donna L Neff + 1 more

This study examined the ability of three normal-hearing listeners to discriminate duration differences in target tones with and without variable-duration context tones in frequency regions remote from the target. Single target tones at 500 (‘‘low’’), 1260 (‘‘middle’’), or 3176 Hz (‘‘high’’) were combined with one or two context tones in nontarget regions (e.g., a low target paired with a high context). Across intervals in the 2AFC task, the target tone was drawn from Gaussian distributions having a mean duration of 100 or 120 ms. Listeners were to select the target drawn from the distribution with the longer mean duration. Context-tone duration was sampled from this ‘‘correct’’ distribution, so duration alone did not cue the targets. Overall performance and perceptual weights for individual tones were examined. Frequency had no effect on performance for targets alone, however, there were interactive effects of target and context frequency. The worst performance was observed for middle targets with flanking context tones. Flanking context tones were effective in combination, but relatively ineffective individually. For low and high targets, the context tone nearer the target frequency was most effective, but both influenced performance. Comparisons to similar conditions with frequency sample discrimination will be discussed. [Work supported by NIH.]

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.3389/fpsyg.2018.01590
Pitch Class and Envelope Effects in the Tritone Paradox Are Mediated by Differently Pronounced Frequency Preference Regions
  • Sep 28, 2018
  • Frontiers in Psychology
  • Stephanie Malek

Shepard tones (octave complex tones) are well defined in pitch chroma but are ambiguous in pitch height. Pitch direction judgments of Shepard tones depend on the clockwise distance of the pitch classes on the pitch class circle, indicating the proximity principle in auditory perception. The tritone paradox emerges when two Shepard tones that form a tritone interval are presented successively. In this case, no proximity cue is available and judgments depend on the first tone and vary from person to person. A common explanation for the tritone paradox is the assumption of a specific pitch class comparison mechanism based on a pitch class template that is differently orientated from person to person. In contrast, psychoacoustic approaches (e.g., the Terhardt virtual pitch theory) explain it with common pitch-processing mechanisms. The present paper proposes a probabilistic threshold model, which estimates Shepard tone pitch height by a probabilistic fundamental frequency extraction. In the first processing stage, only those frequency components whose amplitudes are above specific randomly distributed threshold values are selected for further processing, and whose expected values are determined by a threshold function. The lowest of these nonfiltered components is dedicated to the pitch height. The model is designed for tone pairs and provides occurrence probabilities for descending judgments. In a pitch-matching pretest, 12 Shepard tones (generated under a cosine envelope centered at 261 Hz) were compared to pure tones, whose frequencies were adjusted by an up-down staircase method. Matched frequencies corresponded to frequency components but were ambiguous in octave position. In order to test the model, Shepard tones were generated under six cosine envelopes centered over a wide frequency range (65.41, 261, 370, 440, 523.25, 1244.51 Hz). The model predicted pitch class effects and envelope effects. Steep threshold functions caused pronounced pitch class, whereas flat threshold functions caused pronounced envelope effects. The model provides an alternative explanation to the pitch class template theory and serves as a psychoacoustic framework for the perception of Shepard tones.

  • Research Article
  • 10.1121/1.427342
Factors accounting for performance in a frequency-sample-discrimination task with distracters
  • Oct 1, 1999
  • The Journal of the Acoustical Society of America
  • Donna L Neff + 3 more

This paper presents results of fitting several linear models, including the CoRE model of Lutfi [R. A. Lutfi and K. A. Doherty, J. Acoust. Soc. Am. 96, 3443–3450 (1994)] to data from a study of the interaction of target and distracter tones in a sample-discrimination task. In this 2IFC task, ten listeners judged which of two pairs of target tones were drawn from the higher of two overlapping frequency distributions. After training with targets alone, two distracter tones were presented simultaneously with the target tones, with each distracter placed in a frequency region remote from the target region. The important variables were the frequency regions and degree of frequency variability of targets versus distracters. Main features of the data included a large range of individual differences, a dominance of the effects of lower-frequency context tones over the higher-frequency context tones in many conditions, and differences in performance in the baseline, no-context conditions. When stimulus variability, perceptual weights across frequency, and performance in the no-context conditions are included as factors in the analysis, both mean and individual performance can be described reasonably well. [Work supported by NIDCD.]

  • Research Article
  • 10.1121/1.422517
Interactions of pairs of target and context tones based on relative frequency relation and degree of frequency uncertainty
  • May 1, 1998
  • The Journal of the Acoustical Society of America
  • Donna L Neff + 1 more

Four normal-hearing listeners completed 2IFC sample-discrimination tasks for frequency, in which they judged which of two tone pairs was drawn from the higher of two Gaussian frequency distributions. The center frequencies of the two target distributions were placed at low (400/500 Hz), middle (1128/1410 Hz), or high (3200/4000 Hz) frequency regions. All distributions were equally spaced and equivariant on a logarithmic frequency scale. Pairs of extraneous (to-be-ignored) context tones were added above, flanking, and below the low-, middle-, and high-frequency target regions, respectively. Across conditions, target and context tones used the same frequency regions, but never occupied the same frequency region within condition. Context tones were fixed in frequency, varied between two known frequencies, or were Gaussian distributed. A level jitter was added to derive perceptual weighting functions across stimuli. The results showed little detrimental effect of fixed-frequency context tones, but large individual differences in the effects of known pairs of tones or tones with Gaussian variation. Weighting functions and patterns of frequency interactions among stimuli will be discussed. [Work supported by NIDCD.]

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.3389/fpsyg.2018.00677
Aftereffects of Spectrally Similar and Dissimilar Spectral Motion Adaptors in the Tritone Paradox.
  • May 8, 2018
  • Frontiers in Psychology
  • Stephanie Malek + 1 more

Shepard tones consist of octave-spaced components, whose amplitudes are generated under a fixed bell-shaped spectral envelope. They are well defined in pitch chroma, but generate octave confusions that in turn can produce ambiguities in the perceived relative pitch heights when their chromas are exactly a tritone apart (the tritone paradox). This study examined the effects of tonal context on relative pitch height judgments using adaptor sequences followed by target sequences (pairs of Shepard tones of different chromas separated by a tritone). Listeners judged whether the second target Shepard tone was higher or lower than the first. Adaptor sequences consisted of rising or falling scales (43 s at the beginning of each block, 4 s before each target sequence). Two sets of Shepard tones were used for adaptors and targets that were generated under spectral envelopes centered at either A3 (220 Hz) and C6 (1,046 Hz). Pitch direction judgments (rising vs. falling) to spectrally consistent (A3–A3, C6–C6) and inconsistent (A3–C6, C6–A3) adaptor-target combinations were studied. Large significant contrastive aftereffects (0.08–0.21 change in fraction of pitch direction responses) were only found for the Shepard tones that were judged as higher in the control condition (judgments about the target sequences without adaptor sequences) for the consistent adaptor-target conditions (A3–A3, C6–C6). The experiments rule out explanations based on non-sensory decision making processes. Possible explanations in terms of perceptual aftereffects caused by adaptation in central auditory frequency-motion detectors are discussed.

  • Research Article
  • Cite Count Icon 14
  • 10.1016/j.heares.2013.03.003
School-age children's environmental object identification in natural auditory scenes: Effects of masking and contextual congruence
  • Mar 18, 2013
  • Hearing Research
  • Saloni Krishnan + 3 more

School-age children's environmental object identification in natural auditory scenes: Effects of masking and contextual congruence

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 3
  • 10.1515/psych-2020-0101
Sounds in the classroom: Auditory context-dependent memory in school-aged children
  • Jun 8, 2020
  • Open Psychology
  • Anna L Ostendorf + 2 more

Context-dependent memory (CDM) is the effect whereby information is retrieved more accurately in the presence of the contextual information that was present during encoding than in the absence of that contextual information. Most previous CDM experiments have focused on spatial location, but contexts such as sights, smells, and sounds have also been shown to be effective mnemonic cues, although the research is more limited. In relation to auditory contexts, much of the previous research has focused on music and on adults. We were interested in determining whether auditory CDM effects could be found in a classroom setting in school-aged children using background noises. Across two experiments we found that the reinstatement of the auditory context improved memory performance for 2nd, 3rd, and 5th grade students. Sounds, not just musical pieces, are stored in memory and can be effective contextual mnemonic cues. Further, (auditory) CDM effects can be found in young children. Teachers should be aware of the influence of contextual auditory cues in the classroom setting, and how this information is stored along with the focus of the teaching lesson.

  • Research Article
  • Cite Count Icon 4
  • 10.1037/xhp0001093
The influence of event segmentation by context on stimulus-response binding.
  • Mar 1, 2023
  • Journal of Experimental Psychology: Human Perception and Performance
  • Ruyi Qiu + 4 more

A core characteristic of auditory stimuli is that they develop over time. Referring to the event segmentation theory, we assume that the on- and offset of a contextual sound indicates the start and end of an event. As a consequence, stimuli and responses appearing within a common auditory context may be integrated more likely/strongly, forming so-called event files, than those appearing in different auditory contexts. In two experiments, this hypothesis was tested using the negative priming paradigm and the distractor-response binding paradigm. In prime-probe presentations, participants identified target sounds via keypresses while ignoring distractor sounds. Additional sine tones acted as the context in the prime, whereas the probe context was silence. In the common context condition, the context started with the prime sounds and ended with the prime response. In the changing context condition, the context started with the prime sounds but changed to another tone after the offset of the prime sounds. Results from both experiments revealed a larger stimulus-response binding effect in the common than in the changing context condition. We conducted a control experiment to test the alternative account of contextual similarity between the prime and the probe. Together, our results suggest that common context can temporally segment stimuli and responses into event files, providing evidence of common context as a binding principle. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

  • Research Article
  • Cite Count Icon 52
  • 10.1016/j.cortex.2019.06.010
Reduced prediction error responses in high-as compared to low-uncertainty musical contexts
  • Jun 28, 2019
  • Cortex
  • David R Quiroga-Martinez + 5 more

Reduced prediction error responses in high-as compared to low-uncertainty musical contexts

  • Conference Article
  • Cite Count Icon 6
  • 10.1109/mdm.2009.74
An Implementation of Auditory Context Recognition for Mobile Devices
  • May 1, 2009
  • Mikko Perttunen + 3 more

Auditory contexts are recognized from mixtures of sounds from mobile userspsila everyday environments. We describe our implementation of auditory context recognition for mobile devices. In our system we use a set of support vector machine classifiers to implement the recognizer. Moreover, static and runtime resource consumption of the system are measured and reported.

  • Research Article
  • Cite Count Icon 2
  • 10.1177/1747021820921096
Challenging voices: Mixed evidence for context-specific control adjustments in the auditory domain.
  • Jun 2, 2020
  • Quarterly Journal of Experimental Psychology
  • Anja Berger + 3 more

The flexible adjustment to changing demands is an astonishing human ability. One related phenomenon is the context-specific proportion congruency (CSPC) effect. Regarding response conflict, the CSPC refers to reduced response interference in contexts with a high conflict proportion as opposed to contexts with a low conflict proportion. Derived from previous research showing CSPCs in the visual domain, we here aim to investigate whether human voices (male vs. female) as auditory contexts trigger control adjustments. To this end, we used a numerical judgement task with number words spoken by a male or female voice. We created response conflict by presenting the words either to the left or right ear (Experiment 1), and we created different levels of processing fluency by presenting them clearly or with background noise (Experiment 2). For a given participant, either the female or the male voice was associated with a high proportion of incongruent/disfluent trials and a low proportion of congruent/fluent trials, respectively. Extending previous findings from the visual modality, we found that the frequency of challenging information within one auditory context (i.e., the voice) can lead to typical CSPC patterns. In two further experiments, using frequency biased and unbiased items, we found evidence for the contribution of associative learning. Limitations of context control associations will be discussed.

  • Conference Article
  • Cite Count Icon 8
  • 10.1109/ubicomm.2008.21
Auditory Context Recognition Using SVMs
  • Sep 1, 2008
  • Mikko Perttunen + 3 more

We study auditory context recognition for context-aware mobile computing systems. Auditory contexts are recordings of a mixture of sounds, or ambient audio, from mobile users' everyday environments. Fortraining a classifier, a set of recordings from different environments are segmented and labeled. The segments are windowed into overlapping frames for feature extraction. While previous work in auditory context recognition has often treated the problem as a sequence classification task and used HMM-based classifiers to recognize a sequence of consecutive MFCCs of frames, we compute averaged Mel-spectrum over the segments and train a SVM-based classifier. Our scheme outperforms an already reported HMM-based scheme. This result is achieved using the same dataset. We also show that often the feature sets used by previous work are affected by attenuation, limiting their applicability in practice. Furthermore, we study the impact of segment duration on recognition accuracy.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.