In a complex acoustic environment, several sound sources may simultaneously change their loudness, location, timbre, and pitch. Yet humans like many other animals are able to integrate effortlessly the multitude of cues arriving at their ears, and to derive coherent percepts and judgments about the different attributes of each source. This facility to analyze an auditory scene is conceptually based on a multi-stage process in which sound is first analyzed in terms of a relatively few perceptually significant attributes (the alphabet of auditory perception), followed by higher level integrative processes that organize and group the extracted attributes according to specific context-sensitive rules (the syntax of auditory perception) [1]. The sound received at the two ears is processed for attributes including source location, acoustic ambience, and source attributes such as tone and pitch, timbre and intensity.Decades of physiological and psychoacoustical studies [2,3] have revealed elegant strategies at various stages of the mammalian auditory system for the representation of the signal cues underlying auditory perception. This information has facilitated the development of biophysical models, mathematical abstractions, and computational algorithms of the early and central auditory stages with the aim of capturing the functionality, robustness, and enormous versatility of the auditory system [4]. Numerous groups have implemented such algorithms in software and hardware, and have evaluated them by comparing their performance to human performance and against a range of robustness and flexibility requirements. Furthermore, these auditory-inspired processing strategies have been utilized in a wide range of applications including acoustic diagnostic monitoring systems for machines and manufacturing processes, battlefield acoustic signal analysis, sound analysis and recognition systems, robust detection and recognition of multiple interacting faults, and detection and recognition of underwater transients and weak signals in low signal-to-noise ratio (SNR) in acoustically-cluttered environments [5,6].We shall briefly review the auditory encoding of various sound attributes to illustrate the above ideas. We shall specifically focus on the percept of sound timbre: what acoustic cues are most intimately correlated with it? How are they represented at various stages of the auditory pathway? And how the abstracted auditory signal processing algorithms and representations can be applied to measure speech intelligibility to describe musical timbre and to analyze complex auditory scenes?