Sunday, February 12, 2012


Reality and the Senses

It is no accident that man perceives the 'reality' of this world through his senses, or rather through the manner in which it is interpreted by his senses. Prime among the senses are vision and hearing. No doubt, the requirements of evolutionary survival and mastery of his environment subsequently have fine-honed these sensory faculties and endowed them with capabilities that at times defy fuller comprehension, or at any rate invite our wonder and disbelief.

The visual system --the eyes and the brain-- is a devilishly fast and accurate 3-D signal processor that has the added merits of a built-in peripheral vision sub-system that is tuned to simultaneously sense even the faintest of movements in the lowest of light. No system man has devised has approached the complexity and perfection of our visual system. Period.

It is not difficult to surmise that the sense of hearing too is likely to have evolved to a fine degree of complexity and sophistication (so far as input signal processing and interpretation is concerned) once we could see that hearing has been more than a capable partner to vision in ensuring the evolutionary march forward of man in a difficult and dangerous environment. I have commented on the basic aspects of the topic in my earlier posts here. The acuity and accuracy of the ear-brain combo, an accepted fact that has very many proofs in everyday situations even at the present time when man's survival has come to depend less and less on his senses,  have also been demonstrated by many researchers in the field.

While considering the aspect of 'reproduced reality' as an exercise in the re-creation of the original scene in both its visual and auditory aspects brings up another interesting side-comment. Ever since the first day of the demonstration of moving pictures, man has been dreaming of 'canning' events that pass into the oblivion of the past. Over the past decades, success in that attempt has taken huge strides forward. Perhaps one of the most immersive of such 'realistic' experiences is the IMAX movie system. We have at present a plethora of systems that claim to reproduce 3-D visual reality and a truly immersive visual experience. But the question is, do we really take that re-creation of reality as the real thing itself? Or, in other words, are the eyes truly and fully 'deceived' into accepting it as reality? Hardly. I am not forgetting the degree of realism that is provided by many audio-visual simulation system powered by heaps of computing power. But the honest answer is, though they come close, sometimes wonderfully close, to the 'real thing', nothing like a full-scale 'deception' is staged. That being the state of affairs, won't it be prudent to aim for and accept a similar situation in the auditory field too ?

Raising the 'Fi'

Ever since the invention of that tinfoil contraption that wheezed out a passable imitation of "Mary had a little lamb...", man has been trying his best to 'can' acoustic events with all its aspects of reality intact. To the contemporaries of Edison, his phonograph was 'life-like' and that was a sure-fire jaw-dropper. It is interesting to watch how every development, every bit of improvement was hailed as another notch up in 'fidelity'. In the electrical/electronic era, the definitive term was 'hi-fi', which has stuck with us till the present and it looks like it is not going away anytime soon. However, the 'height'  'fidelity' has achieved till date is somewhat questionable, though its aims were honourable!

Fidelity is a simple word that could mean "..the exactness of the copy to the original..", and in the case of the re-creation of an acoustic event, the triggering of the feeling/s that one is at the original venue. One moment you are in your living room, and the next, perhaps in the front row of a concert hall... or in the immersive rhythmic din of the live rock concert ... or in the intimate company of jazz musicians in a club ... you ARE there!

Over the ten decades that made up the last century, it was one thing or the other which the audio questor energetically hunted after in his search for 'nirvana' in audio --a 'full' frequency response that surpassed the limits of audibility, the absence of noise and distortions of all shades and types, virtually unlimited power on tap, and along with that tremendous slew rates in amplifier specs, a whole lot of 'classes' of amplification, vacuum tubes, bipolar transistors, MOSFETs and other exotica...a mind-boggling variety of speaker designs ... each with its promise of the Holy Grail of a perfectly realistic reproduction. The 20th C saw wonderful advances in Digital Signal Processing and it seemed the possibilities now were truly endless, and everybody thought digital was going to be what finally delivered the elusive Grail to man thirsting for true fidelity in reproduction.

The task of recording and reproducing the simplest of acoustic events with a virtually foolproof "deception coefficient" has, however, eluded the technical wizards of our era. Period. I am not forgetting the astonishing level of verisimilitude that some systems have been able to project, but one has to be reminded that there were a lot of 'conditions' for that to happen. To achieve even a modicum of 'reality' with possibly the 'best' conventional stereo reproduction system, you had to sit in the 'sweet spot', rigid as a dummy, with hair parted in the middle... and even then, wonder of all wonders, NOTHING that you could throw at it really deceived the ears to any degree, except trigger occasional bouts of fleeting illusion at best.

The Natural Way

The jury is still out on the debate whether it is good or bad that Henry Ford did not imitate Nature, or its corollary, if it was good or bad Nature did not imitate Henry Ford. So far as we know, Mr Ford is the papa of assembly-line mass production. It is educative to think why Nature, with her requirements of producing the millions of clones of just two different models, did not opt for a convenient and efficient conveyor belt system. We then would have had no duds or non-conformists in this world! Sadly that was not how She chose to work, and therein lies a clue to her methods.

A little 'side-step' at this moment brings us to some interesting questions, and not many answers. Man, with his logical and scientific mind, launches into 'production' of what s/he wishes to make once the prototype has been perfected, and the tolerances set. Quite the thing to do if you want to have predictable results and repeatability. Or so we think. I had occasion to be with two friends as they signed on the dotted lines and became the proud owners of the same model of cars with near enough chassis numbers. Naturally they were 'identical' copies in every conceivable way. But sadly, they 'drove' differently. Ditto experience with another set of audiophile acquaintances. The guy had just one listen to an amp-speaker combination at his friend's place, and fell in love with the 'sweetness' of the system, (True, something more or less similar happens in the case of falling in love/getting married too! ...why do they call it "falling" in love...??? ) and promptly ordered an 'identical' set from the dealer. But then big disappointment upon audition at home. He then carted the whole system to the friend's place for a side-by-side comparison with the 'original', and discovered that though the situation was a lot better, still '"something was missing". In the case of amplifiers and speakers, such behaviour brought on by manufacturing 'spreads', though still within 'tolerance', is not totally unknown to many.

That perhaps ought to remind us that aren't we lucky that Nature (and/or God...) chose to eschew the assembly line concept and instead opted for the more fuzzy methodology. Had she gone the way of Papa Ford and most others in 'assembly-line production', we would have ultimately ended up with millions of 'models' with wide tolerances in hearing, but still "within specs" when it came to (let us keep things simple, and for the moment forget about the other faculties) hearing accuracy. The result would be some of us out on a hunt won't be home for that  late evening session around the campfire, thanks chiefly to our 'spec-accurate' ears had told us that a faint threatening growl was coming not from right behind you, but from somewhere far behind and off to one side... and again, the chorus of the crickets was particularly loud that evening!!

The auditory system, though it has come into much disuse as a result of man's metro-morphosis these days, has a record of astounding accuracy in imaging. These faculties were no doubt developed and fine-honed over the millennia of evolution and survival. A corpus of scientific research work exists that has tried to study and understand the finer aspects of hearing. It is easy enough to come to the conclusion that there are primarily three overlapping mechanisms of hearing that contribute to accuracy of imaging and 'realism'. The first, applicable over the low octaves, relies on Inter-aural Time/phase Differences (ITD) to clue the brain as regards the direction of sources of sound. In the next few octaves at the low-to-mid level, the Inter-aural Level/intensity Differences (ILD) rule the roost. But when you get into the higher octaves above the mid frequencies, a complex 'filtering' done by the external ear or pinna, interprets the directional clues to the max. Our interest in stereophony has resulted in voluminous studies as regards ITD and ILD. But despite many revealing studies, the pinna and its supreme contribution to the mechanism of hearing seems to be largely not given the attention it surely merits.

The evolutionary needs of hearing/directional accuracy has resulted in a unique 'natural' approach to the mechanism of directional analysis. This aspect assumes extraordinary importance when you consider the fact that there is no 'standard' human being, or for that matter, no 'standard' ear too. For a complex living being that assumes its ultimate shape and functionality after a mind-boggling set of millions and millions of cell divisions after the seminal fusion of just a couple of cells, it is not surprising that the human system is not a 'Xerox-copy' of a master blueprint. Rather, it is a highly unique and individualistic system built according to some master specs. As we all know, the fingers and eyes of the human being carry unique individual patterns ( Nobody has found an identical set of fingerprints from two different humans--yet. And again, the unique retinal patterns ( are accepted as identifying markers. When it comes to the ear, the individual ridges and convolutions, the size and shape, the channels and protruberances of the external ear, or the pinna again, have no 'equals'. It would be interesting to take photographs of your own left and right pinnae and study them side-by-side, and be astounded to discover subtle differences--they are NOT cast in 'standard' moulds. Yet, in spite of these marked and 'serious' variations, the individual's sense of aural accuracy remains astoundingly 'standard'.

The Secret of Hearing

Looking into the 'secret' of hearing accuracy leads one to certain conclusions. Nature in its infinite wisdom has, instead of relying on 'standardized' solutions, has taken recourse to the ploy of embedding these individual 'filter characteristics' within the brain after a long period of 'training'. No doubt the shape and patterns of the external ear or pinna changes subtly over the years as the individual grows up. But one "wears the same set of ears" from one's birth till the end of the journey, and as we live within an ocean of auditory events, it would be an easy matter for 'training' the ear/brain combo, with visual verification/confirmation of the direction and distance of sound clues. In a short while the individual masters the complex filtering of his pinnae, and as a result, is able to interpret the directional clues based on the 'pinna transforms'.

Studies conducted about the nature of pinna transforms (Moller, Hofman, Van Opstal and others) and its abilty to accurately interpret front/back, up/down etc clues have confirmed these observations. Modification of the pinna's physical contours and repeating the studies have confirmed that the ear/brain combo is able to 're-learn and adapt' to the new/modified filter characteristics after a short while, and as a result, the ability of the volunteer subjects to accurately interpret directional clues, which had plummetted initially post-modification, came back to 'normal' levels.

The Individual Pinna Signature

However, the one aspect as regards pinna transforms that has not yet got established in current theory and practice is that the pinna IS the "decisive factor" in hearing and our perception of realism, however you might define that. This is particularly so when considering the recording and reproduction of acoustic events. It is common knowledge that a surface understanding of the functions of the pinna has launched investigations into ways of "tricking" the ears into believing that "you are there" at the original venue of the recorded acoustic event. These technical legerdemains have to a great extent been successful also, though to differing degrees. But whatever "sleight of hand" one might do, it remains a fact that the ears will ALWAYS detect that a reproduction is not the real thing, unless and until you factor in the pinna and its individulized filtering into the final equation. This is so because the brain looks for the individual's own 'pinna signature' in the signal/s being fed in and analysed, and when it cannot detect any trace of that, it concludes that the signals being heard are not "heard properly", and as a result they are NOT real!

The 'Atma of Hearing'

The ability of the ear/brain to detect vague 'copies' as  not being originals is a foolproof method. However, while researching ways to re-create reality, we often ignore the absolutely unique and individual nature of the pinna transforms and its key position in the hearing mechanism, which can only lead us astray. This is where I would like to introduce what I call the 'Atma of Hearing' concept. My contention is that only when we succeed in "splicing in" the individual 'atma of hearing' signature into a recorded event, will our ear/brain accept as 'real' a reproduced acoustic event, because then only will we appear to be "hearing" that acoustic event " with our own pinnae" and not with "another's ears".

The Indian Vedic philosophy proclaims 'Ayam Atma Brahma' --"This soul is God".The Atma (the soul) or Atman, which is but a miniscule fraction of God's Divine Spirit, is central to the individual existence. And it is UNIQUE. Atma also means the 'essence', here the essence of human individuality. One is able to understand the concept when one considers the jaw-dropping complexity of the human body and its systems, and also reminds oneself of the fact that without the 'essence' or 'atma', it is nothing but a collection of dead, lifeless tissue.

In the mechanism of hearing also, the "soul of hearing" is nothing but the individual pinna transforms, without which everything loses its 'natural' purity and realism. Like the individual soul, the 'atma' of hearing too is an absolutely unique 'essence' that 'lives' within the brain of the individual as a result of the life-long integration of that signature into everything that the individual hears. Anything that bypasses this "signature filter" is interpreted as being not REAL.

So perhaps it is time we re-channelled our largely misdirected energies questing after the various artefacts of hearing like frequency response, distorion elements, and such like stuff (not that they are worthless; but rather, they do not form the core, and so merit only secondary attention) into a whole new avenue, that could in all likelihood lead us nearer to the Holy Grail of hearing. Our efforts will be crowned with success if we are ready to accept the fact that within each individual there reposes a unique 'Atma of hearing', and the integration of that into the chain of reproduction holds the key to the re-creation of reality from an auditory perspective.

More than ever, the concept is perhaps closer to realization in the digital age as we now have advanced tools like DSP that in all likelihood will be able to reward our efforts to "measure and codify" the individual 'atma' signature and "splice that in real time" into the digital data that represents the three dimensional sound field captured at the original venue. This then will be the one sure way to transport the listener 'there', as s/he will be once again "hearing it all with her/his own beloved pinnae".

How to achieve that should occupy us in the days to come if we are serious about 'reproduced reality'.

                                  * * * * * * * * * * * *

Saturday, January 28, 2012


A degree of simplification is often the key to grasping a concept. It is perhaps a lot like the argument about learning to swim. Swimming, experts will tell you, is the art of learning and mastering the many strokes. But the knowledgeable will also add that the first step is to learn to be comfortable in the water and then learn to float. Jumping in at the deep end and launching into the standard beginner's stroke is likely to earn you a stomach-ful of water and a holy terror of swimming.

Audio is at best tricky waters, and learning to maintain your balance in a strange new medium is the key to expertise and enjoyment of the many delights of sound reproduction. May I urge the 'pundits' to close their eyes, with an indulgent smile on their faces, as the newbies are invited to a frolic in the shallows of audio in the following discussion.

Sound propagation --  just 're-imagine' the concentric circles as concentric spherical waves in  3-D space

Anybody who, as a child, had thrown a stone into the still waters of a pond knows how sound travels in the air in ripples or wavelets. (If you haven't yet done that, it is time you hurried to do that experiment!) On a rainy day, the many raindrops that cause a series of ripples on the surface of the water demonstrates the interaction of the wavelets. If you have disturbed the surface of water in a large container, then you are sure to have seen the reflection of the wavelets from the edges of the container, and a complex interaction with the original wave. It is a fascinating sight.

Now perhaps the first reminder for the sound enthusiast is that sound is not confined to the two dimensions of the surface of water or that of a sheet of paper on which the wavelets could be drawn. From the source it spreads all around in the three dimensional space in a spherical pattern. Imagine soap bubbles, one within the other, expanding as they are blown, when you think of the expanding wavelets of sound around the source. The main thing here is to imagine that this is what happens when a bird tweets, when somebody speaks/sings, when a musical instrument is played, when a cracker is burst etc.
( the loudspeaker here is idealized as a source )

This spherical spreading out of the wavelets of sound occurs because most often the source of the sound has smaller dimensions than the wavelengths it produces. However, in a real room or hall, the propagation is more hemi-spherical as is illustrated with this idealized speaker output. (Idealized because in real life speakers do 'beam' a lot, and do not always launch the sound equally well around them.)
(idealized source)

Even a grand piano, when played can be heard with equal fidelity from any angle around it, though musicians tell us that concert pianos are meant to be played with the 'lid' up and the audience facing the reflecting lid. However, inside a room or a hall, the immediate reflections also contribute to the feeling of hearing it 'fully' from any angle.

The second reminder for the audio enthusiast, and especially the one who is more interested in the reproduction of 'canned' (recorded) acoustic events, is the fact that reflections are many even in a large auditorium, to speak nothing of the average living room. It is lucky that we do not get confused in the midst of a medley of reflections, thanks to the ear-brain combo 'processing engine'. A sobering demonstration would be to push your finger into one of the ears while listening, and immediately the 'clarity' of the sound image collapses into a confusing and irritating mayhem. Now be careful to not jump to the conclusion that this is the magic of having two ears and so stereo (stereophonic sound reproduction) is the answer to all the ills of perception. Sadly it is not, and it is a complex issue that has not been fully tackled. Just understand that reflections are a part of the real auditory scene as we know the sound wavelets propagate all around the source; the question is what are the 'needed' reflections and what are the unwanted reflections, and do we have any control over them while recording and reproducing acoustical events. Your understanding will grow as you progress with many aspects of the audio art.

The third set of reminders have got to do with frequency, tone and timbre and perceived realism. Though elephants are known to communicate over vast distances using sub-sonic frequecies, the human range of hearing, by common consent of experts, is limited within 20 Hz to 20,000 Hz. The unit is the Hertz and it is one cycle per second, and a cycle is a comlete vibration from the rest position to and fro, initiating a compression and rarefaction of the medium (air, in the case of speakers, which are easier to understand). Here are representations of a tuning fork vibrating to create the compressions and rarefactions of the spreading sound waves.

Tuning forks and sound wave representations

 When a speaker cone reproduces say, 50 Hz, it is vibrating back and forth fifty times a second. It will be educative viewing the cone in the light of a neon lamp or a fluorescent light with some flicker and varying the frequency up and down a bit. Often say, when reproducing a drumbeat, when the loudspeaker cone jumps out at the first transient, many think that this is what produces the 'thump'. Yes and no. The jumping forward of the cone creates the high amplitude sound wave, but the frequency and character of the sound per se is created by the vibration of the cone which is not visible to the eye. Remember. movies 'move' because of persistence of vision, and anything more than 10 Hz (no, not the sound, but the movement of the speaker cone!) is difficult to see.
Frequency and wavelength relation

Now speaking of the frequency range of human hearing, don't be in a hurry to swallow all that about 20-20,000 in a simplistic manner, and try to listen for the extremes of the range. Extreme low frequencies may be aplenty in the home theatre stuff, but they are very rare in real life, except perhaps when there is a thunderstorm or when you are near a huge waterfall or when the sea is stormy-or maybe near a fireworks display! As for the high frequencies, they are again only the modicum of 'garnishing' that gives character to the real world sounds. An extreme dose of HF, as is put out by a modern loudspeaker driven by an amplifier with crazily set tone controls, playing an 'unnaturally recorded' track, easily brings on listening fatigue and ear damage in a short while. So much to discourage you from joining the 'tish-boom' brigade. The key here, as it is always in audio reproduction, is to try and get as close to 'natural' as possible.

It is time for us to familiarize ourselves with the frequencies--and how they sound. The Web has today given the layman many advances tools and facilities to advance the knowledge of his hobby. Here is one of the many links ( ) that lets you download free samples of audio files at various frequencies; download and play them in the computer itself or write them onto a CD, and remember NOT to convert them into low-fi .mp3 files, but to preserve them in the original .wav format itself. It would really "open your ears" as you start listening and discovering many new things for yourself. For example, how difficult it is for you to hear very much above 10,000 Hz as you become older. So take all that you read about with a large pinch of salt and listen and learn for yourself. Remember not to drive the amplifier and speakers with high volume levels of extreme low and high frequencies as it is not good for their health and also for that of your ears. Also remember though you might think yourself to be familiar with the various frequencies, it is not easy to 'remember' them and compare them with the component frequencies of real life sounds--it would take years of practice to have a discerning ear like a trained musician. You might often have seen and heard musicians using pitch pipes and sometimes tuning forks while tuning their instruments. But it is not practically possible to hear even one pure note in isolation in natural sounds.

Take a look at the frequency distribution of the various musical instruments and the human voice, the male and the female.
The audible sound range

The pipe organ, the King of instruments or the Master instrument, covers the full range of audibility (please also note that the larger organs can go way below 20 Hz!), while the concert piano comes a close second. The male and female voices have a very limited range. Researchers of the early 20th C, while studying telephone circuits have concluded that in order to have intellible communication, the frequency range needed is even smaller. The sounds that you encounter in Nature too have a somewhat limited range. Today, after considering many factors, one could say that a reasonable degree of fidelity could be had within the range of say 60 Hz to about 15,000 or 16,000 Hz. Fidelity, as you will soon discover, depends not on frequency response alone.
A sample frequency response curve

That brings us to frequency response, a term bandied about by audio enthusiasts. Is it just the range of frequencies that your amplifier or tape recorder or CD player or speaker is capable of handling? Hardly. To give it its full name, it is actually frequency/amplitude response. And when you say that your amplifier has a flat frequency response, what you mean is that it is capable of handling the specified range of frequencies without altering the relative levels of the frequencies. Suppose it is fed a signal with a freq at 200 Hz of an arbitrary level, along with another at 2,000 Hz half as loud, and also a third at 8,000 Hz with say one-fourth the level of the first, though the amplifier might be called on to raise the overall levels to drive a speaker, the relative levels of the three signal would remain the same, provided the amplifier has a flat response. In other words, an amplifier or other audio component, should preserve the loudness relationships between various instruments and voices in the input signal and should not over- or under-emphasize any frequency or tone. This then is known as flat frequency response.

But then, remember, there is no ideal amplifier capable of doing such a precise job and amplifiers are usually rated to have a "flat frequency response" within say plus and minus a small figure, usually expressed in Decibels (dB). Decibels indicate ratios of voltages and powers, and it is not easy for the layman to have a non-math understanding of unit. The reader is sure to be familiar with units like the Volt, Ohm, Ampere, Tesla etc (each honouring a great scientist), and the Bel is a unit of measurement that honours Alexander Graham Bell. It is a large unit and one-tenth of that is the decibel, called a "deebee" and written as dB. You are likely to find the dB a lot in the specifications of audio equipment and it is easy to remember a few things about the dB.

The sensation of loudness is detected as a logarithmic function of  the sound pressure levels at our ears, and the dB scale indicated that easily. A difference of 1 dB is taken as a minimum change in volume/loudness detectable by ear, while 3 dB is a moderate change. A difference of 10 dB means a doubling of volume or loudness. By convention, 0 db is the threshold of hearing. Other examples inlcude a soft whisper at about 15-25 dB, general background noise at about 35 dB, noise levels inside a home or office is around 40-60 dB, a normal speaking voice goes up to 65-70 dB. The climax of a Western orchetra is known to reach about 105 dB, while rock band easily top 120 dB. There is the onset of pain and loss of hearing from them onwards, and jet aircraft are known to be as loud as 140-180 dB, with an "unhealthy mix" of frequencies at the upper and lower registers. And while on the topic of loudness levels, a good reminder is that a 4 W amplifier can easily sound twice as loud as a 2 W one, other things being equal, while it would take a 100 W amplifier to sound twice as loud as a 10 W one, and as you move up the ratio of loudness, the figures soon become ridiculous and dangerous to your ears!

The average "hi-fi" amplifier claims to have a frequency response that is flat to within plus/minus 3 dB. That is, with possible wild variations, a possible change of 6 dB and it is not a moderate amount by any measure. To understand why this kind of imprecise response could play havoc with fidelity, one has to take a look at real world sounds. The natural world presents us with hardly any pure tones. Every natural 'tone' is a mix of a basic frequency, and its 'overtones' that are multiples of the base frequency. Take the same musical note coming from two different violins or from the mouths of two trained singers able to precisely vocalize the same musical note, your analysis will tell you that the ratio of the overtones and their nature will be slightly different. This is what gives 'character' or 'timbre' to the sound in real life. Musicians can easily distinguish the sound of many particular similar instruments as their ears are trained to recognize the subtleties of the timbral differences relating to the overtones and their levels and ratios. Please note that timbre is defined as the 'quality' of a sound that distinguishes it from other sounds of the same pitch and volume.

Imagine an amplifier that has a wildly fluctuating frequency response curve (very common in the real world!), though it is still within the +/- 3 dB range in its specification, and so qualifies as a moderately good hi-fi instrument. When this amp is fed with a real life sound, there is every  chance that, if the vagaries of response are in the critical middle frequencies (approx: 2 kHz to 5 kHz) particularly where the ear is most sensitive, the output that emerges from it will have altered the timbral quality; that is to say, the relative levels of the overtones are altered, and the signal sounds like "something else"-- and fidelity is lost. To sum up, it is not the 'correct' specifications that can hide the 'truth' very much that matter, but rather, measurements that will certify that an amp has an over-all smoothness of FR that is more important from the angle of fidelity. A timely reminder here is that the same criteria could be applied to every component in the audio chain.

[ More to follow ]