With a good-natured request let me persuade my dear, gentle reader to explore the subtleties of the title I have chosen for my attempt to try and sound the depths of sound, the audiophile's first and last preoccupation. Then let us, together, apply the many caveats of sense in arriving at some sort of an acceptable re-definition of what we have all along been trying to enjoy-- an attempt to 'can' and then reproduce an acoustic event without mutilating its 'life' in any manner. Have we been successful in doing that --at least to a large extent, though with some minimal losses, as it has to be with any imperfect scheme in this imperfect world?? The answer is moot, as any and every sensible person would admit and accept.
Sense and the Senses
Sound happens to be only one of the many stimuli that impinges upon man's senses, making his life more enjoyable or more miserable, depending on variables that are as uncertain as the weather or as some are fond of pointing out, as fickle as a woman's mind. So before this 20% of our sensory input (we have five senses, as has been accepted!) is examined in some detail, it would be educative to look at the larger percentage of the sensory inputs and see how they are dealt with in our day-to-day lives.
Sight, smell, taste and touch make up the 80% of what all come barrelling at our senses. Have these received a measurement-oriented, scientific 'assessment' of their 'correctness' before being 'accepted' more or less completely? Hardly...
Painting, before photography made the enjoyment of visual stimuli more universally accessible, reigned uncrowned, yet with its own rules in a world of its own. The entire history of the enjoyment of purely visual stimuli (for convenience, here confined to the world of painting) is without one instance of that enjoyment being the result of a measurement of chromatic or luminance values so as to meet some sort of a 'conformity' to certain arbitrary standards. History is totally silent when it comes to the question of whether the great masters insisted on somebody's knowledge of the 'science' of light and colour before he was accepted as an understudy. In painting, there are times when visual 'verisimilitude' has been given the 'go by' as the masters explored the as yet unseen 'other sides' of the visual idiom.
Moving forward into the 'modern age' with its preoccupation with, and its possession of the tools of, measurement of most of the natural phenomena, we are still greeted by a lacuna when it comes to the enjoyment of the visual that depends on a set of 'measurements' of a narrow, specialized sort. The stratospheric prices that are offered for canvases of the old masters are not arrived at after a photometric or colorimetric analysis--so far as most of are aware.
In the visual field the ultimate tools of judgement have remained the same --the same old pair of irreverently inaccurate (aren't they deceived by optical illusions?) eyeballs that man possesses. They had served man over the millennia as we evolved from God alone knows what primeval state to the present when, without batting an eyelid, we write out cheques totalling millions after the auctioneer slams his gavel with the loud proclamation 'Sold' -- all for a piece of canvas with vague 'impressions' created by brushstrokes and paint. It is astonishing to note that NOT ONCE has anybody felt a dire need to re-examine the 'decision' made by the pair of eyeballs in the light of more dependable and accurate instruments devised by man's ingenuity. It just doesn't make sense-- one way OR the other!!
Now let us for a moment turn to the world of wines, a multi-million-pound industry. It again is a unique world of taste and smell and maybe a bit of hype. Wine tasters/blenders are some of the highest paid individuals. Their nose and their palate are fine-tuned 'instruments' of measurement. But please pause to ponder if those 'traditional instruments' have ever been upstaged by some new-fangled gizmos. No way!
One of my old friends is a tea-taster. When we ordinary mortals pay to drink mediocre tea, he gets paid--and that too grandly-- to taste any number of prized teas! When I checked last, no machine had been good enough to displace him. Today there is no dearth of analytic instruments, and some of them are truly astounding pieces of hardware. But they are just that--hardware. Would the distilleries in Scotland and elsewhere who pride themselves about the 'undefinable' flavour of their produce ever go for a machine of impeccably accurate 'taste'? I wonder...
And what about that industtry behemoth that has the entire world by its nose--the perfumers? It is a "black art" that has not yet chosen to go the way of "pure, unalloyed and objective" measurements. The nose, with its two 'inlets' and a single cavity, reigns supreme.
While on the olfactory plane, think of all those sniffer dogs that do duty alongside an array of sophisticated scanners and detectors at every airport. When it comes to results, often teh dogs are better and surer. The latest wrinkle on the nose (pun intended!) is that Japanese scientists have trained dogs to detect cancers at a very rudimentary stage--much before they come to register on our 'sophisticated' scanners.
The senses have been evolving with man and his complex environment. Even when we look at periods considerably shorter than millennia, like an average person's lifetime, we could easily see that our senses evolve quickly and in a practical manner to serve a need. The increased aural acuity of a person whose visual faculty has deteriorated is a case in point. An aurally deficient person has heightened visual acuity as is evidenced from the observation of such persons. By extension, it can be surmised that millennia of evolutionary preferences have equipped us with 'sharp enough' senses to let us make "full meaning" of the plethora of stimuli that flood us every moment.
While not denigrating measurement as counter-intuitive and as 'clinical', they have to be seen for what they really are. It is something like a "chicken-and-egg" situation. That a phenomenon could have a subtler aspect starts as an educated premise, and we think of a manner and methodology of identifying that with measurement. That means measurement is only an 'extension' of, and not a substitute for, the primary senses. It is only with an acceptance of that factoid that one must get into the world of sensory inputs and their validity, and the essential (yet confusing if not properly understood and deployed)aspect of measurement. What to measure, how to measure and how to correct are questions that ought to be suggested and validated by the sensory inputs primarily, and not in a 'scientific clean room' environment removed from the reality of the world of the senses.
Sound and Sense
Leaving out the little-understood (and even less 'measured'!) sense of touch (which perhaps is made good use of by the visually challenged and certain sighted others), let us now move on to 'audiophilia'. Often there is a clear division here between musicians and non-musicians who are in a way equally enthused about music. I have among my friends a pianist and a bass guitarist. While listening to a recorded piece of music, it is interesting to note the observations of both. The pianist is often concerned about the general tonality and rhythm, while the bass player primarily notes the peculiar manner in which the bass lines interact with the rest of the music. A lay 'audio' enthusiast, on the other hand, notices what is often termed the 'fidelity' of the recording, and often misses the nuances of the musical techniques.
But before all else, let us ask ourselves a question. Why is it that there is an undue importance accorded to measurement in sound -- that is lacking almost totally as regards the other sensory inputs? Are we justified in stressing certain aspects of the auditory stream for measurement? Is our understanding complete and 'correct'? Why don't we listen to our ears more readily, in a similar manner that we do with the other senses?
By the way, what do our ears tell us? Over the millennia these funny-looking protuberances on both sides of our head have, along with our peripheral vision and our reflexes, stood us in good stead as survivors in a dangerous planet. Localization and spatial clues, as also clues about the 'character' of the venue of the acoustic event, rock-solid 'mental imaging' when everything including the source and the receiver are moving like crazy, are some of the important data that we glean from the auditory inputs. And to add to the complexity, the 'audio' ranges from a barely audible 20 Hz at one end to another barely audible 20 kHz at the other end, according to pundits. Then there are those on the lunatic fringe who claim that nothing less than 30 or 40 kHz would 'really' make music preserve its 'life'.
It is comparatively easy to plan and conduct tests which could show that what the ears perceive about the spatial and other clues mentioned above would 'hold together' pretty well even when the range is limited very much. Without going into invloved arguments, it is easy to see that just as with the other senses of sight, smell and taste, the subtleties of hearing too have not been understood enough so that they could be measured 'correctly' and interpreted. The situation is not far from the classic encounter of the five blind men and an elephant. The ' part-interpretations' made by the individuals about the huge animal could very well be true from their perspective, but unfortunately the sum of the interpretations reveal at best only a caricature of the elephant.
Sound Tales and the Fallacy of Stereo
Stereophonic sound originated as an attempt to record and reproduce acoustic events using two channels (or more, as is popular practice today) in such a manner that it could mimic the original event. Has that attempt succeeded to a satisfactory level?
Here permit me a slight detour, if only to answer the many 'injunctions' from the binaural brigade, by introducing the first of the sound 'tales'. Just prior to starting on a jungle trek long back, I received a few binaural recordings. I do not know how they were made, but I wish to believe that they were made with a 'dummy head mic'. I had at the time a rather good quality Sony Walkman with built-in recording mics, and promptly transcribed the binaural pieces to Ferri-Chrome cassettes using my 3-head home deck. Camping near a river, early next morning I took the Walkman, my fav pair of open headphones, with 'on-edge' transducers that radiated front and back and sat with its edge on the pinna. Sitting on a rock near the river, the surrounding echoing lightly with the gurgle of the river and birdsounds at a distance, I switched on. The recorded nature sounds, played at a moderate volume level, soon had me transported to somewhere far away. The sound was uncanny, with bees buzzing realistically and bird sounds making me look for them left and right and often behind me etc. Switching to some music tracks (sorry, only Western classical), at a slightly higher volume level, I could close my eyes and forget about the actual surroundings and immerse myself in the music and its realism. Perhaps it was as close to real as a recording has sounded to me all these years. But the illusion, though very good, suffered from one drawback-- I was getting a stiff neck from remaining motionless and tensed. A slight movement, and the image moving with me inside my head would leave me confused. It had happened earlier too with the natural sounds, when I turned to look for the bird. The effect was very good while it lasted, but all it took to undo the castle of cards erected by the auditory image was the slightes of head movements.
I have no wish to take conventional stereo speakers to that wilderness setting in order to see how they would reproduce the "canned reality" in the cassettes. I am prepared to be in my average sized room, and in deference to the 'speaker brigade' I am prepared to follow their directions to the full. So the speakers are set up as per the recommendations of its designer and I am sitting down, again with a stiff neck and an upright posture, and as one of my less serious friends suggests, with the hair parted in the middle to maintain symmetry of the HRTF--that is, IF you have hair! The stereo signal, when played through a 'good enough' chain of equipment, is able to re-create the "recorded reality"--but only when one's ears are precisely at the "sweet spot", a mythical 'ideal' listening position.
A millimetric shift/rotation of the head plays absolute havoc with the painfully recreated image. If you turn your head, (or, horrors! walk about!) your brain would go into overdrive trying to make sense out of the signals that have gone "out of whack" as they say. With a small percentage in shift of the 'listening position', (I shall attempt to put my finger on its mythical nature as we go into the further tales.) we are no longer listening to 'stereo'; and for that matter, not mono either. Anyday, honest-to-goodness mono is far better to audition than a muddied left or right channel, with some frequency/level dependent cross-interference from the other channel! Your ears never had it so bad...
Now for some timeless tales; timeless because experiences like these are sure to have occurred to most of the readers/listeners. I was riding home one late afternoon on my noisy motorcycle and as I rolled into our front yard, I could hear my little nephew beginning his violin lessons. The more 'serious' notes of the teacher and the 'scrapings' of the disciple were distinct. On many subsequent occasions through the later weeks, the experience recurred. I noticed that as I rode in, I could identify a few things: were the duo seated in our front hall, or were they in the smaller side room, or at times, were they in the open verandah? The clues about the 'venue' were very strong even in that rather noisy setting. The second thing I noticed was that as I walked in, and went towards the inside room, the 'clarity' of the sound improved, but its basic 'quality' remained unchanged. As I continued walking and turning corners etc, the 'image' as I perceived it, remained 'solid', continuing to reinforce my spatial and other clues. Movement only strengthened the realistic aspects rather than detracted from it. If they were in the open verandah, as I went in, the deteriortion in 'quality' was more abrupt, but the other clues, even with a considerably weaker 'signal', got stronger and confirmed their location and the nature of that location.
To amuse myself I often imagine a pair of excellent stero mics capturing all the sound clues that the purists have identified and measured and 'understood'. I would feed them subsequently through a 'good enough' reproduction chain. THEN would I choose to walk in. I should be happy if the system could give me 25% of the realism that I was used to over the months; forget about the quality and the 'signature' of the venue and all that. But the moment I started walking, the whole thing would "skew", driving me nuts if I was serious about 'interpreting' what I heard. Frankly I would be more comfortable with a mono recording-reproduction chain; at least the image will not skew and confuse my brain as I moved, coming as it did from a single, stationary speaker.
That should tell us one thing--two of everything do not make a thing better always! It is a simple experiment within the capabilities of any lay enthusiast. Try it and be confounded.
The second of these "stereo listening position" tales, again, presents a scenario likely to be familiar to many. I am fond of visiting temples. Large temples with huge corridors and high ceilings, all usually carved from granite, have astounding acoustics. Cathedrals come close to the acoustic experience, though the temples are quite unique in their 'signature', believe me. I have often walked towards temples I hadn't visited earlier, and oftentimes have wondered at how close an 'image' of that acoustic venue could be gathered by my brain as it 'listened' to the chanting/singing that was going on inside. The moment you walk in and are in the proximity of the 'event', a veil is lifted to reveal absolute clarity. But the artefacts that had suggested to you the 'layout' of the venue and its acoustic signature a moment ago, get stronger and confirm the image, and then you IGNORE that image, including the decays, reflections etc and perhaps the Haas effect enables you to hear only what you wish to hear with a clarity and 'involvement' that has to be experienced to know it.
As you walk away, again, many things shift, and the signals grow weaker, but the 'clues' remain strong as ever giving you 'solid' images. The experience is enriched when a group of musicians circumambulate the temple corridors, and from where you are standing --and listening!-- you continue to receive "updates" about the passages and corridors and the open areas through which the team passes. It sure is a complex data-flow that manages all that in real time!
Is there a way to "capture" all that, or even a small part of that?? And reproduce that with a modicum of realism? What stereo deployment of mics will do duty here? What about the speakers? And what 'listening position' will be advocated?
The above are acoustically extremely complex situations. We cannot yet imagine to 'tackle' them--certainly not with our current technology and with our current "understanding" of the art and science of sound. Much less, we haven't yet learned to "capture" a non-professional duo like my nephew and his teacher sitting in a corner of the front hall of my home. What we pretend to do with what we conveniently push as a solution to all sound ills --stereo-- is at best a false caricature of the real. The ideal listening position is a myth that is there one moment and not there at another--there is no sacrosanct thing as a 'listening position', though with conventional stereo, it is a must.
An understanding of what we are doing wrong, or not so right, is the first step in seeking out the proper direction. What our present technology cannot hope to do --ever-- in its present state is the one realization that can push us onto more productive pathways of exploration.
What are the aspects that need to be captured to preserve at least the chief characteristics of a real acoustic event? How to code and decode that? And, more importantly, how best to reproduce that so that the "canned reality" is reconstituted into a semblance of the original? These are serious issues that call for serious, fresh approaches.
Such a thinking would make sound sense!
* * * * * * * * * * * *