It’s crazy because here we are, brought up with classical music, spending years developing a sound and a voice, being cognizant of acoustics, instruments, halls, and the minutest details of sound quality, only to be romanced by Spotify. It’s not about being a snob. This is an aesthetic thing, and music has so many levels of enjoyment built into the performance.
Is this what we’ve come to—after internalizing sound, tone, timbre, and phrasing, merging it viscerally with emotion, gesture, and meaning?
Music allows us to unify our entire being around the process—and that is how we roll.
We are at once the living Brahms, Chopin, Schubert—or whomever the composer of the masterwork may be. And all of that intensity we bring is, in the end, filtered back out through a substandard rendering.
Think about it. It took well over a hundred years of audio technology to get to this point. During that time, we have carried the tradition of classical music forward with all the great composers and teachers behind us—a long, long tunnel stretching back to Bach and beyond. That is us, carrying all this beautiful work from one generation to the next, continuing what has been passed down to us. All the while, audio was progressing—meeting new standards and always trying to capture the experience of being in the room.
It’s funny, really, that the modern way of streaming has lowered the standard so much that we see the LP coming back—not as fashion, but because it actually is better. Maybe more trouble, sure—but if you know, you know.
The technology is there, but someone decided cheaper was better—that instead of giving us the best possible sound experience to match our giving everything to be part of the tradition, they gave us the minimum. Why did they do that? Why did they make the least effort?
The standard answer is: because they could. They got away with it. Everyone was so busy getting music for free that when Spotify made all those deals with the labels, it was going to be $0.0028 or nothing.
(The stream is currently sitting at $0.004—after Spotify takes their 30%, it’s $0.0028—and that’s to the label.)
It’s not only that it is kind of an insult and demeaning to all that giving of one’s soul, it’s also destructive to the millions of ears that are losing out on what is real and true.
It’s a rip-off. We are giving in to a service that’s destroying the industry from the inside out.
A real bait and switch—give us your hard-won, hard-worked-for art, and we’ll water it down for public consumption.
Really, no one will notice. You’ll see. The benefits of what we do outweigh the losses.
And in the end, the artist always pays.
So what are we giving up? What are we losing in the process?
I don’t mean to spoil your fun. If you get something from Spotify, good. But at least know the difference. Not just what it is ultimately costing the art by diminishing it, but let’s please think about what it is replacing and how it might relate to the lack of support for live music—the live music experience that is being sucked into the black hole of Spotify.
If something so substandard is replacing the live music experience, what have we sold out to? And why?
There are so many deeply gifted sound engineers, sensitive to gear setups, rooms, and the many things that go into creating the sound of a recording. This is for them too. We trust them and work closely with them to make art a lasting legacy—something that duplicates the artist and, yes, creates product and value.
So why are we selling ourselves—and everything we do—short for what amounts to a low-end experience? Who exactly is benefitting from that? You can have the best high-end earbuds, headphones, and speakers—but then you go and listen to music through Spotify, the junk food of streaming.
The worst part is that it is damaging to almost every aspect of the field. The more we subscribe to this level of listening, the less we have an educated audience, and the less we have an audience at all. What is Spotify giving to the arts? Are they supporting artists, arts organizations, or orchestras?
I’m wondering if these streaming services like Spotify are just parasites living off the artists they claim to support, only to exploit them for their own gain. And if this is the case, then what can we actually do?
It’s important to acknowledge that the way streaming works has created an impossible business model for both the label and the artist. They are being squeezed beyond any sane limit, with the streaming service being the only beneficiary of this modus operandi.
The original equation was set up for downloads in a pre-streaming environment. At that time, the service took 30%, and the spoils were divided in various ways between the label and the artist, depending on the contract.
That same equation still exists—except the original $0.99 download is now $0.004 per stream. Of that, 30% goes to the service, with the rest divided between the label and artist.
This results in a 24,650% difference in revenue per song streamed versus downloaded—or roughly 247.5 times less per song compared to a download.
And yes, click farms are out there—places that artificially inflate streams—but this doesn’t solve anything. It just highlights how the system rewards quantity over quality, prioritizing numbers over real engagement.
We can’t go on like this; we need an industry overhaul. AI is also coming in to usurp another portion of the business, and it’s being introduced by the streaming service itself—where they become the sole beneficiary. Get it?
We as consumers have a choice, and maybe it’s time to go back to basics and make live music a priority. Live music needs to be more accessible too. It’s something to think about as we watch what is happening slowly but surely to the ecosystem of classical music.
(A deeper look at the distribution model of physical recordings—and how we transitioned to digital distribution—would provide a more in-depth view of the downgrade this has caused. Let’s save that for another time.)
But anyway, we are talking about playback. That is what we want to find out about here. What are we actually hearing when we listen to a recording—even perhaps one you may have participated in?
It is our good fortune to be able to have a conversation about all of this with one of the masters of playback, Tim Marutani.
I’m not at liberty to reveal the names of his clients or where they live (they live all over the world, I think I can say that!), but suffice it to say, you have heard of some of them.
And let’s just agree that for every ear-minded person, this is something to think about and add to your listening evolution. It may be a luxury to care about what happens to all that recorded sound, but since our lives are indissolubly associated with recorded media, we might as well get enlightened, in case.
Kathy Geisler I wanted to start by talking to you about what you do, so that we can understand where you fit into this puzzle of music.
Tim Marutani The majority of my life has been to focus on the playback side of the recording—the passion for refining the playback side in an end user’s home. This has led me to meet people on the recording side. But my experiences are more vast on the playback side.
KG Can you talk about playback? How can we understand the process?
TM Playback is like eating a meal. The performance is like cooking. There are all different types of palates—some people like their fish well done, others prefer it raw. In the same way, listeners often adjust the sound during playback to suit their personal taste, rather than experiencing the recording as it was originally intended.
Ideally, playback should reference the recording itself—true to the original source. But that’s not often the case, since many consumers make changes based on their own preferences.
Some listeners aim to follow the guidelines of being true to the source: this is the proper timbre of the violin we used, this is the proper pitch that was played. Others have their own interpretations, and they create their own sound—and a lot of that is due to the fact that there are flaws; there is noise coming in on the playback side, just as there is noise coming in on the recording side.
Once the recording is made, that’s the reference. If you take that into the home it might not sound very good to the owner. So what do they do? They massage their system to get the sound they want. That’s what goes on most of the time.
The problem with that is the more filters one puts in the playback side, the less able it is to play all genres of music. It’s going to be more frequency selective so that maybe my system sounds great with chamber music, but when I want to listen to a full symphonic sound it doesn’t sound so good.
Or say you want to listen to a male voice, but then the female voice doesn’t sound so great. That’s usually due to a lot of non-linearities in the playback side—distortions that cause certain frequencies to be reproduced unevenly.
But what I’m sharing with you are issues that were not necessarily taught to me in a formal way, but rather learned over decades. Some things I picked up from people on the recording side, others from those on the design side—the people who design the recording electronics and the engineers who design the playback systems in electronics.
When it comes to playback, the listening environment is much less controlled than in a fixed venue like Carnegie Hall. There the acoustics are designed for optimal sound. But in playback, people listen in their living rooms or bedrooms, where the acoustics vary depending on the space and how much effort they put into optimizing the sound.
In the consumer market, only a minority—less than one-half of one percent of people who listen to music at home—will even think about the acoustics. Whatever setup they have at home, in the space where they listen, becomes the individual’s reference for how music should sound.
It’s all about references. I tend to believe the best reference is true to life.
The more a triangle played back in the home sounds like a real triangle, the better the illusion can be. Because recordings—whether in Dolby Atmos, stereo, or mono format—are all an engineered illusion of the event and what it originally sounded like in the room where it happened.
Does that make any sense to you? I mean, you’ve recorded a lot, and you know that when you record a pianist, by the time you take it home, you try to create an illusion of the event. It can’t even come close to the real event—it’s just an illusion of the event.
My career has been about trying to create a better illusion, and the formula that I use is true to life.
The truer the sound is to life, the less the mind has to process it to understand the sound. The more one has to think about what they’re listening to, the more distracting that is to the message from the recording artist and the producer.
KG It sounds like this is all geared toward an audience of one, in a sense—their own personal sound field, wherever they happen to listen.
I wanted to ask if you could compare or talk about the way most people are listening to address this idea that the majority are using earbuds on streaming services. I don’t want to say it’s come a long way since the beginning of streaming, but where do you think things stand—not necessarily in relation to what you do, but considering that 99% of listening today happens through earbuds and streaming?
TM Streaming is one topic; headphone devices and earbuds are another topic. But the earbuds—the vast audience that listens using earbuds—are one of the main markets for surround—Apple Music and Dolby Atmos.
What they are doing with audio surround is to try and create additional space and cues between the two ears. They are trying to enhance the illusion at the engineering sacrifice of the resolution—to a lower resolution. That’s like altering a meal to fulfill the palate of an average person, which is fine—you have a larger audience, a bigger market—but the sacrifice could very well be the quality of the ingredients that go into that meal.
KG So it’s lower resolution to use Atmos?
TM Atmos is a lower-resolving medium. It brings out a lot of cues of space—vertical space, high to low. Creating that space is a lower-resolving signal than when recording a piano, where you might use various high-resolution formats to achieve a resolution that gets closer to the sound of the analog signal (such as 24-bit/176.4kHz, 24-bit/192kHz, or DST—a lossless compression method that preserves audio quality).
But Dolby Atmos—the surround sound format—relies on spatial cues to create a sense of depth and immersion. These cues help generate the perception of space, making it easier to achieve a surround effect when one has five or six loudspeakers in a room.The signals that create that space are lower in resolution than the signals used in a classic two-channel recording—that’s a fact. But it’s also a fact that there are a lot of home theaters and surround sound systems in the market, and these systems can play multi-channel music.
One has to remember that 40 or 50 years ago, we had four-channel records. We had quad—where did that go?
(Quadraphonic sound, as it was formally known, was an early form of surround audio that used four discrete channels to create a more immersive listening experience.)
The earbud audience can well appreciate the Atmos sound—somebody who really enjoys getting involved with listening to recorded material. I don’t want to invalidate one market over another, but I do want to differentiate the engineering quality of one over the other.
Surround is a lower-resolution signal. Period.
KG What is the resolution of surround sound?
TM It can vary. Right now, only a minute number of recordings are done at 24-bit/192kHz. I would think that 99% or more of surround recordings are done at 24-bit/48kHz, and that’s because of bandwidth limitations. When they’re recording, they just don’t have enough space to capture that much data.
KG But Atmos doesn’t limit you—right?
TM That’s correct. Atmos does not limit. But the limitations of high-resolution tracks come at a cost—more hard disk drive space, more processing. A lot of Atmos recordings—a lot of surround recordings today—are generated from the movie industry. And the movie industry captures at a 48kHz sample rate.
KG Right.
TM Because they just can’t take a rig out into the field and record 24, 32, or 64 channels at high-resolution sampling rates. Most movies are recorded at a 48kHz sampling rate. The decisions are made right out of the gate.
Even if we look at how digital audio recordings were going up to 96kHz years ago, that is already twice the sampling rate of what most surround is today. And, as well, multi-channel SACD has greater resolution.
But the whole point of surround is that one can create an engaging experience. It’s all about the experience, isn’t it?
KG Yes, absolutely. That’s where this has all been going—I guess. So is that what this new Apple Classical is about—creating a surround listening field?
TM For earbuds, yes. Think of all the AirPods they are selling.
KG But you have to have these special new AirPods that do that—the newer AirPods Pro that give you the forward-and-back experience.
TM And Sony has their version. They call it 360 Spatial Sound / 360 Reality Audio. Dolby is the gorilla among them all.
KG Why do you call it a gorilla? Because it’s the leader?
TM The big name. Yes. Hard to push a gorilla over.
KG Yes. So let’s talk a little bit about some of the things you do that may or may not be accessible to most people, but that they can still learn from in some way.
When you design a listening system, it’s all about the acoustics of the room, the listener, and what kind of music they listen to. Is it mostly like that? The narrower you can get, the more refined an experience you can have—right?
TM No, not the narrower. A good playback system should be able to play anything from A to Z—whether it be Bad Bunny, ZZ Top, or Strauss. The better the system, the more linear it is, the more it will translate all that music.
Playback levels and the volume of the room play roles, but once again, it’s about creating an illusion in the house. The roles I play today that are unique are basically about setting up the infrastructure to put a system in, in general.
So, for example, start with the room—that’s your starting point—and how to deal with the inherent issues of that space.
The enemy to both the recording side and the playback side is noise. That noise could be environmental—sound coming from the outside to the inside. But the silent enemy is the noise created by the electrical grid.
In today’s world—including your house, Kathy—if you walk around, think of all the items: LED light bulbs, appliances with built-in computers, the computer you’re talking on right now. All around you, it’s like poison ivy vines—you’re trying to walk through the house without getting a rash. All of those items create havoc—let alone the noise coming in from the power grid and your neighbor’s home.
The more one can isolate playback system electronics from that noise, the more one will hear what the electronics can really do. It’s like looking out of a picture window instead of stepping outside. Or like going outside with rose-colored glasses versus neutral glasses. Rose-colored glasses will tint the world—that’s what noise does.
And that noise is one of the reasons why people trying to improve their sound keep changing things—because they’re constantly trying to filter it out.
What the retail consumer side doesn’t teach the end user is this: Don’t filter out the noise—focus on eliminating it. Go to the root source.
Going to the root source is not an exotic way to sell equipment, which is what the retail market tries to do. That’s what they do—they sell. And if one product a customer buys doesn’t have the right sound, why give them advice on how to correct it? Just sell them something different. Or sell them some add-ons.
It’s like when you don’t like the taste of a meal—just keep adding more spices. I don’t think that works too well.
Is what I’m saying making any sense to you from your background?
KG Yes, absolutely.
So, I want to switch gears for a second because there’s something I wanted to ask you about—the music delivery systems we’re using now. That includes LPs and CDs, and then all these higher-end formats. I don’t even know what form to call all of them, but what are you working with in that realm for music delivery systems? What do you use? I mean, is it HDtracks?
TM Oh, as a commercial tool I’m playing vinyl. I’m playing tape. I’m playing digital files which are both in high resolution and normal CD resolution. When I say files from a CD, we could take the music information off of a CD and we can convert that into files for playback, which is the physical version of streaming.
But the commercial streaming tool I use would be Qobuz. And lately, every once in a while, I go to Dolby Atmos in a theater-room and I play surround by Dolby Atmos. I find my customers like that because with Apple TV and streaming they can pull up and read the lyrics as they’re listening to the music. Sometimes they are identifying the words in the lyrics that they’ve never understood before in songs that they have listened to all their life. That’s another version of entertainment.
KG So, it projects the lyrics on a screen when you’re listening to audio?
TM Yes. In the home theater market, that person already has a multi-channel system for surround sound movies, so that room can also be used for streaming multi-channel music.
KG Interesting. So, when you say Qobuz, what’s the resolution that you’re streaming in?
TM Anywhere from 24-bit/192kHz down to CD quality (16-bit/44.1kHz). Let me say this—I do not have any material that is compressed MP3, lower resolution. I have none of that.
KG So Spotify is out?
TM I think their highest resolution might be CD quality in today’s world.
KG So in general, they would be at the bottom of your list.
TM Not in general. They are at the bottom of my list.
KG And the top of your list would be Qobuz, and then under that would be Apple for the Atmos theater listening experience?
TM Well, look, let’s just say this—for the top of my list, I’m not streaming. I’m playing back half-inch analog tapes. And you know what a tape machine is?
KG Are those one-off recordings?
TM These are copies from the running master. And the running masters are the material that may have been used to produce the vinyl.
KG Were these analog recordings?
TM Yes. Analog. But once again, that’s not for most people because they are extremely expensive and difficult to acquire. And playing them back requires specific equipment. But that’s high up there in the food chain.
KG That’s the pinnacle of the playback experience to be had.
TM Yes.
KG The thing that’s closest to the real thing. Maybe.
TM Well, a large part of the audience will say that there’s nothing better than vinyl—except they have no idea that what the tape does is eliminate four or five replication stages in the process. And so, a properly done tape will exceed the quality of vinyl.
KG Is it a master, or is it just a raw recording?
TM It’s all of them. Some of them could be a raw recording, some of them could be stereo mixes from multi-channel music—all the way from Miles Davis to the Beatles, the Rolling Stones, Berlioz, Mahler, or Heifetz.
KG And these are from recording sessions?
TM These are from the record labels. They would send tapes out to the mastering engineers to make a record. These are copies. A lot of them are copies from those running masters.
KG They are pre-masters.
TM What’s the definition of a pre-master?
KG Well, if it’s the tape that was sent from the recording session, whether it’s mixed or not. If it hasn’t been mastered yet.
TM It’s the tape from the mastering engineers sent out to make a record. So then it goes to the disc cutter.
KG Oh, so it is a master.
TM Yes.
KG Okay, so it’s a master before duplication and production. And in this case, in the food chain, it’s a duplicate of the master.
TM Okay. But you know they’re sent all over the world to different mastering engineers to be duplicated for various markets.
KG How rare are they?
TM They are rare.
KG Do you buy them from a person or a store?
TM You can buy them from companies. One company would be Analog Productions—that’s in the Midwest. Another is a San Francisco company from Paul Stubblebine called The Tape Project. But it’s a smaller segment of the market. It’s also a very serious segment of the market. It’s extreme. It’s the most expensive medium.
KG How much does something like that cost?
TM One recording easily costs five hundred dollars.
KG $500. That’s what I was going to guess.
TM Because a blank tape could be a couple hundred dollars.
KG Right. And then, what’s the lifetime of something like that? How many times can you even play it back before it starts to degrade?
TM It all depends on the care one takes. And in answer to your question—you’ll get 20 different answers. Some people say decades. And think about it—how old is a Heifetz recording that I may have? How many decades old?
KG But also, you have to store tape in a special way. It’s like having a wine cellar for audio recordings.
TM Correct. That’s correct. You can’t have anything magnetic near it.
KG Yes. It has to be isolated from electric currents.
TM But the isolation can be as short as six, seven, eight, or ten inches, depending on how strong the magnetic field is.
KG You don’t put it in a lead box or anything like that.
TM No. And you don’t put it on top of a loudspeaker because of the magnets in the speaker.
KG Then there’s another market for live one-off recordings—a non-edited recording. I guess that would be the same as a raw recording.
TM There’s another engineer in the Midwest, in Chicago—Jonathan Horwich. He’s very well known. He’ll go out maybe twice a year and do a live recording. He then duplicates that live recording for his tape audience, and record labels will sometimes reach out to him and authorize him to duplicate previously recorded albums for distribution.
The big boy is Analog Productions, a division of Acoustic Sounds.
KG These are the people that cater to this high-end audiophile audience. And how big is that? Would you say 500, 5,000? Like that? Somewhere in there—10,000, maybe?
TM The audience is growing. And the businesses catering to that audience are growing. There are at least 10,000 people who are buying audio tape recordings at somewhere between $250 to beyond $500.
I have a half-inch tape machine. Half-inch tape costs a lot more than quarter-inch. Prices for new 15 ips (inches per second), 2-track tapes are between $200 and $1,050 per reel, and some albums require two reels—so the cost can be as high as $2,100.
The key for the classical audience is, you know, you have the components of timbre, pitch, and intensity in classical recordings. The better the playback system, the more one can hear that timbre.
I’ll give you an example. I’ve had industry reviewers visit me, and we would play some recordings they were familiar with—one in particular of a guitar trio. And for the first time in their lives, they realized the timbre of the three different guitars in the recording. It was an epiphany for them.
You’ve got three guitars in a mix, and the industry person never knew the guitars sounded different from one another. It’s no different than looking outside your house—you remove all the dirt on the window that builds up over a year. It’s like having a cataract on an eye. When people get new lenses on old eyes, the world becomes brighter, more vibrant. Same thing with music.
KG So, when you said three guitars, you mean a recording with three guitars playing simultaneously?
TM Yes, and for the first time, he realized the sound differences between guitars one, two, and three.
KG Wow. Okay, let’s bring this back now. I want to ask you— for the average listener, a person who has access to consumer materials, they can get Qobuz and listen to it over good earbuds and have a better experience.
How can you take what the average consumer has and elevate their experience with things that are accessible to them? How can they improve their experience?
TM Let me give you an analogy to your question. How do you teach somebody the difference between store-bought bottled mayonnaise and homemade fresh mayonnaise? How do you identify for someone that tomato ketchup is a whole different taste than a vine-ripened tomato? Or that a vine-ripened tomato is a different taste than the tomatoes one buys at the grocery store?
Exposure. Experience. It doesn’t necessarily mean that the individual will like the taste of fresh mayonnaise, but they have no clue.
I’ve given fresh mayonnaise to friends. “You know what this is?” “Hmm, it tastes kind of familiar.” They couldn’t identify it as fresh mayonnaise. And then I said, “No, that’s fresh mayonnaise—fresh.” The reference all their life has been bottled mayonnaise.
KG Well, look, in the case of musicians—let’s say, or people who work with musicians—they’re used to hearing the real thing up close, in an acoustic setting within the room. So they do have a reference point that is valuable.
TM Okay. That’s a different reference if we’re talking about playback. I can respond to your specific issue. I’m working with a talented design engineer who’s also a musician. He was a professional guitarist—they took him out of high school so he could play with a band in Europe. That’s how much passion he had for music.
He had to relearn how to listen to music from a playback standpoint. As a guitarist, he was listening in mono. He had to retrain himself to listen to stereo in playback.
So, where I’m going here is that it’s valid with a musician’s ears, but it’s a different perspective. It’s a whole different perspective from what the pianist experiences sitting on his chair versus what the microphones pick up—which are positioned differently from where the pianist’s ears are and also capture the acoustics of the room.
And then, add another layer of artistic work—once the microphones pick up that information, it goes to the tracking engineer, the mixing engineer, and then the mastering engineer. And then you have your final product. And that’s true whether it’s Qobuz, Atmos, or a producer’s work.
What the musician has is a reference, but it’s a different perspective from an audience listening to that musician.
KG Okay. But is there something that you could suggest? Is Qobuz that much better of a resolution experience that it would be an easy upgrade?
TM That’s certainly a valid suggestion. And the reason I phrased it that way instead of calling it an upgrade is that you can expose somebody to a really good vine-ripened tomato—it doesn’t mean they’re going to like it. But it’s a very good suggestion.
KG And then do you get into personal listening devices, such as earbuds or anything like that?
TM The only time I listen to personal earbuds is when I’m on the phone. When I’m listening to music, occasionally I’ll put on a great set of headphones. And 99.9% of the time, I’ll turn on a playback system.
Another suggestion I have is that musicians get a really economical stereo system, add a turntable, and listen to some vinyl—which brings out more emotion.
Because why are we listening to music? People I know in the music industry cannot just sit down and have music in the background. When they sit down, they’re really listening to music and appreciating it. They don’t necessarily listen to music while cooking in the kitchen.
But with some of the engineering suggestions I make, one could be cooking in the kitchen and still hear the improvements in the sound.
A lot of the equipment that people buy involves a phenomenal amount of money—sometimes more is spent on how the product looks than on the actual product itself. Not how it sounds.
What’s difficult is that we’re trying to condense experiences that I’ve learned over the course of 50 years from a lot of brilliant engineers—a lot of experiences. Whether they be Stephen Jarvis or Keith Johnson of Reference Recordings—when one tries to condense 50 years of learning, of making mistakes, into one conversation, it becomes difficult to organize.
Everything I’m sharing with you comes from memorable experiences, so it’s important.
Some things took me a career to learn (or shouldn’t have taken me a career to learn) because engineers had been telling me about this for 30 years. Understanding what they were telling me, however, is another story.
KG That’s true for most musicians. I think our ears are always evolving, and our understanding of sound and music is always evolving.
TM The retail side of my industry doesn’t teach it, and very few practice it—because their main motive isn’t the result, it’s the sale. It’s like the difference between buying an instrument and really going out and buying the right instrument.
One could sell a piano to somebody, but that piano might be too large for the room it’s going in. And you don’t know how many times I’ve walked in and the consumer didn’t understand what I said: I think that piano that you’re going to get is too large.
At the end of the day, it’s about understanding how all the parts work together—the space, the technology, and the people using it. And because I’m always learning—from the people I work with, from new technology, and from real-world situations—that path has led me into some surprising and transformative projects.
As for what else I’m doing, 512 Engineering has a website that shows a collaboration I have with an engineer, a former musician. He also developed a medical device that gets implanted into the human heart.
How’s that for bringing together tech and music? M
Leave a Comment (Comments are Moderated)
You must be logged in to post a comment.