File-based audio: Remote working for the new and unpredictable future webinar replay
Monday, 06 April 2020Transcript
Cindy:
Welcome, everybody, to the Emotion Systems webinar on file-based audio, remote working for the new and unpredictable future.
So over the last few weeks, work practices have changed, and our customers have been asking us to work with them in new ways. And we’re going to get into that. And I’m here with MC Patel.
MC Patel:
Hi everybody. Thank you for attending this webinar. It’s the first one for me. So, you might find a bit of humming and ahhing, but hopefully, I’ll impart some useful information.
Cindy:
So today MC, let me just tee up what we’re going to talk about, and then I’m going to throw it over to you, of course.
So, we’re going to look at three things today. We’re going to look at; how you can deal with short term scaling and MC has got a really great story about a facility we worked with and how they conquered their lack of live content.
And secondly, we’re going to look at on-prem SAS licensing. And how you can use that for processing archival material and other things as well.
And thirdly, we’re going to get into cloud because got to talk about cloud, right? And how it can help product launch quickly and how you might use it.
So MC, let’s get right into it and talk about short term scaling.
MC Patel:
So, basically the short term scaling, it sounds dramatic in a way. But essentially, what happened is when the lockdown started occurring in various parts of Europe, we were chatting to our customers saying, “Hey guys.” First of all, just to say, “Are you okay? How are you coping?” And so on. And we had a major playout centre in France. And they were playing up 16 channels. And they have an engine already.
And one of their issues was that clearly a lot, some of those channels were live. And they said we have no more live content at the moment. So, we are being asked to process files. And the engine that they had got from us was man enough to do the files they needed on a regular basis. But having literally got twice the amount of content or process through, they needed a solution.
And obviously, when the call came, they needed a solution like now. And so, we always have the ability to offer trial licenses; which didn’t help in this instance because, the server itself that they had, was the production server, which will set up an additional workstation, wouldn’t have helped them. Because they would have then had to work and man differently. So, they needed to scale up the existing engine.
They had the capacity to process three files at a time. They wanted to go to six files at a time. Now, normally, and this is all commercial and licensing stuff, and so on; we can actually do that quickly, but there was a commercial arrangement that needed to become through because they weren’t sure how long they need it for and they wouldn’t show anything. So, what we did basically, is we went ahead and just said, “Right, we’re going to double the processing speed. In the meantime, we will do a bit of arm wrestling on the commercials, and then you go and reach up your purchasing department and get the purchase order from that.”
Now the drivers for this were obviously the customer had an urgent need. We, as a business, pride ourselves on being small and agile at a technical level. Generally, normally you’d get a call like this and say, “I have a file that doesn’t work. Or I have a requirement that I can’t do with you.” And we would respond. And what we did in this instance was responding to the commercial side. We got them up and running literally the day we had the conversation. They started processing the file. A few days later, the purchase order came through. And they’re all happy.
Now, that really is the short term scalability now, because it’s a software product. The licensing is flexible. It’s not hardware dongle or anything. If I press the go button on in, all these things are possible. Clearly, the licenses are designed for perpetual use, but in this instance, we scaled up. And when they said, “Right, we’ve done our stuff, we’re back to normal again.” They would under license. So it’s a simple process, but really an example of saying, the flexibility exists certainly in our organization to be able to help customers.
And really Cindy, that was the scaling and the short term nature of it.
Cindy:
I like that story about the scaling. And I’m just looking through the folks on the call, and some of you are familiar with Emotion Systems. And I see some new names as well. So maybe just give us a little background about Emotion Systems MC, before we dig into the second topic.
MC Patel:
Yes. So it’s interesting. Why would we be able to help if your live content goes away? And that’s because we process file-based content. So anything that’s preplanned, pre-processed, that requires formatting or processing, for example, loudness compliance. Prior to you putting it to air or online, we do that processing. And what we do that is different, is two bits.
The first one is, we will process the audio in a media file. We will take the audio out; we’ll process it, put it back in again and give you a new file. So there is no, I’ve broken the file.
That’s my timer.
The other thing that we do is, we provide this in automated or automatable manner. So we can have watch folders. You just drop the files in the watch folder, assign a workflow through that and process it. Or we can have a Telestream vantage connector, or we can have a MAM, or we can have all of them at once as some customers have.
So basically, it’s file-based for your processing. What we offer effectively is, there’s an awful lot of manual tasks. That boring, repetitive tasks that take up editorial time. Take up people’s time. We offer to do that in an automated manner. So if you need to swap stereo one and two, Dolby encode something, decode something, up make something, down make something. These are all the tools that we offer. And we have a little UI that allows you to program a workflow, and then give it a name, and call it from a multitude of devices.
Cindy:
Nice. So I guess to summarize what you are saying about the short term scaling is sort of, as we all get back into the swing of work, the business model.
MC Patel:
Now, we have been thinking about this for some time, but this example showed that the customer needs are changing. A good way to look at it is that most of our larger customers have a five year or seven-year cycle of capex. So there’s a big project. We’re going to build a big building for file-based processing. And we expect to spend a lot of capital expenditure on that. We see that taking us through four or five years, three years, seven years, a long term planning.
And what this was is a classic example, it wasn’t longterm planning at all. It was, “Hey, I need to do this right now.” Now over the last few months, we also observed is, it’s not the longterm planning suited something like a playout centre where we’re saying, we’re going to take X number of channels to air. The requirement is relatively fixed. But now, we got new platforms coming on, Netflix, iTunes, each one of them has slightly different requirements. Especially for audio, loudness compliance across all these platforms is slightly different, irritatingly different.
And there is certainly the case of something like, if you look at the standards for the traditional broadcasters for America, Europe, there are some minor differences. Netflix has got a completely different standard. They went back in full customs a few years to create another standard. And then Amazon and Apple and all have slightly different variations.
And so, we need to provide tools that automate, but I digress, that the real issue is as these requirements change, you have a technical change needed, which is what I said. We are very agile. When the Netflix spec came out, we set that up within literally minutes. And we had Netflix validated. But now we’re saying that, what about the business need? It’s one thing to say; I need to comply with Netflix or somebody else. What about the business need? Now what the business needs says is, I now need to turn on a sixpence, as we say in England and change my business model.
So what we’re doing is, we’re fortunate enough our technology is licensable. So, we’re now changing our technology and adopting and an offering in a different way. For example, you could come along and say, I need… It has happened recently, a customer had many thousands of hours of content, that they needed to process quickly. It was a migration from one location to another location, from one type of electronic set up to another.
And so they needed to do this migration. And they said, “Look; in order for us to do this migration over 90 days, there is no way in hell we can afford to pay you the capex cost of it because we just need to process as fast as we can.” So basically, and everyone talks about it. This is your SAS model. We will tell you how many hours of content we want to process. Could you please give us a quote for how much per minute, per hour you will charge us.
And in this particular instance, it was a requirement that was going to be staged in the cloud with a partner company. So, we essentially allowed the partner company to launch as many instances of engine as required, in order to satisfy this need. So that really is an example of the dynamic. Now this particular thing, because it was, again, an archive migration. What it said is that, the new format that they wanted, and I’ll just talk about the audio stuff, because there were other requirements, but this is about audio. Is they wanted to do the audio for multilanguage in a particular manner. So they had a bunch of… The sources were an MXF file with no audio and a bunch of web files. And the target was a multichannel audio-video, transmission ready file.
So the workflows were created to do that. In this particular instance, they needed to dynamically create the workflow, because for different targets and different requirements. We modified our API and extended it to do that. As I said, we’re matching our technical agility now and providing business agility. Both of these instances I’ve done are spontaneous ad hoc projects.
But as a business, what we’re really saying is that we would now offer you an on-premise solution, where we can say, you can buy an on-prem license with a certain duration in time. So three months, six months, one year. And you can also add to it, a caveat or an additional requirement of saying, and within that, I want to process X number of hours in minutes.
Now why the two are important is this, that there’s maybe someone who comes along and says, “I want to do $600 a month. I know that come Christmas, I’ll want to go to 1200 hours or just the pre-Christmas season. And I want to do this solution. I want to be able to scale it up like that.” Or someone may come along and say, “Well, I don’t know when I’ll need to use this. It’s incidental. And so I want to do 20 files a month, but sometimes it might be 30 files.” So for the smaller client, having bought 100 hours of processing, which you could use up over 12 months, is a great way to do this. And the tools we’ve developed are very simple in concept, is a license that has got time variability.
Within that, because we have 16 modules. You could individually feature license. So I want the loudness for 12 months, but the goal being for two months, things like that. And then there’s a meter that runs inside the engine that tells you how much usage you have. And so, that would then can be kept at 100 hours, 200 hours and so on. So this is the on-prem SAS offering. I’ve given you the example of the archive as one.
We have another client who came up recently. And they have a very unusual requirement, but an interesting one. We have a pitch-shifting algorithm, which allows you to take 24 frames of short content and give you 25 frames. The video speeds it up to play out faster. The audio speed it up. The pitch changes. We speed up the audio, but maintain the pitch.
Now, in this particular instance, they wouldn’t share the actual exact application, but they wanted to be able to apply a variable percentage of big change. And so we just… The pitch shifter comes on and does that, but they wanted to do this for a six-month rental. Now, the idea there is, we want to do this as a project, and then we will scale. So, that’s a different form of style of agility that we’re demonstrating.
Cindy:
Nice. I like how you got into capex and OPEX there. And that makes total sense; if someone’s got a few files need to process this month, and they’ve got a big project that comes up next month, that this is a good way to address it.
MC Patel:
But the key thing is to be… It may not be the same person, but the variation that we’ve noticed in the customers is, we had a guy in Singapore who says, all I want to do is four files a month of Dolby E encoding. I don’t want to buy stuff.
Now that requirement versus the 70,000 hours or 100,000 hours in a few months, are poles apart. And the challenge we said to ourselves is, how do we use our ability to provide a solution that suits both ends of the scale?
Cindy:
And now does cloud fit into that somewhere? Because that’s the third thing we were going to talk about. You just got into on-prem SAS, and where does cloud fit into this mix?
MC Patel:
Sure. Basically, an on-prem solution is one way you provide the hardware resources, and we license it. In the cloud, you rent the resources, and then you hope to launch it. As a company, because most of our processing is… Well, all of our processing is audio. And we do a little bit of metadata. What’s important to remember is, we deal with the media files. So SD is 50 megabits, or video HD is 100 megabits of video, UHD is 400 megabits of video.
Now, if I were to offer a cloud service and say, “Send me the files.” The cost of egress is for large volume stuff; it’s prohibitive. It doesn’t make sense for us to offer that. So we did something where, everyone’s shouting about, “We are in the cloud. We are in the cloud.” We said, “Well, we have a cloud solution, but it’s a solution with partners.”
The partners can be our customers themselves, who stage content in the cloud. And they’re not talking about egress. They’re saying it’s already there, and we want to use it, but we want Emotion to be part of their ecosystem. And so what we do there is, we have two companies, HoneyComb who are now called Peach. And someone else will come to me. HoneyComb basically are using us in the cloud, because their business was short form delivery of files.
So they were basically, they get short form material. They validated their QCA. They loudness check. It is what they were using for us. And then they deliver it to the client. And so basically, what they’re saying is, before you deliver the file, we will make sure it’s compliant. And if it isn’t, we’ll do something. So, that was a classic use of the customer using that technology.
We’re working with another large broadcaster who are under NDA; I can’t name. Who have an on-premise solution, but they have multiple sites. A dozen over the world, and they have half a dozen places with our solution. And one of the things they’re trying to do is to say, we want to leave our on-premise solutions alone, but we want to state something in the cloud.
Now that workflow will be really, they will build a system in the cloud, of their own making with their own MAM, with their own transcoders and everything else. And we fit into that. That’s much closer to the on-premise solution because the difference is we will give them the ability to launch multiple instances. And the billing will be on a per consumption of media, processing of media.
And then the third solution, which is really the way we see quite a lot of growth is, being a part of a partner solution. So we were working with STVI, and we are providing the audio solutions for STVI. And in that instance, they act like the people who will manage the media, stage the media, send it to us, ask us to do different things. They will monitor the usage. And really, our relationship with STVI, because we’re a consultative business, we’ll talk to the end-user, understand their audio needs, give them a solution that will be fulfilled by STVI.
So the cloud, as I said, we support of clouds as created by the customers, and we support clouds through partners. And there is a trend in the industry, where an awful lot of new companies in broadcast, are coming along and saying, “We will offer you orchestration in the cloud. We will do X, Y, Z for you. You give us the media, and you give us the workflows, and we’ll manage them for you.” We will become an audio component for them.
Cindy:
Yeah. And so it seems like what you’re saying is, we can provide a mix of on-prem and cloud, and that will help people-
MC Patel:
And perpetual licenses. I think one of the things we get carried away with, is we think the whole world’s going to change and nothing will remain the same again. It never happens as fast. And I would say, thank God for that, but, people have been talking about cloud for donkey’s years, and it makes sense in some areas.
Now, one of the migrations of the cloud is, in control. The little bits of information, because as I said, we are media files, the heavy lugging, it’s coming of age now. It will happen. But yes, we offer a perpetual license model, a pay per use model on-prem or a pay per use model in the cloud, with partners or the customers themselves.
So, in essence, as I said, business agility. And hopefully what this means is certainly with us, the customers will be able to, behave in a manner that suits their immediate requirements.
Cindy:
Yeah. I like it. So you’re saying that we can work flexibly with everyone to meet their technical needs. And help them with any changes they have in their audio processing, whether it’s a Dolby encode or loudness or whatever processing.
MC Patel:
Yeah.
Cindy:
Yeah.
MC Patel:
That’s correct. Yeah.
Cindy:
Nice. Well, so today we looked at short term scaling and on-prem SAS and cloud and how all of those can work to together for you.
What questions do you have for MC. I see we’ve got some in the chat. Go ahead and put them in here. And I did want to mention; also, MC mentioned, you mentioned Netflix before. We do have a Netflix guide. So if you’d like to get that, just put a comment in the chat, and we’ll get that guide over to you right away as well.
So MC, is it okay if I bring a couple of questions to you?
MC Patel:
Sure, of course. Yeah.
Cindy:
All right. So let’s see. One of the questions we have is, well, you kind of get into this, but let me just ask this one anyway. If I have a project come up with a lot of hours of processing, what do you do? How can you help me?
MC Patel:
So, there are two elements. The easy bit is we say, how many hours do you want to process? Generally, they may say, “It’s X, but it could be X less a little bit, or more a little bit.” So we would come up, and then we say, “What do you want to do?” And we have, as I said, 16 different processing modules. And, we can come up with a number fairly quickly as to just buy on a per-minute basis. And it’s a little bit like your phone plans. If you want X 1000 minutes, it would be this much if you want this flexibility.
Probably the key thing that I would say though is the process whereby we say to the client, “Let us understand your audio needs. Let’s do some trials. You give us some files. You tell us what you want to do with them. We’ll process them for you. We’ll give them back to you to make sure that this is what you want.”
This is an important part because, in a big plan cycle, five-year planning cycle, there are solution architects on-premises to do this. They’re not there very often nowadays. So we help them. We’ve always been that company where audio… I think most of the people we deal with are video engineers. And they don’t like audio. They might use it and all that.
So, we help in constructing that. And in doing that, we will identify the requirements, give them a solution, and then we’ll price it on the basis of per hour process. Or if the thing is going to be recurring, we could do it as a subscription. So we can just agree, a monthly subscription, where you have a certain limit of hours you can process.
Cindy:
Got it.
MC Patel:
There is one bonus to this, which moves from the capex side, which may be of interest. And that is that, when you have a choice as you have in this 16 modules, generally what happens is, the customer comes along and says, “Yeah, I like three of these, because I’m going to use them a lot.” And the others, “I don’t know. I might use them a few hours a month or whatever.”
Well, because this SAS model is basically saying, you pay per minute. We don’t care what you use. So generally we let the entire toolbox become available. And that means that if the pitch requirement, for example, is an occasional need, the tool exists and you’re just paying per minute.
Now, it took a little bit of soul searching on our part as to, should we do this? Should we do this? But having spoken to our customers, we recognize that actually, this is a good thing because you have to switch your mind for, “I bought something, and I own it, too, I pay for what I use.” We had to learn that, mentally accept it.
And I think the customers will welcome it because they’re saying, “Well, if occasionally I need to do an upmix, I don’t have to go out and spend the money. That tool exists.” Now you may need to create the workflow, but the tool exists.
Cindy:
Okay, good. I like that. That was super helpful. And we have another question, a little bit of a different direction asking, do you provide some kind of integration with avid media composer, like audio plugins for loudness correction?
MC Patel:
Okay. So this is a big distinction. It’s a good question. The answer’s no. There are two key things that we wanted to do. We wanted to build a standalone application.
Two reasons for it. The first one is, when you build a plugin, if the supplier of the main kit that we’re plugging in for a change, we have to change with it. Sometimes we’re not the most important person in their mind. I will give an example. This happened as we were building our first product when Apple decided that FCP 7 or something was no more, there were a lot of people in the broadcast industry that didn’t like that.
And I know a couple of people who have built an entire business around it. So, there are others who do it and good luck to them. We don’t do that. Our value add really is, when you have the video and audio are married. And you have to do something with the audio. That’s the first thing.
The second one is, when you want to automate that process. So our idea is that, the plugin may be cheap, but the person who uses the plugin isn’t. And the suite isn’t. So we have chosen to leave that business to other people.
Cindy: Is there more around, talking about loudness processing or did that kind of answer it?
I just wondered because the whole idea of loudness processing and not using the suite and doing it another way-
MC Patel:
Sure. There’s a lot to it. The first thing, the world in our industry grew up around peak based audio measuring and correction. So thou shall not exceed a certain level. The notion of loudness is really saying, what is the average value of the audio. Now, averages are hard for people to conceptualize in a creative manner. If you’ve used up your loud bits at the beginning of the program, what have I got left in the kitty?
A dubbing mixer knows that. They plan the audio, and they do that. But now you have to deliver to multiple loudness standards. Although they are slightly different averages and slightly different peak values. You don’t want to make that into a creative process. You can, and there are people who do, but it’s very expensive and very time-consuming. Especially if you have multiple deliverables.
I had someone who said, “Can you give me loudness that complies to European values to American values, Netflix and online.” And online was left vague. In fact, it was called social media. What we’re really talking about is four different settings in the product. Four different workflows or one workflow where you say, stereo one is for broadcast to Europe. Stereo two is for the US, stereo three is for Apple, et cetera. Each one of them is a tweak on their loudness adjustment. I’m simplifying this Cindy because I could talk about this for six hours, and we will have nobody left.
Cindy:
Well, I think we’ll do a separate session maybe on loudness because this is super interesting. So, not to steal the thunder from that, but, I just-
MC Patel:
Yeah, I’d be happy to. We can have a similar thing around loudness, and we can share the experiences of different customers.
Cindy:
Yeah.
MC Patel:
We are a bucket load of experience in that area.
Finally, the listening environment, again for loudness, you have a theatre where it’s quiet, and the hall is big. So they have a mixture of the theatrical stuff, which is generally how most of the movies are done.
Then you have your living room, with the occasional bit of chatter going on in the UK. We call it also the kettle’s boiling away, et cetera.
And then you have somebody sitting, well, they wouldn’t be sitting on a plane anymore. Well, not that I’m over it, but in a noisy environment, with their iPad or whatever. Each one of these requires a slightly different audio treatment to get the best experience. And a loudness tool could do that for you.
Cindy:
Nice. Okay. So let’s just recap. We talked about scaling up in a flexible manner and why we want to automate. And we talked about on-prem SAS, and we talked about cloud. And so MC, if someone’s interested in working with us further, where should they take it? What should they do next?
MC Patel:
Obviously, contact us. There are several things, really. We will follow up with an email to thank people, but we can do a trial for you, which you can download from our website, any one of our pages on the website, it’s got a big blue button called try or trial. So you can click on that, and you can download the product. You can call us or email us with specific requirements. We like to discuss them and help understand what you guys are trying to do and how the engine can work with you.
Cindy:
All right. That’s perfect. Thank you very much, everyone. And have a beautiful day. Thank you, MC.
MC Patel:
Thank you. And now I’d like to obviously thank everyone for taking the time. There’s different time zones and everything. So thank you very much for attending it. And I think I’ve been talking into doing another one, so I hope you can join us as well.
Cindy:
Yeah. Well, we’ll keep you posted. I think one on loudness will be perfect. So let’s do that. Yeah. All right. Thank you. Bye.
MC Patel:
Yeah. Thanks
Webinar – Loudness for News and Promos
Monday, 04 May 2020Do you ever see one of your news stories or promos go out and realize loudness wasn’t checked or corrected properly?
Even if you have hardware-based audio compliance sitting on your output, issues with loudness can still arise. Watch this webinar replay “Loudness for News and Promos,” from Emotion Systems to discover how to implement more reliable and consistent loudness monitoring and correction.
Rich Hajdu:
Welcome to the webinar on loudness for news and promos. It’s a big issue because a lot of people do news segments and the audio goes out to the playout server and it’s not really loudness controlled.
I’m Rich Hajdu with Media Technology Group. MC Patel, the CEO of Emotion Systems, is going to be doing most of the presentation. We invite you to use the chat to send us questions and we’ll answer them.
MC, the issue is that loudness for news and promos is not regulated in many instances because people don’t have time to check the loudness when the news clip is edited. So it goes to the playout server and then it goes directly to playout. When it’s time goes through a playlist and all that. So there is a loudness corrector at the output of the station. Why isn’t that enough?
MC Patel:
The best thing is for me to quickly explain loudness. Now, most people will be familiar with it, but I’ll explain it very shortly and succinctly, and that will help you understand why we do the file-based loudness correction. Historically, ever since television has been around, we have used peak-based measurement. What we basically do is say, “What is the loudest or the highest peak in television?” That’s integrated in different countries at different time constants. In the US you guys use VU. In Europe, we use PPMs, but fundamentally what we say is that the sound shall not exceed a certain peak. In the old days, the reason for that was that in NTSC, the audio sat right at the end of the chroma subcarrier, so if the sound got too loud, it interfered in the transmitter and distorted the color.
That’s where the peaking came about. As you know, in the seventies, people discovered that whilst you weren’t supposed to exceed a peak, if you stayed very close to it, your commercials could be very, very loud, which is how the loud commercials came about. Over the years, people got fed up with it and said, “we want some balance in the content, we want dynamics, but we also want to make sure there aren’t sustained bursts of loudness.” So the new standard called “program loudness” came about — in America it’s called the CALM Act, but the standard is ATSC A/85. It said that the average level of a piece of content may not exceed a certain amount in LUFS. It’s minus 24 in the case of the US, and then the second parameter that’s important is the true peak. So the true peak may not exceed minus one or minus two, which varies depending on countries and standards.
When people used to mix to peaks, it was easy. They could see a peak on their PPM meters, and they came along and said, “Yeah, I’ve got my audio right.” When you have to measure to average level, that’s kind of hard because you don’t know what the average is till you get to the end. So when all this came about, there were a bunch of hardware correctors that you could put on the output of your master control and say, “All the content that I’ve been producing for all these years is great. And the hardware corrector will take care of the loudness.” Now, remembering that the average is what matters, the hardware corrector doesn’t know when a program has begun and when a program has ended.
So the hardware corrector is saying, “Over the last few seconds, whatever I’ve seen, I’ll keep it very close to minus 24.” So if you have a few seconds of silence, it says, “Oh, dear, my average is going to fall below minus 24.” It will raise the gain of that. And if you have a lot of very loud noises, it lowers it. Now that’s modulating the audio, and that can cause problems.
If we go back to the promos and the news, the important thing that Rich said is that in a promo, in a local edit suite, you may not have the loudness measurement capability, or that people still mix with their ears for promos or anything. And one of the nice things is if you mix with your ears and you’ve been trained, you will naturally mix to 24, or in Europe we think it should be 23, or thereabouts. So the mix will be good.
And that’s the other important thing: If you spend time creating a mix, you don’t want to destroy it in the correction process. For news, unfortunately, as we know, the news comes in, you’ve got some background noise. If you’re by a freeway, it will be really noisy background. If you have a quiet area, it could be really quiet. Then you have the people talking. And you’re not sure what the levels are going to be, when it comes there and it gets edited. And as Rich so rightly said, there isn’t time to take care of that. So what you’re relying on is the hardware processor to take care of it. Am I’m making sense so far?
Rich Hajdu:
Yeah. So the hardware processor isn’t really set up to do that. The question is, I’m editing news and it’s fast-paced, I don’t have time to manually intervene and check audio levels and all that. What’s an elegant solution to do that, that is settable, repeatable, and reliable? What’s the solution?
MC Patel:
There are a number of approaches. You can use a plugin in the edit suite, but as we already said, we don’t want to do that because the focus should be on getting the right piece of the video there and out. When we approached this problem, our initial focus was commercials. A lot of people came up and said, “Hey, we don’t want to be fined.” And the creative mix was important. A lot of the post houses said, “We don’t touch the audio because if we mess up the audio, our clients will get upset.” Bearing in mind that what we’re interested in is the average, what we say is we measure the average. Now in a file, you know the beginning, you know the end. So that would be applicable to a news clip or a promo clip, right? So what we do is we do two passes.
The first pass is we measure, then we say, “Oh, it should be minus 24. I am at minus 26.” So what we say is, “If you apply gain across the whole clip or two DBs, you’ll get to minus 24.” It’s a very simple algorithm. It’s volume control. Now, if you lifted the audio, then you may have a peak error because your peak’s not supposed to go above minus one. So what we do when we’re measuring it, we identify where the peaks are, we create a gate around it, and then the software attenuates inside the gate to make sure it doesn’t exceed the minus one. And then it does a little mix in, mix out with the main content. So it’s a very simple concept.
So you ask, “How do I do this?” Well, once we’ve established the parameters that the correction should be at minus 24, and the peak should not exceed minus two or minus three, different countries have slightly different standards, you set those up and then the software just takes care of it. You do a measurement pass, you do the correction pass.
In order to do this quickly, it doesn’t sit in the edit suite. The product can sit in there, but what we’ve got is a watch folder, or a hot folder as you might call them. We say the output of the edit suite gets posted into the hot folder, our product is looking at the hot folder. As soon as it receives the file, it measures it, it corrects it, and it puts the corrected file into a folder of your choice. So that could be your transmission server folder.
Rich Hajdu:
And how long does that take?
MC Patel:
Okay, so a 30-second file. Typically the software runs at three to five times faster than real time. So you’re talking if it’s five times, six seconds.
Rich Hajdu:
Right.
MC Patel:
And it’ll take us slightly longer because we watch the file to make sure that it stopped growing because when you’re copying a file into the watch folder, but other than that, it’s faster than real time. And depending on your setup, you can do this. So in the newsroom environment, it’s entirely feasible to make this happen and not worry about the audio side.
Rich Hajdu:
Yeah, because those clips are typically only 20 seconds, 30 seconds, a minute. In the US, if it’s a two minute clip, that’s a long clip. So that’s not going to add much time to it. Right?
MC Patel:
And you would have some time for the promos. So you could either do them manually, or again, you can have another watch folder or hot folder and do it that way.
Rich Hajdu:
So, okay, I’m going to do this. What is the actual physical implementation? In other words, I need a PC… what do I need from a hardware and software standpoint?
MC Patel:
The software’s cross-platform, first of all, and that’s important. So we can run on a Mac, on a Linux or on Windows. Depending, for example, it can be on the same PC or workstation as your edit software, the watch folder may be somewhere, but this can run on there. So you basically say, “Oh, we’re a Mac place. Can you run on Mac?” Yes, you can run on Mac. And it’s a standalone piece of software. It’s very easy to install. The installer comes with it. You download it from our website and it guides you through the installation.
We let people try this out, because if people have a concern that this may not be fast enough, or we’re not sure about the quality, what we say is, “Download it from our website and play with it.” If you need a little bit of help, we’re happy to help you set it up and so on. The correction profile for the US is a part of a library of profiles that are supplied with the application. So you don’t even need to set that up. You just select it and you’re done.
Rich Hajdu:
You talked about running your Emotion Eff software on the same PC that’s used for editing, whether it’s Adobe, Grass Valley, Avid, whatever it is. What if I have eight edit stations? Would I run that software on each one? Or would the watch folder be located on a central PC somewhere else?
MC Patel:
We have clients who do both. Obviously it costs money. If you’re doing a news spot, you would be doing half a dozen spots or whatever, right? So there’s no reason for everyone to have one of these. We’d love them to have it, but from a practical point of view, you can have a dedicated workstation and have a watch folder in there. And that takes care of it.
Now the computer doesn’t need a very beefy spec either because we’re only processing audio. Now we do read the file with the video in it, and we do write out a file with video in it. We do all the extraction, everything inside the software. So if you have it on a fast network, the network’s important, but in the lab, we have $500 PCs that take care of this.
Rich Hajdu:
So it’s a simple setup. Everybody in news is editing fast, they’re doing this and that. Once you set it up and it’s working, it’s a set-it-and-forget-it type of situation. Is that correct?
MC Patel:
Yes, it is. To give you some idea of the robustness of the product, we have two products really. The focus today is on what we call our desktop product, Eff, and we have another protocol, the Engine, that’s running pretty much identical code, but it’s designed to offer really, really high throughput. So we have clients processing 10,000 hours a month of content for retransmission. So all the episodic stuff, the movies and things that go out. If you have a playout center that says, “I have 50, 60, 100 channels, and all the content needs to be loudness compliant,” we have that going through the system. And that works day in, day out. It’s a very stable piece of software, and you wouldn’t have to have concerns about that.
Rich Hajdu:
If I install this software, I downloaded it, I installed it. Are you there to help me through the installation, answer questions, all of that?
MC Patel:
Absolutely. Now, whilst we are in England, if you give us team rear access, we can take care of the stuff. We can do online training in the same way. Iain, my partner, does all the support. This morning Red Bee France was doing a software upgrade, which they did themselves. They just wanted a bit of housekeeping check, so they gave our team rear access. Those guys are doing 2,000 hours a month. We have a call tomorrow in India where they want to set up the same idea. They are a sports channel, or they have lots of stuff, but they’re using us to correct the commercials that they put in, in between the sports. Obviously there isn’t a lot of live sports at the moment, but all the commercials and promos there go through this. So we’re helping with their setup because they have instances where they’re trying to deal with a 5.1 and Dolby E for transmission, which may not be the case in the US.
Rich Hajdu:
Right. And what kind of analytics are available?
MC Patel:
When we did this automated piece of software, people said, “How do I know what you’ve done? And give me some feedback.” If you’re doing it without the watch folder, if you’re doing a one-in, one-out, which I can show you in a second, the software draws a graph for you, and it gives you all the measurements. If you are doing it within the watch folder, you could set it to give you a PDF report. And within the report, it will tell you whether the file passed or failed, what it measured, what the graphs were, and so on. It’s very comprehensive in that. Now in your promos and things, if you have bars and tone, we can literally detect the tone and ignore it. So there’s a fair degree of flexibility.
Rich Hajdu:
Once the system is easy to install, it doesn’t require a lot of hardware. It’ll work on a network, doesn’t require a costly PC. It runs in the background, and it has analytics and it can be expanded. This is an important point. If you have another five edit bays, or you want to do more promos, or if you got into production, you could upgrade from Eff to Engine, so it’s a scalable system.
MC Patel:
Yes, it’s scalable.
I forgot to give you some background. We are Emotion Systems and we’ve been in this business for about 10 years. We used to sell some QC software whilst we were doing some other things. And a lot of our customers kept saying to us, “This QC stuff is really good. It tells me what’s wrong with a file, but there’s no opportunity, so what do I do about it?”
At the same time we’re doing this, people started talking to us about loudness. We thought, “Ah, this is interesting because there isn’t an easy solution of how to fix it loudness-wise.” So our first product was to build the measurement of loudness and correction of it. And the algorithm I described a few minutes ago is what we use. We’ve refined it a little bit, but fundamentally, that’s what we do. And then the company evolved to say, “Well, if you can do loudness and you’ve got the audio out, you can do more with it.” And that’s how the company’s grown. So the flexibility and the scalability came about, because once we did this, Dolby came to us and said, “What happens if the file’s got Dolby E? Can you measure the loudness in Dolby E and correct it?” And we licensed the stuff from them and made that.
And then Red Bee Media, about six or seven years ago, came to us and said, “We’re doing a playout center. We want to do loudness correction. We want to do channel mapping because our tracks are in the wrong place, etc.” So we built a system for them that could process 2,000 hours a month. But the desktop product has been around. It runs the same loudness code as the main Engine. So people often say we have the Eff, but now our needs have grown, and we’ll do an upgrade of the Eff into Engine. We do a trade-in, basically.
Rich Hajdu:
Right. And you have customers all over the world, and you have US customers also, correct?
MC Patel:
Yes, we do. We have some call letter stations that use Eff. We have a PBS station that does that. I won’t name them because I don’t have permission. There are very large content-repurposing houses, some of whom I can name, who, basically when Hollywood sells content to the rest of the world and locally, they want to create multiple versions of the same content. So we have companies like Vubiquity, Premiere Digital. Viacom’s a very big customer of ours. Viacom in New York has a massive system where a lot of their content goes through our system for loudness correction, pitch shifting, and a whole lot of other things.
Rich Hajdu:
MC, we don’t want to take too long here, but can we do a quick demo so we can see what this looks like?
MC Patel:
So what we have here is the product Eff. I’m going to show you the manual version. Let’s do a quick file measurement run.
We won’t go through the settings because I’ve already described most of it. All you’re doing is asking, “What is the correction I want?” So this is our CALM. I’ve got a Deluxe. I was doing a demo for them in Australia. Rich, when I did the demo for you. I have a Netflix profile, right? Now this is my demo system. When you get it, you actually get more than that, but we want to do CALM. I select a source. So this is a file already corrected.
Rich Hajdu:
Okay. Yeah. We just need to see the basics of what Eff does.
MC Patel:
So there you go. So what it’s done is it says the audio duration was 25 seconds at the top. And it measured it 19 times faster than real time. The analysis time was 1.3 seconds. Now this is a corrected file. So it says this group does not require any correction.
If it did require correction, I just press the correct button there and we’re done. And you can see that I’ve got a graph which says I have silence for the first seven seconds. And then the program loudness goes up like that. I can look at the true peak. It shows me where my true peaks are, and I can look at the log of these as well. So I can turn the logs on and it gives me a list where my peaks are.
Rich Hajdu:
So in a news editing situation, would the files that have been corrected be there?
MC Patel:
We’ve got a source and we can say, “Where do I want to put the destination?” Right. So I select the destination. Now I’ve actually just shown a corrected file, but it’s really as straightforward as that.
Rich Hajdu:
And so when I set that up and then as long as the destination doesn’t change, I set this up once.
MC Patel:
Correct. So I am doing this manually up here. That’s the manual operation. There is a watch folder operation. If you select the watch folder, then you can post files into a watch folder right from the system. Have I covered enough, Rich?
Rich Hajdu:
Yeah. I think you have. The point is it’s not a complicated, complex setup that takes days to set up and days to learn. Somebody gets this file and you guys help them. They can be on the road right away.
MC Patel:
The idea here is, this is an audio product that is designed for a video environment. So we assume that if the person doesn’t understand video, we’d take care of it.
You need a bit of trust. So we want you to try it out and get your audio people and the golden ears to listen to it and say, “Is this okay?” And then you’re good to go.
Rich Hajdu:
That’s the key, because an audiophile is different from somebody who’s concentrated on news. And really audio is just a secondary element that happens.
MC Patel:
That’s correct. So what we’re saying is, you focus on the creative stuff, do your job, and leave us to do the compliance.
Rich Hajdu:
Right. And they can download this file from your website, or they can go to my website and access the download and they can contact us. And it’s that easy. Just download it, try it out, and see if it works. We don’t want to get into pricing exactly. But let’s say I’ve got eight edit suites and this is not a $100,000 system, right?
MC Patel:
No, it’s not. It’s a few thousand dollars, depending on the configuration.
Rich Hajdu:
It’s in that price range. So it’s not going to be a budget buster either. And it is going to give somebody the continuity of audio without the highs and the lows and without all the other things that interfere with the viewer response.
MC Patel:
That’s correct. So we have some questions coming along. Cristian asked, what about live transmission? Basically, this product is not designed for live transmission. This product is really designed for file-based content. As I mentioned earlier for live content, we don’t know where the beginning of the program is and the end of the program. So we’ve left that part alone. We see so much content that is file-based that we saw an opportunity to make a product specifically for file-based content.
Rich Hajdu:
Cindy has posted in the chat box how you can download a demo version. And again, you can contact either one of us, me in the States and MC in the rest of the world, and we’ll be glad to answer any questions. Try the demo and make sure that you feel comfortable with the solution to an ongoing problem.
MC Patel:
We have a question from Europe, which is that you may want to preserve the archive as was originally mixed and then that archive may get distributed a number of times. A lot of our customers will do, I wouldn’t necessarily call it the correction, but the compliance to whatever the client needs. So for example, if you have something in your archive that has… I know we’re digressing slightly from the news and promo…
Rich Hajdu:
That’s okay. We’ve covered that.
MC Patel:
If you have something from the archive and you wanted to send it to Netflix, for example, Netflix has a completely different standard from the broadcasters. So this product is very often used for producing content for Netflix or for Amazon Alexa.
Yesterday I was talking to some people who said, “We have a completely different audio requirement for Alexa. If I already have a hardware processor, why do I need this?” And as I mentioned earlier, what’s happening to most hardware processors is they’re programmed to make sure that the long form content is compliant. So the time constant that it reacts to are over long term changes. And that’s also true for live. So for live, you have quiet moments, really loud moments, quiet moments, really loud moments. And it could be a three-hour, well, if you watch cricket, it’s an eight-hour show, but like a baseball game, I guess it’s many hours, right?
So the processors are really set for that. When you have a 30-second, 45-second promo, you really want the promo to be bang on in the range so the hardware doesn’t try to overcorrect it or undercorrect it.
Another question is API integration. The product that I briefly showed you, Eff, is a desktop product. It’s got a little bit of automation in it, but it doesn’t really go beyond that. If you need API and so on, then the product to look at is the Engine which has 16 different audio processing modules. It can process up to eight files at a time. It’s a very, very comprehensive system designed for scaling and automation, and we have a REST API for it. And what happens with a REST API is that you could basically say, “Take this file, apply this workflow to it and send it to this position or this destination.”
It’s as simple as that. Once you’re doing it, you can do the status of the file in terms of how much you’ve processed it and so on. So there’s a question on latency and delay. In this instance, there is no delay or latency because it’s software — what we’re doing is we’re taking the audio out of the file. We’re measuring it, we’re correcting and putting it exactly back where it was. So there’s no video, audio time constraints. In hardware, there may be a different approach to that.
Rich Hajdu:
In the chat box, there’s also an area where you can download the Loudness Factbook, which is really, really excellent. I’m not an audio expert, I’ve spent my time in television. And the Loudness Factbook is really good because it gives you a great primer on loudness in all its attributes and all of that.
MC Patel:
We have a question which says, “If someone is doing a poor job with the audio track, can you fix it?” Now what we have to ask ourselves is, what is a poor job? I often describe this as, if you think about music, I could say the mix on this music is bad, but the artist created it. So we try and preserve that. But if what you call bad is the audio levels are too low, I can’t hear the dialogue, etc., when you normalize the file, because it’s applying that overall gain, you can actually do a few things to it. In fact, that’s what we’re saying for the newsroom. Generally, the problem is people do know how to talk, but sometimes they have to shout because they have to get above the ambient level. And we can normalize that so that is better, but it is literally a gain control.
I have a question from Bogdan. Does it correct loudness on Dolby E streams? And the answer is if you buy the option, it does. Bogdan, I assume you’re in Europe at the moment because we use Dolby E quite a lot. The Red Bee example I use for playout does have Dolby E. So some of the files come transmission-ready. They already have Dolby E and if you want to say, “Is the loudness in that stream correct?”, we decode it, measure it, correct it, and resend it.
This is also applicable in the US if you have to deliver content for someone like Direct TV. You may deliver it as transport stream with Dolby digitally encoded, or MP3 in case of Comcast. And then you say, “I’ve sent it, but is it correct after the encoding process?”
We have done a system like this for a large content creator, where we take the file and we decode the Dolby Digital, loudness-measure it, correct it, recode it, and give it back. Now the important thing here is if you’re the guy who has to send the file out, you may not have the skills to do this and you sure as hell won’t have the kit to do it. So this is like, “I have to send it out and now to make sure it’s okay.” We provide that capability.
Rich, have I covered everything for the news?
Rich Hajdu:
Yes, MC, very productive. Because again, we needed to know about the simplicity, how it works, the repeatability, the reliability, and basically what we’ve discovered in this webinar is that it’s a simple system to install. It’s a simple system to use. It’s repeatable. Set it and forget it, and it’s not too expensive. So I think that’s everything we need to know. If anybody has any questions, they can address MC or me and we’ll be glad to answer them.
MC Patel:
Once again, as you mentioned the Loudness Factbook is something we wrote to help people understand it with all the issues. There is a discussion in there about the hardware versus software, where it’s applicable, why you should do one or the other. We are always happy to discuss specific needs. Just drop us an email on . And you can go onto the website at emotion-systems.com. Every single page on our website has a big button that says “Download a trial” and click on it and you can download the software. We’d love to work with you.
Rich Hajdu:
Great. If that’s it, then we’ll close it for now. And thanks for everyone’s participation. It’s been very enjoyable.
MC Patel:
Yeah. Thank you for your time, gentlemen. Thank you.
Webinar – Automated Audio Processing for One Master, Many Deliverables
Monday, 25 May 2020Find out about the problems and challenges of delivering high-quality audio for a wide range of platforms in an automated and cost-effective manner. Get the inside scoop on real use case studies that show how Emotion have worked with customers to provide solutions that integrate into their workflows.
You can download a free trial of Engine here.
Webinar transcript:
Cindy:
Welcome everyone. I’m so glad you’re here for our webinar today on one master, multiple deliverables. And I have MC Patel, CEO at Emotion Systems. Hey MC, how are you?
MC:
I’m good, Cindy.
Cindy:
Hey, let’s dig right in and talk about one master, many deliverables. What do you mean? Tell us all about it.
MC:
So let me just go back a little bit in history. If you made a piece of content, let’s say a movie, what you would have done 20 years ago is run off of a number of film prints. And then they would go on a jumbo and get distributed around to all the theaters. And similarly, if you made a episodic series, the tape would go to the network and it will be transmitted.
MC:
Now, obviously, the means of distribution have expanded dramatically. Film has demised. So now there is a great interest in saying every time you generate a piece of content, you want to monetize it as quickly as possible in as many markets as possible. And so what the industry is looking for and has to do is to create a number of masters out of that one original.
MC:
Now, there are plenty of solutions for video that allow you to do this. And for cost-effectiveness, these solutions are automated. There aren’t many solutions for audio. And so what we mean by one master many deliverables is really that we get one piece of audio and we regenerate audio that’s suitable for a number of platforms, a number of countries and so on. So that’s one master, many deliverables.
Cindy:
Got it. And can you tell us a little bit about how Emotion Systems got into this space originally?
MC:
Sure. So we’re 10 years old and we started off with an idea which was basically we wanted to make software modules. We recognized that the industry was moving away from tape and into files. And so we wanted to create software modules that would solve specific problems in the final domain. And we started off by examining the market and saying, what sort of things people are looking for?
MC:
And talking to a number of people, they said that they use their edit suites, which really are for creative purposes, to do a lot of mundane things. And we had a range of ideas, inserting bars in tone, clocks and so on. But at that time, what was very topical was loudness. The industry was changing from peak-based measurement to program loudness. And a lot of people didn’t understand what loudness was.
MC:
So what happened is people came to us and said, “Tell us about loudness. How do we solve that?” Now, instead of trying to solve an old problem in a different way, we saw the opportunity to solve a new problem and use our technology. So we developed a product that would measure loudness in a file. But the real challenge that people had is they were very familiar with what to do with the old audio if it was wrong and so on. They weren’t with this.
MC:
So they said, “Well, this is great, you telling me what’s wrong. What do I do about it?” And so that was the germ of the idea that started Emotion off. And in essence, what we recognized is that once you’ve married the video and audio together in a file, it’s very difficult to operate on one and do something with it. Normally you’d go into an edit suite, then you have access to the video or the audio or the metadata, and you do something with it.
MC:
So we wrote a piece of software that allowed us to read the file, take the audio out, do something with it, copy the file, copy modified audio back in, and we wouldn’t have touched the video or the metadata. So Emotion started off doing loudness in this manner with a manual solution. And very quickly the market came back to us and said, “This is really interesting.”
MC:
But what they said is, “If you can take the audio out, you can do more than loudness.” I know our friends that Dolby threw a challenge to us and said, “Imagine we had a fire with Dolby E,” so encoded for carrying more channels. “What if I needed to do loudness measurement and correction in that?” So we licensed the encoders, the decoders, and came up with an algorithm, which said, “If I detect Dolby E, I will decode it, I will measure the loudness or correct it. I will modify the metadata to reflect the changes I have made. I’ll end code that, put that back into the file.”
MC:
And the operator now doesn’t know that we’ve just carried out a measurement and correction in a file, the Dolby E. Now, clearly they do want to know what we’ve done, so we generate a report. When we do the analysis, we write a report. We said, “When opened the file, we found this was the loudness. These were the parameters that failed. These were the parameters that were okay, and that we corrected them.”
MC:
So that was the process some six years ago or so. And then a company that had a slightly inverse problem to what we’re trying to solve, but I’m just telling you the story now. The company said, “Well, we get from our clients video files with 14 variations of audio.” And I’ll tell you about that in a second. But what they said is, “We want to… Because we need to loudness correct. We need to Dolby encode. And some of the tracks are not in the right place so we need to map the tracks. And there are 14 different workflows.”
MC:
So for example, their normal expectation would be stereo and 5.1. And what they would do is they would loudness correct both of those, Dolby encode the 5.1. And then the output would be stereo Dolby E. And the stereo would be replicated as would the Dolby E and the stereo three and four. And they could then play that out for their transmission service. So we recognized that this is doable. We can encode, we can decode, we can do the track manipulation.
MC:
But they said, “14 workflows, we need to automate it.” So we wrote an API for the product so that you could address it from an external device like a MAM. And we then allowed you with a user interface to program the workflow. So if my file has just stereo, do they see if it just has 5.1 do this, et cetera. And then the MAM says, “For this file, apply this workflow and place the result in this location.”
MC:
So it’s a REST API. It’s very, very straightforward. Now, when we build this, this is what product we call the engine. And this is where the story of the one master many deliverables really begins. Because now people are coming to us and saying, “Well, actually, for this deliverable, we need this deliverable, we need this.” Now, as I said, in this particular instance, it was many masters, if you could say it that way, in one deliverable, because you’re delivering to a broadcast house.
MC:
Many deliverables was that they had more channels. They had 13, 14 channels. So the challenge there is to say, well, actually, I need to process 2000 hours a month say. And so I can’t have just one file at a time. So one of the things we built into engine was the ability to process more than one file at a time. So you could license the product to either you pick and choose your modules. I don’t need Dolby E, I need loudness, I need track mapping, et cetera.
MC:
And then you say, “I need to do so many hours, therefore I need to process three files at a time, four files at a time,” or generally the content that needs to be transmitted next week arrives on a Friday and we have 48 hours to process. So there’s all those variations that you have to put in. So that’s who we are. And one of the things we did is we decided that we would focus our effort and energy on being the best at audio.
MC:
There’s many people who do video, and we recognized that not many people did audio. So what that meant is really we went back to the customers and basically said, “Whatever your audio problems are, please talk to us and we will either talk to you about how we go about solving them and where possible and appropriate we will provide the solution for you.”
Cindy:
Nice. So we’re going to get into some examples here in just a minute. But just to recap what you said, if I heard all the important points there, it’s really interesting how you started out by dealing with audio and looking at a file, being able to pull that audio out and do something to it, and then put the audio back in. And then you end up working with people who need 14 variations of files in terms of their deliverables.
Cindy:
I like how you’re talking about the different modules and so people can kind of have a menu of the different pieces that they need. You kind of alluded to this. You’re going to talk about some examples and samples. Tell me more about the challenges that come up around the multiple deliverables.
MC:
So basically, the challenges are in this industry worldwide. You would think we have standards. I mean, we spend a lot of time on standards and deliverables. But standards move slowly and not everyone follows them and so on. So the first challenge is loudness. Even though it’s been around for 10 years, we still have conversations with customers about, “Tell us about loudness. We have this scenario, that scenario.”
MC:
And every country has a slightly different loudness standard. Now, that’s terrible news when you under distributed because there’s a standard in the UK, there’s a standard in Australia, there’s a standard in the U.S., there’s a standard in Japan. So by us specializing in it, the first challenge is, can we do the different standards worldwide? And three years or so back, Netflix came up with a variation of their own.
MC:
They did some analysis and they said the standards that are there for broadcasters aren’t the standards we want. The base is still the same. And so engine has a loudness configuration menu which allows you to set up any standard, any legacy. We put all the meters in when we put this thing in. So you can create a standard, give it a name, and then use that in any of your workflows.
MC:
So loudness is one of them. The other thing, signal processing tools. So you pull out your favorite episode from your archive. I like to use Fawlty Towers. It was shortly in the ’70s, mixed in stereo. You want to do a high definition deliverable, you need 5.1. So can we have an upmix please? Or you have a modern program that has been delivered to you in a country that’s still only transmits stereo.
MC:
So now you have to take your 5.1 into a done mix. The other thing that happens is you may get a master file from Hollywood, where they give you a 24 channel audio file, where you have the program stereo in one and two, the 5.1 in seven and eight. Then they’ll give you a second language in the next eight channels. And then they’ll give you mixing effects in the other eight.
MC:
And that’s all good, because what they’re saying is when we give you this, we’re giving you all the tools to create dubs inversions. But your deliverables will say it has to be eight channels. It has to be stereo in one and two and 5.1 in seven and eight. So that’s a challenge. You have to take… Now, for a simple process like that, even if you say I want to do the English, which is perfectly done in one and two and three to eight, you still have to go into an edit suite or use a tool to take the other channels off.
MC:
So we do loudness. We do signal processing, upmix, downmix. We do another form of signal processing where the master may have been a 24 frame, short origination. And very often for European deliveries, you’ll just speed up that the 25 frames. Of course, the video runs the program duration contracts because you speeded it up… Actually, in that case, speeded it up.
MC:
And the audio will be speeded up, but the pitch changes. So you need to correct the pitch and change the duration at the same time. So these are signal processing modules that we’ve developed, again, because customers ask for them. And then we have what we call file manipulation. Now, file manipulation is track mapping, adding channels, if your archive is two channels and your deliverable has to be eight channels, or removing channels.
MC:
As I said, if you had 24 channels coming in and eight need to go out. When you do all these manipulations, you have to label the tracks as left and right and the various surround components. And you also have to do language tagging. So in some play out platforms, you may have multiple languages so you need to tag each of the language. So you can see that the challenges of making these multiple masters are not insignificant.
MC:
And because bulk of the budget and attention goes into the creative process, this all has to be done on a budget and at great speed, without errors. And so us providing an automated or automateable platform is a very key part of it. Us about providing a scalable solution is a very key part of it. So I hope I’ve kind of given you a flavor of the challenges.
Cindy:
Yeah. I love the challenges. And Fawlty Towers, that’s perfect. 5.1 for Fawlty Towers. That’s great. And so you were talking about loudness and dealing with that, of course, and signal processing. And then looking at encoding and all of those are really clear and make sense. Those are challenges I can totally understand. Can you talk a little bit about some examples where those get used in real life?
MC:
Sure. So the first one I spoke about was the playout center. We have a lot of playout centers that use us for this. And the playout center has basically got… I mean, I can name a couple. Red Bee in France are big users. They do the playout for CANAL+. And we have a similar setup in the UK. We’ve got the National Playout Center in Australia, which is putting 60 plus channels.
MC:
And until recently, we had DMC in Holland that was doing 96 channels. And each one of them had a slightly different requirement because of the content they were being delivered or they were getting from their suppliers and also the internal arrangements as to how the architecture of the playout facility was done. So there was one that had a integration with TDL, another one where they did their own integration. So they worked with us.
MC:
Another one who works with just watch folders. They say, “We do not have an automation system. We just want to give you watch folders.” We also have people who are working with Telestream Vantage, and we have a customer in Viacom who uses all of the different methods. So Viacom have a MAM. They have Aspera Orchestrator triggering workflows. They have Vantage.
MC:
Their post-production use our e-client technology, which means every editor gets access to the engine and can post a job into it. So there’s a wide variety of this for playout and operations. And then we have places in Los Angeles who are essentially saying, “We will take a master from the Hollywood studios and then create deliverables for every country or as many countries as you like for theatrical, for broadcast and for online.” So iTunes, Netflix. They would do everything.
MC:
Now, for them, again, they have the challenge of saying, “Can we do loudness for Japan, for Australia, for Germany, Netflix, for Apple iTunes?” And also they want encoding for them or they may want stereo or 5.1, et cetera. So the idea here is that there is… It sounds very complex, but you set up a workflow for a particular deliverable, give it a name and the APR, or the API, you can call it, and say, “This file needs this workflow applied to it.”
MC:
The workflow can be very simple. It could be just one step, loudness only, or it could be six or seven steps. So six or seven steps, I’ll give you an example. I have stereo, loudness correct the stereo. Do an upmix. Loudness correctly the upmix. Dolby encode the upmix. Track map it, channel label it, and output it. That’s a multistage workflow, but you say this is a set up for Tx channel one. And the MAM calls up for a call workflow for Tx channel one and away you go.
Cindy:
And so how would, for example, the facility in Los Angeles, maybe if they’re using a Telestream Vantage and a MAM system. What would that look like if they were actually doing that on a day to day basis?
MC:
So you don’t see it is the first thing, because it’s a machine talking to a machine. But what I can do is give you some scenarios. It’s a good question. Recently, as we all have been working from home, we’ve had to… People are sitting there saying, “Oh my God, how do I do my job?” And we’ve had our customers calling us and saying, “Well, you know what? I can see what’s going on with your engine,” because I bring up the e-client and it’s monitoring all the jobs and so on.
MC:
I can go away into a TeamViewer or PC anywhere and initiate things and so on. But a versatile setup where you have clients posting from multiple devices, what happens inside engine is there is a job manager. We call this e-flow. So what e-flow says, “I will receive jobs from Vantage, from Orchestrator, from the MAM system, from the watch folders, from the e-client. I will put them in a queue and then I will look around and see what resources do I have and how many.”
MC:
So if I have the ability to process four engines, we call each processor an ESP, an Emotion Signal Processor. It says, number one is busy. Number two is free. So I will open number two, post a job into it. The next one that becomes free maybe number four. And it will do that. So it basically opens and shuts these processes as it does the jobs and posts them.
MC:
The jobs get queued inside e-flow in the first come first serve basis. So anybody can post it and they just join the queue. And then someone comes along and says, “I have a priority job.” So we give you the option of assigning a priority to the job. And what it will do is it will jump the queue and the next ESP that’s available will be used to process that job. So that’s kind of given you some idea of what the set up can look like.
Cindy:
Okay. So if somebody’s got a master and they need to get it out to multiple, I don’t know, stations, playout, channels, that’s the way to do it. And I like hearing about all the different ways really how you put the workflow together. And like you said, it’s just a machine talking to a machine. So once it’s set up, it’s off and going. And your examples are great, because you’re talking about deliverables for broadcast and for online platforms. We had a question earlier about social media or online distribution. Could you talk about that?
MC:
So this is a new thing. That the broadcast world has bodies that meet and discuss. They take a long time. CMT does it. EBU do it. And then agree some standards, to say in order to get the best user experience, this is a standard that we want you to adhere to. The social media platforms are essentially about anybody can post anything in any form. And they have some QC or they have some guidelines, but they’re not out and fast.
MC:
And so if a professional is trying to deliver to a social media platform, they have concern. They have spent a lot of money to make it sound good and look good. So they want to deliver to some level of quality and consistency. And so we often get asked, “What do we do if we want to do something for YouTube or for…” Well, iTunes is a platform as opposed to Apple TV, if you like. But the guidelines are quite vague.
MC:
Sometimes there’s a spec that is available which says, this is what we would like you to do. The key thing is it is always different to the way you make the master. So what we do is we are a very consultative company where we like to discuss the issues with our customers and then propose solutions and then see how they get on. And then we may iterate a couple of times to get it right.
MC:
Because at the end of the day, the customer, as I said, this expensively made content needs to be given the best treatment possible. So for social media, I’ll probably give you an example as a challenge that was sent to us. We were in Soho talking to an audio company that was just getting into feature films. And his issue was that he would have a 7.1 master.
MC:
Because it was a European or British company, the feature in question was shot at 25. So what he said to me is, “Look, for my deliverables, I’m going to start with this 7.1. And then what I want to do is I want to deliver a 5.1 that’s loudness corrected as a wave file because it was an audio house.” But the first wave file is a single file that has all six audio channels into interleaped.
MC:
Then I also want six mono files that represent the 5.1. I now want to do a downmix and get a stereo. I want to loudness correct that and give me the stereo as an interleaped file in the stereo three monos. So if you’re following this, we’ve got interleaped 5.1, six monos, a stereo, two monos. And then I wanted to do the same for a 24 frames a second so that I can do distribution to those countries.
MC:
So I said that’s great. We can do this for European loudness standard, the U.S. loudness standard, if you like. And I said, we can also do the same for social media. We can create the same set, but we would apply a Netflix setting, et cetera. Now, engine is capable of handling up to 64 channels. So the end of it, we had the ability to deliver I think it turned out to be 12 or 15 or some number of deliverables in one hit.
MC:
That was a challenge that is not necessarily a practical example of how you do it, because everything there will be subtle variation. But the idea is that you can… If there’s discipline and if there is a way you want to do this. But the way the 64 becomes useful and interesting is in a playout center, you may say my video master that is common to number of languages.
MC:
So what I really want to do is have a… If I do Dolby E encoding, I’ll say stereo one is stereo, my 5.1 is encoded as Dolby in stereo two. That’s English. The next two pairs could be French. The next two pairs could be German, et cetera. And you could end up with supporting 16 different languages in a 64 chanel package.
Cindy:
And that works for social media? That applies to social media users?
MC:
No. Social media, it doesn’t work in that parallelism. The important thing with social media is to say it’s different to the broadcast, and it’s a moving target. So you need to be able to program it. Now, the other thing is it needs to sound right. Now, the example you… What you have to think about is when you’re sitting in the cinema, the environment is darker and quieter, apart from some noisy popcorn eaters; compared to your living room where you may have a smaller room, a different ambient noise level.
MC:
And then you have to think about sitting on a plane with your iPad and listening to that. Now, if you think about it, these are all three different listening environments. And ideally you want audio to be presented as three choices so that you can say, here’s my cinema audio, here’s my living room audio. And that really is what the market is now waking up to. There is interest in saying if I’m a commuter, and I guess the present times are not that many commuters, but we want a good experience for for all of them.
Cindy:
So it’s the flexibility you’re talking about in terms of no matter what the deliverable is, you’ve got a way to meet the needs…
MC:
Yes. And that applies to evolving standards as well. The EBU are always discussing how to make audio better. And so we give you the flexibility and the programmability to give you whichever version you want. But the key thing is to keep it automated so that this isn’t costing as much as other solutions might.
Cindy:
Right. So we do have some questions. And also, if you have questions for us, you can always send them our way. And we’ll put the contact details in the show notes so you can do that. And one of the questions we have is, what about Atmos? How does it tie into all of this?
MC:
Now, we’d like to think of ourselves as an innovative company, but actually we ride behind the curve. What we say is innovation for us is, what can we do that is useful to the end user as opposed to a G-Well? Now, the reason why I say that is Atmos has been around for a while and is very successful. And if you ever had the pleasure of watching an Atmos movie, it’s absolutely stunning.
MC:
So our first brush with Atmost was a couple of years ago when a launch satellite broadcaster in the UK came to us and said, “We do an awful lot of real time Atmos and we have a workflow set up for that, but we are trying to do file-based Atmos. And the difficulty we’re having is our servers wouldn’t play it out.” And so we analyzed the thing. And what had happened is that the Atmos mix had been encoded into a format called ED2, which is Dolby E twice.
MC:
Because Atmos for transmission is about 16 channels of audio. And so the timing of the Dolby ED2 wasn’t right. We analyzed it. We found out what the problem was. We came up with a fix and we gave them a solution and it played out and they got what they wanted. Now, that’s probably not the answer people want to hear about Atmos. And the reality is what we do is very boring.
MC:
The creative aspects of Atmos are kept creative in the edit suite. We come into play when you want to distribute it. And for whatever reason, the Atmos that you’ve been given won’t let you transmit or won’t let you archive or store. That’s the other area where people are interested in packaging it. And so we are talking to a number of people. We’re really talking to them and said, “What is the problem? What is the challenge that you have?”
MC:
And right now the workflows are simple enough and infrequent enough to be sold in post or to be sold by a narrow line of workflows. Now, I can’t name some names. And also, we’re relying on two things: we’re relying on a partnership with Dolby. We have very good relationship. And we’re relying on the customer saying, “I have so much content that I can’t do this manually and I have these different workflows.”
MC:
So basically, if we go back again to the discipline of when we had the movie and we made the prints, or when we had a piece of content, we did very little with it. We restricted our workflows so it became an efficient operation. The challenge we have now is the efficiency is expected, but the discipline in the workflows has gone. This is where the issue is.
MC:
And in Atmos, we’re still in that early stage where we’ve being very careful about how we produce it, how we transmit it. But that lead is about to go off. And I’ll give you a little trailer. I don’t want to make out that we’ve got something magic around the corner, but we’re monitoring the challenges as they occur, and we will be providing the solutions that work.
MC:
So one of the issues, as an example, is that you get three files for Atmos: the one for the beginning titles, the one for the program and the one from the end credits. And you have stitch these together. At the moment, it’s done in a very tedious manner, in an edit suite and so on. And we’ve been asked to look at that. We are still looking at it because the requirement hasn’t crystallized enough to be a pain for the customer to say, “I can’t bear this anymore.”
MC:
It’s a careful market. If they can get away with it and do it, they don’t want to spend the money. And it’s when people come along and say, “Yeah, when I get this one feature a week, I can deal with it. But if I have to do 15 features, then I’m going scream.” It is a volume… As I said, we do best when we automate the solutions where they’re not frequent enough so that careful pruning and tweaking will take care of it. It’s still best done manually.
Cindy:
That’s right. It’s all about the mundane tasks and when that repetitive work hits that tipping point.
MC:
Yeah. And then the mundane tasks still need to be done really, really well so that you haven’t ended up ruining the quality.
Cindy:
Nice. All right. More questions here in our webinar on one master, multiple deliverables. Here’s a question. Is there an API? And if so, how does it work?
MC:
Okay. So yes, there is an API. It’s a REST API. So basically REST API is a very simple in their construction. And what we say is you preplan the workflows and program them in the engine. We have a very good tool for it. It’s something that really wants to be done with and attention. And then what you’re asking your control system to do is talk over the API.
MC:
And all you have to say is, “This is my file. This is the workflow. This is where I want you to place the file with this name,” if you like. And then post the job. And then you can request progress from it. So you can say, “Where am I with this?” And we will say 60%, 70%, et cetera. We’ve had people who integrated to us in a matter of days. It’s not a very complicated process.
Cindy:
Great. So we know we have people right now using the API every day and obviously it’s a huge part of the solution here. Another question. This is good. Can I just do all of this in my Pro Tools room? Over to you MC on that one.
MC:
Okay. I think if you’re doing a few hours a day or a week, absolutely. The challenge really becomes when you have to do this day in, day out with hundreds of hours. We recently quoted on the system where the requirement was for 200 hours a day. So for that, you need a lot of people and I think the client won’t pay for that. Now, again, in our lockdown mode, we can’t always get access to the edit suite.
MC:
The suite is there for a good reason. Again, if I go back to why we started the company 10 years ago, it’s because people are saying we have these suites and they’re great and they’ve brought book everything in it, but we can’t take an hour of the sweet time to do loudness correction. We should be doing it for more creative, more billable work. And also a suite requirements talent and it requires a room. It requires a huge amount of equipment. So if you’re able to build that, 10,000 hours a month, I’m sure a lot of people would love to do that, but they can’t.
Cindy:
That’s so good. I’ve had the pleasure of working with some people who tried out engine themselves and it was super fun demoing and working with… MC, let me just throw this one over to you. If someone wants to try out all the cool things you’ve been talking about for making multiple deliverables, what steps to take next?
MC:
Oh, okay. So you go to a website. Every single page on our website there’s a little blue button which say’s trial. Click on it and fill in a simple form. You can download the engine and we give you a 10 day trial. The product works in Windows, Mac and Linux. And so that’s how people work. We encourage them to engage with us. We have also got some useful guides. There’s a loudness factbook, Adobe factbook. We have a Netflix one.
MC:
So we also encourage them to talk to us about any issues they have. As I said, when we focused on audio, one of our missions was to be the best in audio file based processing. And one way for us to do that is to engage with the customers and actually share either what we’ve learned from our experiences or listen to what their pain is. And sometimes shrug our shoulders and say, “We can’t help you,” Or we can look at it and say, “Hey, we’ll take a look at this for you.”
MC:
In some instances, we may charge you for it. In most instances, we’ll at least do the investigation and come back to you with some ideas and solutions. We like to work with our customers closely and work as a team. We are small and flexible and agile and keen to be the best. So that’s the way we’re going to deal with it.
Cindy:
All right. I’ll make sure all those links that you just mentioned, MC, are the show notes for you. So if you want to take advantage of those resources MC just mentioned. And thank you so much. If you’ve got one master and need to have multiple deliverables, we look forward to working with you. Thank you.
MC:
Thank you very much.
Webinar – New Loudness Challenges: Broadcast, Online, Cinema
Wednesday, 24 June 2020The media industry is changing quickly, and it can be hard to keep up with changes in standards, regulations, and practices surrounding loudness.
Get up to speed quickly! Watch the webinar replay to learn about:
- Must-know specifications and requirements
- Loudness in different playback environments
- Lesser-known specifications around online delivery
- How to implement loudness processing to optimize audio quality
- What you need to know and ensure compliance for global distribution
Cindy:
Welcome everybody. We’re so glad you’re here. And today we’re looking at new loudness challenges in broadcast, online, and cinema. So today you’re going to hear about loudness challenges and the history of loudness a little bit, and the challenges we’ve come up against in the current day. And we’ll look at the delivery process and also what’s happening with loudness for online and yeah, it’s going to be a great day. So hi MC.
MC:
Oh, hi Cindy. So yeah, thank you for hosting this. And this is our fourth webinar in the summer and we hope to do some more. This particular one, we’re going to focus on loudness. A number of our customers gave us feedback that, whilst it’s great that we do all these different types of audio processing, this is, I wouldn’t say necessarily back to fundamentals, but it is talking much more about loudness.
MC:
So I’m going to start by just talking about, just an introduction to loudness, what is loudness? How it came about and so on. And then we’ll move on to some other things. So basically a bit of history, when television started and we obviously had video and audio, we used to use a technique where we measured the peak audio level to determine what was the highest peak you could get. And there were a number of reasons for it. It was a very crude, but very effective way of measuring how loud the loudest bit would be. And from the analog transmission days, you didn’t want the level to go above a certain amount because we used frequency modulation for the audio, and if you deviated the frequency too much, it interfered with the video. So that’s why it was limited.
MC:
And that practice worked very well. The whole industry evolved and developed around it. Of course there were different standards around the world as to what peaks are. In the UK we had the BBC standard. There were several EU versions, there was a Nordic one, in America we had the volume unit meter, and so on. But in the 80s, people realized that, commercials people especially realized that whilst you weren’t supposed to exceed this peak, you could stay pretty close to it and the result was your commercial was a lot louder than the rest of the program, which I think everyone who watched television knows how irritating it is that people thought it was creative, but it was really irritating.
Cindy:
It’s true. It would be so tough. You’re watching and then just bam.
MC:
Bang. Yeah. And also there was no control if you switch channels, which wasn’t a problem in the old days, but when you have 200 channels, as you switch between channels, if the audio levels are not quite right, you get a sudden jump in audio and not a good experience. And the other thing was that when people made this constantly loud audio, it meant that we lost the dynamics in the audio. Everything was just loud and compressed. So the engineers around the world, the audio engineers in television world had been working on a loudness standard for a number of years. And actually it was about 10 years ago that they came up with the specification. That specs has been evolving since then and it’s 10 years ago, we got into the business.
MC:
So very briefly, program loudness or loudness, there are five measurements that we look at and care about. The first one is true peak, which is actually the highest value that the wave form can achieve. And that’s important because you don’t want the true peak to go into clipping in a system. And if you put compression in audio downstream, then you want to limit the amount of peaking because when you compress it, you will create clipping as a result of the compression process. So -3DBs is a good value for true peak.
MC:
Then the basis of loudness measurement comes out of a block of integration that you do on the audio. There’s a 400 millisecond block, and one 400 millisecond block forms the basis of momentary loud. So that tells you in a short time what the highest audio level is. If you take that average over three seconds, that gives you what we call short term loudness. And then if you take it for the duration of the loudness period, you get program loudness.
MC:
Now the problem loudness is slightly more complicated than just the total average, because in order to get a representative value, they wanted to take out the periods of silence, absolute silence. So if the audio goes to total silence, that measurement block is ignored. And there’s a few more complicated versions of it, which people can read about in our loudness fact book that you can get off our website.
Cindy:
That’s true-
MC:
But those are the… Sorry, go ahead.
Cindy:
Oh, I was going to let you clear your throat there or grab a water for a second and just say that, yes, we do have the loudness fact book for you. And there’s a link for you in the chat. But back to you MC on your five parameters. Yeah.
MC:
Yeah. So, the full parameters are true peak, momentary, short term, program loudness, and then we have loudness range. And loudness range is an attempt to describe the dynamic range in a program. Now you don’t want to look at the absolute values because there will always be silence and there will always be a true peak. So you take the measurements that you’ve done as a part of the integration. And then if you do a histogram and you only measure the top 95 percentile and the bottom 10 percentile, then that range of audio that you get is described as loudness range.
MC:
Now, to give you some examples, a really wide high dynamic range would be something like Mission Impossible. It’s a very noisy film, lots of bullets going on, guns, explosions, fast moving, and so on. And then you have others where that may be a lot less. So if you have a mostly dialogue film or a general TV series, like say Friends, there’s not a lot of dynamics in it. Friends, I’d describe as dialogue, a door slam, and a bit of piano music. So not a lot of dynamics.
MC:
But if you look at some of the more recent episodics as he CIS, or, I don’t watch those things, but they have a lot more dynamics and with 5.1 and so on. So loudness range is a measure of that dynamics. And why is that important? It’s important because it helps you understand how that audio will sound in different listening environments. And so in a cinema where you’re seeing them for 90 minutes or something, then you can enjoy that wide dynamic and live with it. If Mission Impossible was in your living room for eight hours a day, that would be kind of painful and not so enjoyable. And then as we’ll talk about later, if you have online delivery where your listening environment is noisier than normal, then the audio has to be a little bit louder and a little bit less dynamic so that you can actually have a good listening experience. So, that in a nutshell is program loudness.
Cindy:
Got it. So the peak measurement system of old has really changed. And so if I got this right, the parameters that now matter are true peak and program loudness, short term loudness, momentary loudness, and then you were talking about LRA, the loudness range. So changed from days of old.
MC:
That’s correct. Yeah. Now why have so many measurements? So if we talk now about the production process, when you’re doing a production process, the audio engineer has what we call a sound budget. So the budget goes along line, if I’m mixing a 90 minute movie, then I want to make sure that I am at a program loudness level, I manage the LRA based on the environment I’m trying to deliver to, and then I also want to know what my short term and my momentary peaks are just, so that you get an idea. Now, this is a purely mental process. You can’t see it across the whole program, but you use these tools to plan certain things. If you want to make an impact, you may look more at the momentary loudness. If you want to have a short burst of something, you will look at the short term loudness and so on.
MC:
So in production, these tools are used, but they’re used in your edit suite. So there are meters inside the edit suite. So as you’re mixing, you can see what’s going on. Now, program loudness itself is the average of the whole program. So, it’s not very easy to monitor it. So typically, the loudness meter, as it’s called, would be giving a running program loudness. So as you do the mix, you know what the average is up to that point. You don’t know it until you get to the end. Now you may say, how the hell do you mix to it? Because if it’s running along? And therein is a problem, but actually most well-trained audio mixers have an inherent sense of balance. And the reason why they came up with -23 in Europe and -24 in you in the US is because that’s what you naturally mixed with, you’re trained for it. Now, you may not hit the number exactly, you’ll be slightly off, but you’ll get pretty close to that.
MC:
So when you finished your production, you will have used these five meters to create the sound that you want and then you want to deliver it. Now, when we go to delivery, there’s only three parameters that really matter. And I will change my story in a little while. So the production side, as I said, you use all five in the delivery generally speaking, if you’re doing a broadcast delivery, program loudness and true peak are the bits that matter. The LRA would have been taken care of if you’re doing an episodic or a full broadcast, because that will be within the guideline given to the production company. So people tend not to… The program loudness and true peak are the ones that are most measured when you’re making delivery for broadcast.
MC:
Now, the thing that matters for broadcast then is, if you’re making content and you want to monetize it all over the world, you have to deliver to all over the world. And sadly, most broadcasts have a slightly different audio spec to the recommended. Now, the recommended is a guideline, the EBU is a guideline, in the US, the ATSCA defines it, is a fairly tight spec, but if you go around the world, they may say, “We want the true peak to be -3 or -1 or -2. We want the program loudness to be -23 in Europe, -24 in America,” but they have different tolerances. So in France, it’s got to be bang on -23, you’re not allowed any deviation. So that means on you as a content delivery house, you have to measure these things as they come in to you and as you’re repurposing them, you have to ensure you meet this spec. If you have a theatrical mix, coming from a Hollywood studio for example, like your Mission Impossible, that is an LRA of 30 or there about. And most broadcasters will give you a guideline that the LRA mustn’t be greater than 16 or 18. So you need to reduce it for that.
MC:
And then for social media, there are guidelines, rather than standards, but the big guideline is it’s got to be loud enough. So typically people are talking about -18, -16, as the program loudness. Now, ironically, a content creator may do this, but commercials may be mixed for -23. It may be the one instance where the commercial doesn’t sound as loud as the program, if that spec is met.
MC:
So it’s a lot of fun and games in terms of what you have to do for delivery. The theatrical to broadcast makes it a very, very challenging process because where you have lots of audio dynamics, it’s very hard to predict whether the sounds going to go from quiet to loud to quiet to loud, or consistently loud. So the processing needs to be very, very careful, carefully done. Well, in fact, the loudness processing in general needs to be carefully done because we want to preserve what the audio dubbing mixer wanted to transmit and show to people.
Cindy:
Okay. So in the production process, what it sounds like then, and the delivery process, I guess, really the challenge you’re talking about is you don’t have any idea of what parameters were used when the production took place, right? And then the mixing process might have been designed for a different deliverable because you could be delivering to a German broadcaster, a US broadcaster, Netflix, and you don’t know how that all fits into it. And so that’s definitely the problem. Do you have more to talk about around online? Are you just going to jump right into the solution around that?
MC:
Just a couple of things for the challenges. So we could, as I said, in the production, you use the meters to design the sound and by and large, you’ll be close to the spec for one broadcaster, your primary broadcaster, somebody who employs you for example, but it needs a variation. The other one is you may have content where loudness wasn’t a factor at all, an archive, it was content done 15, 20 years ago, before the loudness pack came out. That’s an issue. And the other one that’s a big issue also is the theatrical content. The theatrical content is built for the cinema because that’s where you make the billion dollars. The broadcast is a long tail.
MC:
So we have spoken to a number of studios and they say, “We only do one mix. We don’t always provide what they call a near field mix.” Or if they provide it, you may not have it. And then the other things that you clearly have, which are not subject of this, but I’ll just mention, is you may have a stereo only mix, but you need to deliver 5.1 and you may have a 5.1, and you need to deliver stereo. Now why this is important is, if you do an upmix or a downmix, you will change the loudness. So the loudness has to be… You have to ensure that the loudness meets the spec if a downmix or an upmix process has occurred.
Cindy:
Got it, got it. Okay. So it means that people… What I’m really hearing is that people who are already using our solutions right now in broadcast and for other types of delivery are now starting to talk to you about online delivery and that’s theatrical mix as well. So a lot of choices there.
MC:
Yeah. So a couple of things I would say is, whilst I’ve described the delivery challenge, if you like, what we do is we have designed a set of algorithms that allow us to meet any deliverable spec and there are plenty of them. So the loudness specs for broadcast is relatively straightforward, true peak and program loudness at slightly different variations. Those we handle by the program loudness is a very simple thing, it’s the average audio level in a program. So if the audio is slightly low, you apply gain. If it’s slightly high, you apply attenuation. Now, if you’re applying gain, you may screw up the true peaks. So we tend to measure all the peaks in the file, put a gate around it, and then we will locally attenuate the peaks and do a little mix in, mix out exactly how you do it in the manual process.
MC:
So the big question is, so why not do it manually? And the reason is for one piece of content, you may need to make 30, 40 deliverables if you’re a studio delivering to multiple clients. And so that becomes almost impossible to do manually in a cost effective manner. So by creating the algorithms that we have, we then have created an automation layer on top of it and that automation layer allows you to do this thing unsupervised and we will deliver to any spec and it’s guaranteed.
MC:
And so the tool, we have two tools, there’s a desktop tool for low volume stuff, where you basically say, every deliverable I have, I create a profile. So I say that I’m delivering to a German broadcaster so it’s -23 program loudness, the true peak is -3, the LRA has to be 16. And we don’t worry about the short term and the momentary for that, that spec defines that deliverable and we will measure the audio to say, “Does it meet that spec?” If it doesn’t, we’ll make the necessary adjustment. Now, if you do it in the desktop, it literally is pick a file, pick a profile, press correct. And it does it. Generates a nice report to tell you what it’s done so that you’re accountable, “It was okay leaving me.” And then if you need to make changes to it, you can edit and rename that spec. Now you can have as many of these profiles as you like. Now, what this means is that the operator doesn’t have to be skilled in audio to do this deliverable.
MC:
The other form of course is, if you’re doing more than a few files a day, you don’t want to employ lots of people to do this, you want to automate the process. So now, you could have our product engine, which has a number of automation strategies though. You could have watch folders, very simple. If you have 16 clients or 16 deliverables, you have 16 watch folders, throw the files in the relevant watch folder and out it pops corrected. If you have a Telestream Vantage or an Aspera Orchestrator, you can have a plugin to that. So whilst you’re preparing your video deliverable in your Vantage, you could have the audio taken care of by engine, and it will do the loudness correction for that.
MC:
Obviously that gives you more than 16 profiles if you need it, because it’s as many as you like. And you could also have this driven by a MAM system. So the MAM system can do this. We publish a rest API, so you can have an API that your in-house automation system can drive. So there are many possibilities. So what we’re trying to do is say that we can do the deliverable for any standard and we can also do a deliverable in an automated fashion through any automation system that you have. But I haven’t spoken about online.
Cindy:
That’s okay. Before you speak about online, I just want to see if I got your key points there. I feel like what you’re saying is with all the different processes that need to be done, they can all be done in an automated way and that means that people can go off and be creative and do what they do best. And then automation makes it repeatable. And with the watch folders, that really makes it super easy and straightforward. And of course, as you mentioned, MAMs and Aspera and Telestream, if you want that integration and the rest API as well. So does that sound like what you were talking about around the automation part?
MC:
Mm-hmm (affirmative). Yeah. So, easy to use, flexible, scalable. That’s the idea. Now, the online, as I said, it is basically… Let me talk a little bit about online. So as I said, you’re in an environment, if you’re on a plane and you’ve downloaded a movie on your iPad and you want to watch it, the ambient environment is noisy, it could be on a plane, on a train. Why you’d want to watch Mission Impossible in that environment, I don’t know, but hey, people do. I watch it on planes and I’m quite often annoyed about the audio. So people are now coming to us and saying, “Hey, we’ve got a… We do the broadcast because we have to meet a specification. We may get fined. We may get our content rejected. Now we’ve got these online platforms and we need to do that because we want our audio to be heard.”
MC:
And so we have two approaches to this, one, we say, “Tell us the spec and we can deliver it.” The other one is where they say, “Hey, we are not sure what we really want.” And so we do this very frequently, we’ll just interact with the supplier, give them some files, give them some ideas. And a classic one here is not just increasing the volume, it’s really increasing the volume and managing the compression or the dynamics, the LRA. And so the LRA, which was originally we used for taking a theatrical mix and making it broadcast ready. We can now use that to take it even further down. And now what we’re doing is we’re restricting the dynamic range and we’re making it louder, so that if the environment isn’t friendly because it’s noisy, you can still hear the dialogue. And so that’s really where we are with online.
MC:
Now, nothing changes as far as the product itself concerned, you don’t need to do anything different, you just change the settings. And probably the one I have to mention is Netflix have their own ideas about how they want their audio to be presented. So they’ve taken a variation of loudness, which actually is an older version, the loudness spec is evolving. And it is something that threw a lot of people, because it was a technical change.
MC:
Now, when we had the peak program meter in the UK, we had BBC PPM6. So you just didn’t exceed PPM6 and you were in good shape. Here we have four or five parameters and I haven’t mentioned some of the other subtleties in it. And so when Netflix came up with its chain, it was something that non-audio people found challenging. But now remember, at delivery, you don’t have a lot of audio people. You have machine operators, I guess and so on. So those people are sitting there saying, “How do we do this?” So what I was saying to you is we have a service as part of what we do is come and talk to us about your audio needs and we’ll either set it up for you or help you set it up or interact with you to say, “This didn’t work out so well.” So we’ll say, “Okay, let’s try this variation and that variation.”
Cindy:
And does that work for any size company? Do I need to be a big company to do that? Or do you work with smaller companies as well?
MC:
No. We are a small company. We deal a lot with the larger companies and we deal a lot with the smaller companies. So the desktop tool will do everything that the main engine does, it just doesn’t scale. So if you need, we’ve had customers… We had a customer who basically was their first entry into movie mixing. They were big noise in commercials mixing and they said, “We want to mix movies.” So they know their audio, they know things, but they have a deliverable problem. They said, “We have a 7.1 mix, we want to deliver 5.1 in 25 frames. Then for US delivery,” this was a UK company, “we want to time compress the audio and pitch shift it so that it sounds right. And we want 5.1, we want stereo.” And so the guy said, “So, what would you do?” He said, “Well, we’d sit in the Protools suite and knock these out one at a time.” So I said, “Well, what we can do is create a workflow, which will allow you to do this in a single pass.”
MC:
So for them, what they’re really saying is, what will take me a day to do, I can do in an hour. So they look at a saving in a very different manner to somebody who says, “I need 10,000 hours a month for processing.”
Cindy:
Nice. Nice.
MC:
And it’s really that scalability. The algorithm came first, then came the nuts and bolts to sit there and say, “How do we make scalable?” And then came the flexibility of what else can you do with the audio? Because I want to do more than loudness.
Cindy:
Got it, got it. We are going to go to questions in minute because we do have a couple questions, but before we do that, I just wanted to recap. We looked at loudness and the history of loudness, and then really what’s changed and how we’ve moved into those parameters that you talked about. And then talked about delivery challenges and online challenges, which you just hit on. And then of course, as you’re saying, the solution and the automation around it, really no matter what size your facility is. So yeah, that’s where we’re at. So we’d like to take your questions. What questions do you guys have? And the first one, oh yeah, so I’ve had a couple people ask about examples. And so MC, if you could give some examples, that would be great. What customers are doing this now?
MC:
So I already mentioned a little bit about that little films audio house that was going from commercials mixing to film mixing. A lot of post companies do it when they deliver commercials. So we have Smoke and Mirrors in the UK, the Mail. Smoke and Mirrors have offices all over the world as do the Mail. So they use them. We have a little post house in the US, Leo Ticheli in Midwest… I’ve forgotten now.
Cindy:
I think they’re in the Midwest, I’m just trying to remember where Leo Ticheli is as well. Yeah.
MC:
Yeah. Yeah. And they have a couple of edit suites that they use it for. And then on the other scale, we have Viacom in the UK and in the US were very large installations, because all their content needs to be loudness processed. And in the case of Viacom, they are archiving all their content in 24P, but some of their localizations are done at 25. So they’re bouncing between 24 and 25, a multiple number of times, and still doing that.
MC:
We have a number of companies, play out centers in Australia. They’re a very large one Channel Nine and Channel Seven play out operations. They have two engines and they’re doing all their loudness with us, but they’re also doing all their other audio processing, Adobe encoding, upmixing, downmixing, track mapping. As I said, once we get the audio onto the video file, there’s many things we can do with it. And more and more people are coming to us. Comedy Central actually were the ones who came to us with the online, they said they wanted it, primary playout was Alexa for them.
Cindy:
Yeah. I love Comedy Channel. I’m all in.
MC:
Yeah. So that was an example of interactive stuff, because the guideline wasn’t clear from the platform, they came to us and said, “Hey, what would work?” And we interacted with them. And the way the interaction typically works is we have a chat about it, we make some suggestions, they try it out on their own content. If there are issues with it, then they come back to us and they may send us the file over, we’ll do some analysis, we’ll readjust the settings and give them back.
MC:
Now, the reason for this is it is actually a very exact science, but as with audio, it’s extremely subjective. So people have ideas about how they want to present their audio and what’s important and what isn’t important. So, we also have in India, ZTV and all those listening, and I worked together for a couple of years for Z to be able to deliver globally to the world. And we had some very interesting interactions on how we wanted to do this. So, yeah, it really is something that we find pretty much every broadcaster has a need for and use for. And the more with the present look down and things, there’s a huge demand for file based content because live based content is nonexistent or very low in volume. So the people are digging out archives for presentation. So it’s a big, big requirement.
Cindy:
That’s true. You were telling me the other day about the new resurgence around archives. And I liked some of those examples you gave from Comedy Central to Leo Ticheli to Z. You’ve got a nice range, which ties into the next question, which somebody posted and you kind of touched on this. The examples you gave seem like big companies, but I have a small facility. How can this help me in a small facility?
MC:
So yeah, as a small facility, as I said, Leo Ticheli do commercials, so they don’t do many and they got a rejection from one of the people that they were delivering to. The rejection was by way of a report from us, because these guys had our measurement tool. So they talk to us and they bought a couple of desktop products called F. Now we have customers who say, “Well, all I want to do is make sure I comply and not worry about whether I’m going to make adjustments or not.” And so we have a product that starts at $1,000. And it does the measurement and it gives you a report telling you what’s wrong or not wrong, if it’s good. And you can start at that point. So we have a solution really for the very small customers as well.
Cindy:
Nice. And whoever you are, whatever size facility you have, we have a trial for you that you can download it and try it yourself and see what it does for you. So, there’s the link in-
MC:
Yes. If you go on our website and click the try button, which is on every page of our website really, it will take you to a place where you’ll give us your details and we’ll give you a 10 day evaluation license. And you will get some helpful emails about loudness. And at that point, this is our standard way of working, call us, email us with questions. We are happy to help people get going with this.
Cindy:
Nice. Well MC, it’s been a wonderful discussion on this topic and I think we are wrapping up here. Any closing words?
MC:
Well, no. All I can say is, thank you to everyone who attended. We will do some more of these. And actually, if you want to do a download, if you want the fact book, please visit our website. If you have specific questions that you want us to answer, you can email me, , or support and you’ll get our experts helping you with questions on loudness.
Cindy:
Perfect. All right. Well, thank you everybody and have a beautiful day. Thank you MC. See you later.
Webinar Replay – Audio Processing for A Better Home Movie Experience
Wednesday, 08 July 2020Cindy Zuelsdorf:
Let’s get started. Welcome, everybody. I’m so glad you’re here today. And hey, M.C. How’s your day going?
M.C:
My day is good, busy and busy, but no, very good.
Cindy Zuelsdorf:
Nice. I’m Cindy Zuelsdorf here with M.C and thank you all for being here for Audio Processing for a Better Home Movie Experience, and M.C and I were chatting about this a few weeks back and he had some great things to talk about. And so we’re going to get into why the theatrical mix isn’t suitable for a home viewing environment, and how cinema and home mix is different from one another, and what you can do to achieve an optimal mix without investing a great deal of time and money, and how other people are doing it right now, out in the field. So over to you, M.C.
M.C:
Okay. Thank you, Cindy. So what we’re going to do today is first of all, just a little bit of background. Because cinemas aren’t open, it is frustrating the release of movies in a timely manner. I’ve been dying to see the Bond movie, and it’s not going to be released. It got delayed because of COVID in the cinemas and so on. And theatrical release is a massive source of revenue for the studios. And so there is some talk about saying, we need to release this onto online platforms.
M.C:
We have been doing some work because there are people who are delivering content as broadcasters and as online platforms, who have theatrical mixes to contend with. And so I thought it’d be a good idea to have a chat about it and have a webinar explaining the issues and so on. I’m going to keep this largely non-technical. For once I’m going to use a short PowerPoint, but I will spare it down to two or three minutes, and I will try and show a clip of the processing before and after.
M.C:
So what I’m going to do is I’m going to start by talking about audio mixing for the cinema, and why is it different or what are the constraints, etc. So you’ve got to remember, a cinema is a calibrated environment. If you want to present content to a cinema, the cinema sometimes get certified by Dolby or by DTS. And what they do is they look at the acoustics of the cinema, you don’t want echoes inside. The speaker layout, the quality of the speaker and so on. So what it is, is that you create a cinema that is suitable for theatrical mix delivery, and then the person who’s mixing the audio knows that that’s going to be exhibited or presented in a controlled environment. So he has a lot of freedom when it comes to doing the audio mixing, to create a good experience for you. And obviously, if you’re watching action movies, the good experience is guns and planes and noises. And also, if you’re watching a period drama, then there may be moods and music and dialogue and so on.
M.C:
And the cinema, as I said, acoustically, it’s really well-prepared for this sort of delivery. The second thing is the attention span. We go there for a 90 minute experience or a two hour experience if you’re Bollywood, a three hour experience, but with a three hour experience, they give you a 10 minute interval. So what it means is your eyes and ears and everything can adjust and can deal with it. Now, what this also means is that because the creative audio guys and the directors want to create a certain impact, there are some standards for delivery, but by and large, it’s a bit of a free for all as to how you do it.
M.C:
My favorite story is Christopher Nolan, who, in one of his movies, they put up a billboard outside the theater which says, you may not hear the dialogue and that’s intentional, which wouldn’t be so good if you’re doing Hamlet or something. But anyway, that’s a creative license that exists. These guys are billion dollar box office, movie makers so they get what they want generally.
M.C:
Now, when you come into a broadcast environment, first of all, if you look at your living room, it is probably the epitome of a non-acoustic environment. The Telly’s shoved in a corner somewhere. The speakers are… In the old days, we used to have a nice three inch speaker, but now with the sets getting thinner, the speaker quality is variable, and there is a lot of noise, and we watch TV for a long time. And I’m probably speaking as an old guy who watches TV, as opposed to the kids who watch it on the laptop. So the audio channeling is different. And what the broadcasters have had for a long time is some standards who say, “When we deliver audio, we want to have a certain audio level or approach that is suitable for that environment.” And we call it program loudness.
M.C:
In program loudness, what you’re really saying is that the average level of audio has to meet a certain amount. So you can have loud bits, you can have quiet bits, but in a program, the average must always be consistent. Again, we won’t go into too much detail. I’m talking about these concepts because later on, I’m going to talk about what the challenge is as you go from one environment to other. The other key thing with program loudness is there are broadly speaking, two standards, the EBU, the Europeans, and the ATSE 85, the Americans.
M.C:
And then there are variations of that adopted by all other countries, but there are two numbers. The average for the EBU level’s minus 23, the average for the US is minus 24. Then there are true peak levels that really are designed so that you don’t drive your peaks into distortion, and they could be minus one, minus two, minus three. The good news or the bad news is even with these small numbers, every individual broadcaster has their own flavor of this. So when you’re delivering content, it’s important to meet these standards and requirements. And what this means is you’re going to change whatever mix that has been provided, either it’s a theatrical mixer and episodic mix and change it to meet a broadcast spec. Now, it’s a technical requirement, not a creative requirement. However, you need to make sure that when you try and meet the technical environment, you keep the creative intent as true as possible.
M.C:
So that’s your broadcast environment. If we now talk about the online environment, the online environment has fewer rules, fewer standards, and they’re kind of ad hoc. Netflix, if you can call them an online platform, do have a very specific requirement. They went away and thought about it and they came up with a requirement which isn’t the broadcast requirement. It’s a standard of their own. Again, I won’t go into details other than say it’s different and why that matters, we’ll follow in when we start processing. So Apple have a guideline, Amazon have a guideline. Everyone has a slightly different guideline, but the general rule of thumb is that the environment that you’re watching some of those things is very different to the living room. You most likely are listening to it on your headphones. Very likely, this is if you’re outside, very likely you also in a noisy environment, on a train, on a plane, so there’s consumption there.
M.C:
Now, obviously there’s also consumption in the house. So there is… I don’t watch serious content on a… Well, I do watch it on a plane, but then it’s the mix that they’ve done on the plane itself. But I wouldn’t think about watching it on a train or anything, but a lot of people do. I watch a lot of that content in the house, on my TV. So it’s important that the audio mix for that is of high quality so that it’s as good an experience as a satellite or terrestrial broadcaster, if not better. So here we are, we’ve got these three scenarios, the cinema, the broadcast and the online delivery. So I spoke about program loudness as the average.
M.C:
There is one other parameter for this particular discussion that’s really important. And that’s the thing we call loudness range. Now, loudness range is a way to describe the dynamic content, how dynamic is the content in a movie or in a program. And it isn’t the loudest bit or the quietest bit, because if you did that, everything would have high dynamics. There’s always one peak you could find that’s very loud, and there’s always silence in programs. When you measure program loudness, you integrate 40 millisecond blocks of audio, and then you treat those integrated numbers in different ways to get different values. Again, I’m trying to keep this simple.
M.C:
If anybody’s interested in a very technical explanation of this, we will be inviting people who want this. We have a PowerPoint that we can send, which has got a lot of detail, which we’re happy to share with you. We also like to work with our customers by interacting with them in terms of understanding their needs, and then offering some of our expertise and experiences. So if you do want to do that, we’d encourage it. We’ll also allow you to download our licenses and actually try it out on your content with your people, making the subjective quality assessment. Now, this is important because of the creative intent.
M.C:
Loudness range. So let me try and give it a hand-waving explanation of why it’s important and what you have to do. So the first thing with the loudness range, as I said, it’s an attempt to understand the dynamics. And the way we do that, and I will put up a slide a bit later and go into detail then, but the way we do that is, as I said, we have these integrated blocks. And as those blocks appear along the time, we note them, and then we do a statistical model of how often does a particular level of value come across. And then we draw a histogram of it, and then we chop off a percentage of it on the quiet side and a percentage of it on the loud side. And the middle block is what call loudness range.M.C:
Now, this is a good measure of saying how dynamic is the movie or the content. Now, because we’re talking about movies and because I said there isn’t a regulation, broadcasters like to have a range. They like to say the loudness range shall not exceed 16, or it’s between 16 or 18 or something along those lines. The online, again, they don’t have a rule, but our experience from working with some of our customers has suggested a rule that works well for online delivery. And again, I will tell a couple of stories about how we work with customers to understand this process. Loudness range is really important for two reasons. The first one obviously, is if you have extremely large loudness range, the living room experience becomes difficult.M.C:
A couple of examples, I was watching The Accountant on Netflix a while back and Netflix for some time had wanted to present the theatrical mix. And the house came down and started yelling at me because when the guns started firing, it was extremely loud. And in that environment, if you then say, because it’s so loud, I’ll lower the volume, your dialogue disappears because the dialogue is suppressed. So you need to keep that loudness range in such a manner and process the audio in such a manner that actually the dialogue is kept at an intelligible level. And again, I will mention a couple of strategies for how one goes about doing it.M.C:
So we talked about program loudness and loudness rank. Program loudness, because it’s just the average, you could do a very good correction of program loudness by simply changing the global gain in your content. If your program loudness is minus 20 and you need to deliver to minus 23, apply three DBs of attenuation and you meet the spec. So it’s very straightforward. Now, when you go through extremes, again, it gets challenging, but by and large, that’s the formula. For loudness range, you have some issues. And the issues are to do with the fact that if you have a dynamic range reduction strategy, a simple way to do that is to simply put it through a nonlinear transform. So what you say is, “I’m going to have an S curve on the thing. So I’m going to suppress the quiet bits, compress them. I’m going to compress the loud bits and I’ll keep the middle bits linear. And that would be my compression.” Lots of compressors do this. It’s a simplistic way.M.C:
The challenge you have is where you have a thresholding effect where the noise is going in and of a particular threshold. And so you get what they call sound pumping. It’s not very nice, and if you’re watching a nice movie that’s been well-mixed, it’s not a good experience. So you want to avoid those sort of artifacts. There are various strategies for avoiding it, but when we started doing this, we started looking at a process, and before an automated process was available, what happened is every station used to have a sound mixer. So when they had the movies come in, they would go to the audio department, they would listen to the movie and they would remix it so that it gave you a nice loudness range and everything. Human beings are still the best operators to do this.M.C:
However, in the modern day if you didn’t have an in-house service, a lot of the stations started outsourcing this to specialists. So what would happen is a lot of our customers basically take the content from the studio and create different versions for distributions, for different clients. Why? I just said it when I spoke about the various delivery requirements. So although we have minus 23 and minus 24, there are little variants around this. So every customer needs a slightly unique mix. It’s very, very small, but if you’re trying to do this manually it would be very, very expensive. So program loudness, you may get away with. Loudness range needs a lot of attention. It needs attention to the whole movie, and so it becomes a very, very expensive process.M.C:
So we got a call from a major Hollywood studio when we just started the company about seven, eight years ago, and they said, “Oh, you guys are doing these automated loudness. We want to talk to you about this theatrical experience and the problem we have.” And so we listened to them and they outsourced all this. So they weren’t even going to buy, they were simply advising us that this is a problem. So we listened to them. We analyzed the problems. We came up with a bunch of ideas, and we came up with a processor that saw the problem. And it was a reasonably good processor.M.C:
As the time went on and audio… Actually, over the last few years, audio’s become more important to broadcasters in terms of they’re paying more attention to this. It isn’t good enough just to meet the spec because people are saying the content has been modified creatively, and it’s not a good experience. So we started talking to people and Canal+ took an interest in us. They looked at our processor and a few weeks later, we got a phone call from them saying, “Hey, M.C, we want to show you something.” And they said, “This is your processor and here’s another processor.” And the other processor sounded really, really good. And we had a long chat and we ended up licensing the process. Two reasons for it, our businesses in automating it and solving a range of audio problems. And if we can get the best solution for our customers, if we can guarantee that their ability to automate is 100% achievable, then it’s worth our while doing the licensing.M.C:
So we’ve licensed this process. It works really well. It was particularly nice that a discerning customer came to us, actually pointed out the issues. A couple of years later, we had a similar issue with ZTV in India, where they came to us with the same idea that, “We are listening to this stuff. We want to really put your product through the ringer to make sure it’s suitable for our needs.” We had a bunch of iterative conversations and now all their movies and all their content goes through our processor. And the reason why I mentioned this really is this interaction between customers and us is very important. We encourage it. It’s how we get the best results. Every customer has a slightly different focus and a slightly different need. And the more discerning they are, the more complex their requirement is. But having that dialogue allows us to provide for their needs. And because this is really for the whole movie and everything we’re focusing on that, but there are other audio issues that we talk about. We do more than just loudness.
M.C:
So anyway, loudness range is this iterative process in what Anton Hurtado, who did the algorithm development did, is he did an analysis of the audio, and then he came up with a set of metadata that allowed us to create any loudness range we want, without some of these artifacts. He’s a big audio buff. He’s also very mathematically orientated and he’s come up with this algorithm. So now, with that one analysis, we can do a broadcast delivery or an online delivery. And it’s an iterative process, it’s a complex process, but the end result is, surprisingly the dynamics are preserved, but the loudness range is reduced.
M.C:
If we add to this process, the Dolby’s dialogue intelligence, where we measure the dialogue as we do this, and we incorporate that process in it as well, we get the best of all worlds. So we get a mix that complies to a standard. We get a mix that the dialogue is intelligible, and we get good dynamics. So really, that’s the secret sauce that we have for broadcasters, and then we just apply that same secret sauce with different set of numbers to get you the online mix. So I’ve been rambling on for quite some time. So I’m going to very quickly just do the PowerPoint bit and play the clip, and then I’m going to come back and say how hard or how easy it is to do this.
M.C:
So I thought, I’d start with this little slide. So these are these integrated measurements for The Matrix, the movie. This is a slide I got from the EBU Tech paper. And this is the two thresholds for the loudest bit and the quietest bit. And you can see that The Matrix has a loudness range of 25. Now, best way to show all this is to walk you through some of the other slides. So here’s the loudness range. At the bottom there’s a list of movies. So Hamlet is at this end. So you can see that the loudness range for broadcast is this one and the loudness rain for online delivery is this one, and you can see the divergence. There are movies that go all the way to 25 and there are movies that go all the way to five.
M.C:
And so if the loudness range is actually, if you’re looking for 15 or 16 and it’s actually six, there’s no need to take it the other way. It needs to be less than, rather than exactly within that range. But it gives you an idea. If we now look at the program loudness, you can see that the program loudness also varies. So here’s some of the special like Golden Eye, Zulu, Man of Steel, the program loudness of minus 12 or 13. And at the other end, you got Hamlet and Sabrina. So this is a bunch of my DVDs that we did some analysis on. And what we’re trying to do is we’re trying to get that target to minus 23 here and for online presentation, minus 16 or 18.
M.C:
So you’re trying to make both these challenges. And if we put them side by side, you can see that there’s a diverse need of processing that needs to be done. This is a short clip and watch out towards the end where you’ve got the dialogue.
Speaker 3:
Are you hurt?
Speaker 5:
What?
Speaker 3:
Are you hurt? Are you bleeding?
Speaker 5:
I don’t think so. Are they following us?
Speaker 3:
No. Just calm down. We’re going to be fine.
Speaker 5:
I’m not going to be fine. They just to kill-
M.C:
Okay. If you are listening carefully, when we switched between the original and the processed in the dialogue, the dialogue was lifted. That was an automated process that did that. It’s a short clip, but obviously we do it to the whole movie. Now, clearly this is an attempt at a demonstration. What we normally do when people are interested in this is say, download the engine, try it out, work with us. If you find a problem, generally, people are very happy to FTP the file to us. We’ll analyze it and we’ll either say, “It’s a problem with a setting. Here, try this, or we will make a change if required.”
M.C:
We’ve done this through several iterations and an awful lot of what I call the golden ears have listened to us in this. And so we think we have a good process now, but there’s always room for improvement, and we encourage people to download it. So I said the next thing I was going to show is if we were to do this, how would you set up the engine or how would you do this? So this is the analysis screen. So we normally analyze the file and then we process it. So we analyze. It in this case, this is set up for the US operation ATSE 85. We’re going to analyze for program loudness, loudness range of 15 and true peak of minus two. So what will happen is Engine will analyze the file, measure it, and then it will come up with a criteria to correct to.
M.C:
So if we now look at the next slide, here we have, what are we going to correct? This is a correction profile. So we have a couple of little tricks. If you get tone in your file, which for God knows why we still deliver files with tone in them, we will detect it and ignore it. But here it says, correct for program loudness, correct for true peak, correct for loudness range. It will now say if the program loudness doesn’t meet minus 23, then it will correct it, or minus 24 as we had it, it’ll correct it to that. It will correct the true peak so that they don’t exceed minus two. And it will make sure that the loudness range doesn’t go above the specified threshold of 15.
M.C:
Now, there is one thing that I forgot to talk about, so I’m going to go back again, dialogue intelligence. So what it’s saying is if you click this box, when it analyzes the file it will look for dialogue, measure it, and then apply that additional correction in the process. If you set this up, you could create a watch folder for this, and you can put 100 movies into it and Engine will quietly go in and correct them. So you’d set one of these up for a broadcast delivery. You’d set another one for an online delivery, and it will go in and crank out the results. I can talk about this forever, but this might be a good time to say, do we have any questions?
Cindy Zuelsdorf:
We do have a couple of questions that came in. If you have questions, go ahead and put them in the chat. To start with, somebody asked, can you talk more about the difference between program loudness and loudness range?
M.C:
Oh, okay. So I will talk about three things. Traditionally, what we used to do is we used to measure the peak level of the audio. And that was an old measurement, PPMs we called them. And they worked really well, they were designed to solve a technical problem, and that was that if you had too much audio, it interfered with the color information when we did the transmission. So you needed to limit the amount of audio. And then in the eighties, the commercials people discovered that although you’re not allowed to exceed it, there’s nothing to stop us from staying very close to it. And so they did, and that’s how we got very loud commercials.
M.C:
And as a result, one of the things that you have when you stay close to it is you can’t preserve the dynamics of the audio, or you don’t, you lose them because you just created what they call a sausage, everything is constant level. And actually, if you go back to some of the older ads in the seventies and things, they used to have a lot of classical music in them, and they had dynamics, but the impact, buy my car tomorrow at 10,99 or whatever, it dominated, and it was extremely irritating.
M.C:
The reason for I’m harping on about this is that a few years ago, the broadcast bodies got together and said, “We need to find a way to change this. And so what we’re really trying to do is have that experience where we encourage people to find the dynamics again.” And so we said, “How do we do that? How do we allow people to have dynamics and yet maintain a standard?” So they came up with this idea of the average loudness, or average audio level in a program. It’s a little bit clever and more complicated than just an average, because they have a thing called gating. Our PowerPoint in detail, which we’ll send to you, we’ll explain this, but that’s the average level. So the average level, if you think of a commercial, the average level could have no dynamics, but it could still have an average. But if you have dynamics, you can have average and you can have dynamics. So loudness range is really saying, in some ways you could say it’s a measure of how much the program content deviates from the average, whereas the average is the average.
Cindy Zuelsdorf:
Another question to follow on that is, is the emotion dialogue intelligence similar to Dolby’s?
M.C:
It is. It is exactly the Dolby’s. We have licensed that technology from Dolby. And so yes, it is Dolby’s dialogue intelligence. Now, there are a couple of reasons why we did it. One is it does help with the dialogue. The other one is a famous broadcaster in LA said to me, “Don’t knock on my doors until you can have dialogue intelligence in your product.” And we always listen to our customers. It is a choice that you have. So in Europe, people did not like the notion of dialogue intelligence, which is why it’s not included in any of the broadcast specs, but we give you the option to use them. So as a product maker, although we follow the specs and so on, we also follow the flavors that our customers like.
Cindy Zuelsdorf:
We do have a question here asking why is the audio online different to audio for broadcast? And could you explain a little bit more about those differences?
M.C:
There’s a group in Europe called P Loud that discuss this at great length, but the simple explanation is that the broadcast environment from a listening point of view is a little bit more controlled. As I said, it’s your living room where the ambient noise around it, there is ambient noise, but it’s not always persistent, and it’s not very high. So you’re not competing to hear the audio content. The online requirement, as I said, if you’re in a noisy environment like trains and planes, then the audio level needs to be higher. Now, there are people who argue that it shouldn’t be the case. You should be able to get away with the broadcast one. And if you can’t hear it, just crank up the volume on your phone or your iPad or whatever device you’re listening to.
M.C:
However, the platforms have argued that they want a higher level of audio. So again, there is no reason why the broadcast mix wouldn’t work, other than only your phones in on your iPads, I think there are limiters on how loud they’ll let you go, because you’ve got headphones on, so you don’t go deaf or you don’t damage yourself. So there are some limiters there. But a little experiment, or I observed really is that I was switching between my Regulus television and Amazon Prime on my TV and the Amazon Prime is louder. And that’s how it’s set up. Now, again, in your home environment, if you get it louder, you’ll have to turn it down, but you can still maintain quite a lot of the dynamics, even though you’ve lifted that level. Obviously your peak level to average level will be reduced, but I think that you will get a good experience.
Cindy Zuelsdorf:
That’s great. Now I’ve got two questions that in my mind go together, so I’m going to throw them in there together. And one of them is, how do I find out how good the processing really is? And somebody else asked, can you tell me about the pricing of the software?
M.C:
Okay. So the only way you find out how good it is, is to try it. It is a very subjective thing. Now, we like to think that although it’s subjective, the processing has been carefully tuned, but we just sit there and say, “Don’t take my word for it. Try it.” Now, to give you a couple of examples of this, we have lots of them. We love people trying things out because it’s our way of saying we have nothing to hide. Please listen to it with your material, in your environment, with your ears, because if you can win that, then we have a sale.
M.C:
So the important thing is that test. We have a number of customers who are constantly doing these tests. Our job then is to help and assist to make sure that the test goes well. So we go online, we will do TeamViewer sessions. If something goes wrong, we’ll encourage them to actually send us the file for analysis. I have been sent some amazing content over that. We will sign NDAs. We promise to delete all the content as we get it, et cetera. We respect the copyrights and the nature of it, but it’s the way we work together. So I think the answer is, try it, listen to it. If you don’t find it meets your requirements, talk to us about it. And maybe there are some limits that we can help with. Nine times out of 10, we help with some adjustments.
M.C:
And to give you a measure of our perseverance, and this is no complaint against my customer, ZTV took a good 18 months of evaluation. Now, that wasn’t because they were just taking 18 months, that was because he needed to have the right people at the right time. And their interests grew progressively in what we did, until such a time as the purchase requirement came about, but we were happy to do it. We’re very glad we did it because we got a great result and a very happy customer.
M.C:
Now, in terms of price, price is a… We have a very modular system. Very often people don’t buy just the loudness and the loudest range processor, they buy a bigger system. But it will start at around $10,000 and work upwards. Now, we have customers who are processing 10,000 hours of content a month, and their system runs in five figures and upwards, and we have customers who are sitting there saying, “We just want to do a few files at a time.” So the system scales is really what I’m saying.
M.C:
And the one thing I didn’t mention is I spoke about automation. We have an API for the product, so you can integrate it to your own control system. We have some MAM integrations. We integrate with Telestream Vantage, [Espera 00:36:35], Orchestrator, but we also have another system of what folders, any clients that allow you to automate the process without hooking up to a mainstream MAM.
Cindy Zuelsdorf:
So we have a question. Do you have a lot of presets for reference inside of EBU, et cetera?
M.C:
Oh yeah. So what we do there is that the settings that I showed you, you can set them up and give them a name. So, for example, although, as I said, the standards are slightly different, our first customers were actually big post houses in the UK that were delivering commercials all over the world. And so every time they had to deliver to a broadcaster in Brazil or Spain or Portugal or India or whatever, there would be a broadcast spec that came along. And what we did is we created a system in the thing, which is just called a profile and you give the profile in name. So this one’s for Brazil, this one’s for Fred, this one’s for blogs. And then the operator simply says, “We’re delivering to Fred, so we’ll use a Fred profile.” And that will store the measurement criteria, the correction criteria and also the channel layout, if you require.
M.C:
The system I showed was from our desktop product, but we have a more complicated protocol Engine, which allows you to set up a workflow and set up and store individual profiles. You can have as many of them as you like. We have customers with upwards of 100 profiles
Cindy Zuelsdorf:
One more question. I have an existing media management system. How can your system be integrated with it?
M.C:
Okay. So, as I said, we have a rest API. Now, when you have an existing MAM system, it reads cooperation from both sides. So sometimes, we’ve already done it, but if we haven’t, we would share the API with a MAM provider. We would loan them an engine so that they could do the integration work, and then we’d work with a client. Client pressure helps in these instances because the people won’t speculatively do this, but we have managed to do that. And it really is two or three days of work to do this. It’s not very difficult.
Cindy Zuelsdorf:
Thank you everybody for being here. I’m so glad you joined us for Audio Processing for a Better Home Movie Experience. And it was super helpful, M.C, to hear about loudness range and program loudness and all about the cinema at home and what it means for playout and broadcast and post. All of you watching the replay and all of you here live, again, let us know if you want to get the technical PDF with all of the details, and by all means, sign up for a trial. Thank you, M.C.
M.C:
Thank you very much for giving us your time. And as Cindy says, please feel free to contact us. We have the PowerPoint, we can let you have the download and we’re happy to work with you.