Audio Levels for YouTube - Everyone is Doing it Wrong (Well, Most of Us!)

Kevin Muldoon

Loving YTtalk
I spoke about the subject of audio levels the other day in my thread "What Volume Level for Background Music?". This is an incredibly important subject, therefore I wanted to give it a thread of its own (not least because activity has died in the last thread).

I have spent the last day or so reading dozens of articles on the subject of audio levels for YouTube; both for vocals and for background music. It is something I want to get right so that I use the right settings for all of my videos moving forward.

Please bear in mind when reading this post that I am not an audio expert. I am simply a YouTuber who wants to get audio right and has spent a lot of time reading the advice of others to come up with my own conclusions on what settings I should use for my audio. By and large, this thread was created to allow me to share what I have learned and to speak with fellow YouTubers about this issue so that we can all get a better understanding of audio levels in our videos.

* Note: The forum software stops me from sharing any references to back up what I say in this article. If a member can add these references for me, it would be much appreciated :)

Audio Level of Background Music

With regards to background music, my thread the other day highlighted that everyone is handling it differently. For example, @Andrew Flint noted that at University he was advised to keep all sound peaks around 10 db lower than vocals. While @JTMulcahy has his background music set at -12db.I myself tried different background levels from around -20 db to -35 db below my vocals.

I later found out that according to the World Wide Web Consortium (W3C), we should all ensure that "non-speech sounds are at least 20 decibels lower than the speech audio content".

The W3C note that:

The objective of this technique is to allow authors to include sound behind speech without making it too hard for people with hearing problems to understand the speech. Making sure that the foreground speech is 20 db louder than the backgound sound makes the speech 4 times louder than the background audio. For information on Decibels (dB), refer to About Decibels.
Looking back at my background music thread, @Andy J. Valer seems to have gotten closest to this by setting vocals at -9 db and background music at -28 db.

In my last video, I set vocals at -9 db and background music at -30 db. To me, it still doesn't sound right. But that is perhaps because I don't have a good ear for these things. Or because there are other factors to consider, such as the song I was using etc.

So moving forward, if you are adding background music to your videos remember that the W3C, who suggests standards for the internet, advises setting background music to at least 20 db lower than your vocals.

Audio Level of Vocals

We also need to talk about vocals.

I have been normalising my audio to -9 db for all of my latest videos. The reason I did this is because TV and DVD rules around the world recommend audio levels for vocals in the range of -6 db and -12 db. It's different for each country, though I believe most countries are closer to the -9db range (In my other thread, @Andrew Flint noted that he was advised on his University course to keep vocals at this range too).

Setting a max peak of -9 db means that most of my vocals are around the -11 to -12db mark.

One of the problems of YouTube is that there is no agreed upon standard for video creators. In my other thread, @Andy J. Valer noted he uses -9 db for vocals and @Dino noted he uses between -15 db and -20db.

I have been reading dozens of articles and discussions on this issue. Many people advice 0, many advise -3db. Others advise -6db, -9db, -12db etc.

So who is right?

One of the best articles I read on the subject is entitled "Return of the Video Doctor: Simple Fixes for Online Video Errors".

The author, Jan Ozer, gives a great explanation about the situation YouTubers like ourselves face. At the end of the article he talks about the right target level for audio for YouTube.

Rather than paraphrase what he says, I have quoted it below:

"Finally, let me tackle the appropriate target decibel level for audio uploaded to YouTube or otherwise deployed on the web. I’ll start with a short story. I was consulting with a client in D.C. and the editor in charge of uploading video to the web related that they were having serious issues with audio volume on their web videos. He said that they sounded great in the studio, but remote viewers playing the videos over the Internet complained the audio was too low. He wondered if it was an audio compression issue.

I downloaded one of the compressed files, loaded it into my sound editor, and saw that volumes peaked at -12 dB. I said, “That’s the problem, the volume is too low.” He responded, “I worked in TV for years, and I’ve always set my peaks at -12 dB. It’s perfect and sounded great in the studio.” Interestingly, we were both right.

In the broadcast world, most channels recommend a max volume of -12 dB; everything you watch on the TV is set to these levels. For this reason, audio at -12 dB sounds normal. On the web, virtually all producers target 0 dB, and web viewers are used to this higher volume. My client’s videos, set to -12 dB, had much lower volume than the average video on the web; hence the complaints.

I always normalize my audio to 0 dB before uploading to YouTube or otherwise deploying. As you'll learn if you watch this video normalization pushes the maximum peak in the audio file to 0 dB, so it never causes distortion. You can argue the technical merits of targeting -12 dB, but your volume will be lower than most other audio on the web, and they’ll suspect that you’re out of step, not the other way around."​

Jan advises normalising peaks to 0 db. However, I read advice from many other audio engineers who recommend normalising audio to -1 db or -3 db. The reason being that when a video service such as YouTube encodes video, the audio track needs a bit of room to go up and down. And if it cannot go up, clipping, distortion and other problems can occur during the encoding process (a more technical explanation about this issue can be found online if you search for it - this is the way I understood the problem).

In other words, if you normalise at 0, your audio quality may be much worse than normalising at -1 db or -3 db. The author above, Jan Ozer, does not appear to believe that normalising audio below zero is necessary, but so many others do that it makes me doubt which is correct.

The first question that came into my mind after reading Jan's article was: Is what he is saying correct?

Well, the author clearly knows much more about audio than me. And I also believe that everything he says sounds as if it is true i.e. if we all reduce our decibel levels for YouTube, but the majority do not, then we are the ones out of touch, not them.

My concern is that by normalising peaks to 9 db and having my vocals in the -12 db range, the audio in my videos is too low. That's not a major problem for people who have headphones or good speakers. But a high percentage of people who view videos on YouTube hear audio through poor laptop speakers or terrible mobile phone or tablet speakers.

At this time, I am leaning towards normalising setting audio in my future videos to -1 db and making background music around -25 db. However, again, I do not consider myself an expert on this. The main reason I have created this thread is to share what I have learned and hear the opinion of fellow YouTubers on YTTalk.

Hopefully, this will not be an issue in the future. There are reports that YouTube is beginning to normalise all music videos so that the volume does not keep going up and down between songs.

Unfortunately, I have not seen anything to suggest that is happening with regular videos. So until that happens, we will need to pay close attention to the decibel level of audio in our videos.

I would love to hear the thoughts of fellow YTTalk members on this issue. This is something that is very important to me moving forward and I have no doubt it is important to all of you too.

Hopefully, we can get a good discussion going and all improve our knowledge on the subject.

:)

Kevin
 
I spoke about the subject of audio levels the other day in my thread "What Volume Level for Background Music?". This is an incredibly important subject, therefore I wanted to give it a thread of its own (not least because activity has died in the last thread).

[insert a metric f*** ton of solid, researched sound advice]

*slow clap* Absolutely excellent reply, man, thank you for the great detail and advice!

Edit: +1 to me for my inadvertent pun in that quote, lol
 
*slow clap* Absolutely excellent reply, man, thank you for the great detail and advice!

haha. You're welcome. :)

Only annoyance was that I could not share the references I referred to. I was going to upgrade in order to share the links, but there is no mention of upgrading allowing links to be shared.
 
I normalize to -1db and pull background audio in at about -15 to -20 depending on the source. This is just what I have found to sound the best. The reason I do -1db is in case I need any very very minor adjustment in Premiere afterwards. And the reason I waver between -15 and -20 for background audio is because video game audio is often very inconsistent, with quiet sections and loud battles. So in general, since I'm not a fan of doing too much editing over a 30 minute long video, I "ear-ball" it through the quietest and loudest to make sure it's ok.

The other thing that many people don't consider is to listen to their track with both speakers and headphones. Remember, your audience could be using either, and an audio mix is not necessarily going to sound the same on a set of speakers as it does with headphones.

I personally feel that background music at -25 is too low, but again you also need to consider the type of music. Heavy trash metal at -25 will be very different than jass at -25, while not in volume but definitely in how obtrusive it is to the ear.
 
Thanks for responding Tarmack.

I am publishing a blog post on my blog in a day or two that goes into more detail on this issue. But long story short, I too will be normalising max peaks to -1 db (partly due to an article that recommended it and partly due to be taking an average of other recommendations).

I will probably be setting background music to around -25 db as when I normalise audio to -1 db, my vocals are normally around -5 db. This adheres to the recommendation from W3C of setting background audio to -20 db. However, I do agree that the type of song being used plays a big part on how it all fits together.

You hit the nail on the head about listening to audio with headphones and speakers. The background music of my last video sounds a little too high at times with headphones on, and is barely audible with my laptop speakers. It's difficult to get the right balance.
 
Back
Top