Before releasing video, audio part need to be mastered. We already write about audio level normalization. Today, we focus on most common cases in applying equalizer to audio.
Removes some artefacts created by clipping and prints clipping statistics. Always use. It will not lower volume under 0.0 dbFS, in fact true peaks will be slightly raised.
If volume is too hot, use filter volume to lower true peaks under 0 dbFS. If you are not sure how much negative gain is needed, then go for -6 dB. Can be used for tuning final loudness combined with ebur128 measurement if you do not want to use loudnorm filter. Sometimes loudnorm reduces dynamic range too much.
Cuts low frequencies. Needed in most cases. There are several good cut points, one is about 26 Hz, next is about 35, but sometimes you need to go much higher - about 45 Hz. You need good speakers to hear the difference.
Remove noise and slight De-Esser simiar what tube preamps do. It’s not very aggressive and default values works the best. If you are shooting outdoor audio, use it. It removes just slight high pitched noise and overall cleans audio.
Limits peaks. It’s not a hard limiter, peaks can and will go slightly over the specified level. Default timing is 5 ms attack, 50 ms release. I recommend to use 100 ms release time as Spotify does.
I found 6 dB/octave (p=1) lowpass filter to be optimal for removing too many highs without removing all highs. To leave some highs, limit filter bandwidth to 8-12khz (t=h:w=12k). I use cut frequency above 4.1 kHz.
Tilt filter is useful for rebalancing audio, keeping overall tone balance. Negative slope cuts highs, positive slope lows. My favourite middle frequency is about 2200Hz and bandwidth 3000 to 4000 Hz. order change steepness of curve.
Long term volume normalization, shifts volume levels according to delivery platform specification. I and TP are targets for true peaks and integrated loudness. How to find correction offset is described here.
Changes sample rate of audio, can use different dithering methods. This is needed after using loudnorm because loudnorm will do 4X oversampling to get accurate true peak detection. For video applications, standard sample rate is 48k.
Prints integrated loudness, LRA and minimum / maximum short term loudness.
Prints several audio related statistics.
Changes audio format. Its good to use floating point format during mastering because its more accurate. Two floating point formats are available: flt and fltp.
ffmpeg -hide_banner -i <input.mp4> -af adeclip,asubcut=35,lowpass=4.4k:t=h:w=12k:p=1,loudnorm=I=-14:TP=-1:print_format=summary:offset=-1.4,aresample=48k:dither_method=triangular_hp -acodec libfdk_aac -vcodec copy <output.mp4> -y
Using loudnorm inevitably lowers LRA. This is very often desired because it makes sound louder. In cases where you want to mostly keep original dynamic use two limiters.
ffmpeg -i <zma.mp4> -af aformat=flt,volume=+12.99dB,alimiter=-4.5dB:attack=17:release=100,alimiter=-1.0dB:attack=5,ebur128,astats -vcodec copy -acodec libfdk_aac <zma-normalized.mp4> -y
Club like compressed sound
Caused by s - compression parameter. Lower values compressing more. Too low values like 8 adds basses, we cut basses to get radio friendly sound. Value s=10 doesn’t adds too much sub basses - no need to remove them. Highs needs to be always cut, too much artefacts are created by this compressor. Running it at 96khz will nake more “pro” clean sound.
with less compression: