Optimizing the loudness of audio content

November 22, 2024
7 minutes
Video Engineering
Jump to
Share
This is some text inside of a div block.

Audio optimizing is a key process in audio engineering that systematically adjusts the amplitude of an audio file to achieve a consistent target level. Unlike manual volume adjustments, normalization employs algorithmic analysis to identify the loudest peak in the waveform. Once identified, the process scales the entire audio signal proportionally to ensure the peak aligns with the desired target level. This approach preserves the dynamic range of the audio while ensuring the loudest point remains below the maximum threshold to avoid digital clipping or distortion.

Audio wave representation before and after audio normalization.

In simple terms, normalization is about making loud sounds quieter and quiet sounds louder, all while maintaining the integrity of the original sound. The key difference between this and manual volume adjustments is that normalization processes the entire audio track to hit a desired loudness target without causing distortion or clipping.

Imagine you’re watching a playlist of videos, and each time a new video starts, the volume shifts unexpectedly. With normalization, each track plays at a similar loudness, keeping the listener's experience smooth and consistent.

How audio normalization works?

Audio normalization is a process that analyzes an audio file's amplitude to adjust its loudness to a predefined target level. The two primary techniques are peak normalization and loudness normalization, each serving distinct purposes:

  • Peak normalization: Identifies the maximum amplitude (peak) within the audio and scales the signal proportionally to ensure the peak reaches the target level without exceeding the digital limit.
  • Loudness normalization: Utilizes perceptual algorithms, such as ITU-R BS.1770, to calculate the average loudness (measured in LUFS) and adjusts the track to align with a target loudness value for a more consistent listening experience.

Understanding decibels (dB)

Decibels (dB) are the standard unit for measuring sound intensity in digital audio. A level of 0 dBFS (decibels full scale) represents the maximum amplitude permissible in digital systems. Any value exceeding this threshold results in clipping, producing harsh distortions. Normalization processes typically constrain levels to stay below 0 dBFS, ensuring optimal audio quality without introducing artifacts.

Peak levels vs. average loudness

  • Peak levels: Represent the highest instantaneous amplitude in the audio waveform. Peak normalization adjusts based on this value, ensuring no single point exceeds the defined threshold.
  • Average loudness (integrated loudness): Reflects the perceived loudness of the entire track, factoring in how human ears interpret sound over time. For instance, a recording with quiet passages and intermittent loud peaks may seem quieter overall despite its high peak levels.

By focusing on average loudness, loudness normalization achieves a perceptually consistent output across tracks or media, which is critical for scenarios like streaming platforms where user experience depends on uniform audio playback levels. Algorithms like EBU R128 and LUFS standards are often employed for such applications.

Types of audio normalization

  1. Peak normalization

    Peak normalization adjusts an audio track's maximum amplitude (peak) to a predefined target level. For instance, if the loudest point in an audio file measures -3 dBFS and the target level is 0 dBFS, the entire track is amplified by 3 dB to meet the target. While this method ensures no distortion or clipping, it doesn’t account for the human perception of loudness, potentially leading to inconsistencies in perceived volume across tracks with differing dynamic ranges.

    Benefits and limitations:
  • Benefits:
    1. Prevents clipping by keeping peak levels below the 0 dBFS threshold.
    2. Increases overall amplitude without introducing distortion.
    3. Simple and computationally efficient to implement.
  • Limitations:
    1. Ignores perceived loudness, resulting in potential volume inconsistencies.
    2. Doesn’t adjust for dynamic range, which can affect user experience.
  1. Loudness normalization

    Loudness normalization adjusts the overall volume of an audio file to match a target average loudness, typically measured in LUFS (loudness units full scale). This method is designed to align with how humans perceive sound, making it highly suitable for applications like streaming platforms, where consistent loudness across content is critical.

Key differences from peak normalization:

  • Peak normalization: Focuses exclusively on the highest amplitude in the track, ensuring peaks don’t exceed a set threshold but ignoring how loud the audio "feels."
  • Loudness normalization: Analyzes the average loudness over time, ensuring consistent playback volume even when tracks have varying peak levels and dynamic ranges.

For example, a loudness-normalized track will sound balanced alongside others, even if it has softer sections or occasional high peaks, as the adjustments consider human hearing sensitivities and loudness perception.

Top audio normalization tools and software for professionals

  1. Audacity

    Audacity is a free, open-source tool with user-friendly audio normalization capabilities, making it an excellent choice for beginners. Its intuitive interface allows users to set target volumes, normalize peaks, and even reduce background noise efficiently. While it’s perfect for basic tasks, it may lack the advanced features required for professional-grade audio production.
  1. Adobe audition
    Adobe Audition is a robust, professional-grade audio editing suite that supports both peak and loudness normalization. With advanced tools and customizable settings, it enables precise control over audio levels, making it ideal for professionals in music production, podcasting, and broadcasting. Its seamless integration with other Adobe Creative Cloud applications further enhances its utility for complex workflows.
  1. Pro tools
    Pro Tools is widely regarded as the gold standard in audio production, particularly in music, film, and TV industries. It offers highly sophisticated normalization options alongside powerful compression and mastering features. However, its steep learning curve and premium pricing make it better suited for experienced audio engineers and professionals handling large-scale projects.

Comparison of audio tools

  • Audacity: Best for simple projects, casual users, and those looking for a cost-free solution. Offers straightforward normalization features with limited advanced capabilities.
  • Adobe audition: Balances user-friendliness and advanced control, making it suitable for intermediate to advanced users who need precise audio adjustments.
  • Pro tools: The go-to tool for high-end audio production, offering unmatched customization and feature depth but requiring a significant learning investment.

For quick edits or budget-friendly solutions, Audacity is the top choice. However, for professional-grade audio workflows requiring precision and scalability, Adobe Audition or Pro Tools are more suitable.

Using Python and JavaScript for audio normalization

Python: Using Librosa and Pydub

Python has powerful audio libraries like Librosa and Pydub, which make normalization straightforward.

Using Librosa

1import librosa
2import numpy as np
3import soundfile as sf
4def normalize_audio_librosa(input_file, output_file, target_db=-20.0):
5    # Load audio file
6    audio, sample_rate = librosa.load(input_file, sr=None)
7    
8    # Calculate the current loudness
9    rms = np.sqrt(np.mean(audio**2))
10    current_db = 20 * np.log10(rms)
11    
12    # Calculate necessary adjustment
13    gain = target_db - current_db
14    audio_normalized = audio * (10**(gain / 20))
15    
16    # Save the normalized audio
17    sf.write(output_file, audio_normalized, sample_rate)
18 
19# Example usage
20normalize_audio_librosa('input.wav', 'output_normalized.wav', target_db=-20.0)

Using Pydub

1from pydub import AudioSegment
2 
3def normalize_audio_pydub(input_file, output_file, target_db=-20.0):
4    # Load audio file
5    audio = AudioSegment.from_file(input_file)
6    
7    # Calculate the current loudness and adjust
8    change_in_dBFS = target_db - audio.dBFS
9    normalized_audio = audio.apply_gain(change_in_dBFS)
10    
11    # Export normalized audio
12    normalized_audio.export(output_file, format="wav")
13 
14# Example usage
15normalize_audio_pydub('input.wav', 'output_normalized.wav', target_db=-20.0)

JavaScript: Using web audio API

1async function normalizeAudio(inputAudioContext, audioBuffer, targetGain = 0.9) {
2    // Create a new audio buffer with normalized data
3    let normalizedBuffer = inputAudioContext.createBuffer(
4        audioBuffer.numberOfChannels,
5        audioBuffer.length,
6        audioBuffer.sampleRate
7    );
8 
9    for (let channel = 0; channel < audioBuffer.numberOfChannels; channel++) {
10        let inputData = audioBuffer.getChannelData(channel);
11        let outputData = normalizedBuffer.getChannelData(channel);
12        // Calculate peak volume
13        let max = Math.max(...inputData.map(Math.abs));
14        // Scale the data to match target gain
15        let gain = targetGain / max;
16        for (let i = 0; i < inputData.length; i++) {
17            outputData[i] = inputData[i] * gain;
18        }
19    }
20    return normalizedBuffer;
21}
22 
23// Example usage (assuming you have an AudioContext and audioBuffer)
24const audioContext = new (window.AudioContext || window.webkitAudioContext)();
25fetch('audio.mp3')
26    .then(response => response.arrayBuffer())
27    .then(data => audioContext.decodeAudioData(data))
28    .then(buffer => normalizeAudio(audioContext, buffer, 0.9))
29    .then(normalizedBuffer => {
30        const source = audioContext.createBufferSource();
31        source.buffer = normalizedBuffer;
32        source.connect(audioContext.destination);
33        source.start();
34    }
35);

Optimizing loudness with FastPix API: Easy integration steps

Enabling audio optimization for on-demand content

Audio optimization can only be applied during the media creation process for on-demand content and is not available for live streams. Follow these steps to activate audio optimization:

Step-by-step guide

  1. Access the API endpoint:

Use the create asset API endpoint to initiate the creation of a video asset.

  1. Set the optimization key:

In your API request payload, include the optimizeAudio parameter and set its value to true to enable audio optimization.

1{ 
2  "corsOrigin": "*", 
3  "pushMediaSettings": { 
4    "metadata": { 
5    "key1": "value1" 
6  }, 
7  "accessPolicy": "public", 
8  "maxResolution": "1080p", 
9  "optimizeAudio": true 
10  } 
11}

For more details please refer to our guide to optimize the loudness of audio.

The role of AI and machine learning in audio normalization

  • AI-based normalization techniques: Machine learning models are now trained to recognize patterns in loudness preferences across different types of content. For example, Netflix has developed custom algorithms that “learn” viewer loudness preferences and apply subtle normalization based on these insights.
  • Noise suppression and contextual loudness adjustment: AI-based normalization can detect background noise and adjust the audio signal accordingly, which is especially useful in environments with unpredictable audio elements, like live news or sporting events.

Applications of audio normalization across industries

Music production and mastering

In music production, normalization ensures that each track in an album or playlist plays back at a consistent volume. It prevents quieter songs from getting drowned out and louder songs from being overwhelming. Streaming platforms, like Spotify and Apple Music, use loudness normalization to make transitions between tracks smooth for listeners, ensuring a uniform playback experience.

Podcasting and voice recordings

For podcasts, normalization balances the volume between different speakers and segments, creating an even, comfortable listening experience. It helps to avoid volume discrepancies between hosts and guests and minimizes the need for listeners to adjust volume levels frequently.

Broadcasting and video production

In live broadcasts and video production, normalization ensures audio consistency across scenes, preventing jarring audio shifts that disrupt viewer engagement. It’s also vital in post-production to make sure voiceovers, sound effects, and background music are balanced correctly.

Streaming platforms and media services

Platforms like YouTube, Spotify, and Netflix apply normalization standards to maintain volume consistency across videos, songs, and episodes. This enhances user experience, as listeners don’t need to manually adjust their volume between tracks. Streaming services use industry-specific loudness levels, like -14 LUFS for Spotify and -23 LUFS for broadcast, to match their platform’s playback environment.

Platform-specific standards and “hidden” normalization on streaming services

  • Normalization standards on different platforms: Streaming platforms like Spotify, YouTube, and Apple Music each have their own normalization standards. Spotify, for example, targets around -14 LUFS (Loudness Units relative to Full Scale) and will automatically adjust tracks to this level. However, these levels vary across platforms, meaning a track optimized for Spotify might sound different on YouTube.
  • Automatic loudness adjustment: Many platforms apply their own normalization regardless of the uploaded file's volume. Understanding these “hidden” adjustments can help audio engineers optimize their work, especially since a track that sounds good on Spotify might lack punch on a platform with a higher target loudness, like Apple Music.

Conclusion

Audio normalization is key to providing clear and balanced sound for your content. Modern tools and technologies, like AI and machine learning, help make the process smarter and more efficient.

For developers, using APIs like FastPix or coding solutions can make it easy to normalize audio levels across your content. As audio standards continue to evolve, embracing these tools will ensure your content sounds great on all platforms, offering a better experience for viewers.

Sign up for free today!

FAQs

What is the role of loudness units (LU) in audio optimization?


Loudness Units (LU) are used to measure perceived loudness in audio content. Unlike simple peak levels, LU accounts for the human ear's sensitivity to different frequencies, offering a more accurate representation of how loud a track feels. When optimizing loudness, maintaining consistent LU levels helps avoid abrupt volume changes during playback.

How can I ensure consistent loudness across different devices and environments?


To ensure consistent loudness, it's essential to use loudness normalization standards like LUFS (Loudness Units Full Scale) during production. Additionally, testing on various playback devices (headphones, speakers, mobile phones, etc.) and adjusting to match the ideal loudness range (e.g., -23 LUFS for broadcast) can improve the listener’s experience across all environments.

What is the target loudness for podcast audio?


Podcasts typically aim for an integrated loudness of around -16 LUFS for stereo audio, as this is considered optimal for clarity and comfort. However, it may vary slightly depending on your target platform. Ensuring this loudness range helps to maintain consistent volume levels without distortion or excessive compression.

How does loudness normalization affect dynamic range?


Loudness normalization adjusts the overall loudness of an audio track while maintaining the dynamic range—the difference between the quietest and loudest parts. Unlike peak normalization, which may crush dynamics, loudness normalization ensures that quieter moments are still distinguishable while keeping the audio volume consistent.

Know more

Enjoyed reading? You might also like

Try FastPix today!

FastPix grows with you – from startups to growth stage and beyond.