How an Audio Engineer Sees Your Sound
An overview of the visual tools used by audio engineers today.
How does an audio engineer see sound? That's probably not a question that comes to mind very often, if ever. But it's a fascinating question nonetheless. Audio professionals rely heavily on their well-trained ears, of course, but we also have some incredible visual tools at our disposal that are truly captivating. The world of audio engineering that I surround myself in is a mystery to most, which is why I'm excited to showcase some of the ways in which we see sound.
This article will contain a basic overview of the visual tools I use while working on audio projects in an order which they typically appear. If you wish to learn more about a specific visual tool see Resources and Additional Learning Materials at the end of this article or reach out to me directly via the comment section or my contact page.
WAVEFORM - VU METER - PEAK METER - RMS METER - PITCH MAPPING - SPECTRUM ANALYZER - FREQUENCY ANALYZER - SPECTROGRAM - 3D SPECTROGRAM - VECTORSCOPE - STEREO IMAGING - RESOURCES AND ADDITIONAL LEARNING MATERIALS
The first step when starting any project is to import the audio files into a digital audio workstation or DAW. My DAW of choice is Pro Tools for editing and mixing. Once I import the files into Pro Tools, a waveform view of each file is created automatically. These waveforms are actually graphs that display amplitude/volume over time. The waveform view provides a great deal of information to a trained eye. The first thing I look for when examining waveforms is clipped audio files, which is the main issue I encounter that causes major problems.
The waveform image above the title of this section is ideally how each audio file should look. Without listening I can tell there's not a great deal of excess noise, pops, or clicks. Every transient, or audio peak, is fully present. If I open up a session and see that all the waveforms look similar to that above image, I know it's going to be a good day.
Now take a look at the clipped audio waveform image. It looks as though someone took a scissors and cut off the top and bottom of the waveform. When the peaks of the waveform are clipped, this means that the mic/line input gain was set too high during recording. Overdriving an audio input in the digital realm causes digital distortion, which is extremely unpleasant to the ears and impossible to remove.
If you'd like to learn more about waveforms, SWPhonetics provides an in-depth explanation on their site. Click HERE to jump to their article.
(VU Meter - Peak Program Meter (PPM) - Peak Meter - RMS Meter - Loudness Meter)
Numerical meters are used throughout every stage of an audio project. They assist with setting proper levels during recording, balancing together different instruments during mixing, and checking the final audio levels during mastering. Arguably the most important visual tool in both the analog and digital domain.
There are two main styles of metering used. VU meters and Peak/RMS meters. VU meters or volume unit meters (pictured below) are commonly found on analog audio equipment such as: tape machines, large format recording consoles, and compressors. It's unlikely to see these meters outside of a professional recording studio, so we'll skip the details for now.
Peak and RMS meters are digital metering tools. RMS is an abbreviation for root mean square, which translates to the average level of an audio signal. A peak meter will display the loudest value of an audio signal. Peak and RMS meters are usually combined like in the below image.
Quick Note About the Colors
Since we can comprehend colors much faster than numbers, digital meters display a different color based off of the level of the audio. This makes a session of 20+ tracks much more manageable when trying to focus on the meters. Just to keep things simple; green is good, yellow is good, but red is bad. This is true in the digital domain.
Looking back to the image of the VU meters that I captured from my tape machine, you can see that up to 0 on the lower "normal" number display is black and 1 - 3 is red. In the analog domain, red is not necessarily bad. Now this may be confusing for those outside of the audio world. It basically comes down to digital vs analog distortion. Digital distortion, as discussed previously, is always terrible. Analog distortion is sometimes desirable. Additionally, the 0 on the VU meter is not the same level as the 0 on the peak/RMS meter. Further details are best left for another article.
When using a digital meter to set the gain level on your microphone, instrument, or any other audio source, keep the loudest peaks around -6 dB and your average level around -18 dB.
After examining all the waveforms and organizing the session to my liking, I'll have a quick listen to each file to determine what editing, if any, needs to be performed. If the audio files were recorded in a home studio or a public area, there's likely to be background noises that will need to be reduced. There are limits to what can be removed. Sometimes removing a noise problem can sound worse than leaving it in. Sounds like a car honking or the hum of an air conditioner are common problems that can be dealt with without too much trouble.
In order to combat any noise problems, we need a detailed view of the audio file. That's where the spectrogram comes in. The layout of a spectrogram allows for unwanted background noises to be visually spotted within the frequency spectrum of the audio file. Then, in most cases, the background sounds can be isolated and reduced. Incredible right?
I use an audio repair program called iZotope RX. The spectrogram view in this program can be customized to show extreme detail. Take a look at the below image of a vocal phrase I imported into iZotope RX. Time is displayed on the x-axis, frequency is displayed on the y-axis, and amplitude/volume is displayed via the colors. The color scale is located at the far right of the below image for reference. Basically, the louder a frequency, the brighter it will be displayed.
I've highlighted a few areas of this vocal phrase to give you a better understanding of what it is we're looking at. The sound in box "1" is a breath taken before the phrase begins. Box "2" contains the word "life". The fuzzy/static area highlighted in circle "3" is the background noise captured in the recording.
There's a great article on iZotope's website about spectrograms and the RX program. Click HERE to check it out.
Taking spectrograms a step farther; instead of using only color to show intensity, the 3D spectrogram adds an additional axis to display the intensity of each frequency as little mountain peaks. This graph is also customizable and we can zoom in on a particular frequency range to see in great detail if needed.
Watch the below video to see what a drum kit looks like running through a 3D spectrogram. You'll be able to see the kick drum and snare creating large peaks in the lower end of the frequency spectrum.
Nearly all music projects require some form of tuning. Even projects that feature an incredible singer can benefit from a bit of tuning. Whether it's a gentle nudge or that classic Autotune sound. And it's not just vocals that can be tuned. Horns, bass, or a lead guitar are commonly tuned as well.
A popular program used for vocal and instrument tuning is Melodyne by Celemony. The below video shows a vocal line imported into Melodyne. The grey background is a grid showing pitch or music notes on the vertical axis and time on the horizontal axis. Every orange "blob", which is what they're actually called in the program, is either a breathe or a word. The bigger the blob, the louder the word. You can see that each blob is positioned on the grid to it's closest relative pitch. The line in the center of each blob is more of the fine detail of each word. Engineers can move these blobs and adjust the finer detail of each word so the singer or instrument is more in tune.
How it Works
Pitch mapping software such as Melodyne analyze the audio to find the fundamental frequencies being played or sang and then coverts them to musical notes. Every music note or pitch corresponds to a specific frequency. A well known example of this concept is the standard concert tuning of A = 440 Hz. To give you a bit more perspective; the fundamental frequency of a low E string on a bass guitar is all the way down at 41 Hz. The full range of human hearing has been found to be limited between 20 Hz - 20 kHz (20,000 Hz). So that low E on a bass is just about down to the limit of our hearing!
I found a chart on Sonicbids that goes through the whole music note and frequency spectrum. Click HERE to check it out.
(Spectrum Analyzer - Real-Time Analyzer (RTA))
Now that all editing is complete, we move on to the mixing stage. One of the best visual tools to use during this stage is a frequency analyzer. This visual tool graphs the intensity of each frequency of an audio file in real time. This tool becomes extremely beneficial when there's a problem with a resonant or ringing frequency. In a live sound environment, these resonant frequencies are what typically cause feedback.
Looking at the below images you'll see that a 20 Hz - 20 kHz frequency range is expanded across the x-axis. The level of the frequencies are graphed on the y-axis. The graph can be customized to display precise details like the below image, or it can be set up to display broad intervals such as the 1/3 octave chart pictured below. I use a the fine detailed setting while working on individual instruments and instrument groups during the mixing stage. I'll use a 1/3 octave graph in mastering to get a better sense of the overall EQ curve of the entire project.
Analyzer setup in 1/3 octave intervals
(Stereo Imaging - StereoScope - PhaseScope - Sound Field - Stereo Field - Surround Scope)
A stereo vectorscope is a great tool to use towards the end of a mix or mastering session to check how wide or narrow the project is. Most modern music is produced to have instruments or sounds spread out across the stereo field to create the feeling that you're being surrounded by the music. An audiobook or a podcast is typically produced in mono unless there is music and sound effects to fill out the stereo image. Check out the below examples of stereo imaging in action.
- Example 1 - Bass Guitar (Mono) -
- Example 2 - Drum Kit (Stereo) -
- Comparing a mono and stereo signal -
I hope you found these forms of sound visualization as fascinating as I do. It's truly incredible how advanced some of these tools have become. Learning about spectrograms was a huge game changer for me and is by far my favorite way to see sound.
If you learned something new or found an idea particularly helpful, please like and share this post. Let me know in the comments below what visual tool you found the most interesting.
Don't forget to subscribe to this blog to receive updates on future articles.
- Sterling Skye
Visual Tools I Use:
Insight - iZotope: https://www.izotope.com/en/products/insight
Melodyne - Celemony: https://www.celemony.com/en/melodyne/what-is-melodyne
Resources and Additional Learning Materials:
Andrews, Alex. “Use This Handy Chart of Note Frequencies and Instruments to Eliminate Background Noise From Your Mixes.” Sonicbids, 6 Oct. 2015, https://blog.sonicbids.com/eq-tips-for-eliminating-background-noise-from-your-mixes-a-table-of-note-frequencies.
Hoffman, Charles. “6 Mastering Meters You Need to Learn How to Use.” Black Ghost Audio, 27 Jan. 2019, https://www.blackghostaudio.com/blog/6-mastering-meters-you-need-to-learn-how-to-use.
“Understanding Spectrograms.” iZotope, 3 Apr. 2020, https://www.izotope.com/en/learn/understanding-spectrograms.html.
“Understanding Waveforms.” SWPhonetics, https://swphonetics.com/praat/tutorials/understanding-waveforms/.
“What Are Waveforms And How Do They Work?” SoundBridge, 17 Apr. 2019, https://soundbridge.io/what-are-waveforms-how-they-work/.