Mixing and mastering audio terms can be confusing and scary.
We’ve collated the most frequently misunderstood terminology to help ease your vocabulary fears.
Use ctrl + F to look for a specific term.
Ambiance refers to the environmental sounds and characteristics of a particular space. This includes background and other atmospheric sounds that contribute to the environment.
An amplifier is a device that increases the strength of an audio signal. It’s used to drive a speaker or other audio device and is an essential component of any sound system.
Amplitude refers to the level or volume of a sound. It is typically measured in decibels (dB), and it describes how loud or soft a sound is.
Automation refers to making precise, and controlled adjustments to the parameters of a plugin unit or audio signal. Automation changes are then activated to the affected parameters during playback.
An auxiliary is a channel or track in a digital audio workstation (DAW) that’s used to route audio signals. Think of it as a kind of “side road” that audio can be sent down to be processed and then returned to the main mix.
Balance describes the relative levels of different sounds or elements in a mix. A mix engineer always tries to strive for a balanced mix where possible.
Bandwidth refers to the range of frequencies that an audio signal occupies. Think of it as the width of the audio signal on a graph of the frequency spectrum. A signal with a wide bandwidth occupies a large range of frequencies.
Bit depth is a measure of the resolution of an audio signal, or how accurately it is represented. Imagine an audio signal as a waveform on a graph. The bit depth determines the number of points on the graph used to draw the waveform. The more points there are, the more accurately the waveform can be drawn and the higher the quality of the audio.
Boomy is a term used to describe a mix that has too much low-frequency content, resulting in a muddied, uncontrolled, or “boomy” sound. Imagine a drum kit with too much bass drum or a bass guitar with too much low end. The sound would be difficult to distinguish from the other instruments and will lack clarity.
Bounce or “bouncing” refers to the process of exporting or rendering an audio mix or group of stems from a DAW.
Boxy is a term used to describe a mix that has too much energy in the mid-frequency range, resulting in a hollow or muffled sound. Imagine a drum kit with too much snare. The sound would be difficult to distinguish from the other instruments and might lack definition.
BPM stands for beats per minute and is a measure of the tempo of a song or piece of music. Imagine a metronome ticking away at a certain speed, with each tick representing a beat. The BPM of a song is the number of these beats that occur in a minute. For example, a song with a BPM of 120 would have 120 beats in a minute.
Buffer size refers to the amount of audio data that a DAW or other audio software is able to process at one time. Imagine a bucket that’s used to hold and transport water. The size of the bucket determines how much water it can hold at one time. In the same way, the buffer size of a DAW determines how much audio data it can handle at one time.
A bus is a type of audio routing point that’s used to send many audio signals to a common destination. Such as a group of tracks to a plugin unit or a group output aux. Imagine a bus as a kind of “highway” that audio can travel down, with different lanes for different chosen categories of audio signals.
A channel refers to a single audio path or track in a DAW or other audio recording and mixing hardware. Imagine a channel as a kind of pipe that audio can flow through. Each channel in a DAW represents a separate audio signal used to record, edit, and process a single audio source.
A chorus is a type of audio effect that’s used to add depth and richness to a sound. It does this by creating a sense of doubling the original sound, often with a slight pitch or time delay between the duplicated sounds. Similar to a choir of singers all singing the same part, but with each singer starting the part slightly later than the one before them. The effect of the chorus can create a fuller, more harmonically rich sound.
Clipping refers to the state of an audio signal being too high for a particular device to handle, resulting in a distorted or “clipped” sound.
Comping refers to the process of selecting and compiling the best takes of an audio recording into a single, cohesive performance. Musicians record many takes of the same song or part, and then go back and listen to find the best parts. A mix engineer or producer might then “comp” these sections together, like a puzzle to create the final, polished recording.
Compression is a type of audio processing that’s used to reduce the dynamic range of an audio signal. Dynamic range refers to the difference between the loudest and quietest parts of the signal. Imagine a singer who sings really quietly and really loudly within the same song.
A crossfade is a type of transition in which one sound or audio clip fades out while another fades in, creating a seamless blend between the two. Crossfades are often used to create smooth transitions between two audio clips or sounds together.
Crunchy is a term used to describe an audio signal or mix that has a rough, distorted, or “crunchy” sound. Similar to a guitar being played with a lot of gain or distortion, or a drum sound that is heavily compressed and saturated. The sound would be rough and edgy, with a lot of harmonics and overtones.
Decay refers to the amount of time it takes for a sound to fade away or decrease in level after it’s played or triggered. For example, the initial sound of a drum is the attack, and the amount of time it takes for the sound to fade away is the decay.
Decibel (dB) is a unit of measurement used to describe the level or intensity of a sound or audio signal. Each increment of decibel represents a ratio of the sound level, rather than an absolute value.
A de-esser is a type of audio processor that reduces sibilant sounds, such as “s,” “sh,” or “t,” usually found in vocals. Sibilant sounds can be harsh, unpleasant, and especially problematic in vocals because they tend to stick out. A de-esser works by detecting the frequency of sibilant sounds and reduces them.
Delay is an audio effect that repeats a sound signal, at a short time behind the original signal.
Depth refers to describing the sense of space in a mix. Sounds with a lot of depth may sound “full” or “rich,” while sounds with less depth may sound “flat” or “thin.” Depth is affected by the use of effects such as reverb or delay, panning in the stereo field, and the balance of the different elements in the mix.
Distortion is a type of audio effect that adds harmonics and overtones to an audio signal. It can be used to add grit to a sound, and are often associated with rock and heavy metal.
Dithering is a technique used to improve the quality of quantization or the process of reducing the resolution of an audio signal. Dithering adds a dash of random noise to a signal, which helps mask the distortion that can occur during quantization.
Doubling refers to the technique of layering two or more similar or identical sounds on top of each other. It be achieved by recording multiple takes of the same part, or using plugin units.
Dry describes an audio signal or mix that has none or little reverb. A dry sound is one that has a “direct” or “up-front” quality, with no sense of space or dimension around it.
Dynamics describe the range or variation in the level or intensity of an audio signal. An audio signal with a large dynamic range has a steep difference between the loudest and quietest parts.
EQ is type of audio processing that’s used to adjust the balance of the different frequencies within an audio signal. A mix engineer or music producer can boost or cut specific frequencies, shaping and sculpting the sound to achieve the desired result.
Fade describes a gradual increase or decrease in the level of an audio signal over time. A fade-in is a gradual increase in level, while a fade-out is a gradual decrease in level.
Fader is an element found on a mixing console used to adjust the level of an audio signal. A fader is usually a sliding control that allows a user to adjust the level of the signal up or down.
Feedback is created when a sound or audio signal is amplified and re-introduced into the same system, resulting in a loop. Often recognized in live music whereby a guitar signal is picked up by another microphone on the stage which creates an unpleasant increase in a high-pitched sound.
A flanger is a type of audio effect that creates a sweeping sound by duplicating audio with a modulated version of itself. The new audio is offset in time and continuously modulated to create a sweeping effect.
Frequency refers to the number of cycles that a sound wave or an electrical signal completes in a second. It is typically measured in Hertz (Hz), and correlates to the pitch of a note or sound.
Fuzz is a type of audio effect that creates a rough or distorted by overloading the input of an amplifier. Fuzz effects are used to create an overdriven sound, colored with harmonics and overtones.
Gain refers to the amount of amplification applied to an audio signal. Gain is typically measured in decibels (dB), and is used to adjust the level of a sound. Gain can be applied using the trim or gain controls on a mixing console, the input stage of an amplifier, or a plug-in. You might find in a DAW, gain is typically adjusted using virtual faders or other controls.
Gain staging refers to the process of optimizing gain levels throughout the audio chain to achieve the desired balance. It’s important to ensure a signal is at the optimal level at each stage which will help to avoid problems whilst mixing such as clipping.
Harmonics are additional frequency components of a sound that are higher in pitch than the root pitch. Harmonics are an important element of tone, recognized by a sound’s harmonic characteristics.
Harshness describes a rough or unpleasant quality in a sound. It usually refers to containing too much high-frequency content or strong sibilance. Harshness is a problem as it is difficult to listen to, and can stand out in an undesirable way.
Headroom refers to the amount of level range available above the nominal level of an audio signal or mix. It’s an important consideration in mixing and mastering as it determines how much room is available for transients. Having optimal headroom can affect the clarity or punch of a mix.
Hertz (Hz) is a unit of measurement used to describe the frequency of an audio signal. Frequency refers to the number of cycles or oscillations per second of an audio waveform and is typically measured in Hz.
A high pass filter is a type of plugin unit that is used to remove or attenuate frequencies below a certain cutoff frequency. They’re used to remove low-frequency sounds like background noise or rumble.
Imaging refers to the process of creating a sense of space, depth, and width in a mix or recording. It’s an important aspect of production as it helps make the different elements of a mix sound more cohesive and balanced.
Latency refers to the delay that occurs between the time an audio signal is generated and the time it is heard. It’s a problem in mixing as it means sounds are out of sync which can be disorienting for the listener.
Layering refers to the process of combining multiple sounds to create a deeper timbre. It’s a common production technique used to add depth and complexity to a mix.
A limiter is a type of audio processing that’s used to prevent an audio signal level from exceeding a threshold. They are often used to prevent distorting or clipping.
Listener fatigue refers to when a person’s ears become tired or bored of a mix after an extended period of time. It’s caused by a variety of factors, including monotonous sounds, aggressive frequencies, or a lack of dynamic contrast.
Looping refers to the process of repeating a section of music to create a continuous, seamless sound. It’s a technique used in arranging to allow for other musical elements to be introduced.
A low pass filter is a type of EQ unit that is used to allow low-frequency sounds to pass through while blocking higher frequencies. They’re often used to reach a smoother sound, and to reduce the harshness or brightness within a mix.
Makeup gain is the process of boosting an audio signal after it has been reduced by a plugin unit effect.
Masking refers to where one sound interferes with or is difficult to hear because of another sound that is playing at the same time. One of them is significantly louder or has more energy in the same frequency range as the other sound.
MIDI stands for “musical instrument digital interface” and describes electronic-based instruments. Musicians use MIDI controllers, such as a keyboard or drum pad, which sends performance data to a device that can generate digital audio.
Monitors are speakers that are used to listen to and evaluate audio during the mixing and mastering process. They’re typically used in a studio and are designed to be as accurate and neutral as possible. This allows the listener to hear the true sound of the audio without any coloration or distortion. Read about correct studio monitor placement.
Mono refers to a single channel of audio. A mono audio signal contains all of the information for a recording in a single channel. This is as opposed to a stereo signal, which contains separate left and right channels. Mixing in mono is a popular idea which engineers adopt to work towards a balanced mix.
Muddy is used to describe a mix that lacks clarity and definition due to too much low-frequency content. Muddy mixes can be difficult to listen to, as it can be hard to discern the individual elements of the mix.
The term “mute” refers to the action of silencing or turning off a track or a group of tracks. This is by using a mute button on a mixing console or within a DAW. It’s often used to temporarily silence an instrument whilst comparing with and without.
A noise gate is a type of audio plugin that’s used to reduce or cut unwanted background noise from a recording. A threshold level is set above and below the signal, which will determine if it is allowed or silenced. This plugin is in use throughout pop and R&B productions more creatively.
A null test is a method used to compare two audio signals to determine how similar or different they are. It is used as a way to check the phase relationship between two tracks or to troubleshoot problems in a mix.
Overdrive is a type of effect that is used to produce a distorted or saturated sound. It is often used on guitar to create a warm and crunchy sound . It can also be used on vocals or other sources to add character or attitude.
A pan pot “panning potentiometer” is a control that is used to adjust the stereo balance of an audio signal. The mix engineer can specify how much of a signal is heard in the left channel, right channel, and in both channels.
Panning refers to the placement of a sound in the stereo field. It’s controlled using a pan pot on a mixing console or in a DAW. It is the balance state of the sound between the left and right channels.
Parallel processing is a mixing technique where a copy of an audio signal is routed through a second chain. Effects can then be applied to the repeated signal without altering the original.
Phase describes the relationship between a waveform of an audio signal and its timing. When two or more audio signals are played together, their phase relationship can negatively affect the sound.
A phaser is a type of audio effect that creates a sweeping, whooshing sound. It combines the original signal with a series of delayed, phase-shifted copies. It splits the signal into two or more frequency bands and applies a phase shift to one of the bands. The phase-shifted signal is then mixed back with the original, creating a sweeping effect as the phase shift is continually varied.
Ping-pong describes a stereo effect in which a sound bounces back and forth between the left and right channels of a mix. This can be done by panning a sound across the stereo field or by using delay/reverb plugin units.
A pitch shifter is a plugin that can change the pitch of a sound without changing its duration. It alters the frequency of the sound and shifts it up or down by a defined interval, such as a half-step or a whole step. Pitch shifters are commonly used to correct the pitch of vocals or instruments or to create special vocal effects.
Plosives are bursts of air that are produced when pronouncing certain consonants, such as “p” or “b.” Plosives can be a problem when recording vocals, as they can cause a sudden, unwanted increase in the level of the signal. This can result in a disturbed or overdriven sound, and it can be difficult to fix whilst mixing.
A preamp is a device or circuit that is used to amplify a weak audio signal. It is often used in a chain to boost the recording level of a microphone or other low-level signals.
The proximity effect occurs when a microphone records a sound source that is close. As the distance between the source and the microphone decreases, the low-frequency content in the signal increases. This can result in a wanted or unwanted boost in the bass frequencies. It’s mostly heard with cardioid and hypercardioid mics, whilst less noticeable in omnidirectional.
Release refers to the amount of time that it takes for dynamic processing to return to its normal state. It determines how a compressor can react to level changes and impacts the sound of the processed signal.
Reverb is an effect that adds a sense of space and ambiance to a sound. It simulates the reflections that occur when a sound is produced in a physical space. It can be used to create a wide range of ambiance, from small, intimate rooms to large, open spaces. Different types of reverb are applied to create depth throughout a mix.
A sample is a short snippet of sound that is recorded and stored digitally. It can be played back and manipulated using an electronic device. Samples are taken from a wide range of sources, including music, speech, and even recordings of everyday objects.
Sample rate refers to the number of samples of audio that are taken per second when an audio signal is recorded. The sample rate is typically measured in kilohertz (kHz), and determines the resolution of the digital audio. The higher the sample rate, the more accurate the digital audio signal. Whilst a lower sample rate results in a lower-quality and less accurate signal.
Saturation is a type of distortion that occurs when an audio signal level is pushed beyond capacity. Saturation is often used to add warmth, character, and harmonics to a sound, and is a way to add depth and interest to a mix.
A send is a routing mechanism used to send a copy of an audio signal to a separate processing chain or bus. Sends are often used to apply FX to a track or a group of tracks without altering the original signal. Sends are controlled using a channel fader on a mixing console or DAW. They are often used in conjunction with auxiliary channels or buses to create a variety of effects.
A shelf is a type of EQ filter that is used to boost or cut the level of a specific range of frequencies.
Sibilance is a frequency disturbance that is often caused by extreme levels of “s” and “sh” sounds in a vocal recording. It can make a recording sound harsh or unpleasant, and is difficult to fix when voice mixing.
Sidechaining is a technique used to shape the dynamic response of one signal based on the level of another. It’s used to create ducking effects, in which a track or effect is attenuated whenever another track, such as a kick drum or a bass line, plays. Sidechaining can be achieved using a dynamics plugin such as a compressor. It’s used to create rhythmic effects, enhance the clarity of a mix, or add the perception of movement to a sound.
A sine wave is a type of waveform that is characterized by a smooth, periodic oscillation around a central axis. It is a fundamental waveform, and it is the building block of many other waveforms.
Slapback is an echo effect characterized by a single, short delay repeated at a fast rate. It’s used to add movement to a sound, and it can be particularly effective when used on vocals or percussion.
Smash describes a heavily compressed sound that has a lot of punch and impact. A smash effect is created by using a compressor or limiter with a high ratio and a fast attack time, resulting in a loud, aggressive sound.
The stereo field refers to the way that sounds are arranged within the left-to-right spread of a mix.
Sub refers to the presence of strong, prominent bass frequencies in a mix. A sound described as “subby” typically has a lot of low-frequency energy and a strong, powerful bass presence. Sub is often desired in music, such as EDM and hip hop, where a strong, powerful bass sound is a key part of the aesthetic.
Sustain refers to the length of time that a note’s held after it’s played. Sustain is an important characteristic and often shapes the expressive qualities of a sound. E.G. a piano has a naturally short sustain, while a guitar or a synthesizer can have a much longer sustain.
Sweetening describes the process of enhancing the sound of a recording or a mix. It often refers to adding clarity, definition, or color to a sound. It’s achieved using a variety of techniques, such as EQ, compression, and reverb.
Tape refers to analog tape, which is a medium that was used for recording and storing audio in the past. Analog tape was widely used for many years in professional recording studios. It was known for its warm, rich sound and its ability to add character and depth to a recording.
The threshold is the name of the level, at which a compressor or limiter begins to reduce the level of an audio signal. The threshold is set using a fader or a knob on the plugin unit.
Timbre describes the characteristic or tonal quality of a sound. It is what makes a sound unique, and it is determined by the harmonic content and the spectral envelope. It determines the character and identity of a sound, which will have a significant impact on the feel or emotion of a piece of music.
Tremolo is a type of effect created by rapidly modulating the level or the volume of a sound. It’s often used to add movement and is easily recognized on electric guitar. It’s created using analog or digital modulators, and envelope followers.
A waveform is a graphical representation of an audio signal. It shows the amplitude (level) of the signal over time.
Warmth describes a sound quality that is often associated with a rich, smooth, and full. A warm sound is often perceived as being more pleasant, natural, and inviting. It’s often desired in productions to create an atmosphere or mood.
Width refers to the stereo image of a sound or a mix, often used to describe the perceived size or spread of the sound.
Wet describes a signal processed by an effect, such as reverb or delay. Whilst mixing, a wet signal is often mixed with the dry, unprocessed portion to create a desired balance of effect and direct sound.