Every podcast host has been there. You wrap up a great conversation, open the files in your editor, and immediately know that this is going to be a problem.
Device differences are one of the most common and least-discussed sources of audio inconsistency in podcast production. Your guest might be calling in from an iPhone on a quiet commute, a Windows laptop in a noisy open-plan office, or a MacBook in a reverberant kitchen. Each device captures and processes audio differently, and when those tracks land on the same timeline, the contrast between them can make an episode sound like two different shows stitched together.
This guide covers why that happens, how to get ahead of it before recording, and what to do in post-production when inconsistencies are already recorded & are ruining your day. Whether you’re putting out your first episode or your hundredth, these podcast production tips will save you hours at the editing desk.
What This Guide Covers:
1. Why Device Differences Create Audio Headaches
2. Getting Guests Ready for Recording
3. Managing Audio During Recording
4. Fixing It in Post (As Usual)
5. Time to Re-Record, Unfortunately
1. Why Device Differences Create Audio Headaches
The core problem is that every device makes different decisions about how to capture and process sound.
➤ Microphones
- Laptop microphones are usually MEMS-based omnidirectional microphones designed for general voice capture. They pick up sound from all directions, which means they capture not just the speaker’s voice but also room reflections, keyboard noise, HVAC hum, and any background activity.
- iPhones use multiple microphones combined with beamforming and software processing to improve speech clarity. Instead of relying on a single directional mic, the system analyses audio from multiple inputs and prioritises the speaker’s voice.
- USB headsets vary widely. Some are tuned for voice calls and apply compression or noise reduction, while others are closer to broadcast quality. In many cases, they still sound more compressed and less natural than dedicated microphones due to proximity and onboard processing.
None of these sources sound identical. When placed side-by-side in editing, tonal and spatial differences are immediately noticeable. While EQ can reduce mismatches, it cannot fully transform a low-quality capture into a high-quality one.
The signal-to-noise ratio of microphones varies significantly across device types. Built-in laptop microphones using MEMS capsules generally have higher self-noise and lower overall signal-to-noise performance compared to dedicated recording microphones, though exact values vary widely by device and design.
Dedicated dynamic and condenser microphones typically offer significantly higher signal-to-noise performance, often in the 70–80 dB range or better, resulting in cleaner recordings with less inherent noise. In practice, this means laptop microphone recordings are more likely to contain background noise, room tone, or processing artifacts. When levels are matched in post-production, these unwanted sounds become more noticeable compared to cleaner microphone sources.
➤ So what’s the solution for microphones?
While it is sometimes possible to standardize microphone setups for recurring guests, (for example, sending the same USB microphone or requesting a specific setup) this isn’t realistic for most one-off remote interviews. In those cases, consistency doesn’t come from matching hardware, but from controlling how whatever microphone is being used, is actually set up and operated.
Most of the variation between recordings comes less from the microphone type itself and more from factors like distance from the mouth, room acoustics, and whether the device is applying automatic processing. A basic microphone used correctly in a quiet space will often outperform a better microphone used poorly.
Because of this, most remote podcast workflows focus more on controlling recording conditions than standardizing equipment, since guest setups vary too widely to enforce hardware consistency. That means close mic placement, avoiding speakerphone setups, and disabling noise suppression or enhancement features where possible.
➤ Automatic Processing
Every device applies some level of automatic processing before audio is recorded or transmitted.
- iPhones and Android devices apply noise suppression, echo cancellation, and automatic gain control (AGC)
- Windows systems may apply driver-level audio enhancements depending on configuration
- macOS applies system-level processing depending on the app and input device
- Video conferencing platforms (like Zoom) add additional compression, echo cancellation, and noise filtering
The most disruptive of these automatic processes for podcast production is AGC, automatic gain control. AGC monitors incoming audio in real time and constantly adjusts the recording level up or down to keep volume consistent. On a phone call, this is useful. In a podcast recording, it creates a pumping effect: during pauses in speech, AGC boosts the gain to compensate for the silence, which pulls up background noise.
When the guest starts speaking again, AGC drops the gain back down abruptly. The result is a track where the noise floor rises and falls in a way that is immediately obvious in editing and very difficult to fix cleanly in post, because the problem is baked into the dynamics of the recording rather than sitting as a consistent layer underneath the voice.
All of these systems are designed for intelligibility in real-time communication, not for clean multi-track podcast production. But don’t worry, this is fixable. Keep reading to see how.
2. Getting Guests Ready for Recording
The most effective podcast production happens before anyone hits record.
➤ Standardize the Recording App
The most useful step is ensuring guests use a consistent recording method that captures high-quality local audio, rather than relying on default system apps or call platforms.
- Dedicated recording apps or local DAWs (like Audacity) allow uncompressed recording at consistent settings
- Remote recording platforms that capture local audio per participant are often the most reliable option
For iPhone user guests, Voice Memos records in AAC by default, which is a compressed audio format designed to reduce file size using perceptual encoding. This won’t work because AAC is a lossy compressed format, which permanently discards parts of the audio signal to reduce file size. It limits how much detail you can recover or enhance in post-production, especially when you need to apply EQ, noise reduction, or match it against higher-quality tracks from other guests.
The fix? Third-party apps such as Ferrite allow recording in higher-quality formats such as WAV or AIFF, depending on settings, which preserve more audio detail and provide greater flexibility in post-production.
For Android user guests, default recording apps vary by manufacturer and often use compressed audio formats designed to optimize storage and compatibility rather than high-fidelity recording.
The fix? Avoid the default voice recorder and use a dedicated recording app such as Dolby On or Easy Voice Recorder Pro, set to record WAV (PCM) at 48 kHz, with any “enhancement,” “noise reduction,” or “clear voice” features turned off.
A quick check of the app and format before the session starts costs two minutes but saves two hours in editing time.
➤ Disable Automatic Audio Processing
Many processing layers can be reduced or disabled:
- Zoom: enabling “Original Sound for Musicians” disables most audio enhancement
- System audio enhancements on Windows can often be turned off in sound settings
- Mobile apps may have noise reduction or “enhanced voice” options that should be disabled when possible
➤ The Pre-Recording Sound Test
A short test recording before the session helps identify:
- background noise issues
- room reflections
- incorrect gain levels
- processing artifacts
3. Managing Audio During Recording
Once the session is live, the focus shifts to monitoring levels and avoiding irreversible issues.
➤ Monitor Levels in Real Time
A common target for spoken-word audio is:
- peaks around -12 dBFS to -6 dBFS
- average levels around -18 to -20 dBFS
Clipping (0 dBFS or above) cannot be repaired cleanly in post-production. Low levels can usually be increased, but may introduce noise.
➤ Note Timestamps for Problems
Keeping timestamps of interruptions or issues helps streamline editing and reduces time spent searching through long recordings.
4. Fixing It in Post (As Usual)
Once the recording is done, this is where you clean up inconsistencies, balance tracks, turn a rough session into something listenable, and pretend everything went according to plan.
➤ Normalize and Level-Match First
Normalization sets peak levels but does not equal perceived loudness. After normalization, use LUFS metering to match perceived loudness across tracks.
Typical podcast targets:
- around -16 LUFS (stereo)
- around -19 LUFS (mono)
➤ EQ to Match Tonal Character
EQ is used to reduce tonal differences between sources:
- Reduce low-end rumble and muddiness (typically below ~80–100 Hz if needed)
- Reduce boxiness (often in the 200–400 Hz range)
- Add presence and clarity (often around 3–5 kHz depending on voice)
The goal is consistency between voices, not perfection of individual tracks.
➤ Audio Repair Software for Serious Problems
Tools like iZotope RX can reduce noise, remove hum, and improve intelligibility in damaged recordings. AI-based tools like Adobe Podcast Enhance Speech can also improve clarity, but results vary depending on input quality and may introduce artifacts on heavily degraded audio.
These tools improve poor recordings but cannot fully restore clean audio from severely damaged sources.
➤ Noise Reduction: Less Is More
Noise reduction should be applied conservatively. Over-processing can introduce unnatural artifacts such as metallic or “underwater” sound.
How do you detect those artifacts?
Noise reduction artifacts can become noticeable at lower levels of processing than editors sometimes anticipate, depending on the material and the playback system. Audio that sounds acceptable on studio monitors may reveal different artifacts or tonal imbalances on consumer playback devices such as earbuds or phone speakers, which represent a significant portion of podcast listening environments.
A useful check before finalizing any noise-reduced track is to listen back at a lower playback volume, which can make tonal imbalances and processing artifacts more noticeable. At lower volumes, human hearing becomes less sensitive to bass frequencies, which can expose issues in frequency balance that are less obvious at higher monitoring levels.
5. Time to Re-Record, Unfortunately
Some recordings cannot be fully recovered in post-production, particularly when:
● audio is heavily clipped throughout
● severe background noise dominates speech
● recordings are captured in highly reflective environments with no control
In these scenarios, re-recording is often more efficient than attempting restoration in post production. Identify the specific sections that are unusable and re-record those alone. These targeted re-recordings, known as pickups, work best when the recording environment matches the original as closely as possible.
A pickup recorded in a noticeably quieter or more reverberant space than the original creates an acoustic discontinuity that can be just as distracting as the problem it replaced. Recording 30 seconds or so of room tone at the start of every session is a habit that pays off precisely in this situation. Laying that ambient bed under a pickup smooths the transition and reduces how obvious the edit sounds in the final mix.
It is also worth knowing when during a session to ask for a repeat rather than waiting for post. If a guest gives an answer mid-session that is interrupted by a sustained noise or a technical drop, asking them to repeat it on the spot, before moving on, keeps the re-recording in the same acoustic context as the rest of the session. This is because the room sounds the same and the guest’s energy remains the same.
Re-recordings captured days later in a different location on a different device are harder to integrate cleanly. The wider the difference between the two environments, the more obvious the join remains after every corrective step has been applied.
Wrapping up
Podcast production with guests on different devices is manageable when setup is controlled before recording. The most important improvements come from:
● consistent recording methods
● disabling unnecessary processing
● monitoring levels during recording
As for post-production: it’s for refinement, not rescue. Clean inputs always reduce editing time and improve final quality.
References
Buzzsprout – How I Record Remote Podcasts (Best Tools and Settings for 2026), December 17, 2025. buzzsprout.com/blog/record-podcast-remotely
NearStream – The Ultimate Guide to the Best Audio Software for Podcasters, Feb 27, 2026 nearstream.us/blog/best-audio-software
Deliberate Directions – 11 Best AI Tools for Podcast Editing and Cleanup, September 27, 2025. deliberatedirections.com/ai-tools-podcast-editing-cleanup