Audiobook Audio Technical Requirements: How to Submit Files That Pass Platform Review

Most audiobook production guides walk you through recording, editing, and reviewing your audio. Fewer explain the specific technical bar your files have to clear before a platform will accept them. For independent authors submitting to ACX or a wide-distribution aggregator for the first time, a technical rejection can feel unexpected and hard to diagnose.

The good news is that platform audio requirements are consistent, well-documented, and entirely checkable before you upload. Understanding what each measurement means — and how to verify it in advance — turns a potential rejection into a routine pre-submission step.

Infographic showing four audiobook audio requirements with pass and fail indicators: loudness between -18 and -23 dBRMS, noise floor below -60 dBFS, peaks below -3 dBFS, and MP3 at 192 kbps or higher. — Four measurements decide whether your audio files pass platform review — catching a failure here takes minutes; catching it after a rejection takes days.

Why platforms set technical audio standards

Audiobook platforms like ACX — which distributes to Audible, Amazon, and Apple Books — publish specific technical requirements for two connected reasons. First, they want a consistent listening experience across their catalog. An audiobook that is significantly quieter or louder than others creates a jarring experience for listeners switching between titles. Second, they need files that encode reliably for delivery across devices, streaming conditions, and playback speeds.

When your files fall outside the required range, the platform review process flags them and returns them for correction. Most of these rejections have nothing to do with how good your narration sounds. They are triggered by measurable values that are too high, too low, or in the wrong format — all of which can be checked and corrected before you ever submit.

File format and encoding: the baseline requirements

Most platforms share the same core file format requirements. Confirming these before production ends prevents format problems from appearing only at the final submission step.

File format: MP3. ACX and most distribution platforms accept MP3 files for final submission. WAV files are generally used during editing but converted to MP3 for delivery. If your production workflow exports WAV by default, add an MP3 export step before submission.
Bit rate: 192 kbps or higher. A 192 kbps encoding rate is the common minimum for stereo audio. Some platforms accept 128 kbps for mono tracks; using 192 kbps for stereo avoids ambiguity. Encoding at 256 or 320 kbps is acceptable but produces larger files without a meaningful quality improvement for voice-only content.
Sample rate: 44.1 kHz. This is the standard sample rate for consumer audio and the required rate for ACX submissions. If your recording software defaults to 48 kHz — common in video-production contexts — convert your export to 44.1 kHz before submission.
Stereo or mono: both are accepted. Most voice-only audiobooks are recorded or converted to mono before submission. Stereo offers no meaningful advantage for a single narrator and results in larger files at the same perceived quality.
One file per chapter or section. Each chapter or major section should be a separate file, not a single continuous file for the whole book. Consistent file naming across all chapters makes the submission process and any future revisions significantly easier.

Loudness: the most commonly failed requirement

Loudness is the technical measurement that causes the most submission failures for independent audiobook authors. It is measured in dBRMS — the average energy level across a section of audio — and most platforms specify a required range rather than a single target value.

ACX requires all audiobook files to have a loudness level between -18 dBRMS and -23 dBRMS. Files louder than -18 dBRMS exceed the acceptable dynamic range and are flagged as too hot. Files quieter than -23 dBRMS are flagged as too quiet. Both are rejected, and neither problem is reliably obvious from casual listening — especially if you are monitoring at a comfortable volume rather than measuring.

-18 dBRMS is the loudest acceptable level. Audio above this value sounds fine to most listeners but exceeds the limits platforms apply when normalizing playback volume across their catalog.
-23 dBRMS is the quietest acceptable level. Audio below this level sounds noticeably soft on most audiobook apps, which normalize playback at a fixed reference point. Platform normalization cannot compensate for audio that is too far below the required floor without introducing audible artifacts.
A target of around -20 dBRMS keeps you comfortably in range. Mixing or mastering to approximately -20 dBRMS gives you a buffer on both ends and avoids the need for last-minute adjustments before submission.

Noise floor: the quiet that isn't quiet enough

The noise floor is the level of background sound present in your audio when no narration is happening — the subtle hiss, hum, or room tone that exists beneath your voice. ACX requires a noise floor quieter than -60 dBFS in silent sections of each file.

A noise floor problem is almost always a recording problem, not an editing problem. If your room has audible hum from an appliance, HVAC system, or nearby electrical equipment, that noise is captured in every second of audio alongside your voice. Most editing tools can reduce noise floors to some degree, but heavy noise reduction also affects the narration quality itself. Getting the noise floor below -60 dBFS at the recording stage is far more reliable than trying to recover it in post-production.

Check your noise floor before each recording session. Record five seconds of silence at the start and review the level. If silence reads louder than -60 dBFS, identify and remove the source before recording your full chapter.
Common causes: running appliances (refrigerators, HVAC, desktop fans), electrical hum from power supplies or lighting fixtures, and building mechanical systems with a consistent low-frequency tone.
Noise reduction plugins can help with residual issues but should be used at low settings. Aggressive noise reduction creates an unnatural, processed quality in the narration that is often more distracting than mild background hiss.

Peak levels: the ceiling no sample should exceed

The peak level requirement governs the loudest individual sample in your audio file — the single loudest moment in any word, consonant, or breath. ACX requires that no sample exceed -3 dBFS.

Most carefully edited narration will not trigger a peak violation unless there is an uncorrected plosive burst (the burst of air from B and P sounds hitting a condenser microphone directly), an accidental knock or clap captured in the room, or a particularly loud passage that was not controlled during recording. A peak limiter set to a -3 dBFS ceiling — applied at the mastering stage — catches outlier peaks without affecting normal narration levels.

How to check your files before you submit

All four measurements — loudness, noise floor, peak level, and format — can be verified before submission using free or low-cost tools that are accessible to any author with a computer.

Audacity with the ACX Check plugin. Audacity is free and cross-platform. The ACX Check plugin runs all three ACX measurements at once — loudness, noise floor, and peak — and reports a clear pass or fail for each. For most independent authors, this is the simplest and most direct verification path available.
Adobe Audition or Logic Pro. Both applications include loudness meters and waveform analysis tools in their standard toolsets. For authors already working in these environments, the measurement workflow sits within the same application used for editing and mastering.
iZotope RX. Provides detailed noise floor analysis and loudness measurement in a single interface, along with correction tools. It is a paid application but offers useful analysis even in the Elements tier, which is priced for independent use.

The most practical approach for most authors is to run each exported chapter through ACX Check before adding it to the submission queue. Catching a failure file by file means correcting only the problem chapter, not reworking the full submission after a platform rejection.

What to do when a file fails a technical check

Finding a failure before submission is far preferable to discovering it after upload. The correction for each type of failure is straightforward and usually requires a single editing step.

Too loud (above -18 dBRMS): Apply a gain reduction to lower the overall level, then re-export. Most editing applications allow you to reduce gain by a specific number of dB in a single step across the entire file.
Too quiet (below -23 dBRMS): Apply a gain boost, taking care not to push any peaks above -3 dBFS in the process. Normalizing to -20 dBRMS is a reliable target that keeps you well within range on both sides.
Noise floor too high (above -60 dBFS): Apply gentle noise reduction and re-check. If the noise floor remains above -60 dBFS after reduction at conservative settings, the recording environment needs to be addressed before re-recording that chapter. A persistent noise floor problem is almost always a room problem, not an editing problem.
Peak too high (above -3 dBFS): Apply a limiter set to a -3 dBFS ceiling. This catches the offending peak without affecting the surrounding narration or overall loudness level.

A pre-submission technical checklist

Before uploading your final audio files to any distribution platform, confirm each file meets these requirements. Running this check per chapter — rather than per submission — catches problems before they become delays.

File format is MP3, not WAV, AIFF, or another format.
Bit rate is 192 kbps or higher for stereo, 128 kbps minimum for mono.
Sample rate is 44.1 kHz — not 48 kHz.
One file per chapter, with consistent and clearly labeled file names across all chapters.
Loudness reads between -18 dBRMS and -23 dBRMS — target approximately -20 dBRMS.
Noise floor in silent sections is below -60 dBFS.
No sample peaks above -3 dBFS.
Files have been checked using ACX Check, a loudness meter, or equivalent measurement tool before upload.

Technical quality as part of the final product

Meeting platform technical requirements is not just about passing review. It is about delivering audio that sounds consistent and professional to every listener, on every device, at whatever playback speed they choose. An audiobook that meets these standards behaves predictably — it plays at the right volume relative to other titles, sounds clean through both speakers and headphones, and holds up at the faster playback speeds many audiobook listeners prefer.

For authors working with voice-cloning or AI-assisted production, many of these values are determined by the production workflow rather than the recording process. But the pre-submission check is just as important regardless of how the audio was produced. A clean, passing file is the right foundation for a clean launch.

Simply Voiced is designed to produce audiobook files that arrive at the submission stage already meeting technical review standards, so your final step before launch is uploading — not correcting and resubmitting. If you want a production path where audio quality is built in from the start rather than checked for afterward, that is exactly what a purpose-built audiobook workflow delivers.