The DeepTone™ SDK currently supports PCM audio only. For file processing it currently supports the WAV file format. Ideally, the audio should be 16-bit PCM with the sample rate of 16 kHz. If a different sample rate is provided, the file will be up- or down-sampled accordingly. Please be aware though that using files with sample rates lower than the recommended 16kHz may lead to deterioration of analysis results.
When processing files, the following file formats are supported*:
- WAV (.wav)
* Certain audio file formats can contain various audio codecs. Make sure the audio coding format of the file you are processing is supported as well. For example: A WAV file can contain PCM A-law or PCM mu-law audio. However, the DeepTone™ SDK does not support these audio coding formats, yet.
Most PCM audio coding formats are supported. When processing audio data directly (not file processing) by passing a
numpy array, the numpy array needs to have one of the following data types. Additionally, the sample rate needs to be
specified in the
Check audio format
If you're not sure your audio files meet these criteria you can use the CLI tool SoX for that verification by doing the following:
The result will be something similar to:
SoX also allows you to convert your files in case they don't match our criteria by using the following command:
Alternatively, you can also use ffmpeg to convert your audio to the right format: