DeepTone™'s File Processing functionality allows you to extract insights from your audio files.
There are two ways to provide your audio files to the DeepTone™ Cloud API.
URL Method (Production Method)
With the URL method, the URL of the audio file you would like to process is sent
in the JSON body of the
post request. The DeepTone™ Cloud API will then download the file
and process it.
This is the recommended approach for a production stage. The reasons for this are:
- The file size limits are larger
- The files are never stored on OTO's infrastructure
- It allows the DeepTone™ Cloud API to scale optimally which results in a higher throughput
Direct File Upload Method (Testing only)
For testing purposes, the the content of local audio files can directly be uploaded with the
For a code sample that shows both mothod go to Example Usage
Working with stereo files
DeepTone™ processes each audio channel separately. If you provide a stereo file, you can provide a specific channel to be processed, otherwise, all channels will be processed separately.
File download request validation
When executing file processing, if a file URL is provided, the requests to download the file will be signed using ES512, and can be verified using this Public Key:
Public Key ID:
The requests contain an HTTP header
X-DeepTone-Request-Verification which represents a JWT with the following
- alg: Algorithm used for signing (should match ES512)
- kid: They ID of the Key Pair used to sign this request. This ID can be used to find which Public Key from the provided set matches the one used to sign this request.
- organization_id: The ID of your Organization. Should match your Organizations' attributed ID
- project_id: The ID of the Project that initiated the call. Should match your Project's attributed ID
- job_id: The Job ID related to this request. Should match the Job ID returned when the file processing request was made
- url: The URL for the file to be downloaded. Should match the url of the file that was requested to be processed
- iat: Timestamp representing when this request was made
- jti: Unique download request identifier
- exp: Timestamp representing when this JWT expires
Currently, processing WAV files is supported. Ideally, the files should be 16-bit PCM with the sample rate of 16 kHz. If a different sample rate is provided, the file will be up- or down-sampled accordingly. Please be aware though that using files with sample rates lower than recommended may lead to deterioration of analysis results.
If you're not sure your audio files meet these criteria you can use the CLI tool SoX for that verification by doing the following:
The result will be something similar to:
SoX also allows you to convert your files in case they don't match our criteria by using the following command:
Currently, using the Cloud API for file processing is limited by the size of the file as follows:
- 15MB for direct file upload
- 125MB when providing a URL for the file
Usage examples that fit these use cases can be found below. If you would like to process files larger than what's mentioned above with the Cloud API we provide methods to do so in our Troubleshooting page.
These constraints also apply to the on-premise deployments of the DeepTone™ API, however, the limits are configurable.
The size of the JSON result is currently limited to 30MB for both direct file upload and the URL Method. To reduce the result size you can:
- Increase the
- Create separate processing jobs for each model
If these measures are not an option for you, let us know at firstname.lastname@example.org.
Configuration options and outputs
There are different configuration parameters and types of outputs which can be requested.
Available configuration parameters
There are several possible parameters which can be passed to a post request to the
models- the list of model names to use for the audio analysis
output_period- how often (in milliseconds, multiple of 64) the output of the models should be returned
channel- optionally a channel to analyse, otherwise all channels will be analysed
include_summary- optionally if the output should contain of summary of the analysis, defaults to False
include_transitions- optionally if the output should contain transitions of the analysis, defaults to False
include_raw_values- optionally if the result should contain raw model outputs, defaults to False
volume_threshold- optionally if a volume level different than default should be considered (higher values will result in more of the data being treated as silence)
callback- optionally a callback URL that will be invoked once the results are ready. More info about this option here.
There are three possible output types, depending on the parameters that are set to
true on the request:
- a plain time series - default output type, returned always
- a plain time series with raw model outputs - raw values are appended when
- a summary - appended to the results when
- a simplified time series - appended to the results when
See below for examples of each of the three outputs:
- plain time series (according to the specified
- plain time series with additional raw outputs:
- summary (showing fraction of each class across the entire file):
- simplified time series (indicating transition points between alternating results):
callback parameter expects a valid URL that will be invoked when your job finishes processing.
Once the results are ready the API will invoke this endpoint using
POST and with a body that matches the one
returned by the
GET request to the
If the invocation to the
callback is successful the API expects a
2XX status code. In case of an unsuccessful
5XX status code) the API will retry invoking the endpoint up to 3 times. This retry mechanism means that
there's the possiblity that the
callback endpoint might receive multiple notifications, which should be handled by
the user. Any other status codes will not trigger the retry mechanism. The
callback endpoint provided should respond
within 10 seconds when being invoked, otherwise the request will timeout.
- Shell + Curl
To process a file that is available on this url:
To process a local file use the following