Analysing audio only makes sense when actual audio signal could be detected in the provided audio. That is why all our models work together with a Volume Detector. We use it to detect silence and near-silence and only process snippets which contain sound.
Volume detection is very much dependent on the audio capture device properties. That is why you can specify sensitivity
of the silence detection by adjusting the
volume_threshold in the audio processing functions. The sensitivity can vary
between 0 (no sound treated as silence) and 1 (all sound treated as silence). We suggest using the value of 0.005 as a starting point
(that's also the default value) - that should exclude very quiet audio chunks from the analysis. If too much of the audio is
tagged as silence, we suggest dividing the threshold by 10 until the proper silence threshold is found.
Audio fragments that were treated as silence by the Volume Detector will be marked as "silence" and the results of any other requested models will not be returned.