Arousal Model

The arousal model can classify speech into "low", "neutral" or "high". This can be interpreted as the energy in the voice. Tired and depressed speakers would speak with low arousal, whereas angry or excited speakers would present with high arousal in their voice.

Because it only makes sense to apply this model to speech audio, it is combined with the speech and volume models to increase the reliability of the results.

The receptive filed of this model is 2107 milliseconds.

Specification

Receptive Field	Result Type
2107 ms	result ∈ ["high", "low", "neutral", "no_speech", "silence"]

Time-series

The time-series result will be an iterable with elements that contain the following information:

{
  "timestamp": 0,
  "results":{
    "arousal": {
        "result": "high",  
        "confidence": 0.92
    }
  }
}

Time-series with raw values

If raw values were requested, they will be added to the time-series result:

{
  "timestamp": 0,
  "results":{
    "arousal": {
        "result": "high",  
        "confidence": 0.92
    }
  },
  "raw": {
    "arousal": {
      "high": 0.92,
      "low": 0.0073,
      "neutral": 0.0727
    }
  }
}

Summary

In case a summary is requested the following will be returned

{
  "arousal": {
    "high_fraction": 0.30,
    "low_fraction": 0.60,
    "neutral_fraction": 0.05,
    "no_speech_fraction": 0.05,
    "silence_fraction": 0.0
  }
}

where x_fraction represents the percentage of time that x class was identified for the duration of the input.

Transitions

In case the transitions are requested a time-series with transition elements like shown below will be returned

{
  "timestamp_start": 0,
  "timestamp_end": 1500,
  "result": "neutral",
  "confidence": 0.96
},
{
  "timestamp_start": 1500,
  "timestamp_end": 4000,
  "result": "high",
  "confidence": 0.88
}

The example above means that the first 1500ms of the audio snippet contained neutral speech, and between 1500ms and 4000ms DeepTone™ detected high arousal in the voice.

Specification#

Time-series#

Time-series with raw values#

Summary#

Transitions#

Specification

Time-series

Time-series with raw values

Summary

Transitions