LUFS Model

The LUFS model can classify audio into "very_faint", "faint", "moderate", "loud" or "painful" based on their perceived loudness relative to full scale:

very_faint - Sound that is barely above the threshold of hearing for humans (-50LUFS and below)
faint - Audible sound that is very low, e.g. a whisper (-50LUFS to -30LUFS)
moderate - Clear audible sounds, e.g. human conversation (-30LUFS to -10LUFS)
loud - Intensity level corresponding to very loud sounds, e.g. power tools, alarm clocks, loud music (-10LUFS to 0LUFS)
painful - Intensity levels that are uncomfortable and painful/dangerous, e.g. jet planes, fireworks, jackhammers (0LUFS)

The confidence values that the model produces are always 1 since they are based on a deterministic calculation.

When trying to detect speech in a normal conversation, we recommend looking for intensity_levels in the moderate range.

Specification

Receptive Field	Result Type
`512ms`	result ∈ ["very_faint", "faint", "moderate", "loud", "painful"]

Time-series

The time-series result will be an iterable with elements that contain the following information:

{
  "timestamp": 0,
  "results": {
    "lufs": {
      "result": "moderate",
      "confidence": 1.0
    }
  }
}

Time-series with raw values

If the raw values were requested, they will be added to the time-series results. Note that for this particular model the only thing that will be shown is the intensity_level which corresponds to the calculated intensity level in LUFS.

{
  "timestamp": 0,
  "results": {
    "lufs": {
      "result": "moderate",
      "confidence": 1.0
    }
  },
  "raw": {
    "lufs": {
      "intensity_level": -25.234
    }
  }
}

Summary

In case a summary is requested the following will be returned

{
  "lufs": {
    "very_faint_fraction": 0.0,
    "faint": 0.2903,
    "moderate_fraction": 0.7097,
    "loud_fraction": 0.0,
    "painful_fraction": 0.0
  }
}

where x_fraction represents the percentage of time that x class was identified for the duration of the input.

Transitions

In case the transitions are requested a time-series with transition elements like shown below will be returned.

 {
  "timestamp_start": 0,
  "timestamp_end": 256,
  "result": "moderate",
  "confidence": 1.0
},
{
  "timestamp_start": 256,
  "timestamp_end": 448,
  "result": "faint",
  "confidence": 1.0
},

The example above means that the first 256ms of the audio snippet represented an intensity level corresponding to a normal human conversation, and between 256ms and 448ms it was faint.

Specification​

Time-series​

Time-series with raw values​

Summary​

Transitions​

Specification

Time-series

Time-series with raw values

Summary

Transitions