The UnderageSpeaker model can be used to classify a speaker as an adult or child. It also recognizes silence or no speech moments.
If the analysis is inconclusive, the result would be classified as unknown.
In the examples below, we demonstrate:
using the UnderageSpeaker model to find the percentage of the audio during which an adult and a child speaks (example 1)
using the model to detect if the audio contains a child speaker, and to determine their energy and emotion (example 2)
These can be especially useful for content moderation, detection of bullying or other similar use-cases.
You can download this multichannel audio sample for the following examples.
Determine the percentage that an adult/child speaks - Example 1#
Remember to add a valid license key before running the example.
In this example, you can use the summary level output, which is optionally calculated when processing a file, to calculate the percentage of speech detected to be from an adult and from a child.
We are analysing each channel independently, and we know there is only one speaker per channel.
With this approach, we can determine the age range of a speaker with a high confidence.
from deeptone import Deeptone
from deeptone.deeptone import UNDERAGE_NO_SPEECH, UNDERAGE_CHILD, UNDERAGE_ADULT