Emotions Model Recipes
Overview
The Emotions model can be used to classify the emotion in an audio snippet into happy, neutral, irritated or tired. You can use that model to eg. assess how the speakers in a snippet/stream are feeling.
In the examples below, you will see how to use the Emotions model to detect speakers emotion when streaming from a microphone and display a warning if the speaker sounds tired for too long.
Pre-requisites
- DeepTone with license key and models
- pyaudio
- a microphone
Installing pyaudio in a python env may require some extra steps unless you are using Anaconda to manage your environment.
We still feel it's the easiest way to get your mic input in python though. For more details on how to install head to Gender model recipes
Detect when speaker is tired - Example 1
Remember to add a valid license key before running the example.
In these examples we make use of the summary
and transitions
level outputs, calculated optionally when processing a file.
from collections import deque
from deeptone import Deeptone
from math import ceil
import pyaudio
# Set the required constants
VALID_LICENSE_KEY = None
OUTPUT_PERIOD_MS = 1024
CHUNK_SIZE = 1024
data_buffer = deque()
assert not None in (VALID_LICENSE_KEY), "Set the required constants"
# Initialise an audio stream
def writer_callback(in_data, frame_count, time_info, status):
data_buffer.extend(in_data)
return in_data, pyaudio.paContinue
pa = pyaudio.PyAudio()
stream = pa.open(
format=pyaudio.paInt16,
channels=1,
rate=16000,
input=True,
frames_per_buffer=CHUNK_SIZE,
stream_callback=writer_callback,
)
stream.start_stream()
def input_generator(buffer):
while stream.is_active():
while len(buffer) >= CHUNK_SIZE * 2:
samples_read = [buffer.popleft() for x in range(CHUNK_SIZE * 2)]
yield bytes(samples_read)
# Initialise Deeptone
engine = Deeptone(license_key=VALID_LICENSE_KEY)
audio_generator = input_generator(data_buffer)
print("Listening to you ...")
output = engine.process_stream(
input_generator=audio_generator,
models=[engine.models.Emotions],
output_period=OUTPUT_PERIOD_MS,
volume_threshold=0.005
)
try:
# Inspect the result
tired_counter = 0
for ts_result in output:
ts = ts_result["timestamp"]
res = ts_result["results"]["emotions"]
print(
f'Timestamp: {ts}ms\tresult: {res["result"]}'
f' with confidence {res["confidence"]}'
)
if res["result"] == "tired":
tired_counter += 1
else:
tired_counter = 0
if tired_counter >= 5 * ceil(1000 / OUTPUT_PERIOD_MS):
print("\tYou seem tired. Take a break and get some rest!")
tired_counter = 0
except KeyboardInterrupt:
print(f"Congrats! You processed {round((ts+1024)/1000)}s of audio with Deeptone.")
print("Goodbye!")