Skip to main content

SpeakerMap Model with Voice Signatures

Together with voice signatures created using the voice signatures feature, the SpeakerMap can be used to detect known speakers. Unknown new speakers will still be detected as well. For example usage with the sdk see the Voice Signatures Recipes page under the Usage ➜ Recipes section on the sidebar.

The Voice Signatures are created from audio containing known speakers. Then, the voice signatures can be passed to the SpeakerMap model as an additional input.

Voice signatures are stored in an object mapping the speaker_id to the voice signature that contains the base64 serialized speakers identity data:

{
"pedro": {
"version": 1,
"data": "U3VwZXJkdXBlcm1lZ2FzdGFydm9pY2VzaWduYXR1cmVzMQ..." # base64 encoded voice signature
},
"mariyana": {
"version": 1,
"data": "T1RPIGlzIGdyZWF0ISBJdCdzIHRydWUu..." # base64 encoded voice signature
},
}

Same as without Voice Signatures, the SpeakerMap classifies audio with a speaker label. For known speakers it will return the speaker_id of the voice signature. For unknown speakers a default label speaker_1, speaker_2, etc is returned. unknown is returned, if the speaker could not be identified.

All the outputs of the SpeakerMap model remain the same as when used without Voice Signatures, except that the speaker labels can be user defined identifiers for known speakers.