Add ElevenLabs tagging of speakers to Speech to Text Block

royden · August 12, 2025, 3:56am

What problem does this feature request solve?

The speech to text block returns a large block of text without clearly flagging speakers.

What is the use case for this feature?

We analyse large blocks of audio from interviews, discussions, focus groups, etc. To analyse these effectively it is important to know when there is a change of speaker and who says what (speaker 1, speaker 2, etc).

Please describe the functionality of this feature request.

Insert speaker tags in the text returned by Scribe from ElevenLabs in the Speech to Text block or return the JSON with these details.

GPT4o-transcribe is limited as it does not accept long transcripts. Scribe accepts the long transcripts, but you are only returning the text output, not the details of who was speaking.

sean · August 12, 2025, 8:10pm

Great idea! The block has been updated for scribe to include an option called “Include Speakers”. When set to yes, the response will be in subtitle-style format:

00:00:00,000 --> 00:00:04,739 [speaker_0]
...built a professional services firm to 35 million in revenue.

00:00:05,039 --> 00:00:10,259 [speaker_1]
Let me guess, spreadsheet guy who thought he could DIY his exit strategy.

00:00:10,439 --> 00:00:13,800 [speaker_0]
You know it, Michael. Classic case of "I built this bus-

Let me know if that fixes it for you!

royden · August 12, 2025, 11:08pm

Yes, that is perfect. Thank you so much for the quick response.

Regards,

Royden

Topic		Replies	Views
Transcribe Audio + 10 more blocks now live! 🎉 Announcements	2	77	June 5, 2025
Speech to Text Block Feature Requests	6	81	August 8, 2025
Please add Whisper Large V3 Model Feature Requests	0	13	October 23, 2025
Add Full ElevenLabs API Support for Voice Selection via API Key in External Integrations Feature Requests	5	79	August 11, 2025
Scribe v1 Limits Support	0	10	September 18, 2025

Add ElevenLabs tagging of speakers to Speech to Text Block

Related topics