More resources on text-to-speech solutions:
@roland.alton could you try to figure how complex such a solution would be and how many hours seem appropriate for trying to implement such a feature?
Regarding accessibility we would like to integrate a service that automatically transcribes the spoken words in a video call into text.
That text should be automatically made available using the Closed Captions Feature.
We suggest integrating the easiest possible solution, because for that feature data security is not the top priority. There are services available by Google: https://cloud.google.com/speech-to-text/docs/streaming-recognize?hl=de#performing_streaming_speech_recognition_on_an_audio_stream
Or a service that currently use the Web Speech API as implemented in Google Chrome: https://webcaptioner.com