CR: Automatically adjust audio signal for listeners of a translation
URL
https://dev.fairteaching.net/b/den-gwm-s1c-hlp
Current state
- As a participant of a conference with translations I can choose which translation I would like to hear
- Given a conference with two languages (e.g. German and Russian) - any of those languages may be spoken in the main room
- As a participant who only speaks Russian I need to choose the Russian translation
- Now when somebody speaks German in the main room, I hear the Russian translation
- But when somebody speaks Russian in the main room, I hear nothing, because the translator does not speak in the translation channel
Desired state
There are a couple of possible solutions to this scenario.
The most important requirement above all is: The user should not have to react to the changing languages. This so important, as the client would like to be able to stream a conference (using a virtual participant). And when streaming it is not possible to interact with the UI once set up.
- Solution 1: Automatically adjust volume when there is nothing going on in the translation channel
- As a participant - you always listen to both - the main room as well as the translation
- When the translation has any output, play the translation at a high volume, but keep the main room in the background with a very low volume
- When the translation currently has no output, turn down the volume of the translation and turn up the volume of the main room
- So basically this is about analyzing the audio signals and adjusting the volumes based on it.
- Solution 2: Play translation, when any translator is unmuted. Play main audio, when all translators are muted.
- The desired behavior is similar to the one above, but this time you don't analyze the volume of the translation.
- Instead you will need a way to count the number of unmuted speakers in an audio channel and adjust volumes based on that.
- It requires the possibility to mute/unmute oneself in a translation channel, see proposal in #12 (closed)
The second solution would technically be more safe, because it relies on the decision of interpreters and not on audio signal processing.
Resources
- Maybe the Web Audio API AnaylserNode may help for solution 1: https://developer.mozilla.org/en-US/docs/Web/API/AnalyserNode
Edited by Kollotzek Markus