CR: Automatically adjust audio signal for listeners of a translation

URL

https://dev.fairteaching.net/b/den-gwm-s1c-hlp

Current state

As a participant of a conference with translations I can choose which translation I would like to hear
Given a conference with two languages (e.g. German and Russian) - any of those languages may be spoken in the main room
As a participant who only speaks Russian I need to choose the Russian translation
Now when somebody speaks German in the main room, I hear the Russian translation
But when somebody speaks Russian in the main room, I hear nothing, because the translator does not speak in the translation channel

Desired state

There are a couple of possible solutions to this scenario.

The most important requirement above all is: The user should not have to react to the changing languages. This so important, as the client would like to be able to stream a conference (using a virtual participant). And when streaming it is not possible to interact with the UI once set up.

Solution 1: Automatically adjust volume when there is nothing going on in the translation channel
- As a participant - you always listen to both - the main room as well as the translation
- When the translation has any output, play the translation at a high volume, but keep the main room in the background with a very low volume
- When the translation currently has no output, turn down the volume of the translation and turn up the volume of the main room
- So basically this is about analyzing the audio signals and adjusting the volumes based on it.
Solution 2: Play translation, when any translator is unmuted. Play main audio, when all translators are muted.
- The desired behavior is similar to the one above, but this time you don't analyze the volume of the translation.
- Instead you will need a way to count the number of unmuted speakers in an audio channel and adjust volumes based on that.
- It requires the possibility to mute/unmute oneself in a translation channel, see proposal in #12 (closed)

The second solution would technically be more safe, because it relies on the decision of interpreters and not on audio signal processing.

Resources

Maybe the Web Audio API AnaylserNode may help for solution 1: https://developer.mozilla.org/en-US/docs/Web/API/AnalyserNode

Edited Oct 26, 2020 by Kollotzek Markus