Enable speech to text

How to train models: https://github.com/german-asr/kaldi-german

The duplicate string problem should have been fixed: https://github.com/alphacep/vosk-server/issues/89

added inactive label

removed inactive label

Jitsi has developed Skynet that employs Fast Whisper for transcriptions https://github.com/jitsi/skynet Skynet is an API server for various AI services and also offers summaries.

Docs: https://github.com/jitsi/skynet/blob/master/docs/streaming_whisper_module.md

For a Jitsi videoconference you call it from a Jigasi server (which uses kaldi models by default) see https://community.jitsi.org/t/jigasi-whisper-setup/130312 We have not yet deployed Jigasi on k8s https://git.fairkom.net/hosting/k8s-deployments/fairmeeting but that should be easy.

@oliver.maklott @chagai.friedlander @jan.krammer

mentioned in issue hosting/conf/bbb/fairblue3#4

We started to add a jigasi.yaml for fairmeeting staging in a branch https://git.fairkom.net/hosting/k8s-deployments/fairmeeting/-/commits/feat/transcription

Here is the ChatGPT guide for using Jigasi with Whisper.

To deploy Jigasi and Skynet in Kubernetes using Kustomize and enable transcription using Whisper, you'll follow a systematic approach to:

Set up Jigasi to handle audio transcription.
Deploy a Skynet API server to handle Whisper transcription requests (or connect Jigasi to an existing Skynet API endpoint).
Use Kustomize to manage the Kubernetes deployment, allowing you to configure environments like production or staging.

Steps Overview:

Jigasi Deployment Configuration in Kubernetes:
- Set up Jigasi to run as a container in your Kubernetes cluster.
- Configure Jigasi for transcription (integrating Skynet Whisper API for transcription).
Skynet API Deployment (or connection to an existing API):
- Deploy Skynet if you're running it yourself, or point Jigasi to the external Skynet API server.
Configure Kustomize:
- Use Kustomize to manage overlays for different environments (e.g., production and staging).

Let’s break it down in detail:

1. Jigasi Deployment in Kubernetes

You need to define a Kubernetes Deployment for Jigasi with appropriate environment variables and configurations to enable transcription. Here's how you can configure the Jigasi deployment YAML file:

`base/jigasi-deployment.yaml`:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: jigasi
  labels:
    app: jigasi
spec:
  replicas: 1
  selector:
    matchLabels:
      app: jigasi
  template:
    metadata:
      labels:
        app: jigasi
    spec:
      containers:
      - name: jigasi
        image: jitsi/jigasi:latest
        env:
        - name: JIGASI_SIP_URI
          value: "sip:example@domain.com"  # SIP URI for the service
        - name: JIGASI_XMPP_SERVER
          value: "xmpp-server.jitsi.example.com"  # XMPP server for Jitsi Meet
        - name: JIGASI_TRANSCRIBE
          value: "true"  # Enable transcription
        - name: JIGASI_TRANSCRIPTION_SERVICE
          value: "SkynetTranscriptionService"  # Custom transcription service for Skynet API
        - name: SKYNET_API_ENDPOINT
          value: "<Skynet API endpoint>"  # Endpoint for Skynet API server
        - name: SKYNET_API_KEY
          valueFrom:
            secretKeyRef:
              name: skynet-api-key
              key: apiKey  # Reference secret for Skynet API key
        ports:
        - name: sip
          containerPort: 5060  # SIP port
        - name: http
          containerPort: 8080  # HTTP port for health checks, etc.

In this configuration:

JIGASI_TRANSCRIBE is set to true to enable transcription.
JIGASI_TRANSCRIPTION_SERVICE points to a custom transcription service that integrates with the Skynet API.
Environment variables SKYNET_API_ENDPOINT and SKYNET_API_KEY are set to configure access to the Skynet Whisper API.

2. Skynet API Deployment

If you're running your own Skynet API for Whisper transcription, you can deploy it as a service in Kubernetes.

`base/skynet-deployment.yaml`:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: skynet
  labels:
    app: skynet
spec:
  replicas: 1
  selector:
    matchLabels:
      app: skynet
  template:
    metadata:
      labels:
        app: skynet
    spec:
      containers:
      - name: skynet-api
        image: skynet-api:latest  # Your Skynet API image
        ports:
        - name: http
          containerPort: 8000
        env:
        - name: SKYNET_WHISPER_MODEL
          value: "fast-whisper"  # Set to use the fast Whisper model

This sets up Skynet to expose an API that Jigasi will use for transcription.

3. Kustomize Configuration

You’ll use Kustomize to manage the deployment of Jigasi and Skynet across different environments (e.g., staging, production).

Base `kustomization.yaml`:

Create a base kustomization.yaml file in the base/ directory, which includes both the Jigasi and Skynet resources:

resources:
  - jigasi-deployment.yaml
  - skynet-deployment.yaml

Overlays

Create overlays for different environments. For example, let’s say you want to have different configurations for production and staging.

`overlays/production/kustomization.yaml`:

resources:
  - ../../base

patchesStrategicMerge:
  - production-jigasi-patch.yaml
  - production-skynet-patch.yaml

This kustomization.yaml will apply the base resources but also include production-specific patches.

`overlays/production/production-jigasi-patch.yaml`:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: jigasi
spec:
  replicas: 3  # Set more replicas for production
  template:
    spec:
      containers:
      - name: jigasi
        env:
        - name: JIGASI_SIP_URI
          value: "sip:prod@domain.com"  # Production SIP URI
        - name: JIGASI_XMPP_SERVER
          value: "xmpp-server-prod.jitsi.example.com"

`overlays/production/production-skynet-patch.yaml`:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: skynet
spec:
  replicas: 2  # More replicas for Skynet API in production

This patch updates Jigasi and Skynet for production by increasing replicas and changing configuration values.

Optional: Staging Overlay

Similarly, you can create a staging overlay:

`overlays/staging/kustomization.yaml`:

resources:
  - ../../base

patchesStrategicMerge:
  - staging-jigasi-patch.yaml

This will apply staging-specific settings such as fewer replicas, different API keys, or alternate endpoints.

Running the Deployments with Kustomize

Once everything is set up, you can deploy Jigasi and Skynet to your Kubernetes cluster.

For production:

kubectl apply -k overlays/production

For staging:

kubectl apply -k overlays/staging

4. Custom Transcription Service for Skynet

To enable Jigasi to use Skynet for transcription, you’ll need to implement a custom transcription service in Jigasi. Here’s a simplified workflow for how it might work:

Jigasi streams audio from a conference.
The audio is sent to Skynet’s Whisper API for transcription.
Skynet returns the transcription result, which is forwarded back to Jitsi Meet for display.

You’ll need to develop or configure Jigasi to send audio streams to the Skynet API using the environment variables provided in the Jigasi deployment.

For example, you can modify Jigasi’s internal transcription handler to use the Skynet API:

public class SkynetWhisperTranscriptionService implements TranscriptionService {
    @Override
    public String transcribeAudio(byte[] audioData) {
        String transcription = sendAudioToSkynet(audioData);
        return transcription;
    }

    private String sendAudioToSkynet(byte[] audioData) {
        // API call to Skynet for Whisper transcription
        // Convert audioData into proper format (e.g., WAV)
        // Make an HTTP request to Skynet's API
        // Return the transcription
    }
}

5. Secrets Management (Optional)

For better security, use Kubernetes Secrets to store sensitive data like the Skynet API key. Define a secret in your Kubernetes cluster and reference it in your Jigasi deployment.

Create Secret:

kubectl create secret generic skynet-api-key --from-literal=apiKey=YOUR_API_KEY

Reference Secret in `jigasi-deployment.yaml`:

env:
- name: SKYNET_API_KEY
  valueFrom:
    secretKeyRef:
      name: skynet-api-key
      key: apiKey

Conclusion

By using Kustomize to deploy both Jigasi and Skynet, you can flexibly manage different environments and easily scale or reconfigure your deployment as needed. Integrating Jigasi with Skynet for transcription involves configuring both services to communicate, and once this is complete, you’ll have real-time transcription enabled for your Jitsi conferences using Whisper.

Some more answers from AI:

How to create the xmpp jigasi user and password

To deploy Jigasi and Skynet in Kubernetes using Kustomize and enable transcription using Whisper, you'll follow a systematic approach to:

Set up Jigasi to handle audio transcription.
Deploy a Skynet API server to handle Whisper transcription requests (or connect Jigasi to an existing Skynet API endpoint).
Use Kustomize to manage the Kubernetes deployment, allowing you to configure environments like production or staging.

Steps Overview:

Jigasi Deployment Configuration in Kubernetes:
- Set up Jigasi to run as a container in your Kubernetes cluster.
- Configure Jigasi for transcription (integrating Skynet Whisper API for transcription).
Skynet API Deployment (or connection to an existing API):
- Deploy Skynet if you're running it yourself, or point Jigasi to the external Skynet API server.
Configure Kustomize:
- Use Kustomize to manage overlays for different environments (e.g., production and staging).

Let’s break it down in detail:

1. Jigasi Deployment in Kubernetes

You need to define a Kubernetes Deployment for Jigasi with appropriate environment variables and configurations to enable transcription. Here's how you can configure the Jigasi deployment YAML file:

`base/jigasi-deployment.yaml`:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: jigasi
  labels:
    app: jigasi
spec:
  replicas: 1
  selector:
    matchLabels:
      app: jigasi
  template:
    metadata:
      labels:
        app: jigasi
    spec:
      containers:
      - name: jigasi
        image: jitsi/jigasi:latest
        env:
        - name: JIGASI_SIP_URI
          value: "sip:example@domain.com"  # SIP URI for the service
        - name: JIGASI_XMPP_SERVER
          value: "xmpp-server.jitsi.example.com"  # XMPP server for Jitsi Meet
        - name: JIGASI_TRANSCRIBE
          value: "true"  # Enable transcription
        - name: JIGASI_TRANSCRIPTION_SERVICE
          value: "SkynetTranscriptionService"  # Custom transcription service for Skynet API
        - name: SKYNET_API_ENDPOINT
          value: "<Skynet API endpoint>"  # Endpoint for Skynet API server
        - name: SKYNET_API_KEY
          valueFrom:
            secretKeyRef:
              name: skynet-api-key
              key: apiKey  # Reference secret for Skynet API key
        ports:
        - name: sip
          containerPort: 5060  # SIP port
        - name: http
          containerPort: 8080  # HTTP port for health checks, etc.

In this configuration:

JIGASI_TRANSCRIBE is set to true to enable transcription.
JIGASI_TRANSCRIPTION_SERVICE points to a custom transcription service that integrates with the Skynet API.
Environment variables SKYNET_API_ENDPOINT and SKYNET_API_KEY are set to configure access to the Skynet Whisper API.

2. Skynet API Deployment

If you're running your own Skynet API for Whisper transcription, you can deploy it as a service in Kubernetes.

`base/skynet-deployment.yaml`:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: skynet
  labels:
    app: skynet
spec:
  replicas: 1
  selector:
    matchLabels:
      app: skynet
  template:
    metadata:
      labels:
        app: skynet
    spec:
      containers:
      - name: skynet-api
        image: skynet-api:latest  # Your Skynet API image
        ports:
        - name: http
          containerPort: 8000
        env:
        - name: SKYNET_WHISPER_MODEL
          value: "fast-whisper"  # Set to use the fast Whisper model

This sets up Skynet to expose an API that Jigasi will use for transcription.

3. Kustomize Configuration

You’ll use Kustomize to manage the deployment of Jigasi and Skynet across different environments (e.g., staging, production).

Base `kustomization.yaml`:

Create a base kustomization.yaml file in the base/ directory, which includes both the Jigasi and Skynet resources:

resources:
  - jigasi-deployment.yaml
  - skynet-deployment.yaml

Overlays

Create overlays for different environments. For example, let’s say you want to have different configurations for production and staging.

`overlays/production/kustomization.yaml`:

resources:
  - ../../base

patchesStrategicMerge:
  - production-jigasi-patch.yaml
  - production-skynet-patch.yaml

This kustomization.yaml will apply the base resources but also include production-specific patches.

`overlays/production/production-jigasi-patch.yaml`:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: jigasi
spec:
  replicas: 3  # Set more replicas for production
  template:
    spec:
      containers:
      - name: jigasi
        env:
        - name: JIGASI_SIP_URI
          value: "sip:prod@domain.com"  # Production SIP URI
        - name: JIGASI_XMPP_SERVER
          value: "xmpp-server-prod.jitsi.example.com"

`overlays/production/production-skynet-patch.yaml`:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: skynet
spec:
  replicas: 2  # More replicas for Skynet API in production

This patch updates Jigasi and Skynet for production by increasing replicas and changing configuration values.

Optional: Staging Overlay

Similarly, you can create a staging overlay:

`overlays/staging/kustomization.yaml`:

resources:
  - ../../base

patchesStrategicMerge:
  - staging-jigasi-patch.yaml

This will apply staging-specific settings such as fewer replicas, different API keys, or alternate endpoints.

Running the Deployments with Kustomize

Once everything is set up, you can deploy Jigasi and Skynet to your Kubernetes cluster.

For production:

kubectl apply -k overlays/production

For staging:

kubectl apply -k overlays/staging

4. Custom Transcription Service for Skynet

To enable Jigasi to use Skynet for transcription, you’ll need to implement a custom transcription service in Jigasi. Here’s a simplified workflow for how it might work:

Jigasi streams audio from a conference.
The audio is sent to Skynet’s Whisper API for transcription.
Skynet returns the transcription result, which is forwarded back to Jitsi Meet for display.

You’ll need to develop or configure Jigasi to send audio streams to the Skynet API using the environment variables provided in the Jigasi deployment.

For example, you can modify Jigasi’s internal transcription handler to use the Skynet API:

public class SkynetWhisperTranscriptionService implements TranscriptionService {
    @Override
    public String transcribeAudio(byte[] audioData) {
        String transcription = sendAudioToSkynet(audioData);
        return transcription;
    }

    private String sendAudioToSkynet(byte[] audioData) {
        // API call to Skynet for Whisper transcription
        // Convert audioData into proper format (e.g., WAV)
        // Make an HTTP request to Skynet's API
        // Return the transcription
    }
}

5. Secrets Management (Optional)

For better security, use Kubernetes Secrets to store sensitive data like the Skynet API key. Define a secret in your Kubernetes cluster and reference it in your Jigasi deployment.

Create Secret:

kubectl create secret generic skynet-api-key --from-literal=apiKey=YOUR_API_KEY

Reference Secret in `jigasi-deployment.yaml`:

env:
- name: SKYNET_API_KEY
  valueFrom:
    secretKeyRef:
      name: skynet-api-key
      key: apiKey

Conclusion

By using Kustomize to deploy both Jigasi and Skynet, you can flexibly manage different environments and easily scale or reconfigure your deployment as needed. Integrating Jigasi with Skynet for transcription involves configuring both services to communicate, and once this is complete, you’ll have real-time transcription enabled for your Jitsi conferences using Whisper.

Yes, you need to create an XMPP user in Prosody that matches the JIGASI_XMPP_PASSWORD and ensure it is persistent so that the user account remains available across container restarts or upgrades. Below are the steps to achieve this:

1. Create the XMPP User in Prosody

In a typical Jitsi setup, Prosody is used as the XMPP server. You can use the prosodyctl command to add a user.

Inside the Prosody Container

Access the Prosody container:
```
kubectl exec -it <prosody-pod-name> -- /bin/bash
```
Replace <prosody-pod-name> with the name of the Prosody pod running in your Kubernetes cluster.
Run the prosodyctl command to create a new user:
```
prosodyctl register jigasi xmpp-server.example.com YOUR_PASSWORD
```
- Replace jigasi with the username you want to use for Jigasi.
- Replace xmpp-server.example.com with the domain of your Prosody server (usually matches the value of JIGASI_XMPP_SERVER).
- Replace YOUR_PASSWORD with the desired password that matches the value in JIGASI_XMPP_PASSWORD.
Exit the container after the user is created:
```
exit
```

2. Make the XMPP User Persistent

By default, Prosody stores user data in files or a database. For persistence across container restarts or updates, ensure that Prosody’s data directory is backed by a persistent volume.

a. Verify Persistent Storage for Prosody

Check if the Prosody data is stored on a persistent volume in Kubernetes. The data directory typically contains prosody.cfg.lua and user data files.

If Prosody uses a persistent volume, verify it in the deployment YAML file or Helm chart:

volumeMounts:
- name: prosody-data
  mountPath: /var/lib/prosody
volumes:
- name: prosody-data
  persistentVolumeClaim:
      claimName: prosody-data-pvc

Ensure that prosody-data-pvc or a similar volume claim is set up to store the /var/lib/prosody directory persistently.

b. Backup or Restore the User

If you need to back up or restore the user data manually:

Export Prosody’s data directory:

kubectl cp <prosody-pod-name>:/var/lib/prosody ./prosody-data

To restore, copy it back to the container:

kubectl cp ./prosody-data <prosody-pod-name>:/var/lib/prosody

3. Automate User Creation (Optional)

If you want to automate the creation of the jigasi user, you can include the prosodyctl command in the Prosody container’s startup script. Modify the container’s entrypoint to check if the user exists, and create it if it doesn’t.

Example Startup Script:

#!/bin/bash
# Check if the Jigasi user exists
USER_EXISTS=$(prosodyctl show users | grep jigasi)

if [ -z "$USER_EXISTS" ]; then
    echo "Creating Jigasi XMPP user..."
    prosodyctl register jigasi xmpp-server.example.com YOUR_PASSWORD
fi

# Start Prosody
exec prosody

Mount this script into the container as an executable file, or build a custom Prosody image with this script as the entrypoint.

4. Testing

After creating the user, ensure that Jigasi can authenticate with Prosody:

Verify the user exists:

kubectl exec -it <prosody-pod-name> -- prosodyctl show users

You should see the jigasi user listed.

Restart the Jigasi deployment to test authentication:
```
kubectl rollout restart deployment/jigasi
```
Check Jigasi logs for successful authentication:
```
kubectl logs -l app=jigasi
```
Look for log messages indicating that Jigasi connected successfully to the XMPP server.

Conclusion

To make the JIGASI_XMPP_PASSWORD user persistent:

Create the user using prosodyctl inside the Prosody container.
Ensure Prosody’s data directory is backed by a Kubernetes persistent volume.
Optionally, automate user creation in the Prosody container’s startup script.

This setup ensures the XMPP user exists consistently and supports Jigasi authentication seamlessly.

assigned to @suman and unassigned @roland.alton

Jigasi and vosk pods are starting without errors in fairmeeting-staging namespace.

We added related configs in https://git.fairkom.net/hosting/k8s-deployments/fairmeeting/-/blob/feat/transcription/k8s/staging/config/web/custom-config.js

When starting transcription in a room at https://fairmeeting.staging.osalliance.com then we see in the console:

<iq xmlns="jabber:client" to="ba96ea46-75cb-42f9-9159-696a2cf1454d@fairmeeting.net/31klTixIGc4Q" id="851463bc-bca5-478c-9968-791baa4f4c8b:sendIQ" type="error" xml:lang="en-US" from="suman@muc.fairmeeting.net/focus"><error type="cancel"><feature-not-implemented xmlns="urn:ietf:params:xml:ns:xmpp-stanzas"/></error><dial xmlns="urn:xmpp:rayo:1" to="jitsi_meet_transcribe" from="fromnumber"/></iq>

It looks like Jitsi-web is calling Jigasi to answer with the transcription service, but it does not answer yet properly.

Enable speech to text

Activity

Steps Overview:

1. Jigasi Deployment in Kubernetes

base/jigasi-deployment.yaml:

2. Skynet API Deployment

base/skynet-deployment.yaml:

3. Kustomize Configuration

Base kustomization.yaml:

Overlays

overlays/production/kustomization.yaml:

overlays/production/production-jigasi-patch.yaml:

overlays/production/production-skynet-patch.yaml:

Optional: Staging Overlay

overlays/staging/kustomization.yaml:

Running the Deployments with Kustomize

4. Custom Transcription Service for Skynet

5. Secrets Management (Optional)

Create Secret:

Reference Secret in jigasi-deployment.yaml:

Conclusion

Steps Overview:

1. Jigasi Deployment in Kubernetes

base/jigasi-deployment.yaml:

2. Skynet API Deployment

base/skynet-deployment.yaml:

3. Kustomize Configuration

Base kustomization.yaml:

Overlays

overlays/production/kustomization.yaml:

overlays/production/production-jigasi-patch.yaml:

overlays/production/production-skynet-patch.yaml:

Optional: Staging Overlay

overlays/staging/kustomization.yaml:

Running the Deployments with Kustomize

4. Custom Transcription Service for Skynet

5. Secrets Management (Optional)

Create Secret:

Reference Secret in jigasi-deployment.yaml:

Conclusion

1. Create the XMPP User in Prosody

Inside the Prosody Container

2. Make the XMPP User Persistent

a. Verify Persistent Storage for Prosody

b. Backup or Restore the User

3. Automate User Creation (Optional)

Example Startup Script:

4. Testing

Conclusion

`base/jigasi-deployment.yaml`:

`base/skynet-deployment.yaml`:

Base `kustomization.yaml`:

`overlays/production/kustomization.yaml`:

`overlays/production/production-jigasi-patch.yaml`:

`overlays/production/production-skynet-patch.yaml`:

`overlays/staging/kustomization.yaml`:

Reference Secret in `jigasi-deployment.yaml`:

`base/jigasi-deployment.yaml`:

`base/skynet-deployment.yaml`:

Base `kustomization.yaml`:

`overlays/production/kustomization.yaml`:

`overlays/production/production-jigasi-patch.yaml`:

`overlays/production/production-skynet-patch.yaml`:

`overlays/staging/kustomization.yaml`:

Reference Secret in `jigasi-deployment.yaml`: