ARI + Stasis + AsterNET + Vosk

Hi there.

Has anyone tried setting up voice recognition in an ongoing call (not IVR) using said tech in the title?
I’m having isues wiring it all together.

I have a C# app with AsterNET running connected to FreePBX and listening for events.
Also, I have a Vosk server running on docker on the same machine as FPBX.
What I want to do is when an extension is called and picked up to stream that signal in to the Vosk server.

Currently, I can call the extension and the OnStasisStartEvent is fired but the extension I’m calling never rings.

My extensions_custom is simple:

[from-internal-custom]
exten => 100,1,Stasis(StreamToVosk)
;same => n,Dial(PJSIP/${EXTEN}) ; if i uncomment this it's still not calling the extension
;same => n,Stasis(StreamToVosk)
;same => n,Hangup()

Also I haven’t found any reasonable code for streaming the audio to Vosk once I manage to connect the two extensions.
I asked Chatgpt and this is (moderated by me) the code it gave me:

using System;
using System.Threading.Tasks;
using AsterNET.ARI;
using AsterNET.ARI.Middleware.Default;
using AsterNET.ARI.Models;
using NAudio.Wave;
using NAudio.Wave.SampleProviders;
using WebSocketSharp;

namespace StreamToVosk
{
    internal class Program
    {
        private static readonly string _app = "StreamToVosk";
        private static readonly string _username = "xxx";
        private static readonly string _password = "xxx";
        private static readonly string _host = "xxx";
        private static readonly int _port = 8088;
        private static readonly string _voskServer = "ws://xxx:2700";
        private static AriClient _client;
        
        public static async Task Main(string[] args)
        {
            var stasisEndpoint = new StasisEndpoint(_host, _port, _username, _password);
            _client = new AriClient(stasisEndpoint, _app, true);
            _client.OnStasisStartEvent += StasisStartEventHandler;
            _client.OnStasisEndEvent += StasisEndEventHandler;
            _client.OnChannelTalkingStartedEvent += StasisStartedTalkingHandler;
            
            _client.Connect();
            
            Console.WriteLine("Connected to ARI...");
            Console.ReadLine();
        }

        private static void StasisStartedTalkingHandler(IAriClient sender, ChannelTalkingStartedEvent e)
        {
            Console.WriteLine("Extension picked up");
        }

        private static async void StasisStartEventHandler(IAriClient client, StasisStartEvent e)
        {
            var channel = e.Channel;
            Console.WriteLine($"Channel {channel.Id} entered Stasis.");

            await _client.Channels.AnswerAsync(channel.Id);          
            
            var bridge = await _client.Bridges.CreateAsync("mixing", channel.Id);
            await client.Bridges.AddChannelAsync(bridge.Id, channel.Id);

            await StreamToVosk(client, channel.Id, bridge.Id);
        }
        
        

        private static async void StasisEndEventHandler(IAriClient client, StasisEndEvent e)
        {
            Console.WriteLine($"Channel {e.Channel.Id} left Stasis.");
        }
        
        private static async Task StreamToVosk(IAriClient client, string channelId, string bridgeId)
        {
            try
            {
                using (var ws = new WebSocket(_voskServer))
                {
                    ws.OnMessage += (sender, e) => Console.WriteLine("Vosk Result: " + e.Data);

                    ws.Connect();

                    var format = new WaveFormat(8000, 16, 1);
                    var waveProvider = new BufferedWaveProvider(format);
                    var sampleStream = new Wave16ToFloatProvider(waveProvider);
                    var sampleChannel = new SampleChannel(waveProvider);
                    var rmsProvider = new MeteringSampleProvider(sampleChannel);
                    var waveIn = new WaveInEvent { WaveFormat = format };
                    waveIn.DataAvailable += (s, a) =>
                    {
                        waveProvider.AddSamples(a.Buffer, 0, a.BytesRecorded);
                        ws.Send(a.Buffer);
                    };
                    waveIn.StartRecording();
                    await Task.Delay(1000);
                    var snoopChannel = await _client.Channels.SnoopChannelAsync(channelId, _app);
                    await _client.Channels.AnswerAsync(snoopChannel.Id);
                    await _client.Bridges.AddChannelAsync(bridgeId, snoopChannel.Id);
                    Console.WriteLine($"Snoop channel {snoopChannel.Id} created and added to bridge {bridgeId}");
                    await Task.Run(async () =>
                    {
                        while (true)
                        {
                            var samplesRead = rmsProvider.Read(new float[sampleStream.WaveFormat.SampleRate / 10], 0, sampleStream.WaveFormat.SampleRate / 10);
                            if (samplesRead == 0) break;
                    
                            await Task.Delay(100);
                        }
                    });
                    waveIn.StopRecording();

                    ws.Close();
                }
            }
            catch (Exception e)
            {
                Console.WriteLine(e);
                throw;
            }

            Console.WriteLine("Disconnected from Vosk server.");
        }
    }
}

Does anyone have any pointers on how to setup all of this?
Thanks

I am not sure but am very interested in your approach and success. Did you ask GPT how to implement the generated code?

I did few times and this is one of those implementations. Now I’m just trying to make it tell if I can somehow bridge extension call thru Stasis function but no luck :slightly_smiling_face:

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.