Microphone stream manipulation

Talking2442 · Jan-07-2021, 07:36 PM

Hello and happy new year!

I working with Vosk speech-to-text engine that already works but i want to improve the microphine input.

My thoughts are around:
- denoise stream
- gain stream

I've read much exaples but most one are just for wav files and not for microphone streaming

My code Looks lije this:

#!/usr/bin/env python3

import json
import os
import pyaudio
from vosk import Model, KaldiRecognizer


lang = en-US
DEBUG = True


class Vosk:
    def __init__(self, language):
        if not os.path.exists("models/vosk/" + language):
            print("Please download the model from https://alphacephei.com/vosk/models and unpack as "
                  "'" + language + " ' in 'models/vosk'.")
            exit(1)

        model = Model("models/vosk/" + language)
        self.rec = KaldiRecognizer(model, 16000)

        p = pyaudio.PyAudio()
        self.stream = p.open(format=pyaudio.paInt16,
                             channels=1,
                             rate=16000,
                             input=True,
                             frames_per_buffer=8000)
        self.stream.start_stream()

    def run(self):
        print("Listening...")
        while True:
            data = self.stream.read(4000)
            if len(data) == 0:
                break
            if self.rec.AcceptWaveform(data):
                res = json.loads(self.rec.FinalResult())
                if DEBUG:
                    print("Text:", res['text'])

                res = json.loads(self.rec.Result())
                if DEBUG:
                    print(res['text'])

        res = json.loads(self.rec.FinalResult())
        if DEBUG:
            print("Listened: " + res['text'])
        return res['text']


if __name__ == '__main__':
    stt = Vosk(lang)
    print(stt.run())

I think right modules are audioop or Pydub but most examples are for WAV files...
Have you better ideas for right modules and then yes can you explain how i should work with the stream?

palumanic · (This post was last modified: Nov-28-2023, 08:16 AM by palumanic.)

Improving microphone input for speech-to-text is a cool project. You're on the right track with denoising and adjusting gain! Streaming might need some real-time processing. Have you checked out https://asmrmicrophones.com/? They've got insights on enhancing microphone quality that could help fine-tune your stream for better speech recognition.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	EEG stream data with mne and brainfolw	PaulC	0	501	Aug-22-2023, 03:17 AM Last Post: PaulC
	Decoding a serial stream	AKGentile1963	7	8,631	Mar-20-2021, 08:07 PM Last Post: deanhystad
	Best Video Quality And Stream	Harshil	2	2,265	Aug-19-2020, 09:03 AM Last Post: Harshil
	stream audio from pc to pc	floatingshed	2	1,989	Sep-16-2019, 03:45 PM Last Post: floatingshed
	pi camera stream is upside down	delta1071	3	5,764	Sep-11-2019, 11:35 AM Last Post: metulburr
	Need help to read a gzip stream...	pythonchakri	5	4,186	Jun-07-2019, 02:33 AM Last Post: heiner55
	how coding microphone receive non-English?	TedHanaka	1	2,370	Feb-12-2018, 02:13 PM Last Post: sparkz_alot
	I want to know how I can determine pitch and volume from my computer microphone.	notalentgeek	1	6,894	Nov-16-2016, 08:38 AM Last Post: Ofnuts

Microphone stream manipulation

User Panel Messages

Announcements