Dec-04-2017, 10:34 AM
I have hundreds of audio files (mp3) of a teaching course and because of copyright,etc, we are not permitted to upload the files. Therefore, I need to be able to convert the audio/speech to text offline.
I have recently installed the "Uberi" Speech Recognition package. There were a number of problems I initially encountered, but that was due to ensuring the correct packages had been installed. Also, some issue to do with using python2 or python3, pip or pip3,etc.
The system details are:
Kubuntu 16.04.3 LTS (xenial), kernel 4.4.0-101-generic
Python 2.7.12 and 3.5.2 (both installed)
pip 9.0.1 from /home/********/.local/lib/python2.7/site-packages (python 2.7)
Speech_Recognition 3.7.1
PyAudio 0.2.11
When I run
As the requirement is to do this offline, I have tested the sample python script in the /examples path .. audio_transcribe.py The input file is english.wav , but the output is just 'garbage'.
Being new to python (but not to programming), I'm currently unable to follow what to change to get the SpeechRecognition package to do this offine (not Google,IBM, Bing,etc).
I notice that pip, python2 and python3 are located in ~/.local/bin and ~/.local/lib , as per the problems with installing and not knowing where packages should be installed. Would also prefer to only run python3, but see there are a number of Kubuntu packages that rely on python2.
If I run
I have recently installed the "Uberi" Speech Recognition package. There were a number of problems I initially encountered, but that was due to ensuring the correct packages had been installed. Also, some issue to do with using python2 or python3, pip or pip3,etc.
The system details are:
Kubuntu 16.04.3 LTS (xenial), kernel 4.4.0-101-generic
Python 2.7.12 and 3.5.2 (both installed)
pip 9.0.1 from /home/********/.local/lib/python2.7/site-packages (python 2.7)
Speech_Recognition 3.7.1
PyAudio 0.2.11
When I run
python -m speech_recognitionand speak a few words or many words, the test displayed is either perfect or _almost_ perfect. I later realised by examining the code that is used there, that the Google services are used. Hence the output is very good/accurate.
As the requirement is to do this offline, I have tested the sample python script in the /examples path .. audio_transcribe.py The input file is english.wav , but the output is just 'garbage'.
Being new to python (but not to programming), I'm currently unable to follow what to change to get the SpeechRecognition package to do this offine (not Google,IBM, Bing,etc).
I notice that pip, python2 and python3 are located in ~/.local/bin and ~/.local/lib , as per the problems with installing and not knowing where packages should be installed. Would also prefer to only run python3, but see there are a number of Kubuntu packages that rely on python2.
If I run
python3 audio_transcribe.pythere are no errors, other than the garbage output. If I run
python audio_transcribe.py, there is an error message
Quote:Sphinx error; missing PocketSphinx module: ensure that PocketSphinx is set up correctly.