Apr-10-2024, 11:49 AM
(Apr-10-2024, 08:48 AM)Gribouillis Wrote: If the filename's suffix is not one of m4a, wav, wma, the function convtomp3()
will return None, this is probably what happens in your case.
Hello and thank you for your answer.
I tried adding
return file
after the match pattern in the function.I thought it would not work because the suffix is limited once by the
file_uploader
button (only mp3, m4a, wav, wma accepted) and because I call the convtomp3() only if the suffix is not mp3.So here is the modified code (this type I let the docstrings, otherwise the line numbers won't match):
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' # Upload an audio file, if needed it will be converted to mp3 # # Then it will be transcripted using bofenghuang/whisper-small-cv11-french # # Finally it will give you possibility to download transcription # # in .txt format # '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' ## Imports ## from pathlib import Path import streamlit as st import torch from tempfile import NamedTemporaryFile from datasets import load_dataset from transformers import pipeline from pydub import AudioSegment ## Initialize environment ## device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") pipe = pipeline("automatic-speech-recognition", model="bofenghuang/whisper-small-cv11-french", device=device) pipe.model.config.forced_decoder_ids = pipe.tokenizer.get_decoder_prompt_ids(language="fr", task="transcribe") ## Functions ## def convtomp3(file): """Convert an audio file from m3a, wav or wma to mp3""" match Path(file.name).suffix: case "m4a": wav_audio = AudioSegment.from_file(file, format="m4a") result = wav_audio.export("audio1.mp3", format="mp3") return result case "wav": wav_audio = AudioSegment.from_file(file, format="wav") result = wav_audio.export("audio1.mp3", format="mp3") return result case "wma": wav_audio = AudioSegment.from_file(file, format="wma") result = wav_audio.export("audio1.mp3", format="mp3") return result return file # do othing if no match in the suffix def transcriptor(file): """Transcript from an audio file to a string""" if(Path(file.name).suffix != "mp3"): file=convtomp3(file) waveform = file.getvalue() predicted_sentence = pipe(waveform, max_new_tokens=225) return str(predicted_sentence) ## Display ## st.title("Convertisseur / Transcripteur") audio_source = st.sidebar.file_uploader(label = "Fichiers audio uniquement", type=["mp3","m4a","wav","wma"]) if audio_source is not None: st.toast("Lancement de la transcription") with NamedTemporaryFile(suffix=Path(audio_source.name).suffix) as temp_file: temp_file.write(audio_source.getvalue()) temp_file.seek(0) cpte_rendu = transcriptor(temp_file) st.write("Transcription : ") st.write(cpte_rendu) st.sidebar.download_button(label = "Télécharger le compte-rendu", data = cpte_rendu, file_name = "cr.txt", mime = "text/plain")And ... here is the new error :
AttributeError: '_io.BufferedRandom' object has no attribute 'getvalue' Traceback: File "/home/ild/.local/lib/python3.12/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 542, in _run_script exec(code, module.__dict__) File "/home/ild/conv-cripteur.py", line 58, in <module> cpte_rendu = transcriptor(temp_file) ^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ild/conv-cripteur.py", line 45, in transcriptor waveform = file.getvalue() ^^^^^^^^^^^^^ File "/home/ild/miniconda3/lib/python3.12/tempfile.py", line 494, in __getattr__ a = getattr(file, name) ^^^^^^^^^^^^^^^^^^^For me, the biggest problem of my code is in the function transcriptor (more precisely at line 45):
waveform = file.getvalue()
But I can't see where I made (a) mistake(s) :(
Might this be because I should type
if(Path(file.name).suffix != ".mp3"):
(the . would be important), or because I have to convert from audio_source to ?? (I don't know what)Thank you