(Apr-11-2024, 01:06 PM)Gribouillis Wrote: Don't manipulate the garbage collector, it is completely useless for what you are doing.OK, I didn't see any difference with or without garbage collector, so I removed it.
(Apr-11-2024, 01:06 PM)Gribouillis Wrote: On the other hand yes, callingOK, working withPath('audio.mp3').unlink(missing_ok=True)
after the call totranscriptor(
) should work.
Path("audio1.mp3").unlink(missing_ok=True)
in line 61.This one removes the audio1.mp3, and on line 62
Path(temp_file.name).unlink(missing_ok=True)
removes (or seems to) any uploaded audio file (no trace about my m4a file)Here is my full code, in my opinion it works - maybe it could be optimmized, but I don't know enough about Python to say that :
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' # Upload an audio file, if needed it will be converted to mp3 # # Then it will be transcripted using bofenghuang/whisper-small-cv11-french # # Finally it will give you possibility to download transcription # # in .txt format # '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' ## Imports ## from pathlib import Path import streamlit as st import torch from tempfile import NamedTemporaryFile from datasets import load_dataset from transformers import pipeline from pydub import AudioSegment ## Initialize environment ## device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") pipe = pipeline("automatic-speech-recognition", model="bofenghuang/whisper-small-cv11-french", device=device) pipe.model.config.forced_decoder_ids = pipe.tokenizer.get_decoder_prompt_ids(language="fr", task="transcribe") ## Functions ## def convtomp3(file): """Convert an audio file from m3a, wav or wma to mp3""" match Path(file.name).suffix: case ".m4a": wav_audio = AudioSegment.from_file(file, format="m4a") result = wav_audio.export("audio1.mp3", format="mp3") return result case ".wav": wav_audio = AudioSegment.from_file(file, format="wav") result = wav_audio.export("audio1.mp3", format="mp3") return result case ".wma": wav_audio = AudioSegment.from_file(file, format="wma") result = wav_audio.export("audio1.mp3", format="mp3") return result case _: raise ValueError(f"Fichier invalide : {file.name!r}") return file # do nothing if no match in the suffix def transcriptor(file): """Transcript from an audio file to a string""" if(Path(file.name).suffix != ".mp3"): file=convtomp3(file) waveform = file.read() predicted_sentence = pipe(waveform, max_new_tokens=225) return str(predicted_sentence) ## Display ## st.title("Convertisseur / Transcripteur") audio_source = st.sidebar.file_uploader(label = "Fichiers audio uniquement", type=["mp3","m4a","wav","wma"]) if audio_source is not None: st.toast("Lancement de la transcription") with NamedTemporaryFile(suffix=Path(audio_source.name).suffix) as temp_file: temp_file.write(audio_source.getvalue()) temp_file.seek(0) cpte_rendu = transcriptor(temp_file) Path("audio1.mp3").unlink(missing_ok=True) Path(temp_file.name).unlink(missing_ok=True) st.write("Transcription : ") st.write(cpte_rendu) st.sidebar.download_button(label = "Télécharger le compte-rendu", data = cpte_rendu, file_name = "cr.txt", mime = "text/plain") torch.cuda.empty_cache()Not sure about the future, maybe I'll try with a copy of my script and directly instruct
cpte_rendu = trancriptor(audio_source.getvalue())
For my use case, this question is solved.
Thank you for your patience and your help