Python Forum
[Solved] I'm not getting the good type
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Solved] I'm not getting the good type
#19
(Apr-11-2024, 01:06 PM)Gribouillis Wrote: Don't manipulate the garbage collector, it is completely useless for what you are doing.
OK, I didn't see any difference with or without garbage collector, so I removed it.

(Apr-11-2024, 01:06 PM)Gribouillis Wrote: On the other hand yes, calling Path('audio.mp3').unlink(missing_ok=True) after the call to transcriptor() should work.
OK, working with Path("audio1.mp3").unlink(missing_ok=True) in line 61.
This one removes the audio1.mp3, and on line 62 Path(temp_file.name).unlink(missing_ok=True) removes (or seems to) any uploaded audio file (no trace about my m4a file)

Here is my full code, in my opinion it works - maybe it could be optimmized, but I don't know enough about Python to say that :
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
# Upload an audio file, if needed it will be converted to mp3                #
# Then it will be transcripted using bofenghuang/whisper-small-cv11-french   #
# Finally it will give you possibility to download transcription             #
#    in .txt format                                                          #
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

## Imports ##
from pathlib import Path
import streamlit as st
import torch

from tempfile import NamedTemporaryFile
from datasets import load_dataset
from transformers import pipeline
from pydub import AudioSegment

## Initialize environment ##
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
pipe = pipeline("automatic-speech-recognition", model="bofenghuang/whisper-small-cv11-french", device=device)
pipe.model.config.forced_decoder_ids = pipe.tokenizer.get_decoder_prompt_ids(language="fr", task="transcribe")

## Functions ##
def convtomp3(file):
    """Convert an audio file from m3a, wav or wma to mp3"""
    match Path(file.name).suffix:
        case ".m4a":
            wav_audio = AudioSegment.from_file(file, format="m4a")
            result = wav_audio.export("audio1.mp3", format="mp3")
            return result
        case ".wav":
            wav_audio = AudioSegment.from_file(file, format="wav")
            result = wav_audio.export("audio1.mp3", format="mp3")
            return result
        case ".wma":
            wav_audio = AudioSegment.from_file(file, format="wma")
            result = wav_audio.export("audio1.mp3", format="mp3")
            return result
        case _:
            raise ValueError(f"Fichier invalide : {file.name!r}")
    return file # do nothing if no match in the suffix

def transcriptor(file):
    """Transcript from an audio file to a string"""
    if(Path(file.name).suffix != ".mp3"):
        file=convtomp3(file)
    waveform = file.read()
    predicted_sentence  = pipe(waveform, max_new_tokens=225)
    return str(predicted_sentence)

## Display ##
st.title("Convertisseur / Transcripteur")
audio_source = st.sidebar.file_uploader(label = "Fichiers audio uniquement", type=["mp3","m4a","wav","wma"])
if audio_source is not None:
    st.toast("Lancement de la transcription")
    
    with NamedTemporaryFile(suffix=Path(audio_source.name).suffix) as temp_file:
        temp_file.write(audio_source.getvalue())
        temp_file.seek(0)    
        cpte_rendu = transcriptor(temp_file)
        Path("audio1.mp3").unlink(missing_ok=True)
        Path(temp_file.name).unlink(missing_ok=True)
    st.write("Transcription : ")
    st.write(cpte_rendu)
    st.sidebar.download_button(label = "Télécharger le compte-rendu", data = cpte_rendu, file_name = "cr.txt", mime = "text/plain")
    torch.cuda.empty_cache()
Not sure about the future, maybe I'll try with a copy of my script and directly instruct cpte_rendu = trancriptor(audio_source.getvalue())
For my use case, this question is solved.
Thank you for your patience and your help Smile
Reply


Messages In This Thread
[Solved] I'm not getting the good type - by slain - Apr-10-2024, 07:51 AM
RE: I'm not getting the good type - by Gribouillis - Apr-10-2024, 08:48 AM
RE: I'm not getting the good type - by slain - Apr-10-2024, 11:49 AM
RE: I'm not getting the good type - by Gribouillis - Apr-10-2024, 12:26 PM
RE: I'm not getting the good type - by slain - Apr-10-2024, 01:49 PM
RE: I'm not getting the good type - by Gribouillis - Apr-10-2024, 02:29 PM
RE: I'm not getting the good type - by slain - Apr-10-2024, 02:48 PM
RE: I'm not getting the good type - by Gribouillis - Apr-10-2024, 03:01 PM
RE: I'm not getting the good type - by slain - Apr-11-2024, 07:17 AM
RE: [almost solved] I'm not getting the good type - by slain - Apr-11-2024, 01:36 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  [SOLVED] Good way to handle input args? Winfried 2 2,122 May-18-2021, 07:33 PM
Last Post: Winfried
  Type hinting - return type based on parameter micseydel 2 2,545 Jan-14-2020, 01:20 AM
Last Post: micseydel

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020