[Solved] I'm not getting the good type

slain · Apr-11-2024, 07:17 AM

(Apr-10-2024, 03:01 PM)Gribouillis Wrote: I disapprove the solution that you found in stackoverflow because instead of solving the problem it hides the problem. You still don't know why the file name was not intercepted by the case statements. There is an error in the logic of your program and chose to conceal it instead of resolving it.

Sorry, but I don't totally agree with you ;)
The solution I found on stack overflow just solves my problem about reading data from the audio file - and my bad, I didn't use the good function (used getvalue() instead of read() )
And in my context (just mine, I won't promise this script would be applicable everywhere), I know that I'll handle a limited number of file formats (didn't test with an iPhone, but with Android and Windows 10 I just need to handle m4a).

But I think for a larger program (or for a function which can be called by several other files) you would be right and it would be cleaner to add a case _: to throw an error if somebody tries to call convtomp3() with a csv, png or other non-foreseen case. ;)

(Apr-10-2024, 03:01 PM)Gribouillis Wrote: For the temporary files, a good solution is to put them all in a temporary directory which is automatically destroyed at the end of a context

>>> from tempfile import TemporaryDirectory, NamedTemporaryFile
>>>
>>> with TemporaryDirectory() as tdir:  # create temporary directory
...     print(tdir)
...     with NamedTemporaryFile(dir=tdir) as f:  # create temporary file inside temporary directory
...         print(f.name)
...     with NamedTemporaryFile(dir=tdir) as f:  # same thing
...         print(f.name)
... 
/tmp/tmpdhu6psra
/tmp/tmpdhu6psra/tmp0b_i1wfp
/tmp/tmpdhu6psra/tmpuvc0lsbc
>>> 
>>> # all the files are gone now

I'll try to add something like that, do you advice me to add this in the transcriptor function ?
Or is it possible to use os.remove(temp_file.name) after calling transcriptor at line 58?

**Gribouillis** · (This post was last modified: Apr-11-2024, 09:02 AM by Gribouillis.)

(Apr-11-2024, 07:17 AM)slain Wrote: Or is it possible to use os.remove(temp_file.name) after calling transcriptor at line 58?

Reading the documentation about NamedTemporaryFile, the temporary file should be automatically deleted when it is closed because the default values of parameters 'delete' and 'delete_on_close' are True. So another mystery is why the temporary files persist on your disk?

I'd suggest that you leave the context as soon as possible that is to say

if audio_source is not None:
    st.toast("Lancement de la transcription")
     
    with NamedTemporaryFile(suffix=Path(audio_source.name).suffix) as temp_file:
        temp_file.write(audio_source.getvalue())
        temp_file.seek(0)         # <-- This is not useful because transcriptor only uses the file name.
        cpte_rendu = transcriptor(temp_file) # <-- we are done with temp_file, now unindent to leave the context
    st.write("Transcription : ")
    st.write(cpte_rendu)
    st.sidebar.download_button(label = "Télécharger le compte-rendu", data = cpte_rendu, file_name = "cr.txt", mime = "text/plain")

If you want to stay with pathlib, you could also write

Path(temp_file.name).unlink(missing_ok=True)

(Apr-11-2024, 07:17 AM)slain Wrote: The solution I found on stack overflow just solves my problem

What revolts me is that you solve the bug without understanding the bug, which is why does the code reach a theoretically unreachable point. So the code still bugs but you don't perceive it anymore.
----
I think the transcriptor function should take only the filename as argument, not the temporary file open for writing.

slain · Apr-11-2024, 09:11 AM

(Apr-11-2024, 08:55 AM)Gribouillis Wrote: Reading the documentation about NamedTemporaryFile, the temporary file should be automatically deleted when it is closed because the default values of parameters 'delete' and 'delete_on_close' are True. So another mystery is why the temporary files persist on your disk?

I'd suggest that you leave the context as soon as possible that is to say
if audio_source is not None:
    st.toast("Lancement de la transcription")
     
    with NamedTemporaryFile(suffix=Path(audio_source.name).suffix) as temp_file:
        temp_file.write(audio_source.getvalue())
        temp_file.seek(0)    
        cpte_rendu = transcriptor(temp_file) # <-- we are done with temp_file, now unindent to leave the context
    st.write("Transcription : ")
    st.write(cpte_rendu)
    st.sidebar.download_button(label = "Télécharger le compte-rendu", data = cpte_rendu, file_name = "cr.txt", mime = "text/plain")

I just did it, the good news is that my script still works.
But it re-created a file audio1.mp3 (I deleted from the filesystem before re-running) and the file remains.
I read a little about NamedTemporaryFile, I understood that on Windows the file is deleted when context is left, not on Linux, right?
So this would be because my test machine is on Linux?

(Apr-11-2024, 08:55 AM)Gribouillis Wrote: If you want to stay with pathlib, you could also write
Path(temp_file.name).unlink(missing_ok=True)

Should I put this on line 59? Just after having passed temp_file to the function transcriptor?

(Apr-11-2024, 08:55 AM)Gribouillis Wrote: What revolts me is that you solve the bug without understanding the bug, which is why does the code reach a theoretically unreachable point. So the code still bugs but you don't perceive it anymore.

Sorry, I didn't want to revolt you.
As I understood it, the only thing I found on stack overflow was a user soing the same mistake as me : instead of reading his file, is ws trying to get the vallue of the file (the getvalue() is useful only when we create the temporary file and fill it with the value of the audio stram, in my case).
I think we are mixing 2 different things which happened in my code : the first one was my mistake of trying to getvalue() of my file instead of reading it (so, this one is solved), the second one is about the [inline]convtomp3()[inline] function - and there you are right, I should handle a case when we call the function with a file which has a suffix other than [m4a, wav, wma], is that what you meant?

slain · (This post was last modified: Apr-11-2024, 09:49 AM by slain.)

Oh, I have another question again Huh

So, I know my code is still not perfect, but I would like to understand one more point - and solve it if possible.
At this point, my code allowed me to convert an m4a file to an mp3, and then transcript, that's good.
But, on the Linux system hosting it, the nvidia-smi returns me this info :

Thu Apr 11 11:16:58 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.06              Driver Version: 545.29.06    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1050 Ti     Off | 00000000:01:00.0 Off |                  N/A |
|  0%   39C    P8              N/A /  95W |   2294MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      9299      C   /home/ild/miniconda3/bin/python            2290MiB |
+---------------------------------------------------------------------------------------+

That means that the GPU's memory is still used by Python, whereas it does not need it anymore (because it finished transcribing the audio file).
Is there a way to "deallocate" (not sure if it is the good word) the GPU memory, in order to be able to transcribe a new audio file without having to reboot streamlit?

Thanks

Edit : found a way to clear some memory.
I added torch.cuda.empty_cache()
From 3302MiB used, I'm now to 3108MiB used, which allows me to run again a transcribe operation.

slain · Apr-11-2024, 09:54 AM

(Apr-11-2024, 09:02 AM)Gribouillis Wrote: If you want to stay with pathlib, you could also write
Path(temp_file.name).unlink(missing_ok=True)

I tried to add it at line 61, ls still shows me an audio1.mp3.
I tried both on the same indentation as cpte_rendu = transcriptor(temp_file) and unindented, it stills lets the audio1.mp3 survive.
Is this because I don't put it in the right place?

**Gribouillis** · Apr-11-2024, 10:49 AM

(Apr-11-2024, 09:54 AM)slain Wrote: tried to add it at line 61, ls still shows me an audio1.mp3.

Is audio1.mp3 the name of your temporary file? Or was it written by some other portion of the code?

slain · Apr-11-2024, 12:35 PM

(Apr-11-2024, 10:49 AM)Gribouillis Wrote: Is audio1.mp3 the name of your temporary file? Or was it written by some other portion of the code?

For me, it's written in the function convtomp3()

def convtomp3(file):
    """Convert an audio file from m3a, wav or wma to mp3"""
    match Path(file.name).suffix:
        case ".m4a":
            wav_audio = AudioSegment.from_file(file, format="m4a")
            result = wav_audio.export("audio1.mp3", format="mp3")
            return result
        case ".wav":
            wav_audio = AudioSegment.from_file(file, format="wav")
            result = wav_audio.export("audio1.mp3", format="mp3")
            return result
        case ".wma":
            wav_audio = AudioSegment.from_file(file, format="wma")
            result = wav_audio.export("audio1.mp3", format="mp3")
            return result
        case _:
            raise ValueError(f"Fichier invalide : {file.name!r}")
    return file # do nothing if no match in the suffix

Maybe I should give an instruction at the end of each case, like os.remove(audio1.mp3) or gc.collect()?
I keeped googling (not sure if this is a good method, but my idea is not to just ask my question and wait for an answer by someone doing the job for me), and found the garbage collector.
So I added import gc in my imports and gc.enable() after initializing device and pipe, hope this in not an error.

**Gribouillis** · Apr-11-2024, 01:06 PM

(Apr-11-2024, 12:35 PM)slain Wrote: Maybe I should give an instruction at the end of each case, like os.remove(audio1.mp3) or gc.collect()?

Don't manipulate the garbage collector, it is completely useless for what you are doing.

On the other hand yes, calling Path('audio.mp3').unlink(missing_ok=True) after the call to transcriptor() should work.

slain · (This post was last modified: Apr-11-2024, 01:36 PM by slain.)

(Apr-11-2024, 01:06 PM)Gribouillis Wrote: Don't manipulate the garbage collector, it is completely useless for what you are doing.

OK, I didn't see any difference with or without garbage collector, so I removed it.

(Apr-11-2024, 01:06 PM)Gribouillis Wrote: On the other hand yes, calling Path('audio.mp3').unlink(missing_ok=True) after the call to transcriptor() should work.

OK, working with Path("audio1.mp3").unlink(missing_ok=True) in line 61.
This one removes the audio1.mp3, and on line 62 Path(temp_file.name).unlink(missing_ok=True) removes (or seems to) any uploaded audio file (no trace about my m4a file)

Here is my full code, in my opinion it works - maybe it could be optimmized, but I don't know enough about Python to say that :

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
# Upload an audio file, if needed it will be converted to mp3                #
# Then it will be transcripted using bofenghuang/whisper-small-cv11-french   #
# Finally it will give you possibility to download transcription             #
#    in .txt format                                                          #
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

## Imports ##
from pathlib import Path
import streamlit as st
import torch

from tempfile import NamedTemporaryFile
from datasets import load_dataset
from transformers import pipeline
from pydub import AudioSegment

## Initialize environment ##
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
pipe = pipeline("automatic-speech-recognition", model="bofenghuang/whisper-small-cv11-french", device=device)
pipe.model.config.forced_decoder_ids = pipe.tokenizer.get_decoder_prompt_ids(language="fr", task="transcribe")

## Functions ##
def convtomp3(file):
    """Convert an audio file from m3a, wav or wma to mp3"""
    match Path(file.name).suffix:
        case ".m4a":
            wav_audio = AudioSegment.from_file(file, format="m4a")
            result = wav_audio.export("audio1.mp3", format="mp3")
            return result
        case ".wav":
            wav_audio = AudioSegment.from_file(file, format="wav")
            result = wav_audio.export("audio1.mp3", format="mp3")
            return result
        case ".wma":
            wav_audio = AudioSegment.from_file(file, format="wma")
            result = wav_audio.export("audio1.mp3", format="mp3")
            return result
        case _:
            raise ValueError(f"Fichier invalide : {file.name!r}")
    return file # do nothing if no match in the suffix

def transcriptor(file):
    """Transcript from an audio file to a string"""
    if(Path(file.name).suffix != ".mp3"):
        file=convtomp3(file)
    waveform = file.read()
    predicted_sentence  = pipe(waveform, max_new_tokens=225)
    return str(predicted_sentence)

## Display ##
st.title("Convertisseur / Transcripteur")
audio_source = st.sidebar.file_uploader(label = "Fichiers audio uniquement", type=["mp3","m4a","wav","wma"])
if audio_source is not None:
    st.toast("Lancement de la transcription")
    
    with NamedTemporaryFile(suffix=Path(audio_source.name).suffix) as temp_file:
        temp_file.write(audio_source.getvalue())
        temp_file.seek(0)    
        cpte_rendu = transcriptor(temp_file)
        Path("audio1.mp3").unlink(missing_ok=True)
        Path(temp_file.name).unlink(missing_ok=True)
    st.write("Transcription : ")
    st.write(cpte_rendu)
    st.sidebar.download_button(label = "Télécharger le compte-rendu", data = cpte_rendu, file_name = "cr.txt", mime = "text/plain")
    torch.cuda.empty_cache()

Not sure about the future, maybe I'll try with a copy of my script and directly instruct cpte_rendu = trancriptor(audio_source.getvalue())
For my use case, this question is solved.
Thank you for your patience and your help Smile

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	[SOLVED] Good way to handle input args?	Winfried	2	2,094	May-18-2021, 07:33 PM Last Post: Winfried
	Type hinting - return type based on parameter	micseydel	2	2,517	Jan-14-2020, 01:20 AM Last Post: micseydel

[Solved] I'm not getting the good type

User Panel Messages

Announcements