Nov-30-2020, 04:44 PM
I would like to extract and label certain segments of an audio file (audio.wav). The start and end times of the segments are given by the DateTimeStamp (first column) and the duration of action in milliseconds (third column) in another file, the annotation file (annotat.csv):
DateTimeStamp ---------- Action -- Duration of action in milliseconds
04/16/20 21:25:36:241 ----- A ----- 502
04/16/20 21:25:36:317 ----- B ----- 2253
04/16/20 21:25:36:734 ----- X ----- 118
04/16/20 21:25:36:837 ----- C ----- 10
04/16/20 21:25:37:537 ----- D ----- 797
04/16/20 21:25:37:606 ----- X ----- 70
04/16/20 21:25:37:874 ----- A ----- 1506
. ----- . -----.
The audio.wav file starts at the time of the first DateTimeStamp of the file annot.csv. How can I use the info in the annotat.csv file to save the segments (audio_seg) as individual files with unique filenames containing the info in the "Action" column of the annotat.csv file?
This is how far I got today:
DateTimeStamp ---------- Action -- Duration of action in milliseconds
04/16/20 21:25:36:241 ----- A ----- 502
04/16/20 21:25:36:317 ----- B ----- 2253
04/16/20 21:25:36:734 ----- X ----- 118
04/16/20 21:25:36:837 ----- C ----- 10
04/16/20 21:25:37:537 ----- D ----- 797
04/16/20 21:25:37:606 ----- X ----- 70
04/16/20 21:25:37:874 ----- A ----- 1506
. ----- . -----.
The audio.wav file starts at the time of the first DateTimeStamp of the file annot.csv. How can I use the info in the annotat.csv file to save the segments (audio_seg) as individual files with unique filenames containing the info in the "Action" column of the annotat.csv file?
This is how far I got today:
import io import pandas import numpy as np import librosa import soundfile as sf def read_data(annotat, date_format): df = pandas.read_csv(annotat, sep=',') # Use proper pandas datatypes df['Time'] = pandas.to_datetime(df['DateTime'], format=date_format) df['Duration'] = pandas.to_timedelta(df['Duration ms'], unit='ms') df['Offset'] = pandas.to_datetime(df['StartOffset ms'], unit='ms') df = df.drop(columns=['DateTime', 'Duration ms', 'StartOffset ms']) # Compute start and end time of each segment # audio starts at time of first segment first = df['Time'].iloc[0] df['Start'] = df['Time'] - first df['End'] = df['Start'] + df['Duration'] return df def extract_segments(y, sr, segments): # compute segment regions in number of samples starts = numpy.floor(segments.Start.dt.total_seconds() * sr).astype(int) ends = numpy.ceil(segments.End.dt.total_seconds() * sr).astype(int) # slice the audio into segments for start, end in zip(starts, ends): audio_seg = y[start:end] print('extracting audio segment:', len(audio_seg), 'samples') segments = read_data("C:/Users/Mergorine/Audio/annotat.csv", date_format="%m/%d/%y %H:%M:%S:%f") print(segments) y, sr = librosa.load("C:/Users/Mergorine/Audio/audio.wav", sr=16000, duration=2027) extract_segments(y, sr, segments)