Feb-24-2017, 08:32 PM
A .wav file (as well as most other audio and images actually) consists of two main parts, a header and the actual data. For a .wav file the 'header' is 44 bytes (0 - 43) and the data is the remainder of the file (bytes 44 - end). The 'header' is further divided into 'file' and 'format' bytes (chunks). The following is something I wrote to 'read' that information. It's some what long as I included a lot of comments. It only requires the built-in struct module. This is based on the information provided here: .WAV format. Hope this help you in understanding what's going on 'under the hood'
the output:
#! /usr/bin/env/ python3 import struct """ be = big endian, le = little endian followed by field(bytes) size file_id = chunk 0-3; be, 4, ascii each character = 1 byte, default 'RIFF' file_sz = chunk 4-7; le, 4, 32 bit integer, file size minus 8 bytes file_type = chunk 8-11; be, 4, File type, default = 'WAVE' file_format = chunk 12-15; be, 4, format chunk marker, default = 'fmt ', NOTE there is a space at the end of the chunk format_data = chunk 16-19; le, 4, length of format data listed above, default 16 (8, 16, 24, 32) format_type = chunk 20-21; le, 2, type of format, default 1 (PCM), other number indicates compression num_ch = chunk 22-23; le, 2, number of channels, 1=mono, 2=stereo (up to 6 channels ?) sample_rate = chunk 24-27; le, 4, sample frames per sec (i.e. Hz),default 44100 (CD quality) byte_rate = chunk 28-31; = le, 4, (sample_rate * bits_sample * no_chan) / 8 blk_align = chunk 32-33; le, 2, (bits_sample * no_chan) / 8, rounded up to next whole number. (A sample frame for a 16-bit mono wave is 2 bytes. A sample frame for a 16-bit stereo wave is 4 bytes. Etc) bits_sample = chunk 34-35; le, 2, bits per sample, (ie, a 16-bit waveform would have wBitsPerSample = 16) d_tag = chunk 36-39; be, 4, marks beginning of data, default = 'data' data_sz = chunk 40-43; le, 4, size of data section; (number of samples * no_chan * bits_sample) / 8 """ def main(w_file): with open(w_file, 'rb') as bif: bif.seek(0) file_id = str(bif.read(4), encoding='utf-8') bif.seek(4) sz = bif.read(4) file_sz = struct.unpack('<i', sz) bif.seek(8) file_type = str(bif.read(4), encoding='utf-8') bif.seek(12) file_format = str(bif.read(4), encoding='utf-8') bif.seek(16) fd = bif.read(4) format_data = struct.unpack('<2h', fd) bif.seek(20) ft = bif.read(2) fmt = struct.unpack('<h', ft) if fmt[0] == 1: format_type = 'PCM (Uncompressed)' else: format_type = 'Compressed' bif.seek(22) nc = bif.read(2) num_ch = struct.unpack('<h', nc) bif.seek(24) sr = bif.read(4) sample_rate = struct.unpack('<i', sr) bif.seek(28) br = bif.read(4) byte_rate = struct.unpack('<i', br) bif.seek(32) ba = bif.read(2) blk_algn = struct.unpack('<h', ba) bif.seek(34) bps = bif.read(2) bits_sample = struct.unpack('<h', bps) bif.seek(36) d_tag = str(bif.read(4), encoding='utf-8') bif.seek(40) ds = bif.read(4) data_size = struct.unpack('<i', ds) info = (file_id, file_sz[0] + 8, file_type, file_format, format_data[0], format_type, num_ch[0], sample_rate[0], byte_rate[0], blk_algn[0], bits_sample[0], d_tag, data_size[0]) return info if __name__ == '__main__': wav_file = 'SineWave_440Hz.wav' # Make what ever wave file you'd like stats = main(wav_file) print("\n.WAV File information for ", wav_file) print("=" * 79, "\n") print("File ID: {} \nFile Size: {} \nFile Type: {} \nFile Format: {} \nLength of Format Data: {}" "\nType of Format: {} \nNumber of Channels: {} \nSample Rate: {} \nByte Rate: {} \nBlock Align: {}" "\nBits Per Sample: {} \nData Header: {} \nSize of Data Section: {}".format( stats[0], stats[1], stats[2], stats[3], stats[4], stats[5], stats[6], stats[7], stats[8], stats[9], stats[10], stats[11], stats[12]))
Output:C:\Python36\python.exe C:/Python/Sound/read_wave.py
.WAV File information for SineWave_440Hz.wav
===============================================================================
File ID: RIFF
File Size: 264644
File Type: WAVE
File Format: fmt
Length of Format Data: 16
Type of Format: PCM (Uncompressed)
Number of Channels: 1
Sample Rate: 44100
Byte Rate: 88200
Block Align: 2
Bits Per Sample: 16
Data Header: data
Size of Data Section: 264600
Process finished with exit code 0
I did not include the actual data, but concept remains the same. Just 'seek(44), read the appropriate bits per sample (i.e . 8, 16, 24, 32) then unpack it. Writing the file is pretty much just the reverse, except you 'pack' instead of 'unpack'.