Python Forum
Decompressing bz2 in multiple sub-directories
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Decompressing bz2 in multiple sub-directories
#21
Hello guys! I have a question in relation to the code that you've helped me with before (for which I am super thankful to you all). So, I am running the code below on the AWS. But sometimes my machine disconnects due to "broken pipe" (no apparent reason for that, but connection is getting lost). Therefore, how can I adjust the code so that when I re-run it after disconnection  it "skips" those files that have already been extracted and placed in to a target folder. Otherwise, as far as I understand, the code starts the process all over and over-writes all already extracted files. Since there are 44,000 files to extract, it is very time consuming. Thank you in advance for help!

import os
import sys
import bz2
from bz2 import decompress

file_counter = 0
for dirpath, dirname, files in os.walk('/home/ec2-user/Notebook/Source'):
   for filename in files:
       file_counter += 1
       if filename.endswith('.json.bz2'):
           filepath = os.path.join(dirpath, filename)
           newfilepath = os.path.join('/home/ec2-user/Notebook/Target', "{0}.json".format(file_counter))
           with open(newfilepath, 'wb') as new_file, bz2.BZ2File(filepath, 'rb', 10000000) as file:
               for data in iter(lambda : file.read(100 * 1024), b''):
                   new_file.write(data)
Reply


Messages In This Thread
RE: Decompressing bz2 in multiple sub-directories - by kiton - Apr-13-2017, 02:25 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Organization of project directories wotoko 3 477 Mar-02-2024, 03:34 PM
Last Post: Larz60+
  Listing directories (as a text file) kiwi99 1 869 Feb-17-2023, 12:58 PM
Last Post: Larz60+
  Find duplicate files in multiple directories Pavel_47 9 3,238 Dec-27-2022, 04:47 PM
Last Post: deanhystad
  rename same file names in different directories elnk 0 736 Nov-04-2022, 05:23 PM
Last Post: elnk
  I need to copy all the directories that do not match the pattern tester_V 7 2,508 Feb-04-2022, 06:26 PM
Last Post: tester_V
  Moving specific files then unzipping/decompressing christophereccles 2 2,403 Apr-24-2021, 04:25 AM
Last Post: ndc85430
  Python create directories within directories mcesmcsc 2 2,247 Dec-17-2019, 12:32 PM
Last Post: mcesmcsc
  How to combine file names into a list from multiple directories? python_newbie09 3 5,260 Jul-09-2019, 07:38 PM
Last Post: python_newbie09
  Accessing files in various directories and folders ccuny 2 2,195 May-08-2019, 12:11 PM
Last Post: ccuny
  Creating directories from two lists QueenSveta 2 2,805 Jun-22-2018, 09:33 AM
Last Post: volcano63

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020