Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Group files according to first few characters in filename
#1
I used the script below that worked for one set of files that had the same name in different folders but for this second set of files, it only shares the same first five characters across the folders so I am finding it difficult to try to combine these files. I tried using .startswith and it is also not giving me any results.

file_list = {}

for dirname in os.listdir(path):
    #print("dir:",dirname)
    for root,dirs,files in os.walk(path+dirname):
        for filename in files:
            if filename not in file_list:
                file_list[filename]=[]
            file_list[filename].append(os.path.join(root,filename))
print()
pprint.pprint(file_list)
how do I improve this code to only look for the first five characters in each file before adding into the dictionary? Thank you
Quote
#2
(Aug-01-2019, 10:36 AM)python_newbie09 Wrote: I used the script below that worked for one set of files that had the same name in different folders but for this second set of files, it only shares the same first five characters across the folders so I am finding it difficult to try to combine these files. I tried using .startswith and it is also not giving me any results.

file_list = {}

for dirname in os.listdir(path):
    #print("dir:",dirname)
    for root,dirs,files in os.walk(path+dirname):
        for filename in files:
            if filename not in file_list:
                file_list[filename]=[]
            file_list[filename].append(os.path.join(root,filename))
print()
pprint.pprint(file_list)
how do I improve this code to only look for the first five characters in each file before adding into the dictionary? Thank you

Hello i cant understand what exactly do you want, my english it not good, can you provide an example ?
Quote
#3
i don't know why this shouldn't work, what exactly didn't work by using 'startswith':

if filename.startswith('...')         # your five letters
Yeah, an example would help understanding what you really need
Quote
#4
(Aug-01-2019, 11:32 AM)Friend Wrote: i don't know why this shouldn't work, what exactly didn't work by using 'startswith':

if filename.startswith('...')         # your five letters
Yeah, an example would help understanding what you really need

My directory and file structure is as below
Main_Folder
--SubFolder1
---aaa_001
---bbb_002

--SunFolder2
---aaa_002
---bbb__004

So the idea is I want to group files based on the first 3 characters in this case and have a dictionary structure that will look as below when printing the file_list

{aaa: [aaa_001,aaa_002],
bbb: [bbb_002, bbb_004]}

when using startswith it just gives a true or false value but the result i get is as below:

{aaa_001: [aaa_001],
aaa_002: [aaa_002],
bbb_002: [bbb_002],
bbb_004:[bbb_004]}
Quote
#5
(Aug-01-2019, 06:39 PM)python_newbie09 Wrote:
(Aug-01-2019, 11:32 AM)Friend Wrote: i don't know why this shouldn't work, what exactly didn't work by using 'startswith':

if filename.startswith('...')         # your five letters
Yeah, an example would help understanding what you really need

My directory and file structure is as below
Main_Folder
--SubFolder1
---aaa_001
---bbb_002

--SunFolder2
---aaa_002
---bbb__004

So the idea is I want to group files based on the first 3 characters in this case and have a dictionary structure that will look as below when printing the file_list

{aaa: [aaa_001,aaa_002],
bbb: [bbb_002, bbb_004]}

when using startswith it just gives a true or false value but the result i get is as below:

{aaa_001: [aaa_001],
aaa_002: [aaa_002],
bbb_002: [bbb_002],
bbb_004:[bbb_004]}

Try this

for root, dirs, files in os.walk(path):
    for name in files:
    	identifier = name[:3]
    	if identifier not in file_list:
    		file_list[identifier] = []
    		file_list[identifier].append(os.path.join(root,name))
    	else:
    		file_list[identifier].extend([root + "/" + str(name)])
Quote
#6
(Aug-01-2019, 08:23 PM)cvsae Wrote:
(Aug-01-2019, 06:39 PM)python_newbie09 Wrote: My directory and file structure is as below
Main_Folder
--SubFolder1
---aaa_001
---bbb_002

--SunFolder2
---aaa_002
---bbb__004

So the idea is I want to group files based on the first 3 characters in this case and have a dictionary structure that will look as below when printing the file_list

{aaa: [aaa_001,aaa_002],
bbb: [bbb_002, bbb_004]}

when using startswith it just gives a true or false value but the result i get is as below:

{aaa_001: [aaa_001],
aaa_002: [aaa_002],
bbb_002: [bbb_002],
bbb_004:[bbb_004]}

Try this

for root, dirs, files in os.walk(path):
    for name in files:
    	identifier = name[:3]
    	if identifier not in file_list:
    		file_list[identifier] = []
    		file_list[identifier].append(os.path.join(root,name))
    	else:
    		file_list[identifier].extend([root + "/" + str(name)])

Brilliant!! Thank you. But if I may ask, how is this script different from the other? I understand you added the identifier to look up for the first few characters but i dont really understand what is going on in the if else condition. Would appreciate your explanation. Thanks!
Quote
#7
Observation: I would avoid misleading names. Not good to have name file_list for dictionary.
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Quote
#8
(Aug-02-2019, 05:56 AM)perfringo Wrote: Observation: I would avoid misleading names. Not good to have name file_list for dictionary.

You are right
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Remove escape characters / Unicode characters from string DreamingInsanity 5 267 May-15-2020, 01:37 PM
Last Post: snippsat
  extract specific data from a group of json-files ledgreve 3 360 Dec-05-2019, 07:57 PM
Last Post: ndc85430
  Rename only first 4 characters of filename bmatt8 2 702 Nov-15-2018, 05:15 PM
Last Post: nilamo
  copy files from one destination to another by reading filename from csv Prince_Bhatia 3 2,831 Feb-27-2018, 10:56 AM
Last Post: Prince_Bhatia
  How to create def for sorted() from list of versioning files (filename+datetime) DrLove73 10 3,542 Jan-16-2017, 11:43 AM
Last Post: DrLove73

Forum Jump:


Users browsing this thread: 1 Guest(s)