Python Forum
I need to copy all the directories that do not match the pattern
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
I need to copy all the directories that do not match the pattern
#1
Greetings!
I need to copy all the directories that do not match the pattern:
8 alphanumeric characters all capital, underscore, and the 4 digits at the end.
Something like this:
ED1234ND_2345
YD1COP1Z_3456
And so on...

I’m having a hell of a time creating regex for it. Sad

Here is what I got:
import re
from pathlib import Path
tofind = '^[A-Z0-9]{8}_d{4}$' 
for ed in Path('C:/01/TLogs/').iterdir() :
    if ed.is_dir():
        rudir = Path(ed).parts[3]
        #print(f" Dir_Name -{type(rudir)}")
        if re.findall(tofind,rudir) :
            print(f" found -> {rudir}")
I think the problem is in the 'underscore' part of the regex.
Any help is appreciated.
Thank you!
Reply
#2
When match digit in regex need be like this \d and not d alone.
Can test regex at regex101.
You can delete line 6 and change line 8 to this:
if re.search(tofind, ed.stem):
Reply
#3
my bad! It is a typo...
it is actually look like that
tofind = '^[A-Z0-9]{8}_\d{4}$' 
but it does not print anything...
If I'll remove part of it
_\d{4}$

it runs and prints but does not filter everything I'm looking for
Reply
#4
It's not hurting you here, but you should get in the habit of using "r-strings" for regex patterns so backslashes don't get modified.

We don't have your filesystem to know what's there. But your pattern seems okay. Your problem may be in the filesystem or how you're trying to prune the names. As mentioned above, use .stem to pull the final component of a Path.

Your pattern is anchored, so it can't match multiple times. Use re.match or re.search instead of re.findall.

import re
tofind = r'^[A-Z0-9]{8}_\d{4}$'

for d in ["ED1234ND_2345", "YD1COP1Z_3456", "mydir"]:
    if re.findall(tofind,d):
        print(f"{d} Matched")
    else:
        print(f"{d}  no match")
Output:
ED1234ND_2345 Matched YD1COP1Z_3456 Matched mydir no match
tester_V likes this post
Reply
#5
Thank you for the code!
Do you think you could elaborate on why your snipped is working and my is not?
even if I add your regex it is not printing anything...
it seems exactly the same.

your code:
tofind = r'^[A-Z0-9]{8}_\d{4}$'
 
for d in ["_Y151029E_7345", "D151009EN_7295", "mydir","small_11","TST___3456","TST_3456","TST3456"]:
    if re.findall(tofind,d):
        print(f"{d} Matched")
    else:
        print(f"{d}  no match")
And I wrote (the directories names are the same):
import re
from pathlib import Path
tofind = r'^[A-Z0-9]{8}_\d{4}$'
for ed in Path('C:\\01\\TLogs').iterdir() :
    if ed.is_dir():
        rudir = Path(ed).parts[3]
        print(f" Dir_Name -{rudir}")
        if re.findall(tofind,rudir) :
            print(f" found -> {rudir}")
Reply
#6
If i do a test.
import re
from pathlib import Path

my_dir = r'G:\div_code\foo_folder'
pattern = re.compile(r'^[A-Z0-9]{8}_\d{4}$')
for ed in Path(my_dir).iterdir():
    if ed.is_dir():
        #print(ed)
        # <stem> path component,without it's suffix
        if re.search(pattern, ed.stem):
            print(ed)
            print(ed.stem)
            print('-' * 30)
Output:
G:\div_code\foo_folder\11111111_1111 11111111_1111 ------------------------------ G:\div_code\foo_folder\ED1234ND_2345 ED1234ND_2345 ------------------------------ G:\div_code\foo_folder\YD1COP1Z_3456 YD1COP1Z_3456
So it's working as i expected,this is my content of foo_folder.
Output:
G:\div_code\foo_folder λ ls 11111111_1111/ AFILE111_1111.txt YD1COP1Z_3456/ find_dir.py '11111111_1111 not'/ ED1234ND_2345/ boy_2.txt test_folder/
If i want the opposite also all folder that don't match this pattern.
if not re.search(pattern, ed.stem):
Output:
G:\div_code\foo_folder\11111111_1111 not 11111111_1111 not ------------------------------ G:\div_code\foo_folder\test_folder test_folder
tester_V likes this post
Reply
#7
(Feb-04-2022, 08:23 AM)tester_V Wrote: even if I add your regex it is not printing anything...

If it's not printing anything then line 7 isn't being reached and your problem isn't related to the regex. As I mentioned, you might not be pulling the path data properly.

Try putting in a statement between 4 and 5 like print(ed). Are you getting the paths you expect? Is the 4th component the part you want?
tester_V likes this post
Reply
#8
Thank you!
I really appreciate your help!
This is the best forum for Python and not just because of the level of knowledge.
Very friendly attitude, not condescending... you guys are great!
Thank you for the snippet and the coaching again!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Why is the copy method name in python list copy and not `__copy__`? YouHoGeon 2 289 Apr-04-2024, 01:18 AM
Last Post: YouHoGeon
  Organization of project directories wotoko 3 442 Mar-02-2024, 03:34 PM
Last Post: Larz60+
  Listing directories (as a text file) kiwi99 1 848 Feb-17-2023, 12:58 PM
Last Post: Larz60+
  Regex pattern match WJSwan 2 1,285 Feb-07-2023, 04:52 AM
Last Post: WJSwan
  Find duplicate files in multiple directories Pavel_47 9 3,151 Dec-27-2022, 04:47 PM
Last Post: deanhystad
  rename same file names in different directories elnk 0 719 Nov-04-2022, 05:23 PM
Last Post: elnk
  Functions to consider for file renaming and moving around directories cubangt 2 1,771 Jan-07-2022, 02:16 PM
Last Post: cubangt
  Python create directories within directories mcesmcsc 2 2,227 Dec-17-2019, 12:32 PM
Last Post: mcesmcsc
  Regular expression: match pattern at the end only Pavel_47 3 1,882 Nov-27-2019, 07:51 PM
Last Post: Gribouillis
  Shutil attempts to copy directories that don't exist ConsoleGeek 5 4,571 Oct-29-2019, 09:26 PM
Last Post: Gribouillis

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020