Python Forum
First time with Python.. need help with simple script
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
First time with Python.. need help with simple script
#1
Hi everyone. I'm completely new to Python and I'm hoping to get a little direction with a simple script. I'm trying to get the script to do the following:

1. from the current working directory, gather a list of all the subfolders and store them in variable
2. cd into each subfolder and create an additional set of folders (ie. cd Folder1 then mkdir Folder A, Folder B, Folder C)
3. while in the subfolder (Folder 1), move all the files that start with a certain string (ie. blaster*) into Folder A. Then move all the files that start with different string (ie. clash*) into Folder B, and so on.

I've written two versions of the script; One works if I feed it individual directory names by using sys.argv[1] (myscript.py Folder1) I'm trying to get the second script to run on every subdfolder in the current working directory. I hope I'm making sense and I apologize for sounding like a complete idiot.. Here's a copy of the script:

#!/usr/bin/python3

import os
import shutil
import sys
import fnmatch

top_dir = "/home/mmcneil/Desktop/TMP"
root_dir = next(os.walk('.'))[1]

work_dir = ['Blaster','Clash','Force','Lockup','PowerOFF','PowerON','Sign1','Spin','Stab','Swing']
for root_folder in root_dir:

        for folder in work_dir:
                os.makedirs(os.path.join(root_folder,folder), exist_ok=True)
## Everthing ABOVE this line works the way I want it to##

## Trying to get the script to chdir to 'top_dir/root_dir'
for ret_folder in next(os.walk('.'))[1]:
        os.chdir(os.path.join(top_dir,ret_folder))

#for file in os.listdir('.'):
for file in os.path.join(top_dir,ret_folder):
    dst = (work_dir)
    if fnmatch.fnmatch(file, 'blaster*'):
          shutil.move(file, dst[0])

    if fnmatch.fnmatch(file, 'clash*'):
     shutil.move(file, dst[1])

    if fnmatch.fnmatch(file, 'force*'):
     shutil.move(file, dst[2])

    if fnmatch.fnmatch(file, 'lockup*'):
     shutil.move(file, dst[3])

    if fnmatch.fnmatch(file, 'p*w*off*'):
     shutil.move(file, dst[4])

    if fnmatch.fnmatch(file, 'p*w*on*'):
     shutil.move(file, dst[5])

    if fnmatch.fnmatch(file, 'combo*'):
     shutil.move(file, dst[6])

    if fnmatch.fnmatch(file, 'spin*'):
     shutil.move(file, dst[7])

    if fnmatch.fnmatch(file, 'stab*'):
     shutil.move(file, dst[8])

    if fnmatch.fnmatch(file, 'swing*'):
     shutil.move(file, dst[9])
I appreciate any help I can get with this.

Kind Regards,

Shakir
Reply
#2
Hi,

You are using os.walk to access to the current directory, but remember that os.walk will transverse the entire folder tree. If you only want to list the current directory is enough with os.listdir or even better os.scanlist.
# List all the elements of the current directory that are folders
folders = [d.name for d in os.scandir('.') if d.is_dir()]
I think the problem is when you try to chdir in the 'top_dir/root_dir'... the last for loop (the one in the line 23) is supposed to work for each of the ret_folder values, but as it is outside of the for in line 19 it only runs for the last value and remember that os.walk returns the folders in no particular order.

As a small recommendation, try to chdir as less as possible. In your case all the loop can work without having to move the execution directory as shutil.move works fine with absolute paths (os.getcwd allows you to obtain the current folder full path).

Another nice trick to avoid repeating the same fnmatch schema is to store them in a dictionary. If it is {mask: Folder} that many masks can go to the same directory, if you use the opposite {Folder: mask} each folder can only contain one set of files.

mapping = {'Blaster': 'blaster*', 'Clash': 'clash*'} # And all the others...
# for file in os.listdir('.'):
for file in os.path.join(top_dir, ret_folder):
    for target in mapping:
        if fnmatch.fnmatch(file, mapping[target]):
            shutil.move(file, target)
Reply
#3
Here's an alternative to os.walk that will allow one directory at a time
uses f-string (python 3.6+)
from pathlib  import Path


class FindPaths:
    def __init__(self):
        self.current_dir = None

    def get_info(self, path, what=None):
        """
        return list of directories (what == 'd', or None) or files (what == 'f') in path

        :param path: path for search
        :param what: 'd' or default (None) for directory list, 'f' for files
        :return: list of dirs or files
        """
        if what is None or what == 'd':
            return [item for item in path.iterdir() if item.is_dir()]
        else:
            return [item for item in path.iterdir() if item.is_file()]

    def get_dir_info(self, path):
        """
        Generator which given a starting directory yields a tuple of two lists, first for subdirectories,
        second for files.

        :param path: pathname
        :return: two lists, dirs and files
        """
        yield self.get_info(path, 'd'), self.get_info(path, 'f')


def testit():
    # for test, start with current path
    homepath = Path('.')
    # set starting directory two up
    targetpath = homepath / '..' / '..'

    fp = FindPaths()
    def Walkdir(path):
        for dirs, files in fp.get_dir_info(path):
            print(f'\nDirs: {dirs}\nFiles: {files}')
            for dir in dirs:
                Walkdir(dir)
    Walkdir(targetpath)

if __name__ == '__main__':
    testit()
Reply
#4
(May-03-2018, 10:07 AM)killerrex Wrote: Hi,

You are using os.walk to access to the current directory, but remember that os.walk will transverse the entire folder tree. If you only want to list the current directory is enough with os.listdir or even better os.scanlist.
# List all the elements of the current directory that are folders
folders = [d.name for d in os.scandir('.') if d.is_dir()]
I think the problem is when you try to chdir in the 'top_dir/root_dir'... the last for loop (the one in the line 23) is supposed to work for each of the ret_folder values, but as it is outside of the for in line 19 it only runs for the last value and remember that os.walk returns the folders in no particular order.

As a small recommendation, try to chdir as less as possible. In your case all the loop can work without having to move the execution directory as shutil.move works fine with absolute paths (os.getcwd allows you to obtain the current folder full path).

Another nice trick to avoid repeating the same fnmatch schema is to store them in a dictionary. If it is {mask: Folder} that many masks can go to the same directory, if you use the opposite {Folder: mask} each folder can only contain one set of files.

mapping = {'Blaster': 'blaster*', 'Clash': 'clash*'} # And all the others...
# for file in os.listdir('.'):
for file in os.path.join(top_dir, ret_folder):
    for target in mapping:
        if fnmatch.fnmatch(file, mapping[target]):
            shutil.move(file, target)

First of all, thanks for getting back to me killerrex. I've modified the script to look like this:
#!/usr/bin/python3

import os
import shutil
import sys
import fnmatch

top_dir = "/home/mmcneil/Desktop/TMP"
root_dir = [d.name for d in os.scandir('.') if d.is_dir()]

mapping = {'Blaster': 'blaster*', 'Clash': 'clash*', 'Force': 'force*', 'Lockup': 'lockup*', 'PowerOFF': 'p*w*off*', 'PowerON': 'p*w*on*', 'Sign1': 'combo*', 'Spin': 'spin*', 'Stab': 'stab*', 'Swing': 'swing*'}

work_dir = ['Blaster','Clash','Force','Lockup','PowerOFF','PowerON','Sign1','Spin','Stab','Swing']
for root_folder in root_dir:

        for folder in work_dir:
                os.makedirs(os.path.join(root_folder,folder), exist_ok=True)
## Everthing ABOVE this line works the way I want it to##


## Trying to get the script to chdir to 'top_dir/root_dir'
for ret_folder in root_dir:
 os.chdir(ret_folder)
 for file in os.path.join(top_dir, ret_folder):
  for target in mapping:
   if fnmatch.fnmatch(file, mapping[target]):
    shutil.move(file, target)
Now, the folders listed in "work_dir" get created, with no problems. The problem is, as you described, when I try to 'cd' into 'root_dir'. So my question to you is; How do I tell the script to go into each 'root_dir' and move the files into the folders defined in 'mapping' ? Also, thank you so much for showing me how to condense the fnmatch schema.

Kind Regards,

Shakir
Reply
#5
Still hoping to get a bit more help here.

Thanks
Reply
#6
Hi Shakir,

What I was saying is that it can be better to avoid the chdir altogether. Changing the current directory is practical for dirty code in the command line and some really obscure special occasions, but normally is just a source of bugs...

This code is what I manage to understand from tour explanation (maybe I got wrong the idea of what you want to do, but look to the tools to play with paths and adapt it to your code)
#!/usr/bin/python3
 
import os
import shutil
import sys
import fnmatch
 
top_dir = "/home/mmcneil/Desktop/TMP"
root_dir = [d.name for d in os.scandir('.') if d.is_dir()]
 
mapping = {
    'Blaster': 'blaster*', 'Clash': 'clash*', 'Force': 'force*',
    'Lockup': 'lockup*', 'PowerOFF': 'p*w*off*',
    'PowerON': 'p*w*on*', 'Sign1': 'combo*', 'Spin': 'spin*', 
    'Stab': 'stab*', 'Swing': 'swing*'
}

## Create for all the folders in the current directory a set of subdirectories
#  So if the current directory has the folders a, b, c everything finish as
#   ./a/Blaster    ./b/Blaster   ./c/Blaster
#   ./a/Clash      ./b/Clash     ./c/Clash
#   ./a/...        ./b/...       ./c/...
for root_folder in root_dir:
    for section in mapping:
        os.makedirs(os.path.join(root_folder, section), exist_ok=True)
## Everthing ABOVE this line works the way I want it to##
 
## Try to move all the files that are under top_dir/<Folder>/<glob> to
## ./<Folder>/<section>/
for ret_folder in root_dir:
    
    source = os.path.join(top_dir, ret_folder)
    # List all the files in the source
    for entry in os.scandir(source):
        if not entry.is_file():
            continue
        for section, pattern in mapping.items():
            if not fnmatch.fnmatch(entry.name, pattern):
                continue
            # Move top_dir/<Folder>/file to ./<Folder>/<section>/
            ori = os.path.join(source, entry.name)
            dst = os.path.join('.', ret_folder, section)
            shutil.move(ori, dst)
            break
I think the problem in your code was in the line
for file in os.path.join(top_dir, ret_folder):
That does not list all the files in the directory "<top_dir>/<ret_folder>" but iterate over the characters in the string (so file was "/", "h", "o", "m", "e"...)
That is one of the advantages of python (you can iterate in almost anything) but sometimes is not what you want to do.
To solve this type of problems the best thing is to put some traces in your code or use a debugger to put a breakpoint before the loop.
Reply
#7
(May-04-2018, 10:51 PM)killerrex Wrote: Hi Shakir, What I was saying is that it can be better to avoid the chdir altogether. Changing the current directory is practical for dirty code in the command line and some really obscure special occasions, but normally is just a source of bugs... This code is what I manage to understand from tour explanation (maybe I got wrong the idea of what you want to do, but look to the tools to play with paths and adapt it to your code)
 #!/usr/bin/python3 import os import shutil import sys import fnmatch top_dir = "/home/mmcneil/Desktop/TMP" root_dir = [d.name for d in os.scandir('.') if d.is_dir()] mapping = { 'Blaster': 'blaster*', 'Clash': 'clash*', 'Force': 'force*', 'Lockup': 'lockup*', 'PowerOFF': 'p*w*off*', 'PowerON': 'p*w*on*', 'Sign1': 'combo*', 'Spin': 'spin*', 'Stab': 'stab*', 'Swing': 'swing*' } ## Create for all the folders in the current directory a set of subdirectories # So if the current directory has the folders a, b, c everything finish as # ./a/Blaster ./b/Blaster ./c/Blaster # ./a/Clash ./b/Clash ./c/Clash # ./a/... ./b/... ./c/... for root_folder in root_dir: for section in mapping: os.makedirs(os.path.join(root_folder, section), exist_ok=True) ## Everthing ABOVE this line works the way I want it to## ## Try to move all the files that are under top_dir/<Folder>/<glob> to ## ./<Folder>/<section>/ for ret_folder in root_dir: source = os.path.join(top_dir, ret_folder) # List all the files in the source for entry in os.scandir(source): if not entry.is_file(): continue for section, pattern in mapping.items(): if not fnmatch.fnmatch(entry.name, pattern): continue # Move top_dir/<Folder>/file to ./<Folder>/<section>/ ori = os.path.join(source, entry.name) dst = os.path.join('.', ret_folder, section) shutil.move(ori, dst) break 
I think the problem in your code was in the line
 for file in os.path.join(top_dir, ret_folder): 
That does not list all the files in the directory "<top_dir>/<ret_folder>" but iterate over the characters in the string (so file was "/", "h", "o", "m", "e"...) That is one of the advantages of python (you can iterate in almost anything) but sometimes is not what you want to do. To solve this type of problems the best thing is to put some traces in your code or use a debugger to put a breakpoint before the loop.
Thank you soooooo much killerrex. This works EXACTLY how I needed it to. I'm now going to do myself a favor and go through Google's free 2 day Python course and also take advantage of a few of the available free online courses. I really like Python. One question; after looking at the last part of your code, beginning on like 32, how would I have known to do that? Being a noob and all. Thanks again.

Shakir
Reply
#8
(May-06-2018, 02:06 AM)shakir_abdul_ahad Wrote: One question; after looking at the last part of your code, beginning on like 32, how would I have known to do that? Being a noob and all. Thanks again.
Well... have you seen this in the python documentation
Quote:keep this under your pillow
it is not a joke.

Really that part off the code will not look so special to you in a few weeks using python. It has different just my use of the continue to quickly discard the non interesting passes of a loop -other people prefer using an if-then-else, is just your preference- and the break to stop testing once the file is already moved. The rest is just read the os entry in the python library.

One thing that is really important and does not look part of the code are the comments. My professional code has even more comments, describing what I expect to find in each part of the code a little bit complex. Always think that your code must be understood -and fixed- by someone who has less experience than you...
When you plan to use something complex (like the nested for loops in the previous code or a long regex) write the comment before the writing code. If you cannot summarise it in a few sentences, your code will be too complex and might backfire in any moment.

Good luck learning python!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Is there a *.bat DOS batch script to *.py Python Script converter? pstein 3 3,009 Jun-29-2023, 11:57 AM
Last Post: gologica
  Understanding venv; How do I ensure my python script uses the environment every time? Calab 1 2,158 May-10-2023, 02:13 PM
Last Post: Calab
  How to change UTC time to local time in Python DataFrame? SamKnight 2 1,528 Jul-28-2022, 08:23 AM
Last Post: Pedroski55
  Clock\time calculation script Drone4four 3 1,442 Jan-21-2022, 03:44 PM
Last Post: ibreeden
  Simple Python script, path not defined dubinaone 3 2,656 Nov-06-2021, 07:36 PM
Last Post: snippsat
  Real-Time output of server script on a client script. throwaway34 2 2,011 Oct-03-2021, 09:37 AM
Last Post: ibreeden
  PyCharm Script Execution Time? muzikman 3 8,358 Dec-14-2020, 11:22 PM
Last Post: muzikman
  Need help creating a simple script Nonameface 12 4,429 Jul-14-2020, 02:10 PM
Last Post: BitPythoner
  How to kill a bash script running as root from a python script? jc_lafleur 4 5,792 Jun-26-2020, 10:50 PM
Last Post: jc_lafleur
  crontab on RHEL7 not calling python script wrapped in shell script benthomson 1 2,254 May-28-2020, 05:27 PM
Last Post: micseydel

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020