Python Forum

Full Version: How and where to store a data for path tree?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello.
I am developing a small scrapper that allows to save images from different urls and save it on many different folders. I have the skeleton for directory hierarchy(6-8 folders and 3-6 subfolders) and basic names, but each session I need to create new root folder with current date, and some custom text from text input.

I want that my folder structure would be easy to edit from the outside of script. Also I want to be able in future to add a feature where I could visually create directory tree with using wxpython treectrls.

My question is what's the best way to store all path tree data? So
1) It would be easy and comprehensively to edit it outside of a script
2) It could be quickly converted and handled by python.


Maybe xml, or json or a simple python file with lists(but it looks too hard to read a hierarchy for me, maybe I wrong)?
I couldn't find the solution or common pattern at google.
I use relative path's all the time, using pathlib
I keep all of my paths in one file for each project, and name it ProjectPaths.py (replace 'Project' with actual name, like 'IllinoisPaths.py'
I then usually define paths for each type of file, csv, json, html, txt etc.
Note that the directory will be created if it doesn't exist.
It will not overwrite existing directories of same name.

example:
# BusinessPaths.py

from pathlib import Path
import os


class BusinessPaths:
    def __init__(self):
        os.chdir(os.path.abspath(os.path.dirname(__file__)))
        self.homepath = Path('.')
        self.rootpath = self.homepath / '..'

        self.datapath = self.rootpath / 'data'
        self.datapath.mkdir(exist_ok=True)

        self.dbpath = self.datapath / 'database'
        self.dbpath.mkdir(exist_ok=True)

        self.htmlpath = self.datapath / 'html'
        self.htmlpath.mkdir(exist_ok=True)

        self.idpath = self.datapath / 'Idfiles'
        self.idpath.mkdir(exist_ok=True)
        
        self.jsonpath = self.datapath / 'json'
        self.jsonpath.mkdir(exist_ok=True)

        self.prettypath = self.datapath / 'pretty'
        self.prettypath.mkdir(exist_ok=True)

        self.textpath = self.datapath / 'text'
        self.textpath.mkdir(exist_ok=True)

        self.tmppath = self.datapath / 'tmp'
        self.tmppath.mkdir(exist_ok=True)

        self.base_url = 'http://searchctbusiness.ctdata.org/'

        self.cities_json = self.jsonpath / 'cities.json'
        self.city_list_url = 'https://ctstatelibrary.org/cttowns/counties'
        self.raw_city_file = self.tmppath / 'raw_city.html'
        # self.cities_text = self.textpath / 'cities.txt'

        self.company_master_json = self.jsonpath / 'CompanyMaster.json'

        self.CompanyMasterDb = self.dbpath / 'CompanyMaster.db'

        self.company_main = self.jsonpath / 'CompanyMain.json'
        self.company_detail = self.jsonpath / 'CompanyDetail.json'
        self.company_principals = self.jsonpath / 'CompanyPrincipals.json'
        self.company_agents = self.jsonpath / 'CompanyAgents.json'
        self.company_filings = self.jsonpath / 'CompanyFilings.json'


if __name__ == '__main__':
    BusinessPaths()
So if I want to open a file to dump some data in my json directory into a file names 'mystuff.json' it's as simeple as:
from BusinessPaths import BusinessPaths
import json

sampledict = {
    'one': 1,
    'two': 2,
    'three': 3
}

bpaths = BusinessPaths()

jsonfile = bpaths.jsonpath / 'junk.json'

with jsonfile.open('w') as fp:
    json.dump(sampledict, fp)