Python Forum
Regular Expression Help - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Regular Expression Help (/thread-9403.html)



Regular Expression Help - Dazzler - Apr-06-2018

Hi All

I wonder if somebody can help. I'm running a script to help me order files for a 3d pdf application...basically i have a software which outputs .stl files which the scripts puts into order so the 3d pdf can play as a small frame by frame progression.
My output files look like this:
VS_Subsetup1_Maxillar.stl
VS_Subsetup2_Maxillar.stl
VS_Subsetup3_Maxillar.stl
VS_Subsetup4_Maxillar.stl
VS_Subsetup5_Maxillar.stl
The script i'm running at the moment looks like this:
#re_frame = r'- 0*(\d+) -'
re_frame = r'_*(\d+)_'
But by using this script my team have to add an underscore before the number in order for the program to work.

Can anybody tell me how to solve this so that it works without making any changes to the file name?


RE: Regular Expression Help - Gribouillis - Apr-06-2018

You could use
re_frame = r'[a-zA-Z_]*(\d+)[a-zA-Z_]*[.]stl'



RE: Regular Expression Help - ljmetzger - Apr-06-2018

Good job Gribouillis.

There is an excellent online regex tester at https://regex101.com/#python courtesy of DeaD_EyE from the following thread: https://python-forum.io/Thread-Regular-Expressions-in-Files-find-all-phone-numbers-and-credit-card-numbers

Lewis


RE: Regular Expression Help - Dazzler - Apr-06-2018

thanks for your great suggestion but now it doesn't see the models? Any thoughts?


RE: Regular Expression Help - Gribouillis - Apr-06-2018

(Apr-06-2018, 12:53 PM)Dazzler Wrote: now it doesn't see the models? Any thoughts?
Can you give more details about the issue?


RE: Regular Expression Help - ljmetzger - Apr-06-2018

The regex model provided by Gribouillis seems to match all your file names given.

We need additional information e.g. your Python code that uses the regex in context to give you additional help.


RE: Regular Expression Help - Dazzler - Apr-06-2018

# local subdirectory for 3d files (must be STL format) relative to templates/, with forward slashes:
model_subfolder = '../models'


# 3d files are grouped in 2-level local subdirectories, True or False (without quotes, first letter upper case):
grouped_subfolders = False


# local pictures folder for button resources relative to templates/ (jpeg, png), with forward slashes:
image_resource_subfolder = './images'


# Regular Expression pattern match rules for tray number from filename
# see: https://en.wikipedia.org/wiki/Regular_expression

#re_frame = r'[a-zA-Z_]*(\d+)[a-zA-Z_]*[.]stl'
re_frame = r'[a-zA-Z_]*(\d+)[a-zA-Z_]*[.]stl'

# segmented model identifiers
re_teeth = r'teeth'
re_gum = r'gum'
re_crown = r'crown'

# For multi-page template.pdf designs, set where 3D should go (starting at page 1)
pdf_page_number = 2

# Output folder, location for new generated files, relative to templates, with forward slashes:
path_to_output = '../output'
[hr]
so i have four folders. one for input template...one for out put pdf one for the models and one for my template images.

the other part of my set up is below:# Orthodontic Treatment Plan Tutorial
# Copyright Visual Technology Services Ltd. All rights reserved.
# Python version 3.6 expected

import time
import os
import re
from setup_configuration import *


def input_treatment():
    result = {
        'doctor' : input('Dr. (default=Who): ') or 'Who',
        'patient': input('patient name (default=Anonymous): ') or 'Anonymous',
        'Order no.': input('order no. (default=1): ') or '1'
    }
    return result

def get_frame(path, name):
    if not os.path.isfile(os.path.join(path, name)) or re.search(re_frame, name) is None:
        return 0
    return int(re.search(re_frame, name).group(1))


def is_model_type(name):
    return re.search(re_teeth, name) is not None or re.search(re_gum, name) is not None or re.search(re_crown, name) is not None


def gather_files(frame_count):
    path = os.path.abspath(model_subfolder)
    result = []
    print('frame count = %d'%frame_count)
	
    if grouped_subfolders:
        folders = [f for f in os.listdir(path) if os.path.isdir(os.path.join(path, f))]
        for directory in folders:
            abs_dir = os.path.join(path, directory)
            file_names = [f for f in os.listdir(abs_dir) if get_frame(abs_dir, f) in range(1, frame_count+1)]
            print(directory)
            for filename in file_names:
                result.append({'dir': directory, 'file': filename, 'frame': get_frame(abs_dir, filename)})
    else:
        file_names = [f for f in os.listdir(path) if get_frame(path, f) in range(1, frame_count+1)]
        for filename in file_names:
            result.append({'file': filename, 'frame': get_frame(path, filename)})
    return result


def input_keywords():
    # build the table of ordered keywords
    wildcards = [
        input("keyword to match the Mandibular lower trays (default = Mandibular): ") or 'Mandibular',
        input("keyword to match the Maxillary upper trays (default = Maxillar): ") or 'Maxillar'
    ]
    print('keywords are : ' + ', '.join(wildcards) + ', length is : ' + str(len(wildcards)))
    return wildcards


def input_simplification():
    print('type here the maximum number of triangles, (more triangles gives higher accuracy but larger file size)')
    print(' type 1 for 500 000')
    print(' type 2 for 800 000')
    print(' type 3 for 1 000 000')
    print(' type 4 for 1 500 000')
    print(' type 5 for 2 000 000')
    print(' type 6 for 2 500 000')
    print(' type 7 for 3 000 000')
    mode = int(input("number of triangle for simplification (numeric, defaut = 2 000 000) : ") or 5)

    result = 2000000
    if mode is 1:
        result = 500000
    elif mode is 2:
        result = 800000
    elif mode in range(3, 7):
        result = 500000 * (mode - 1)

    return result


def create_pdf(substitutes):
    create_settings('case01.pdf3dsettings', substitutes, 'result.pdf3dsettings')
    run_reportgen('result.pdf3dsettings')


def create_html(substitutes):
    create_settings('case01_web.pdf3dsettings', substitutes, 'result_web.pdf3dsettings')
    create_settings('case01.json', substitutes, path_to_output + '/treatment.json', '@{key}@')
    run_reportgen('result_web.pdf3dsettings')


def create_settings(template, substitutes, output, key_format='[ {key} ]'):

    template_file = open(template, 'r')
    output_file = open(output, 'w')
    # apply all substitutions:
    for line in template_file:
        for key, value in substitutes.items():
            line = line.replace(key_format.format(key=key), value)
        output_file.write(line)
    template_file.close()
    output_file.close()
    print('file created : ', output)


def run_reportgen(state_file):

    print('working, please wait...')
    print('""' + path_to_program + '" -state ', state_file, ' -silent "STL Interface""')
    try:
        os.system('""' + path_to_program + '" -state ' + state_file + ' -silent "STL Interface""')
        print('generation completed.')
    except:
        print('error: generation failed')


def generate_assemblies(files):
    assemblies = ''
    input_template = '<InputFileName value="{path}"/>'
    node_template_with_folder = '<NodeName value="{folder} - {file}_{frame}_"/>'
    node_template_bare = '<NodeName value="{file}_{frame}_"/>'
    for item in files:
        fname = os.path.splitext(item['file'])[0]
        assemblies += '\n    <Assembly>'
        if 'dir' in item:
            assemblies += input_template.format(
                path=os.path.normpath(os.path.join(os.curdir, model_subfolder, item['dir'], item['file'])))
            assemblies += node_template_with_folder.format(folder=item['dir'].lower(), file=fname)
        else:
            assemblies += input_template.format(
                path=os.path.normpath(os.path.join(os.curdir, model_subfolder, item['file'])))
            assemblies += node_template_bare.format(file=fname, frame=item['frame'])
        assemblies += '</Assembly>'
    return assemblies


def generate_stages(files, frame_count, keywords):
    stages = ''

    frames = [dict() for i in range(1, frame_count+1)]

    for item in files:
        name = item['dir'] + ' ' + item['file'] if 'dir' in item else item['file']
        name = '.'.join(name.split('.')[:-1])
        frame = item['frame'] - 1

        model_type = 'generic'
        if re.search(re_gum, name, flags=re.IGNORECASE) is not None:
            model_type = 'gum'
        if re.search(re_teeth, name, flags=re.IGNORECASE) is not None:
            model_type = 'teeth'
        if re.search(re_crown, name, flags=re.IGNORECASE) is not None:
            model_type = 'crown'

        keyword = None
        if re.search(keywords[0], name) is not None:
            keyword = 'mandibular'
        if re.search(keywords[1], name) is not None:
            keyword = 'maxillary'

        if keyword not in frames[frame]:
            frames[frame][keyword] = {}

        if model_type not in frames[frame][keyword]:
            frames[frame][keyword][model_type] = []

        frames[frame][keyword][model_type].append(name)

    result = []
    for frame in frames:
        frame_data = []
        for keyword, jaw in frame.items():
            jaw_data = []
            for model_type, models in jaw.items():
                models_string = ','.join(['"{model}"'.format(model=model) for model in models])
                jaw_data.append('"{model_type}": [{models}]'.format(model_type=model_type, models=models_string))
            frame_data.append('"{jaw}": {{{data}}}'.format(jaw=keyword, data=','.join(jaw_data)))
        result.append('{{{data}}}'.format(data=','.join(frame_data)))

    return ',\n'.join(result)


def main():
    print('Orthodontic Treatment Plan script powered by PDF3D technology')
    print('Copyright Visual Technology Services - pdf3d.com ...')

    treatment = input_treatment()
    keywords = input_keywords()
    print(keywords)

    delay = int(input("delay between frames (numeric, default = 1 second) : ") or 1)
    number_of_frames = int(input("number of frames (numeric, default = 5 frames) : ") or 5)
    triangle_count = input_simplification()

    # put the date (today)
    date = time.strftime("%d-%m-%Y")
    files = gather_files(number_of_frames)

    assemblies = generate_assemblies(files)
    stages = generate_stages(files, number_of_frames, keywords)
    print ('Stages, %d files:'%(len(files)))
    print (stages)
	
    substitutes = {
        'Dr.':              treatment['doctor'],
        '#Upper Trays':     str(number_of_frames),
        '#Lower Trays':     str(number_of_frames),
        'Order no.':        treatment['Order no.'],
        'Patient name':     treatment['patient'],
        'Date':             date,
        'STL':              assemblies,
        'KEYWORDS':         '["' + '", "'.join(keywords) + '"]',
        'delay':            str(delay),
        'number of frames': str(number_of_frames),
        'triangle count':   str(triangle_count),
        'picture folder':   str(image_resource_subfolder),
        'root file':        str(image_resource_subfolder),
        'stages':           stages,
        'page number':      str(pdf_page_number),
        'output folder':    path_to_output
    }

    save_pdf = (input("create the pdf ? (Y or N (default) ?) : ") or 'N') in ['y', 'yes']

    #save_html = (input("create the html data ? (Y or N (default) ?) : ") or 'N') in ['y', 'yes']
    save_html = False

    if save_pdf:
        create_pdf(substitutes)
    #if save_html:
    #    create_html(substitutes)

    if not save_html and not save_pdf:
        print('skipping generation.')

    breakpoint = 0

if __name__ == '__main__':
    main()]

What happens is the 3d pdf software converts my .stl into 3d pdf and the python script sets where it searches and how it searches for the info from me.

What happens with the new script you suggested is that the models then don't transfer into the 3d pdf


RE: Regular Expression Help - ljmetzger - Apr-08-2018

The following code demonstrates that the regex provided by Gribouillis seems to answer your original question:
import os
import re

RE_FRAME = r'[a-zA-Z_]*(\d+)[a-zA-Z_]*[.]stl'
model_subfolder = 'dummy'

print("All files in subfolder '{}' follow:".format(model_subfolder))
my_path = os.path.abspath(model_subfolder)
for my_file_name in os.listdir(my_path):
    print (my_file_name)

print() 
print("All files in subfolder '{}' that match the following regex pattern '{}' follow:".format(model_subfolder, RE_FRAME))
my_path = os.path.abspath(model_subfolder)
for my_file_name in os.listdir(my_path):
    if not re.search(RE_FRAME, my_file_name) is None:
        print(my_file_name) 
Output:
All files in subfolder 'dummy' follow: dummy.txt VS_Subsetup1_Maxillar.stl VS_Subsetup5_Maxillar.stl VS_SubsetupXX_Maxillar.stl All files in subfolder 'dummy' that match the following regex pattern '[a-zA-Z_]*(\d+)[a-zA-Z_]*[.]stl' follow: VS_Subsetup1_Maxillar.stl VS_Subsetup5_Maxillar.stl
If you need further assistance you should probably:
a. Start a new thread providing the minimum of code (and data) that demonstrates your problem.
b. Read the forum rules and provide CODE TAGS for your code.
c. Please provide details concerning your input and your expected output.

Lewis