Regular Expression Help

Dazzler · Apr-06-2018, 11:26 AM

Hi All

I wonder if somebody can help. I'm running a script to help me order files for a 3d pdf application...basically i have a software which outputs .stl files which the scripts puts into order so the 3d pdf can play as a small frame by frame progression.
My output files look like this:
VS_Subsetup1_Maxillar.stl
VS_Subsetup2_Maxillar.stl
VS_Subsetup3_Maxillar.stl
VS_Subsetup4_Maxillar.stl
VS_Subsetup5_Maxillar.stl
The script i'm running at the moment looks like this:
#re_frame = r'- 0*(\d+) -'
re_frame = r'_*(\d+)_'
But by using this script my team have to add an underscore before the number in order for the program to work.

Can anybody tell me how to solve this so that it works without making any changes to the file name?

**Gribouillis** · Apr-06-2018, 12:13 PM

You could use

re_frame = r'[a-zA-Z_]*(\d+)[a-zA-Z_]*[.]stl'

ljmetzger · Apr-06-2018, 12:40 PM

Good job Gribouillis.

There is an excellent online regex tester at https://regex101.com/#python courtesy of DeaD_EyE from the following thread: https://python-forum.io/Thread-Regular-E...rd-numbers

Lewis

Dazzler · Apr-06-2018, 12:53 PM

thanks for your great suggestion but now it doesn't see the models? Any thoughts?

**Gribouillis** · Apr-06-2018, 01:57 PM

(Apr-06-2018, 12:53 PM)Dazzler Wrote: now it doesn't see the models? Any thoughts?

Can you give more details about the issue?

ljmetzger · Apr-06-2018, 02:00 PM

The regex model provided by Gribouillis seems to match all your file names given.

We need additional information e.g. your Python code that uses the regex in context to give you additional help.

Dazzler · (This post was last modified: Apr-08-2018, 03:34 PM by snippsat.)

# local subdirectory for 3d files (must be STL format) relative to templates/, with forward slashes:
model_subfolder = '../models'


# 3d files are grouped in 2-level local subdirectories, True or False (without quotes, first letter upper case):
grouped_subfolders = False


# local pictures folder for button resources relative to templates/ (jpeg, png), with forward slashes:
image_resource_subfolder = './images'


# Regular Expression pattern match rules for tray number from filename
# see: https://en.wikipedia.org/wiki/Regular_expression

#re_frame = r'[a-zA-Z_]*(\d+)[a-zA-Z_]*[.]stl'
re_frame = r'[a-zA-Z_]*(\d+)[a-zA-Z_]*[.]stl'

# segmented model identifiers
re_teeth = r'teeth'
re_gum = r'gum'
re_crown = r'crown'

# For multi-page template.pdf designs, set where 3D should go (starting at page 1)
pdf_page_number = 2

# Output folder, location for new generated files, relative to templates, with forward slashes:
path_to_output = '../output'
[hr]
so i have four folders. one for input template...one for out put pdf one for the models and one for my template images.

the other part of my set up is below:# Orthodontic Treatment Plan Tutorial
# Copyright Visual Technology Services Ltd. All rights reserved.
# Python version 3.6 expected

import time
import os
import re
from setup_configuration import *


def input_treatment():
    result = {
        'doctor' : input('Dr. (default=Who): ') or 'Who',
        'patient': input('patient name (default=Anonymous): ') or 'Anonymous',
        'Order no.': input('order no. (default=1): ') or '1'
    }
    return result

def get_frame(path, name):
    if not os.path.isfile(os.path.join(path, name)) or re.search(re_frame, name) is None:
        return 0
    return int(re.search(re_frame, name).group(1))


def is_model_type(name):
    return re.search(re_teeth, name) is not None or re.search(re_gum, name) is not None or re.search(re_crown, name) is not None


def gather_files(frame_count):
    path = os.path.abspath(model_subfolder)
    result = []
    print('frame count = %d'%frame_count)
	
    if grouped_subfolders:
        folders = [f for f in os.listdir(path) if os.path.isdir(os.path.join(path, f))]
        for directory in folders:
            abs_dir = os.path.join(path, directory)
            file_names = [f for f in os.listdir(abs_dir) if get_frame(abs_dir, f) in range(1, frame_count+1)]
            print(directory)
            for filename in file_names:
                result.append({'dir': directory, 'file': filename, 'frame': get_frame(abs_dir, filename)})
    else:
        file_names = [f for f in os.listdir(path) if get_frame(path, f) in range(1, frame_count+1)]
        for filename in file_names:
            result.append({'file': filename, 'frame': get_frame(path, filename)})
    return result


def input_keywords():
    # build the table of ordered keywords
    wildcards = [
        input("keyword to match the Mandibular lower trays (default = Mandibular): ") or 'Mandibular',
        input("keyword to match the Maxillary upper trays (default = Maxillar): ") or 'Maxillar'
    ]
    print('keywords are : ' + ', '.join(wildcards) + ', length is : ' + str(len(wildcards)))
    return wildcards


def input_simplification():
    print('type here the maximum number of triangles, (more triangles gives higher accuracy but larger file size)')
    print(' type 1 for 500 000')
    print(' type 2 for 800 000')
    print(' type 3 for 1 000 000')
    print(' type 4 for 1 500 000')
    print(' type 5 for 2 000 000')
    print(' type 6 for 2 500 000')
    print(' type 7 for 3 000 000')
    mode = int(input("number of triangle for simplification (numeric, defaut = 2 000 000) : ") or 5)

    result = 2000000
    if mode is 1:
        result = 500000
    elif mode is 2:
        result = 800000
    elif mode in range(3, 7):
        result = 500000 * (mode - 1)

    return result


def create_pdf(substitutes):
    create_settings('case01.pdf3dsettings', substitutes, 'result.pdf3dsettings')
    run_reportgen('result.pdf3dsettings')


def create_html(substitutes):
    create_settings('case01_web.pdf3dsettings', substitutes, 'result_web.pdf3dsettings')
    create_settings('case01.json', substitutes, path_to_output + '/treatment.json', '@{key}@')
    run_reportgen('result_web.pdf3dsettings')


def create_settings(template, substitutes, output, key_format='[ {key} ]'):

    template_file = open(template, 'r')
    output_file = open(output, 'w')
    # apply all substitutions:
    for line in template_file:
        for key, value in substitutes.items():
            line = line.replace(key_format.format(key=key), value)
        output_file.write(line)
    template_file.close()
    output_file.close()
    print('file created : ', output)


def run_reportgen(state_file):

    print('working, please wait...')
    print('""' + path_to_program + '" -state ', state_file, ' -silent "STL Interface""')
    try:
        os.system('""' + path_to_program + '" -state ' + state_file + ' -silent "STL Interface""')
        print('generation completed.')
    except:
        print('error: generation failed')


def generate_assemblies(files):
    assemblies = ''
    input_template = '<InputFileName value="{path}"/>'
    node_template_with_folder = '<NodeName value="{folder} - {file}_{frame}_"/>'
    node_template_bare = '<NodeName value="{file}_{frame}_"/>'
    for item in files:
        fname = os.path.splitext(item['file'])[0]
        assemblies += '\n    <Assembly>'
        if 'dir' in item:
            assemblies += input_template.format(
                path=os.path.normpath(os.path.join(os.curdir, model_subfolder, item['dir'], item['file'])))
            assemblies += node_template_with_folder.format(folder=item['dir'].lower(), file=fname)
        else:
            assemblies += input_template.format(
                path=os.path.normpath(os.path.join(os.curdir, model_subfolder, item['file'])))
            assemblies += node_template_bare.format(file=fname, frame=item['frame'])
        assemblies += '</Assembly>'
    return assemblies


def generate_stages(files, frame_count, keywords):
    stages = ''

    frames = [dict() for i in range(1, frame_count+1)]

    for item in files:
        name = item['dir'] + ' ' + item['file'] if 'dir' in item else item['file']
        name = '.'.join(name.split('.')[:-1])
        frame = item['frame'] - 1

        model_type = 'generic'
        if re.search(re_gum, name, flags=re.IGNORECASE) is not None:
            model_type = 'gum'
        if re.search(re_teeth, name, flags=re.IGNORECASE) is not None:
            model_type = 'teeth'
        if re.search(re_crown, name, flags=re.IGNORECASE) is not None:
            model_type = 'crown'

        keyword = None
        if re.search(keywords[0], name) is not None:
            keyword = 'mandibular'
        if re.search(keywords[1], name) is not None:
            keyword = 'maxillary'

        if keyword not in frames[frame]:
            frames[frame][keyword] = {}

        if model_type not in frames[frame][keyword]:
            frames[frame][keyword][model_type] = []

        frames[frame][keyword][model_type].append(name)

    result = []
    for frame in frames:
        frame_data = []
        for keyword, jaw in frame.items():
            jaw_data = []
            for model_type, models in jaw.items():
                models_string = ','.join(['"{model}"'.format(model=model) for model in models])
                jaw_data.append('"{model_type}": [{models}]'.format(model_type=model_type, models=models_string))
            frame_data.append('"{jaw}": {{{data}}}'.format(jaw=keyword, data=','.join(jaw_data)))
        result.append('{{{data}}}'.format(data=','.join(frame_data)))

    return ',\n'.join(result)


def main():
    print('Orthodontic Treatment Plan script powered by PDF3D technology')
    print('Copyright Visual Technology Services - pdf3d.com ...')

    treatment = input_treatment()
    keywords = input_keywords()
    print(keywords)

    delay = int(input("delay between frames (numeric, default = 1 second) : ") or 1)
    number_of_frames = int(input("number of frames (numeric, default = 5 frames) : ") or 5)
    triangle_count = input_simplification()

    # put the date (today)
    date = time.strftime("%d-%m-%Y")
    files = gather_files(number_of_frames)

    assemblies = generate_assemblies(files)
    stages = generate_stages(files, number_of_frames, keywords)
    print ('Stages, %d files:'%(len(files)))
    print (stages)
	
    substitutes = {
        'Dr.':              treatment['doctor'],
        '#Upper Trays':     str(number_of_frames),
        '#Lower Trays':     str(number_of_frames),
        'Order no.':        treatment['Order no.'],
        'Patient name':     treatment['patient'],
        'Date':             date,
        'STL':              assemblies,
        'KEYWORDS':         '["' + '", "'.join(keywords) + '"]',
        'delay':            str(delay),
        'number of frames': str(number_of_frames),
        'triangle count':   str(triangle_count),
        'picture folder':   str(image_resource_subfolder),
        'root file':        str(image_resource_subfolder),
        'stages':           stages,
        'page number':      str(pdf_page_number),
        'output folder':    path_to_output
    }

    save_pdf = (input("create the pdf ? (Y or N (default) ?) : ") or 'N') in ['y', 'yes']

    #save_html = (input("create the html data ? (Y or N (default) ?) : ") or 'N') in ['y', 'yes']
    save_html = False

    if save_pdf:
        create_pdf(substitutes)
    #if save_html:
    #    create_html(substitutes)

    if not save_html and not save_pdf:
        print('skipping generation.')

    breakpoint = 0

if __name__ == '__main__':
    main()]

What happens is the 3d pdf software converts my .stl into 3d pdf and the python script sets where it searches and how it searches for the info from me.

What happens with the new script you suggested is that the models then don't transfer into the 3d pdf

ljmetzger · Apr-08-2018, 03:27 PM

The following code demonstrates that the regex provided by Gribouillis seems to answer your original question:

import os
import re

RE_FRAME = r'[a-zA-Z_]*(\d+)[a-zA-Z_]*[.]stl'
model_subfolder = 'dummy'

print("All files in subfolder '{}' follow:".format(model_subfolder))
my_path = os.path.abspath(model_subfolder)
for my_file_name in os.listdir(my_path):
    print (my_file_name)

print() 
print("All files in subfolder '{}' that match the following regex pattern '{}' follow:".format(model_subfolder, RE_FRAME))
my_path = os.path.abspath(model_subfolder)
for my_file_name in os.listdir(my_path):
    if not re.search(RE_FRAME, my_file_name) is None:
        print(my_file_name)

Output:All files in subfolder 'dummy' follow:
dummy.txt
VS_Subsetup1_Maxillar.stl
VS_Subsetup5_Maxillar.stl
VS_SubsetupXX_Maxillar.stl

All files in subfolder 'dummy' that match the following regex pattern '[a-zA-Z_]*(\d+)[a-zA-Z_]*[.]stl' follow:
VS_Subsetup1_Maxillar.stl
VS_Subsetup5_Maxillar.stl

If you need further assistance you should probably:
a. Start a new thread providing the minimum of code (and data) that demonstrates your problem.
b. Read the forum rules and provide CODE TAGS for your code.
c. Please provide details concerning your input and your expected output.

Lewis

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	data validation with specific regular expression	shaheen07	0	346	Jan-12-2024, 07:56 AM Last Post: shaheen07
	Regular Expression search to comment lines of code	Gman2233	5	1,700	Sep-08-2022, 06:57 AM Last Post: ndc85430
	List Creation and Position of Continue Statement In Regular Expression Code	new_coder_231013	3	1,680	Jun-15-2022, 12:00 PM Last Post: new_coder_231013
	Need help with my code (regular expression)	shailc	5	1,944	Apr-04-2022, 07:34 PM Last Post: shailc
	Regular Expression for matching words	xinyulon	1	2,187	Mar-09-2022, 10:34 PM Last Post: snippsat
	regular expression question	Skaperen	4	2,516	Aug-23-2021, 06:01 PM Last Post: Skaperen
	How can I find all combinations with a regular expression?	AlekseyPython	0	1,681	Jun-23-2021, 04:48 PM Last Post: AlekseyPython
	Python Regular expression, small sample works but not on file	Acernz	5	2,955	Jun-09-2021, 08:27 PM Last Post: bowlofred
	Regular expression: cannot find 1st number in a string	Pavel_47	2	2,431	Jan-15-2021, 04:39 PM Last Post: bowlofred
	Regular expression: return string, not list	Pavel_47	3	2,512	Jan-14-2021, 11:49 AM Last Post: Pavel_47

Regular Expression Help

User Panel Messages

Announcements