Python Forum

Full Version: Hard time trying to figure out the difference between two strings
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Greetings, fellow pythonistas.

Considering I have the following filenames:
filenames = [
    "BCY0649_PURANATTA_120X90MM__ Cyan.tif"
    "BCY0649_PURANATTA_120X90MM__ Black.tif"
    "BCY0649_PURANATTA_120X90MM__ Magenta.tif"
    "BCY0649_PURANATTA_120X90MM__ PANTONE 357 C.tif"
    "BCY0649_PURANATTA_120X90MM__ PANTONE 465 C.tif"
    "BCY0649_PURANATTA_120X90MM__ Yellow.tif"
]
I need to get the last part of the filenames (Cyan, Magenta, PANTONE 357 C etc). In this case in particular, I can do it easily spliting the string on the double underscores "__".

But, this will not always be the case. I'm frying my brains trying to make this work, but couldn't figure it out.

What I have done so far: tryied to look for matching characters on two filenames and guess what is the prefix. But this will fail if I have lets say only two PANTONE colors, as they will be included in the prefix.
def get_filename_prefix(first: str, second: str) -> str:
    prefix = []
    first = first.split()
    second = second.split()
    for one in first:
        for another in second:
            if one == another:
                prefix.append(one)
    return " ".join(prefix)
Any better ideas?
filenames = [
    "BCY0649_PURANATTA_120X90MM__ Cyan.tif",
    "BCY0649_PURANATTA_120X90MM__ Black.tif",
    "BCY0649_PURANATTA_120X90MM__ Magenta.tif",
    "BCY0649_PURANATTA_120X90MM__ PANTONE 357 C.tif",
    "BCY0649_PURANATTA_120X90MM__ PANTONE 465 C.tif",
    "BCY0649_PURANATTA_120X90MM__ Yellow.tif",
]

for name in filenames:
    parts = name.split(" ")
    if "PANTONE" in parts:
        print(" ".join(parts[parts.index("PANTONE") :]))
    else:
        print(parts[-1])
This will not work if you can have color names like "Dark Blue", or if there is a different separator, or no separator, or if the color name is not at the end, or for a multitude of unexpected reasons.

You really need to provide more, different patterns or define the grammar used to make these names.
(Aug-16-2023, 03:00 PM)deanhystad Wrote: [ -> ]This will not work if you can have color names like "Dark Blue"
This is precisely the my main concern as custom colors are common here.

But, your insight gave me a partial solution to the problem. Guss I'll stick with forcing the user to put an underscore at the end of the filename and I can handle it from there using rfind('_') to get the highest index for an underscore and get the color name from there. Like so:
filename = "BCY0649_PURANATTA_120X90MM__ Magenta.tif"
color = filename[filename.rfind("_")+1:].strip()