Python Forum
Hard time trying to figure out the difference between two strings
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Hard time trying to figure out the difference between two strings
#1
Greetings, fellow pythonistas.

Considering I have the following filenames:
filenames = [
    "BCY0649_PURANATTA_120X90MM__ Cyan.tif"
    "BCY0649_PURANATTA_120X90MM__ Black.tif"
    "BCY0649_PURANATTA_120X90MM__ Magenta.tif"
    "BCY0649_PURANATTA_120X90MM__ PANTONE 357 C.tif"
    "BCY0649_PURANATTA_120X90MM__ PANTONE 465 C.tif"
    "BCY0649_PURANATTA_120X90MM__ Yellow.tif"
]
I need to get the last part of the filenames (Cyan, Magenta, PANTONE 357 C etc). In this case in particular, I can do it easily spliting the string on the double underscores "__".

But, this will not always be the case. I'm frying my brains trying to make this work, but couldn't figure it out.

What I have done so far: tryied to look for matching characters on two filenames and guess what is the prefix. But this will fail if I have lets say only two PANTONE colors, as they will be included in the prefix.
def get_filename_prefix(first: str, second: str) -> str:
    prefix = []
    first = first.split()
    second = second.split()
    for one in first:
        for another in second:
            if one == another:
                prefix.append(one)
    return " ".join(prefix)
Any better ideas?
Reply
#2
filenames = [
    "BCY0649_PURANATTA_120X90MM__ Cyan.tif",
    "BCY0649_PURANATTA_120X90MM__ Black.tif",
    "BCY0649_PURANATTA_120X90MM__ Magenta.tif",
    "BCY0649_PURANATTA_120X90MM__ PANTONE 357 C.tif",
    "BCY0649_PURANATTA_120X90MM__ PANTONE 465 C.tif",
    "BCY0649_PURANATTA_120X90MM__ Yellow.tif",
]

for name in filenames:
    parts = name.split(" ")
    if "PANTONE" in parts:
        print(" ".join(parts[parts.index("PANTONE") :]))
    else:
        print(parts[-1])
This will not work if you can have color names like "Dark Blue", or if there is a different separator, or no separator, or if the color name is not at the end, or for a multitude of unexpected reasons.

You really need to provide more, different patterns or define the grammar used to make these names.
carecavoador likes this post
Reply
#3
(Aug-16-2023, 03:00 PM)deanhystad Wrote: This will not work if you can have color names like "Dark Blue"
This is precisely the my main concern as custom colors are common here.

But, your insight gave me a partial solution to the problem. Guss I'll stick with forcing the user to put an underscore at the end of the filename and I can handle it from there using rfind('_') to get the highest index for an underscore and get the color name from there. Like so:
filename = "BCY0649_PURANATTA_120X90MM__ Magenta.tif"
color = filename[filename.rfind("_")+1:].strip()
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  time difference bettwenn logs enkliy 14 1,006 Nov-21-2023, 04:51 PM
Last Post: rob101
  Trying to understand strings and lists of strings Konstantin23 2 772 Aug-06-2023, 11:42 AM
Last Post: deanhystad
  Sum up Time difference tester_V 10 2,589 Apr-06-2023, 06:54 AM
Last Post: Gribouillis
  Splitting strings in list of strings jesse68 3 1,779 Mar-02-2022, 05:15 PM
Last Post: DeaD_EyE
  How to get indices of minimum time difference Mekala 1 2,173 Nov-10-2020, 11:09 PM
Last Post: deanhystad
  Having a hard time conceptualizing how to print something MysticLord 6 3,129 Sep-19-2020, 10:43 PM
Last Post: MysticLord
  Difference Between Figure Axis and Sub Plot Axis in MatplotLib JoeDainton123 2 2,480 Aug-21-2020, 10:17 PM
Last Post: JoeDainton123
  Having hard time understanding the function self-returning itself twice jagasrik 2 2,502 Aug-15-2020, 08:50 PM
Last Post: deanhystad
  The count variable is giving me a hard time in this code D4isyy 2 1,981 Aug-09-2020, 10:32 PM
Last Post: bowlofred
  How to calculate time difference between each row of dataframe in seconds Mekala 1 2,582 Jul-16-2020, 12:57 PM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020