Python Forum
which sequence is in a sequence
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
which sequence is in a sequence
#1
i have a sequence and a list/tuple of similar sequences. i want to find if any of the sequences in the list/tuple are in the first sequence. i know how i can make a loop to do this. is there any better, or more elegant, or uber pythonic way i should do this? in the use case i have today, sequence is string, but if the good solution can handle other sequence types, then it could be a solution for future use cases. it would also be great if after that code runs, i also end up with which sequence was found in the first sequence. order matters for this latter benefit. if two or more are in it the earlier in the list/tuple is the one i care about.

or

my use case might be better dealt with some other way. i have an argument that could be a single number or a range. the number separator will also indicate if the 2nd number is inclusive or exclusive. the range will be like that in the posix cut command which is '-' and inclusive, but i will be adding '..' and ':' as exclusive ways to express a range. in my current use case only integers make sense. in the future, use cases may include floats and names.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#2
You want to use set().

list1 = [1,1,1,2,3,4,5]
list1 = set(list1) # unique elements
list2 = [4,2,5,1,6,7,4]
list2 = set(list2) # unique elements
print(list1 & list2)
https://docs.python.org/3/tutorial/datastructures.html#sets
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#3
For the first part the most compact expression I can think is:
>>> keywords = ['-', ':', ',', 'xx']
>>> s = 'wgewgewgwee'
>>> any(key in s for key in keywords)
False
>>> s = 'wgewgewg:wee'
>>> any(key in s for key in keywords)
True
>>> s = 'wgewgewgxxwee'
>>> any(key in s for key in keywords)
True
It has the advantage that any will stop checking with the first check that returns True.
The problem is that usually you do not only want to know if any of them is inside the sequence but also which one it is and were in s...
To do that the best thing I can think is a loop like:
def match_keys(keywords, s):
    for key in keywords:
        try:
            p = s.index(key)
        except ValueError:
            continue
        return key, p
    return None, -1
with this a simplified version of your code that accepts only 'n', 'a:b' or 'a-b' and return always a slice will be:
def parse_range(s):
    key, p = match_keys([':', '-'], s)
    if key is None:
        a = int(s)
        b = a + 1
    else:
        a = int(s[:p])
        b = int(s[p+1:])
        if key == '-':
            b += 1
    return slice(a, b)
Obviously you might want to guard all the int conversions to allow things like ':6' or '3-'
And the same can be done using regexp, but in this case will only work for strings:
import re

def parse_with_re(s):
    m = re.match(r'(\d+)([-:]?)(\d*)', s)
    
    if not m:
        raise ValueError(f'Invalid input {s}')

    a = int(m.group(1))
    if m.group(2) == ':':
        b = int(m.group(3))
    elif m.group(2) == '-':
        b = int(m.group(3)) + 1
    else:
        b = a + 1
    return slice(a, b)
Although you can use a better regexp that allows to match much complex strings like '5::2' or '7-30:3'
Reply
#4
(May-27-2018, 09:46 AM)DeaD_EyE Wrote: You want to use set().

list1 = [1,1,1,2,3,4,5]
list1 = set(list1) # unique elements
list2 = [4,2,5,1,6,7,4]
list2 = set(list2) # unique elements
print(list1 & list2)
https://docs.python.org/3/tutorial/datastructures.html#sets

a good start. my code will still need to carry out the steps to see what matches. my first use case involves a list of possible keywords that might be in an argument list. at least this much can rule in or out if any keywords are present and avoid further work in that regard if no keywords are present. but this is to save time (make the command faster). so, figuring this out with minimal work needs to be less work than doing the basic work more often ... iow, i still need to figure out which way is faster.

the command is actually a shell script tool. it could be used many times or in loops.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  which exception should be thrown for a sequence of the wrong length? Skaperen 1 828 Jan-06-2023, 04:13 AM
Last Post: deanhystad
  (off) and (on) sequence AlexPython 6 976 Dec-05-2022, 07:25 PM
Last Post: AlexPython
  How to set a sequence of midi note on and note off values in ticks tomharvey 2 2,107 Mar-25-2022, 02:33 AM
Last Post: tomharvey
  TypeError: sequence item 0: expected str instance, float found Error Query eddywinch82 1 5,029 Sep-04-2021, 09:16 PM
Last Post: eddywinch82
  Why can't I explicitly call __bool__() on sequence type? quazirfan 11 4,541 Aug-20-2021, 06:49 AM
Last Post: Gribouillis
  Error : "can't multiply sequence by non-int of type 'float' " Ala 3 3,024 Apr-13-2021, 10:33 AM
Last Post: deanhystad
  Python - Import file sequence into Media Pool jensenni 1 2,082 Feb-02-2021, 05:11 PM
Last Post: buran
  How can I found how many numbers are there in a Collatz Sequence that I found? cananb 2 2,506 Nov-23-2020, 05:15 PM
Last Post: cananb
  print two different sequence number mantonegro 2 1,631 Nov-16-2020, 06:19 PM
Last Post: mantonegro
  how to detect a sequence? Skaperen 1 1,427 Oct-31-2020, 06:36 AM
Last Post: bowlofred

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020