Python Forum
Checking if a string contains all or any elements of a list
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Checking if a string contains all or any elements of a list
#1
so i have some keywords in a list and i want to check if a string contains any or all of those keywords. E.g
teststring = 'this is a test string it contains apple, orange & banana. Moreover, this i a very long string contact length more than 500k'
keywords= ['apple','banana'.'length']
i want to do something like,
if (any or all elements in keywords) in teststring:
    print("Match found")
i found a way using any():
if any(ext in teststring for ext in keywords):
    print("Match found")
so the prob. with that is that it can be slow if string is long like in my case contact length > 500k . Is there any better way to do both any or all operations?
Reply
#2
You are doing the search backwards. Search for keywords in the sample text, not the other way around.
You can use regular expressions.
import re
keywords = ('apple', 'banana', 'orange')
teststring = (
    'this is a test string it contains apple, orange & banana. '
    'Moreover, this i a very long string contact length more than 500k'
)

# This is any
pattern = '|'.join(keywords)
found_any = re.search(pattern, teststring)

# This is all
found_all = all(
    re.search(keyword, teststring) for keyword in keywords
)

print(found_any, found_all, sep="\n")
Output:
<re.Match object; span=(34, 39), match='apple'> True
Testing with "I like oranges and apples."
Output:
<re.Match object; span=(7, 13), match='orange'> False
A different appoach is to use sets. Set matching will be very fast compared to any other kind of search. The results will be slightly different because regex matches orange to oranges, but a set intersection will see these as different words. To use sets, you'll need to first convert the teststring to a set of words. This requires removing all punctuation and stripping whitespace. You probably want to set everything to upper or lower case, so capitalization doesn't prevent matches.
import string
keywords = {'apple', 'banana', 'orange'}
teststring = (
    'this is a test string it contains apple, orange & banana. '
    'Moreover, this i a very long string contact length more than 500k'
)

trans = str.maketrans('', '', string.punctuation)
testwords = set(map(str.strip, teststring.translate(trans).lower().split()))

print(keywords.intersection(testwords))
Output:
{'apple', 'orange', 'banana'}
I like how the search results provide the information you need for both any and all.
k1llcod3 likes this post
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  unable to remove all elements from list based on a condition sg_python 3 412 Jan-27-2024, 04:03 PM
Last Post: deanhystad
Question mypy unable to analyse types of tuple elements in a list comprehension tomciodev 1 459 Oct-17-2023, 09:46 AM
Last Post: tomciodev
  How to change the datatype of list elements? mHosseinDS86 9 1,950 Aug-24-2022, 05:26 PM
Last Post: deanhystad
  ValueError: Length mismatch: Expected axis has 8 elements, new values have 1 elements ilknurg 1 5,093 May-17-2022, 11:38 AM
Last Post: Larz60+
  Why am I getting list elements < 0 ? Mark17 8 3,104 Aug-26-2021, 09:31 AM
Last Post: naughtyCat
  Looping through nested elements and updating the original list Alex_James 3 2,111 Aug-19-2021, 12:05 PM
Last Post: Alex_James
  Extracting Elements From A Website List knight2000 2 2,240 Jul-20-2021, 10:38 AM
Last Post: knight2000
  Make Groups with the List Elements quest 2 1,961 Jul-11-2021, 09:58 AM
Last Post: perfringo
  I cannot delete and the elements from the list quest 4 2,962 May-11-2021, 12:01 PM
Last Post: perfringo
  List of lists - merge sublists with common elements medatib531 1 3,383 May-09-2021, 07:49 AM
Last Post: Gribouillis

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020