Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Extract text
#2
Example with a generator:
  1. assign False to start_found
  2. iterate line by line, which should be word for word
  3. if start was found, change start_found to True
  4. yield element if start_found is True
  5. return from generator, if end is the element. This will also leave the for-loop
  6. optional Exceptions:
    • if the for-loop was finished, but start_found is still False, then start-word was not found
    • if the for-loop was finished and start_found is True, then the end-word was not found

word_start = "yes"
word_end = "no"

words = """yes
ok
ok
ok
no""".splitlines()

def split(sequence, start, end):
    start_found = False

    for element in sequence:
        if element == word_start and not start_found:
            start_found = True
        elif element == word_end and start_found:
            return
            # close generator
        elif start_found:
            yield element

    # this point is reached, if the start or end was not found
    if start_found:
        # seen start, but no end
        raise ValueError(f"'{end}' was not the last element in sequence")
    else:
        # seen no start in the whole sequence
        raise ValueError(f"The start_word '{start}' was not found in sequence")

oks = list(split(words, word_start, word_end))
print(oks)
Here the Version, which includes start-word and stop-word.
It has no big difference compared to the previous generator-function.
word_start = "yes"
word_end = "no"

words = """yes
ok
ok
ok
no""".splitlines()

def split(sequence, start, end):
    start_found = False

    for element in sequence:
        if element == word_start and not start_found:
            start_found = True
            yield element
        elif element == word_end and start_found:
            yield element # yield the word_end
            return
            # close generator
        elif start_found:
            yield element

    # this point is reached, if the start or end was not found
    if start_found:
        # seen start, but no end
        raise ValueError(f"'{end}' was not the last element in sequence")
    else:
        # seen no start in the whole sequence
        raise ValueError(f"The start_word '{start}' was not found in sequence")

oks = list(split(words, word_start, word_end))
print(oks)
tester_V, BashBedlam, rektcol like this post
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply


Messages In This Thread
Extract text - by rektcol - Jun-27-2022, 07:53 AM
RE: Extract text - by DeaD_EyE - Jun-27-2022, 08:19 AM
RE: Extract text - by Gribouillis - Jun-27-2022, 08:48 AM
RE: Extract text - by ibreeden - Jun-27-2022, 11:21 AM
RE: Extract text - by Gribouillis - Jun-27-2022, 08:21 PM
RE: Extract text - by rektcol - Jun-28-2022, 07:38 AM
RE: Extract text - by Gribouillis - Jun-28-2022, 08:57 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  extract only text strip byte array Pir8Radio 7 3,424 Nov-29-2022, 10:24 PM
Last Post: Pir8Radio
  Extract only certain text which are needed Calli 26 6,925 Oct-10-2022, 03:58 PM
Last Post: deanhystad
  Extract a string between 2 words from a text file OscarBoots 2 1,991 Nov-02-2021, 08:50 AM
Last Post: ibreeden
  Extract text based on postion and pattern guddu_12 2 1,766 Sep-27-2021, 08:32 PM
Last Post: guddu_12
  Extract specific sentences from text file Bubly 3 3,645 May-31-2021, 06:55 PM
Last Post: Larz60+
  extract color text from PDF Maha 0 2,161 May-31-2021, 04:05 PM
Last Post: Maha
Question How to extract multiple text from a string? chatguy 2 2,547 Feb-28-2021, 07:39 AM
Last Post: bowlofred
  How to extract a single word from a text file buttercup 7 3,939 Jul-22-2020, 04:45 AM
Last Post: bowlofred
  How to extract specific rows and columns from a text file with Python Farhan 0 3,510 Mar-25-2020, 09:18 PM
Last Post: Farhan
  Extract Strings From Text File - Out Put Results to Individual Files dj99 8 5,178 Jun-28-2018, 10:41 AM
Last Post: dj99

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020