Posts: 3
Threads: 1
Joined: Oct 2022
I have a simple mad libs genarator and template text. when I try to replace part of speech it's just do noting and return unchanged string. I don't know what's the problem I don't get any error message
import os
import random
import re
def generator(text:str) -> str:
'''find needed words in text and ask user enter them, then print result'''
parts_of_speech: list = re.findall('_+\s?\([^\)]+\)', text)
words: dict = {}
for part in parts_of_speech:
text_to_user = part.replace('_', '').replace('(', '').replace(')', '').replace('\n', ' ')
words[part] = input(f'Please input a(n) {text_to_user}\n')
os.system('cls' if os.name == 'nt' else 'clear')
for regex, replacable_world in words.items():
text = re.sub(regex, replacable_world, text, 1)
print(text)
def get_template(witch: str = None) -> str:
'''just get a template from text file'''
if witch:
return open(f'./templates/{witch}.txt', 'r').read()
template: str = random.choice(os.listdir('./templates'))
return open(f'./templates/{template}', 'r').read()
def main() -> None:
generator(get_template('The Monkey King!'))
if __name__ == '__main__':
main() "
The day I saw the Monkey King __________(verb) was one of the most
interesting days of the year.
After he did that, the king played chess on his brother's
__________(noun) and then combed his __________ (adjective) hair with a
comb made out of old fish bones. Later that same day, I saw the
Monkey King dance __________ (adverb)
in front of an audience of kangaroos and wombats.
"
Posts: 6,120
Threads: 16
Joined: Feb 2020
Some characters have special meanings in regular expressions, such as (). You obviously know this because your used backslashes to remove their special meaning in this:
re.findall('_+\s?\([^\)]+\)', text) You could proces the regex strings to add in the backslashes, but it is easier to use str.replace().
Ryokousha likes this post
Posts: 26
Threads: 1
Joined: Mar 2022
Hello,
(Oct-02-2022, 04:50 AM)deanhystad Wrote: Some characters have special meanings in regular expressions, such as (). You obviously know this because your used backslashes to remove their special meaning in this:
re.findall('_+\s?\([^\)]+\)', text) You could proces the regex strings to add in the backslashes, but it is easier to use str.replace(). We can use a rstring...
Ryokousha likes this post
I speak Python but I don't speak English (I just read it a little). If I express myself badly, please blame the translator^^.
Posts: 795
Threads: 127
Joined: Jul 2017
Oct-02-2022, 05:59 AM
(This post was last modified: Oct-02-2022, 05:59 AM by Pedroski55.)
I make gapped texts for classroom use. Starting with a text, I make a list of words I want to extract and the line number, as the answer key.
The AK looks like this:
Quote:3,woman
4,dog
5,monkey
Just loop through the AK replacing the words, later the gapped text gets saved as .docx file, with the missing words in a table.
number = 1
for word in AKlist:
# word looks like 3,woman\n
splitword = word.split(',')
# get rid of the comma to get the line number
line = splitword[0]
# minus 1 because the list string1 starts at 0
linenum = int(line) -1
print('line number is', linenum)
newword = splitword[1].replace('\n', '')
print('newword is', newword)
sentence = string1[linenum]
print('sentence', linenum, sentence)
repl = f'{number}. ___________'
# another re command
# sentence = re.sub(r"\b{}\b".format(word), newword, sentences)
sentence = re.sub(newword, repl, sentence, count=1)
print(sentence)
string1[linenum] = sentence
number +=1
Posts: 3
Threads: 1
Joined: Oct 2022
(Oct-02-2022, 04:50 AM)deanhystad Wrote: Some characters have special meanings in regular expressions, such as (). You obviously know this because your used backslashes to remove their special meaning in this:
re.findall('_+\s?\([^\)]+\)', text) You could proces the regex strings to add in the backslashes, but it is easier to use str.replace().
Oh, that was a such stupid mistake. Thank you!
Posts: 3
Threads: 1
Joined: Oct 2022
(Oct-02-2022, 05:42 AM)Coricoco_fr Wrote: Hello,
(Oct-02-2022, 04:50 AM)deanhystad Wrote: Some characters have special meanings in regular expressions, such as (). You obviously know this because your used backslashes to remove their special meaning in this:
re.findall('_+\s?\([^\)]+\)', text) You could proces the regex strings to add in the backslashes, but it is easier to use str.replace(). We can use a rstring...
Hello, Do you know how can we format string to raw string? I've googled it but that not work
Posts: 7,068
Threads: 122
Joined: Sep 2016
(Oct-02-2022, 07:06 AM)Ryokousha Wrote: Hello, Do you know how can we format string to raw string? I've googled it but that not work You add r (raw string) to the regex pattern,do this always as a habit or can get problems.
Example add car after new line( \n ).
>>> import re
>>>
>>> s = ' hello world\n'
>>> re.sub('(\n)', '\1car', s)
' hello world\x01car So it fails,now add r and it's ok.
>>> import re
>>>
>>> s = ' hello world\n'
>>> re.sub(r'(\n)', r'\1car', s)
' hello world\ncar' From regex doc.
Quote:The solution is to use Python’s raw string notation for regular expression patterns ;
backslashes are not handled in any special way in a string literal prefixed with 'r'.
So r"\n" is a two-character string containing '\' and 'n', while "\n" is a one-character string containing a newline.
Usually patterns will be expressed in Python code using this raw string notation.
Posts: 6,120
Threads: 16
Joined: Feb 2020
Oct-02-2022, 03:00 PM
(This post was last modified: Oct-02-2022, 03:00 PM by deanhystad.)
This is not an issue of raw strings. Raw strings are used to prevent backslashes from being treated as escape sequences. An r"string" has no effect on parentheses. As you know, your problem was that re.sub() interpreted the parentheses in your regex strings as grouping characters instead of literal parentheses.
When a python program is compiled (converted to bytecodes) all strings are converted to raw strings (escape sequences are replaced with associated character(s)). Strings with an "r" prefix skip the escape sequence processing as they are already in "raw" form. Since all strings are "raw" strings when your program runs, there is no way, or no need, to convert a str to a "raw" str.
You don't want two loops in generator(). Replace the words as they are encountered.
def generator(text:str) -> str:
'''find needed words in text and ask user enter them, then print result'''
for placeholder in re.findall(r'_+\s?\([^\)]+\)', text): # good spot for a raw string
word_type = placeholder.replace('_', '').replace('(', '').replace(')', '').replace('\n', ' ')
text = text.replace(placeholder, input(f'Please input a(n) {word_type}\n'), 1)
print(text) This is a wonderful example of simpler is better. The dictionary in your solution limits your madlib to one noun, one verb, one adjective, etc. I suppose you could have verb2 and noun3, but that either looks clunky (Enter a verb2) or requires extra processing to remove the extra sequence number. It is much easier to replace the placeholders as they are encountered. With no dictionary you don't have to worry about uniqueness. Your madlib can have 10 nouns, because your program only knows about the next noun.
|