Hi,
Stdin reads any text entered in, into a list, so a limerick, 5 lines of text, will be read into a list of 5 comma separated list items. How can I strip out the punctuation from this?
Found it:
Stdin reads any text entered in, into a list, so a limerick, 5 lines of text, will be read into a list of 5 comma separated list items. How can I strip out the punctuation from this?
lines = ["There was an old man from Peru,", "Who said he had nothing to do,", "So he sat on the stairs,", "And counted he hairs,", "And found he had seventy-two"]I can strip out the punctuation if it is just a simple string:
import re lines = "There was an old man from Peru,", "Who said he had nothing to do,", "So he sat on the stairs,", "And counted he hairs,", "And found he had seventy-two" lines = re.sub(r'[^\w\s]','',lines)But I can't figure out how to do this with a list, without then looping through the list into one string, which I am then only going to turn back into a list anyway. Is there a nice way to do this please?
Found it:
lines = ["There was an old man from Peru,", "Who said he had nothing to do,", "So he sat on the stairs,", "And counted he hairs,", "And found he had seventy-two"] input_text = ''.join(lines).lower() input_text = re.sub(r'[,.:@#?!&$]', ' ', input_text)This works well, looks like some regex studying is required.