Python Forum

Full Version: Removing punctuation from strings in lists
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,

Stdin reads any text entered in, into a list, so a limerick, 5 lines of text, will be read into a list of 5 comma separated list items. How can I strip out the punctuation from this?

lines = ["There was an old man from Peru,", "Who said he had nothing to do,", "So he sat on the stairs,", "And counted he hairs,", "And found he had seventy-two"]
     
I can strip out the punctuation if it is just a simple string:
import re
lines = "There was an old man from Peru,", "Who said he had nothing to do,", "So he sat on the stairs,", "And counted he hairs,", "And found he had seventy-two"
lines = re.sub(r'[^\w\s]','',lines)
But I can't figure out how to do this with a list, without then looping through the list into one string, which I am then only going to turn back into a list anyway. Is there a nice way to do this please?

Found it:

 lines = ["There was an old man from Peru,", "Who said he had nothing to do,", "So he sat on the stairs,", "And counted he hairs,", "And found he had seventy-two"]

 input_text = ''.join(lines).lower()
 input_text = re.sub(r'[,.:@#?!&$]', ' ', input_text)
This works well, looks like some regex studying is required.
newlines = []
for item in lines:
   newlines.append(item.strip(','))
lines = newlines
(May-21-2017, 04:57 PM)Larz60+ Wrote: [ -> ]
newlines = []
for item in lines:
   newlines.append(item.strip(','))
lines = newlines

Why not just list comprehension
lines = [l.strip(',') for l in lines]
sounds good to me .. old habit