Python Forum

Full Version: Removing punctuation
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello. I need to create a function that counts using a dictionary how many different word there are in a string.

So far I have done this code:

def word_distribution(text_string):
  text_string = text_string.lower()
  text_string = text_string.split()
  dict={} 
  for wd in text_string:
    if wd in dict:
      dict[wd]+=1
    else:
      dict[wd]=1
  return (dict)
But what I get considers the punctuation, but I need to remove the punctuation from the end of the words using the function "isalpha()". What I have done so far about this was:

def removing(x):
  if x[-1].isalpha() == True:
    return x
  else:
    return x[:-1]
But it only works for the last word of the sentence.

So, my trouble is getting to do a loop of for which searches for each word I have split in the first part of the first function to remove all the possible punctuation.

Thank you
You could use a generator expression with a conditional and check for each char if it's in string.punctuation:

import string


def remove_punctuation(text):
    return "".join(char for char in text if char not in string.punctuation)
Written with a for-loop:
import string


def remove_punctuation(text):
    result = []
    for char in text:
        if char not in string.punctuation:
            result.append(char)
    return "".join(result)
def removing(x):
  if x[-1].isalpha() == True:
    return x
  else:
    return x[:-1]
I am not sure what 'x' refers to in the above code. If it refers to the whole string, x[-1] in a string refers to the last element. So, no srprises there.

you need to remove the punctuation before splitting the string into words. Look up string.replace() method.

Hope it helps.