Python Forum
Removing hyphens and adding zeros - Printable Version

+- Python Forum (
+-- Forum: Python Coding (
+--- Forum: Data Science (
+--- Thread: Removing hyphens and adding zeros (/Thread-Removing-hyphens-and-adding-zeros)

Removing hyphens and adding zeros - sidney - Aug-14-2019

What is the meaning of the following code from the Learn Data Analysis with Python book? Can someone elaborate on the first 10-12 lines. Why return s[-9] or in other words assign to ssn? Negative index? As I understand, first replacing hyphens with a space. Then, splitting at the space and rejoin? If number of digits less than 9 and not 'missing' then .. why add the zeros? Then what next?

The ssns are:
ssns = ['867-53-0909','333-22-4444','123-12-1234',

def right(s, amount):
    return s[-amount]
def standardize_ssn(ssn):
      ssn = ssn.replace("-","")
      ssn = "".join(ssn.split())
      if len(ssn)<9 and ssn != 'Missing':
         ssn="000000000" + ssn
    return ssn
df.ssn = df.ssn.apply(standardize_ssn)

RE: Removing hyphens and adding zeros - ThomasL - Aug-14-2019

ThereĀ“s a colon missing, either in the book or you overlooked it.
def right(s, amount):
    return s[-amount:]
if the snn is less than 9 digits it is set to nine zeros plus the given digits and then the last nine digits are returned
So in other words zeros were added to the start of the ssn

RE: Removing hyphens and adding zeros - perfringo - Aug-14-2019

There is string method zfill for filling in zeros and if the objective is to have standard lenght (9) and fill shorter values with zeros and removing hyphens one can simply do:

>>> ssns = ['867-53-0909','333-22-4444','123-12-1234', '777-93-9311','123-12-1423', '1-2-3', 'Missing', '45-67-78']
>>> [''.join(ssn.split('-')).zfill(9) for ssn in ssns if ssn != 'Missing']

RE: Removing hyphens and adding zeros - sidney - Aug-14-2019

Thank you Thomas and Perfringo. I just realized the colon after working through the code step by step. I figured out the working of the code as I did it step by step running the code after each line. Then logged in and noticed your reply as well. And the colon is actually missing in the book. Well problem solved.

Sure Buran, will format with tags in future. I'm going to try to see if it's working:

for i in range(10):
      print i