Python Forum
Removing hyphens and adding zeros
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Removing hyphens and adding zeros
What is the meaning of the following code from the Learn Data Analysis with Python book? Can someone elaborate on the first 10-12 lines. Why return s[-9] or in other words assign to ssn? Negative index? As I understand, first replacing hyphens with a space. Then, splitting at the space and rejoin? If number of digits less than 9 and not 'missing' then .. why add the zeros? Then what next?

The ssns are:
ssns = ['867-53-0909','333-22-4444','123-12-1234',

def right(s, amount):
    return s[-amount]
def standardize_ssn(ssn):
      ssn = ssn.replace("-","")
      ssn = "".join(ssn.split())
      if len(ssn)<9 and ssn != 'Missing':
         ssn="000000000" + ssn
    return ssn
df.ssn = df.ssn.apply(standardize_ssn)
ThereĀ“s a colon missing, either in the book or you overlooked it.
def right(s, amount):
    return s[-amount:]
if the snn is less than 9 digits it is set to nine zeros plus the given digits and then the last nine digits are returned
So in other words zeros were added to the start of the ssn
There is string method zfill for filling in zeros and if the objective is to have standard lenght (9) and fill shorter values with zeros and removing hyphens one can simply do:

>>> ssns = ['867-53-0909','333-22-4444','123-12-1234', '777-93-9311','123-12-1423', '1-2-3', 'Missing', '45-67-78']
>>> [''.join(ssn.split('-')).zfill(9) for ssn in ssns if ssn != 'Missing']
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Thank you Thomas and Perfringo. I just realized the colon after working through the code step by step. I figured out the working of the code as I did it step by step running the code after each line. Then logged in and noticed your reply as well. And the colon is actually missing in the book. Well problem solved.

Sure Buran, will format with tags in future. I'm going to try to see if it's working:

for i in range(10):
      print i

Possibly Related Threads…
Thread Author Replies Views Last Post
  getting trailing zeros with 1 during pandas read fullstop 1 1,011 Jan-05-2020, 04:01 PM
Last Post: ichabod801
  Numpy saving and loading introduces zeros in the middle of every element DreamingInsanity 0 330 Dec-11-2019, 07:21 PM
Last Post: DreamingInsanity

Forum Jump:

User Panel Messages

Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020