Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Removing hyphens and adding zeros
What is the meaning of the following code from the Learn Data Analysis with Python book? Can someone elaborate on the first 10-12 lines. Why return s[-9] or in other words assign to ssn? Negative index? As I understand, first replacing hyphens with a space. Then, splitting at the space and rejoin? If number of digits less than 9 and not 'missing' then .. why add the zeros? Then what next?

The ssns are:
ssns = ['867-53-0909','333-22-4444','123-12-1234',

def right(s, amount):
    return s[-amount]
def standardize_ssn(ssn):
      ssn = ssn.replace("-","")
      ssn = "".join(ssn.split())
      if len(ssn)<9 and ssn != 'Missing':
         ssn="000000000" + ssn
    return ssn
df.ssn = df.ssn.apply(standardize_ssn)
buran wrote Aug-14-2019, 10:18 AM:
Please, use proper tags when post code, traceback, output, etc. This time I have added tags for you.
See BBcode help for more info.
There´s a colon missing, either in the book or you overlooked it.
def right(s, amount):
    return s[-amount:]
if the snn is less than 9 digits it is set to nine zeros plus the given digits and then the last nine digits are returned
So in other words zeros were added to the start of the ssn
There is string method zfill for filling in zeros and if the objective is to have standard lenght (9) and fill shorter values with zeros and removing hyphens one can simply do:

>>> ssns = ['867-53-0909','333-22-4444','123-12-1234', '777-93-9311','123-12-1423', '1-2-3', 'Missing', '45-67-78']
>>> [''.join(ssn.split('-')).zfill(9) for ssn in ssns if ssn != 'Missing']
buran likes this post
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Life of Brian: Conjugate the verb, "to go" !
Thank you Thomas and Perfringo. I just realized the colon after working through the code step by step. I figured out the working of the code as I did it step by step running the code after each line. Then logged in and noticed your reply as well. And the colon is actually missing in the book. Well problem solved.

Sure Buran, will format with tags in future. I'm going to try to see if it's working:

for i in range(10):
      print i

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Creating new rows and adding them to empty data frame kapilan15 0 198 May-31-2019, 10:19 AM
Last Post: kapilan15
  Help in adding confusion matrix Aashish 5 585 Apr-15-2019, 11:45 PM
Last Post: scidam
  Adding a string value to a dictionary that is inside a list mahmoud899 1 377 Dec-15-2018, 02:31 PM
Last Post: ichabod801
  Adding times in ISO format? amca01 1 480 Oct-31-2018, 11:49 PM
Last Post: Larz60+
  Removing rows at random based on the value of a specific column Mr_Keystrokes 4 582 Aug-24-2018, 11:15 AM
Last Post: Mr_Keystrokes
  Adding Columns to CSV using iterator pstarrett 10 13,659 Jan-22-2018, 02:37 AM
Last Post: pstarrett
  Memory error while recursively adding np.arrays Afterdarkreader 0 1,686 Dec-22-2017, 04:02 PM
Last Post: Afterdarkreader
  Removing characters from columns in data frame kiton 15 27,514 Apr-17-2017, 07:01 PM
Last Post: zivoni

Forum Jump:

Users browsing this thread: 1 Guest(s)