Python Forum

Full Version: String extraction
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi everyone,

I am looking at extracting name titles and putting them into their own column in a dataset.

The names are in this format: grant-perice, Mr. Owen Harris.

Can I start a string search from the , and end at the . and extract into a column and delete any white space? Perhaps with the extract method?

Thanks
>>> my_string = 'grant-perice, Mr. Owen Harris'
>>> title, name = my_string.split(',')
>>> title
'grant-perice'
>>> name.strip()
'Mr. Owen Harris'
>>> name.split('.')[-1].strip()
'Owen Harris'
>>> title, name = my_string.split('.')
>>> name.strip()
'Owen Harris'
>>> title
'grant-perice, Mr'
of course, you can also use RegEx, check re module

or you can install third party package nameparser
>>> import nameparser
>>> name = HumanName('grant-perice, Mr. Owen Harris')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'HumanName' is not defined
>>> name = nameparser.HumanName('grant-perice, Mr. Owen Harris')
>>> name
<HumanName : [
	title: 'Mr.' 
	first: 'Owen' 
	middle: 'Harris' 
	last: 'grant-perice' 
	suffix: ''
	nickname: ''
]>
>>> name = nameparser.HumanName('Mr. Owen Harris')
>>> name
<HumanName : [
	title: 'Mr.' 
	first: 'Owen' 
	middle: '' 
	last: 'Harris' 
	suffix: ''
	nickname: ''
]>
That is just one example there is a column called 'Names' with over 100 entries.
(Jul-21-2018, 10:49 AM)Scott Wrote: [ -> ]That is just one example there is a column called 'Names' with over 100 entries.
yes, that is example. Now start coding and implement same in a loop. We are not going to do it for you.