Python Forum

Full Version: Output substrings from rows in pandas
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I have a dataframe with a column "C" which contains address information. How can I single out the number of unique cities? In the example below it would be London, Paris and Barcelona?

A B C
Grey no 84 Sussex Gardens Westminster Borough London W2 1UH United Kingdom
Red yes 71 Rue de Charonne 11th arr 75011 Paris France
Blue yes Diputaci 262 264 Eixample 08007 Barcelona Spain
Do you have the list of cities you want to look for? You'll need some form of logic to identify which words in the string are cities. If you're looking for a particular city (e.g. London) you could create a column with something like (I'm doing this syntax from memory so you might want to play with it).

df['Is London'] = np.where(df['C'].str.contains('London'), 1, 0)
(Jun-20-2018, 02:54 PM)mecampbell Wrote: [ -> ]Do you have the list of cities you want to look for?
No but it's usually the 2nd last string in the column or if it's UK then it is the string before the postcode.


(Jun-20-2018, 02:54 PM)mecampbell Wrote: [ -> ]df['Is London'] = np.where(df['C'].str.contains('London'), 1, 0)
this works but means I need to know the cities prior
How about using split() with " " to get the second last word and then use isnumeric() to determine if it has a number. If it has a number, then consider the previous word.
(Jun-20-2018, 03:57 PM)Nwb Wrote: [ -> ]How about using split() with " " to get the second last word and then use isnumeric() to determine if it has a number. If it has a number, then consider the previous word.

That sounds good, do you know how I can separate output these from each cell?
I have singled out a column of data from a dataframe via
address = reviews.iloc[:,1]
which contains addresses. How can I output the cities alone for each row? Each city is the 2nd last string or if it's UK then it is the 3rd last string that does not contain numbers.
Thanks in advance

0 372 Strand Westminster Borough London WC2R 0JJ United Kingdom
1 Rossell 249 Eixample 08008 Barcelona Spain
2 Damrak 1 5 Amsterdam City Center 1012 LG Amsterdam