Python Forum
pandas dataframe.replace regex - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Homework (https://python-forum.io/forum-9.html)
+--- Thread: pandas dataframe.replace regex (/thread-2177.html)



pandas dataframe.replace regex - metalray - Feb-24-2017

Dear Pandas Experts,
I am trying to replace occurences like "United Kingdom of Great Britain and Ireland" or "United Kingdom of Great Britain & Ireland"
with just "United Kingdom". So I thought I use a regex to look for strings that contain "United Kingdom".
However, my two attempts below do not work:

    dftwo['Country'].replace(r'^United Kingdom of Great Britain','United Kingdom',inplace=True, regex=True)

    dftwo['Country'].replace('/United Kingdom/','United Kingdom',inplace=True, regex=True)
I would really appreciate any help!


RE: pandas dataframe.replace regex - zivoni - Feb-24-2017

You have no "wildcards" there

Output:
In [39]: import pandas as pd In [40]: df = pd.DataFrame({"country":["United Kingdom of Great Britain", "Ireland", "United Kingdom of Great Britain & Ireland"], "value":[12,31, 43]}) In [41]: df Out[41]:                                      country  value 0            United Kingdom of Great Britain     12 1                                    Ireland     31 2  United Kingdom of Great Britain & Ireland     43 In [42]: df.country.replace("^United Kingdom of Great Britain.*", "United Kingdom", regex=True, inplace=True) In [43]: df Out[43]:           country  value 0  United Kingdom     12 1         Ireland     31 2  United Kingdom     43



RE: pandas dataframe.replace regex - metalray - Feb-24-2017

Hi zivoni!
Thank you so much. I had no idea that I could use a wild card there!


RE: pandas dataframe.replace regex - zivoni - Feb-24-2017

"wildcard" is little vague   - pandas' replace is made on top of standard python re.sub, so you can use exactly same regular expressions you would use for re.sub and python re documentation is your friend.