Python Forum
pandas str.extract multiple regex groups with OR
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
pandas str.extract multiple regex groups with OR
#2
With sample dataset one can achieve desired result as follows:

>>> df = pd.DataFrame('3" deep, 4 inches deep, 5" depth'.split(','), columns=['Depth'])
>>> df
          Depth
0         3" deep
1   4 inches deep
2        5" depth
>>> df.Depth.str.extract('(\d+)')                                     
   0
0  3
1  4
2  5
EDIT: if numeric value is needed then it probably should be int (or float) datatype. So one can do:

>>> df['Depth_number'] = df.Depth.str.extract('(\d+)').astype(int)
>>> df
            Depth  Depth_number
0         3" deep             3
1   4 inches deep             4
2        5" depth             5
>>> df.dtypes
Depth           object
Depth_number     int64
dtype: object
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply


Messages In This Thread
RE: pandas str.extract multiple regex groups with OR - by perfringo - Dec-19-2019, 02:17 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Import multiple CSV files into pandas Krayna 0 1,749 May-20-2021, 04:56 PM
Last Post: Krayna
  Weighted average with multiple weights and groups amyd 0 2,135 Oct-11-2019, 10:30 AM
Last Post: amyd
  Reading Multiple Sheets using Pandas dhiliptcs 1 4,088 Sep-30-2019, 11:26 PM
Last Post: scidam
  Handling multiple errors when using datafiles in Pandas alphanov 1 1,878 Jul-16-2019, 03:17 AM
Last Post: scidam
  How to extract different data groups from multiple CSV files using python Rafiz 3 3,296 Jun-04-2019, 05:20 PM
Last Post: jefsummers
  extract specific content in a pandas dataframe with a regex? steve1040 0 13,576 Oct-05-2017, 03:17 AM
Last Post: steve1040

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020