Python Forum
Pandas/Dataframes, Strings and Regular Expressions...
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pandas/Dataframes, Strings and Regular Expressions...
#1
Hi all

I am new in the Python world (20 years ago I did some C/C++). For being new, I was able to achieve quite a lot so far. I successfully got around RegEx all these years, but this seems to change now...

With this problem I didn't get a solution so far, I also think so far I have not fully understood the indexing/selecting mechanism.

I have a data frame 'data_total' (what a thrilling name...) with the column INFO. It contains strings like 'X-Z-34567A' or 'X-Y-123456'.
I'd like to extract the numbers into a new column INFO_NR. The letter on the tail is to replace with a '0'.
After all, data should read '345670' and '123456'

First I tried a slightly other way: I extracted the number part, converted it to int and multiplied by 10.

See the following code snippet:

# this processes the X-Z-34567A correctly, fills the fields of the other rows with nan
data_total['INFO_NR'] = data_total['INFO'].str.extract('^X-\w-(\d*)[ABCDEFGHILKMNOPQRSTUVWXYZ]$', expand=False).str.strip()
data_total['INFO_NR'] = data_total['INFO_NR'].fillna('0')
data_total['INFO_NR'] = data_total['INFO_NR'].astype(np.int64)*10

# this processes the X-Y-123456 correctly, but fills the previously processed fields with nan!!
data_total['INFO_NR'] = data_total['INFO'].str.extract('^X-\w-(\d*)$', expand=False).str.strip()
data_total['INFO_NR'] = data_total['INFO_NR'].astype(np.int64)*10
Both the regexes work, but the second deletes the results of the first. How can I apply the second regex only on the rows with INFO_NR == 0, without deleting the first results?

And how I got to know Python so far, there should be a much more elegant solution out there Smile

Looking forward to your inputs
Thank you
Stephan
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Pandas dataframes and numpy arrays bytecrunch 1 1,326 Oct-11-2022, 08:08 PM
Last Post: Larz60+
  Merging sorted dataframes using Pandas Robotguy 1 2,195 Aug-12-2020, 07:11 PM
Last Post: jefsummers
  Merging two DataFrames based on indexes from two other DataFrames lucinda_rigeitti 0 1,746 Jan-16-2020, 08:36 PM
Last Post: lucinda_rigeitti
  Why can't I merge pandas dataframes learnpython2018 2 7,674 Sep-23-2018, 05:53 PM
Last Post: learnpython2018

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020