Substr on Pandas Dataframe - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Substr on Pandas Dataframe (/thread-20818.html) |
Substr on Pandas Dataframe - Scott - Sep-01-2019 Hi everyone, I have a DF and I want to set an if statement in a function to sum a value if the first part of a field = '10'. This would be easy in SAS with the substr function. Can I do it in a dataframe or do I need to put it into an array and slice? I have pasted the DF below, the column headers don't align well but you can make it out. What i want is an output with month as the index and then a new variable summed Import by is substr(hsc,0,2) = '01' which is grouped by month. I just want help with first variable and then I am going to create a few more summs based on the HSC that are grouped by month and have them as the new columns.I hope that makes sense. Please let me know if you need more info. Thanks RE: Substr on Pandas Dataframe - scidam - Sep-02-2019 (Sep-01-2019, 06:21 AM)Scott Wrote: I have a DF and I want to set an if statement in a function to sum a value if the first part of a field = '10'.You need to convert values to strings first and use .str.startswith method. Take a look at the following minimal example I just wrote: import pandas as pd df = pd.DataFrame({"x": [100, 1000, 1000, 1919, 124], "y": [1, 2, 3, 4, 5]}) df.loc[df.x.astype(str).str.startswith('10'), 'y'].sum() |