Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Substr on Pandas Dataframe
#1
Hi everyone,

I have a DF and I want to set an if statement in a function to sum a value if the first part of a field = '10'. This would be easy in SAS with the substr function. Can I do it in a dataframe or do I need to put it into an array and slice?

I have pasted the DF below, the column headers don't align well but you can make it out.

Output:
HSC Country Month Imports_(NZD) Harmonised System Description 0 101210015 New Zealand 201903 191,550 Horses; live, pure-bred breeding animals, thor... 1 101210015 New Zealand 201904 190,550 Horses; live, pure-bred breeding animals, thor... 2 101290010 New Zealand 201903 76,660 Horses; live, other than pure-bred breeding an... 3 101290010 New Zealand 201904 1,187,430 Horses; live, other than pure-bred breeding an... 4 101290013 New Zealand 201904 1,257,700 Horses; live, other than pure-bred breeding an...
What i want is an output with month as the index and then a new variable summed Import by is substr(hsc,0,2) = '01' which is grouped by month. I just want help with first variable and then I am going to create a few more summs based on the HSC that are grouped by month and have them as the new columns.

I hope that makes sense. Please let me know if you need more info.

Thanks
Quote
#2
(Sep-01-2019, 06:21 AM)Scott Wrote: I have a DF and I want to set an if statement in a function to sum a value if the first part of a field = '10'.
You need to convert values to strings first and use .str.startswith method.

Take a look at the following minimal example I just wrote:

import pandas as pd
df = pd.DataFrame({"x": [100, 1000, 1000, 1919, 124], "y": [1, 2, 3, 4, 5]})
df.loc[df.x.astype(str).str.startswith('10'), 'y'].sum()
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  strange error from pandas dataframe djf123 1 290 Jul-27-2020, 05:25 AM
Last Post: scidam
  Pandas DataFrame not updating HelpMePlease 3 276 Jul-11-2020, 07:19 PM
Last Post: jefsummers
  Pandas DataFrame visual Truman 8 346 Jul-10-2020, 06:11 AM
Last Post: hussainmujtaba
  Pandas DataFrame and unmatched column sritsv19 0 282 Jul-07-2020, 12:52 PM
Last Post: sritsv19
  Pandas DataFrame Concatenate problems Kristenl2784 1 209 Jul-01-2020, 01:28 AM
Last Post: hussainmujtaba
  Difference of two columns in Pandas dataframe zinho 2 514 Jun-17-2020, 03:36 PM
Last Post: zinho
  error bars with dataframe and pandas Hucky 4 433 Apr-27-2020, 02:02 AM
Last Post: Hucky
  Python Pandas DataFrame Help AmericanEagle1989 1 275 Apr-12-2020, 12:37 PM
Last Post: AmericanEagle1989
  How does pyplot know what was plotted by the output of pandas.DataFrame(...).cumprod( codeowl 2 305 Mar-28-2020, 08:27 AM
Last Post: j.crater
  Ordering of pandas DataFrame new_to_python 5 363 Mar-15-2020, 06:08 PM
Last Post: new_to_python

Forum Jump:


Users browsing this thread: 1 Guest(s)