Python Forum

Full Version: how to count frequency in a column
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello,

Can somebody please tell me how I can count number of 'Buy', 'Sell' on a yearly basis for the chart below.

thanks,

YK303

Output:
SYMBOL PROCESS Year COMMISSION TRANSACTION_TYPE @.HSE 170120C00016000 2016-01-12 2016 11.49 BUY 2016-01-13 2016 11.49 BUY 2016-01-26 2016 11.49 BUY 2016-01-28 2016 10.00 SELL @AAPL 131227C00530000 2013-01-21 2013 7.49 BUY ... ... ... ... ... NaN 2020-01-15 2020 0.00 INT 2020 0.00 INT 2020-01-16 2020 0.00 REB 2020-01-20 2020 0.00 EFT 2020-01-28 2020 0.00 EFT
Is this a dataframe from pandas?

With Python, you can use collections.Counter, but pandas should have a function for this. There is a count and a groupby function, but they don't return the frequency of unique values.

from collections import Counter
import pandas as pd

columns = ("SYMBOL", "PROCESS", "Year", "COMMISSION", "TRANSACTION_TYPE")
data = [
    ("AB", "C", 2010, "A", "sell"),
    ("AB", "C", 2010, "C", "buy"),
    ("AC", "C", 2012, "C", "buy"),
    ("AD", "C", 2010, "A", "sell"),
]
df = pd.DataFrame(data, columns=columns)

frequency = Counter(df["SYMBOL"])
for value, frequency in frequency.most_common():
    print(f"Value: {value}, Freq: {frequency}")
Output:
Value: AB, Freq: 2 Value: AC, Freq: 1 Value: AD, Freq: 1
What you should read:
And maybe someone else has a solution with pandas-functions for this task.
Hi DeaD_EyE,

I am indeed using Pandas and DataFrame.

I don't think this is what I am looking for. Is there a way I can attach an image on this post. I don't think I can.

I just want to tally up buys, sells, etc for year.

so the new table should look like this

Year Buy Sell
2012 20 15
2013 25 20
2014 22 8
2015 35 30
2016 24 22

Thanks,

YK303

(Jan-14-2021, 09:34 AM)DeaD_EyE Wrote: [ -> ]Is this a dataframe from pandas?

With Python, you can use collections.Counter, but pandas should have a function for this. There is a count and a groupby function, but they don't return the frequency of unique values.

from collections import Counter
import pandas as pd

columns = ("SYMBOL", "PROCESS", "Year", "COMMISSION", "TRANSACTION_TYPE")
data = [
    ("AB", "C", 2010, "A", "sell"),
    ("AB", "C", 2010, "C", "buy"),
    ("AC", "C", 2012, "C", "buy"),
    ("AD", "C", 2010, "A", "sell"),
]
df = pd.DataFrame(data, columns=columns)

frequency = Counter(df["SYMBOL"])
for value, frequency in frequency.most_common():
    print(f"Value: {value}, Freq: {frequency}")
Output:
Value: AB, Freq: 2 Value: AC, Freq: 1 Value: AD, Freq: 1
What you should read:
And maybe someone else has a solution with pandas-functions for this task.
I'm not a regular user of pandas, so I've read also the documentation like others shuld.

import pandas as pd

columns = ("SYMBOL", "PROCESS", "Year", "COMMISSION", "TRANSACTION_TYPE")
data = [
    ("AB", "C", 2010, "A", "sell"),
    ("AB", "C", 2010, "C", "buy"),
    ("AC", "C", 2012, "C", "buy"),
    ("AD", "C", 2010, "A", "sell"),
]
df = pd.DataFrame(data, columns=columns)

data = []
for year, block in df.groupby("Year"):
    buy = (block["TRANSACTION_TYPE"] == "buy").sum()
    sell = (block["TRANSACTION_TYPE"] == "sell").sum()
    data.append({"Year": year, "Buy": buy, "Sell": sell})

statistic = pd.DataFrame(data=data, columns=("Year", "Buy", "Sell"))
I think this is a little closer to that what you want to solve.
I guess there is still a better way to do it.