Pandas Import CSV count between numerical values within 1 Column - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Pandas Import CSV count between numerical values within 1 Column (/thread-19799.html) |
Pandas Import CSV count between numerical values within 1 Column - ptaylor520 - Jul-15-2019 fileuploadid completedate ... Difference successful (SUM) 0 156691956 03/06/19 00:00 ... 0 1 1 156691958 03/06/19 00:00 ... 0 1 2 156730185 10/06/19 00:00 ... 0 1 3 156815823 25/06/19 00:00 ... 0 1 4 156752103 14/06/19 00:00 ... -4 1 Hi There, fairly new to python. I have imported the csv above, head() output shown. I am been trying to start a script that will read the column 'Difference' and tell me how many instances of rows are there with a numerical value between 0 and 1. Can I do this in one simple query or do you need to create a list of all rows that contain a value of 0 and 1 and then do a count of that list? From there I can build the script to compute what normally is done through excel like this: =COUNTIFS(F:F, ">=0", F:F, "<=1") What I have seen from courses and online hasn't worked so far. One of the examples I have seen is this: df = pd.DataFrame({'a': [1, 2, 3]}) df[(df.a > 1) & (df.a < 3)].count()But when applied to my datadrame it's not giving the output I am expecting. Any assistance anybody can provide will be greatly appreciated. RE: Pandas Import CSV count between numerical values within 1 Column - ptaylor520 - Jul-15-2019 I managed to get a workaround but I don't think this this is the best practice: import pandas as pd tsjune = pd.read_csv("/Users/phil/Documents/datacsv/tsj.csv", delimiter = ',') one = tsjune[(tsjune.Difference >= 0) & (tsjune.Difference <=1)] two = tsjune[(tsjune.Difference > 1) & (tsjune.Difference <=3)] three = tsjune[(tsjune.Difference > 3) & (tsjune.Difference <30)] four = tsjune[tsjune.Difference >= 30] onecount = one.count() twocount = two.count() threecount = three.count() fourcount = four.count() total = onecount + twocount + threecount + fourcount onepercentage = round(onecount/total*100, 2) twopercentage = round(twocount/total*100, 2) threepercentage = round(threecount/total*100, 2) fourpercentage = round(fourcount/total*100, 2) print(onepercentage['Difference']) print(twopercentage['Difference']) print(threepercentage['Difference']) print(fourpercentage['Difference'])Which gives me a percentage output of: 26.89 28.29 32.8 12.01 Any better ways of writing this query? RE: Pandas Import CSV count between numerical values within 1 Column - scidam - Jul-16-2019 You need to work around .cut method.As a starting point, consider the following: df.groupby(pd.cut(df.x, [0, 1, 3, 30, df.Difference.max()], right=False)).count().values / len(df) * 100 RE: Pandas Import CSV count between numerical values within 1 Column - ptaylor520 - Jul-16-2019 Thanks! I'll give this a shot and post the results. |