Dec-03-2018, 02:23 AM
(This post was last modified: Dec-03-2018, 02:23 AM by 577e94982d620b84f7c536d5e76f1e.)
Hey everyone,
I'm trying to organise and filter data in order to plot variables against each other.
I have a DataFrame with 2 columns, x and y. I'm trying to form box plots with the data however I'm trying to separate the x values in sub groups and then assign the y value to that sub group. For example, the subgroups of x are 0-10, 10,20, 20-30 etc, and if the original DataFrame has a value of 15 in the y, then form a new DataFrame which has the y values and the subgroup which it belongs too.
I've attached an image to hopefully explain the problem better.![[Image: jX1C4HL]](https://imgur.com/a/jX1C4HL)
So far I've imported the csv file and assigned a new DataFrame to the columns I need. I'm not so sure how to check if the value of x is part of the subset; specifically in regards to the correct process.
I was thinking of making a new list with the subgroups 0-10 increasing by 10 to 90-100 and then attempt to create an if statement.
Thanks!
I'm trying to organise and filter data in order to plot variables against each other.
I have a DataFrame with 2 columns, x and y. I'm trying to form box plots with the data however I'm trying to separate the x values in sub groups and then assign the y value to that sub group. For example, the subgroups of x are 0-10, 10,20, 20-30 etc, and if the original DataFrame has a value of 15 in the y, then form a new DataFrame which has the y values and the subgroup which it belongs too.
I've attached an image to hopefully explain the problem better.
So far I've imported the csv file and assigned a new DataFrame to the columns I need. I'm not so sure how to check if the value of x is part of the subset; specifically in regards to the correct process.
I was thinking of making a new list with the subgroups 0-10 increasing by 10 to 90-100 and then attempt to create an if statement.
import pandas as pd import csv # importing the main dataset dataset1 = pd.read_csv('dataset1.csv', index_col=0) # creating a DataFrame with columns x and y xydata = dataset1.loc[:, [x, y]] # creating subset of x values wd_bins = ["0-10", "10-20", "20-30","30-40","40-50","50,60","60-70","70-80","80-90","90-100"]I am relatively new to Python, any help or advice would be greatly appreciated.
Thanks!