Python Forum - Sample labels from excel file in order to put them on x-axis and y-axis of a plot

Pages: 1 2

Hello, I have an excel file with two columns, the 1st includes dates/datetimes and the 2nd includes temperatures. The number of rows are 100. I want to plot the values, but both on the x-axis and y-axis I want to display sampled values as labels from the excel file and not all of them. Obviously I need to include all the values inside the plot, however on x-axis and y-axis display sampled values as labels from the excel file. For instance display the labels of the 1st excel column on x-axis with step 10 (every 10 labels) and display the labels of the 2nd excel column on y-axis with step 10 (every 10 labels) . Any ideas how to implement that?

you can open the excel files with pandas,
or use
openpyxl https://pypi.org/project/openpyxl/
Docs: https://openpyxl.readthedocs.io/en/stable/

(Sep-10-2021, 09:18 AM)Larz60+ Wrote: [ -> ]you can open the excel files with pandas,
or use
openpyxl https://pypi.org/project/openpyxl/
Docs: https://openpyxl.readthedocs.io/en/stable/

Thanks! And how do I do the sampling in excel values?

This is well explained in the documents.
Pandas PDF here: https://pandas.pydata.org/docs/pandas.pdf
In pandas look at read_excel Section 2.4.6, Page 333
and to_excel same section

for openpyxl, see: https://openpyxl.readthedocs.io/en/stable/

Thank you!! I studied the links you posted... I also found this: https://pandas.pydata.org/docs/reference...excel.html where it includes:

skiprowslist-like, int, or callable, optional
Line numbers to skip (0-indexed) or number of lines to skip (int) at the start of the file. If callable, the callable function will be evaluated against the row indices, returning True if the row should be skipped and False otherwise. An example of a valid callable argument would be lambda x: x in [0, 2].

So, I have a sense of how to make it work. But the main problem is that I have an excel file with 5000 rows. I need all the values to be inside x,y area but on x-axis to have values (labels) every 5000/10 = 500 rows, meaning:

row 0, row 500, row 1000, row 1500 ... row 5000. How I skip the between (1, 499) rows and the between (501 - 999) rows and .... e.t.c. and put them in skiprows flag? Is there any other way? Do you see what I am trying to implement? Thank you...

df.iloc[::500, :] should read every 500th row.

(Sep-12-2021, 11:16 AM)Larz60+ Wrote: [ -> ]df.iloc[::500, :] should read every 500th row.

Thanks! I put it in skiprows flag I suppose?

df would be the name of your dataframe
so newdf = df.iloc[::500, :]
or simply print(df.iloc[::500, :])
'should' work

Here's an example I ran with an excel file that I had, replace filename with one of your own and try it:

import pandas as pd
from pathlib import Path
import os

def show_incremental_rows(filename, increment):
    df = pd.read_excel(filename)
    print(df.iloc[::increment, :])


def main():
    os.chdir(os.path.abspath(os.path.dirname(__file__)))
    show_incremental_rows('GP2016Cty_637257362220392064.xls', 200)


if __name__ == '__main__':
    main()

Output:          ID Election                          OfficeName  VoteFor BallotGroup  ...  CanAffilCommit      County  Votes   PartyName PartyAbbrev
0          1  GP 2016                100TH REPRESENTATIVE        1           J  ...             NaN     CALHOUN    525  REPUBLICAN         REP
500      501  GP 2016            11TH REPUBLICAN DELEGATE        3           G  ...            BUSH        WILL    409  REPUBLICAN         REP
1000    1001  GP 2016  12TH REPUBLICAN ALTERNATE DELEGATE        3           G  ...            BUSH     JACKSON     99  REPUBLICAN         REP
1500    1501  GP 2016            12TH REPUBLICAN DELEGATE        3           G  ...           TRUMP  WILLIAMSON   3549  REPUBLICAN         REP
2000    2001  GP 2016  13TH REPUBLICAN ALTERNATE DELEGATE        3           G  ...           RUBIO     MADISON    915  REPUBLICAN         REP
2500    2501  GP 2016            14TH DEMOCRATIC DELEGATE        4           G  ...         CLINTON        LAKE   3881  DEMOCRATIC         DEM
3000    3001  GP 2016            15TH DEMOCRATIC DELEGATE        4           G  ...         CLINTON        BOND    377  DEMOCRATIC         DEM
3500    3501  GP 2016  15TH REPUBLICAN ALTERNATE DELEGATE        3           G  ...          CARSON     EDWARDS     73  REPUBLICAN         REP
4000    4001  GP 2016  15TH REPUBLICAN ALTERNATE DELEGATE        3           G  ...            CRUZ    MOULTRIE   1091  REPUBLICAN         REP
4500    4501  GP 2016            15TH REPUBLICAN DELEGATE        3           G  ...            PAUL    RICHLAND     40  REPUBLICAN         REP
5000    5001  GP 2016            16TH DEMOCRATIC DELEGATE        5           G  ...         SANDERS    IROQUOIS    419  DEMOCRATIC         DEM
5500    5501  GP 2016            16TH REPUBLICAN DELEGATE        3           G  ...          CARSON   WINNEBAGO    435  REPUBLICAN         REP
6000    6001  GP 2016            17TH DEMOCRATIC DELEGATE        6           G  ...         CLINTON      MERCER    761  DEMOCRATIC         DEM
6500    6501  GP 2016            17TH REPUBLICAN DELEGATE        3           G  ...          KASICH      PEORIA    782  REPUBLICAN         REP
7000    7001  GP 2016            18TH DEMOCRATIC DELEGATE        4           G  ...         SANDERS      MORGAN    729  DEMOCRATIC         DEM
7500    7501  GP 2016  18TH REPUBLICAN ALTERNATE DELEGATE        3           G  ...         FIORINA     HANCOCK     16  REPUBLICAN         REP
8000    8001  GP 2016            18TH REPUBLICAN DELEGATE        3           G  ...          KASICH      MORGAN    698  REPUBLICAN         REP
8500    8501  GP 2016             2ND REPUBLICAN DELEGATE        3           G  ...         FIORINA        COOK     83  REPUBLICAN         REP
9000    9001  GP 2016                 57TH REPRESENTATIVE        1           J  ...             NaN        LAKE    670  DEMOCRATIC         DEM
9500    9501  GP 2016   6TH REPUBLICAN ALTERNATE DELEGATE        3           G  ...            CRUZ        COOK   3096  REPUBLICAN         REP
10000  10001  GP 2016             8TH REPUBLICAN DELEGATE        3           G  ...          CARSON        KANE     59  REPUBLICAN         REP
10500  10501  GP 2016                           PRESIDENT        1           A  ...             NaN       ADAMS    119  REPUBLICAN         REP
11000  11001  GP 2016                           PRESIDENT        1           A  ...             NaN     CALHOUN      2  REPUBLICAN         REP
11500  11501  GP 2016                           PRESIDENT        1           A  ...             NaN        LAKE  44823  DEMOCRATIC         DEM
12000  12001  GP 2016                           PRESIDENT        1           A  ...             NaN    CRAWFORD      0  REPUBLICAN         REP
12500  12501  GP 2016               UNITED STATES SENATOR        1           A  ...             NaN    HAMILTON     62  DEMOCRATIC         DEM
13000  13001  GP 2016                         COMPTROLLER        1           B  ...             NaN     LaSALLE  11592  REPUBLICAN         REP

Thank you for your code! I also use this source: https://stackoverflow.com/questions/4679...ws-to-plot and I am trying this code with the help of your code above:

import pandas as pd
from pathlib import Path
import os
import numpy as np
import matplotlib.pyplot as plt

 
def show_incremental_rows(filename, increment):
    df = pd.read_excel(filename)
    ser = df.iloc[::increment, :]
    print(ser)
 
    fig, ax = plt.subplots()
    ser.plot(ax=ax)
    ax.set_xlabel("date")
    ax.set_ylabel("temperature")
    plt.show() 
 
def main():
    os.chdir(os.path.abspath(os.path.dirname(__file__)))
    show_incremental_rows('/home/hobbyist/Desktop/temperatures.xlsx', 1000)
 
if __name__ == '__main__':
    main()

Although it prints the excel rows in the terminal, it does not plot them...I believe this command

read_excel

reads the whole excel file, so I do not need to specify something more...
Any ideas what is wrong here?

Pages: 1 2