Python Forum
Sample labels from excel file in order to put them on x-axis and y-axis of a plot
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Sample labels from excel file in order to put them on x-axis and y-axis of a plot
#1
Hello, I have an excel file with two columns, the 1st includes dates/datetimes and the 2nd includes temperatures. The number of rows are 100. I want to plot the values, but both on the x-axis and y-axis I want to display sampled values as labels from the excel file and not all of them. Obviously I need to include all the values inside the plot, however on x-axis and y-axis display sampled values as labels from the excel file. For instance display the labels of the 1st excel column on x-axis with step 10 (every 10 labels) and display the labels of the 2nd excel column on y-axis with step 10 (every 10 labels) . Any ideas how to implement that?
Reply
#2
you can open the excel files with pandas,
or use
openpyxl https://pypi.org/project/openpyxl/
Docs: https://openpyxl.readthedocs.io/en/stable/
hobbyist likes this post
Reply
#3
(Sep-10-2021, 09:18 AM)Larz60+ Wrote: you can open the excel files with pandas,
or use
openpyxl https://pypi.org/project/openpyxl/
Docs: https://openpyxl.readthedocs.io/en/stable/

Thanks! And how do I do the sampling in excel values?
Reply
#4
This is well explained in the documents.
Pandas PDF here: https://pandas.pydata.org/docs/pandas.pdf
In pandas look at read_excel Section 2.4.6, Page 333
and to_excel same section

for openpyxl, see: https://openpyxl.readthedocs.io/en/stable/
hobbyist likes this post
Reply
#5
Thank you!! I studied the links you posted... I also found this: https://pandas.pydata.org/docs/reference...excel.html where it includes:

skiprowslist-like, int, or callable, optional
Line numbers to skip (0-indexed) or number of lines to skip (int) at the start of the file. If callable, the callable function will be evaluated against the row indices, returning True if the row should be skipped and False otherwise. An example of a valid callable argument would be lambda x: x in [0, 2].


So, I have a sense of how to make it work. But the main problem is that I have an excel file with 5000 rows. I need all the values to be inside x,y area but on x-axis to have values (labels) every 5000/10 = 500 rows, meaning:

row 0, row 500, row 1000, row 1500 ... row 5000. How I skip the between (1, 499) rows and the between (501 - 999) rows and .... e.t.c. and put them in skiprows flag? Is there any other way? Do you see what I am trying to implement? Thank you...
Reply
#6
df.iloc[::500, :] should read every 500th row.
hobbyist likes this post
Reply
#7
(Sep-12-2021, 11:16 AM)Larz60+ Wrote: df.iloc[::500, :] should read every 500th row.

Thanks! I put it in skiprows flag I suppose?
Reply
#8
df would be the name of your dataframe
so newdf = df.iloc[::500, :]
or simply print(df.iloc[::500, :])
'should' work
hobbyist likes this post
Reply
#9
Here's an example I ran with an excel file that I had, replace filename with one of your own and try it:
import pandas as pd
from pathlib import Path
import os

def show_incremental_rows(filename, increment):
    df = pd.read_excel(filename)
    print(df.iloc[::increment, :])


def main():
    os.chdir(os.path.abspath(os.path.dirname(__file__)))
    show_incremental_rows('GP2016Cty_637257362220392064.xls', 200)


if __name__ == '__main__':
    main()
Output:
ID Election OfficeName VoteFor BallotGroup ... CanAffilCommit County Votes PartyName PartyAbbrev 0 1 GP 2016 100TH REPRESENTATIVE 1 J ... NaN CALHOUN 525 REPUBLICAN REP 500 501 GP 2016 11TH REPUBLICAN DELEGATE 3 G ... BUSH WILL 409 REPUBLICAN REP 1000 1001 GP 2016 12TH REPUBLICAN ALTERNATE DELEGATE 3 G ... BUSH JACKSON 99 REPUBLICAN REP 1500 1501 GP 2016 12TH REPUBLICAN DELEGATE 3 G ... TRUMP WILLIAMSON 3549 REPUBLICAN REP 2000 2001 GP 2016 13TH REPUBLICAN ALTERNATE DELEGATE 3 G ... RUBIO MADISON 915 REPUBLICAN REP 2500 2501 GP 2016 14TH DEMOCRATIC DELEGATE 4 G ... CLINTON LAKE 3881 DEMOCRATIC DEM 3000 3001 GP 2016 15TH DEMOCRATIC DELEGATE 4 G ... CLINTON BOND 377 DEMOCRATIC DEM 3500 3501 GP 2016 15TH REPUBLICAN ALTERNATE DELEGATE 3 G ... CARSON EDWARDS 73 REPUBLICAN REP 4000 4001 GP 2016 15TH REPUBLICAN ALTERNATE DELEGATE 3 G ... CRUZ MOULTRIE 1091 REPUBLICAN REP 4500 4501 GP 2016 15TH REPUBLICAN DELEGATE 3 G ... PAUL RICHLAND 40 REPUBLICAN REP 5000 5001 GP 2016 16TH DEMOCRATIC DELEGATE 5 G ... SANDERS IROQUOIS 419 DEMOCRATIC DEM 5500 5501 GP 2016 16TH REPUBLICAN DELEGATE 3 G ... CARSON WINNEBAGO 435 REPUBLICAN REP 6000 6001 GP 2016 17TH DEMOCRATIC DELEGATE 6 G ... CLINTON MERCER 761 DEMOCRATIC DEM 6500 6501 GP 2016 17TH REPUBLICAN DELEGATE 3 G ... KASICH PEORIA 782 REPUBLICAN REP 7000 7001 GP 2016 18TH DEMOCRATIC DELEGATE 4 G ... SANDERS MORGAN 729 DEMOCRATIC DEM 7500 7501 GP 2016 18TH REPUBLICAN ALTERNATE DELEGATE 3 G ... FIORINA HANCOCK 16 REPUBLICAN REP 8000 8001 GP 2016 18TH REPUBLICAN DELEGATE 3 G ... KASICH MORGAN 698 REPUBLICAN REP 8500 8501 GP 2016 2ND REPUBLICAN DELEGATE 3 G ... FIORINA COOK 83 REPUBLICAN REP 9000 9001 GP 2016 57TH REPRESENTATIVE 1 J ... NaN LAKE 670 DEMOCRATIC DEM 9500 9501 GP 2016 6TH REPUBLICAN ALTERNATE DELEGATE 3 G ... CRUZ COOK 3096 REPUBLICAN REP 10000 10001 GP 2016 8TH REPUBLICAN DELEGATE 3 G ... CARSON KANE 59 REPUBLICAN REP 10500 10501 GP 2016 PRESIDENT 1 A ... NaN ADAMS 119 REPUBLICAN REP 11000 11001 GP 2016 PRESIDENT 1 A ... NaN CALHOUN 2 REPUBLICAN REP 11500 11501 GP 2016 PRESIDENT 1 A ... NaN LAKE 44823 DEMOCRATIC DEM 12000 12001 GP 2016 PRESIDENT 1 A ... NaN CRAWFORD 0 REPUBLICAN REP 12500 12501 GP 2016 UNITED STATES SENATOR 1 A ... NaN HAMILTON 62 DEMOCRATIC DEM 13000 13001 GP 2016 COMPTROLLER 1 B ... NaN LaSALLE 11592 REPUBLICAN REP
hobbyist likes this post
Reply
#10
Thank you for your code! I also use this source: https://stackoverflow.com/questions/4679...ws-to-plot and I am trying this code with the help of your code above:

import pandas as pd
from pathlib import Path
import os
import numpy as np
import matplotlib.pyplot as plt

 
def show_incremental_rows(filename, increment):
    df = pd.read_excel(filename)
    ser = df.iloc[::increment, :]
    print(ser)
 
    fig, ax = plt.subplots()
    ser.plot(ax=ax)
    ax.set_xlabel("date")
    ax.set_ylabel("temperature")
    plt.show() 
 
def main():
    os.chdir(os.path.abspath(os.path.dirname(__file__)))
    show_incremental_rows('/home/hobbyist/Desktop/temperatures.xlsx', 1000)
 
if __name__ == '__main__':
    main()
Although it prints the excel rows in the terminal, it does not plot them...I believe this command
read_excel
reads the whole excel file, so I do not need to specify something more...
Any ideas what is wrong here?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Python openyxl not updating Excel file MrBean12 1 252 Mar-03-2024, 12:16 AM
Last Post: MrBean12
  Copy Paste excel files based on the first letters of the file name Viento 2 349 Feb-07-2024, 12:24 PM
Last Post: Viento
  Search Excel File with a list of values huzzug 4 1,148 Nov-03-2023, 05:35 PM
Last Post: huzzug
  Updating sharepoint excel file odd results cubangt 1 756 Nov-03-2023, 05:13 PM
Last Post: noisefloor
  IndexError: index 10 is out of bounds for axis 0 with size 10 Mehboob 11 1,956 Sep-14-2023, 06:54 AM
Last Post: Mehboob
  matplotlib x-axis text move bottom upward jacklee26 3 930 May-31-2023, 04:28 AM
Last Post: jacklee26
  Print names in x-axis of a time-series values hobbyist 4 1,178 Apr-22-2023, 09:29 PM
Last Post: deanhystad
  Reading data from excel file –> process it >>then write to another excel output file Jennifer_Jone 0 1,048 Mar-14-2023, 07:59 PM
Last Post: Jennifer_Jone
  Save and Close Excel File avd88 0 2,845 Feb-20-2023, 07:19 PM
Last Post: avd88
  Trying to access excel file on our sharepoint server but getting errors cubangt 0 773 Feb-16-2023, 08:11 PM
Last Post: cubangt

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020