Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Working with excel files
#1
Hi,

I have one excel file with 2 sheets. There are 6 columns, year, month, day, T2M_MAX, T2M_MIN e PRECTOT.

I need to select in year 2015 the highest value for T2M_MAX and save the information in the second worksheet. Same thing for T2M_MIN, but it's the minimum value that must be saved in spreadsheet 2. I need to group these two pieces of information by year and month. How to do this in python?


see the attached example...

Thank´s
.xlsx   sample.xlsx (Size: 11.79 KB / Downloads: 212)
Reply
#2
using the excel file I have attached, you can read the data with pandas

This example shows opening xlsx file with two sheets
Using your file, You can use same method to read your data, select what you want and rewrite (see references at bottom):
import pandas as pd
from pathlib import Path
import os

# Set starting path
os.chdir(os.path.abspath(os.path.dirname(__file__)))
homepath = Path('.')
excel_file = homepath / '../data/excel/Data.xlsx'

exfile = pd.ExcelFile(excel_file)

df1 = pd.read_excel(exfile, 'ZipcodeData')
df2 = pd.read_excel(exfile, 'Geography')

print(f"\n{df1}\n")
print(f"\n{df2}\n")
Output:
CountyName CountySeat FipsCd LocalCountyId ZipCode 0 Boone Belvidere 7 4 61008 1 Boone Belvidere 7 4 61011 2 Boone Belvidere 7 4 61011 3 Boone Belvidere 7 4 61012 4 Carroll Mount Carroll 15 8 61014 5 Boone Belvidere 7 4 61038 6 Boone Belvidere 7 4 61038 7 Boone Belvidere 7 4 61038 8 Carroll Mount Carroll 15 8 61046 9 Carroll Mount Carroll 15 8 61046 10 Carroll Mount Carroll 15 8 61051 11 Carroll Mount Carroll 15 8 61053 12 Carroll Mount Carroll 15 8 61053 13 Boone Belvidere 7 4 61065 14 Boone Belvidere 7 4 61065 15 Boone Belvidere 7 4 61065 16 Carroll Mount Carroll 15 8 61074 17 Carroll Mount Carroll 15 8 61074 Zip City State Latitude Longitude Timezone Daylight savings time flag Latitude.1 Longitude.1 0 61008 Belvidere IL 42.257229 -88.84755 -6 1 42.257229 -88.84755 1 61011 Caledonia IL 42.376423 -88.93250 -6 1 42.376423 -88.93250 2 61012 Capron IL 42.401104 -88.74726 -6 1 42.401104 -88.74726 3 61014 Chadwick IL 41.970680 -89.87508 -6 1 41.970680 -89.87508 4 61038 Garden Prairie IL 42.255617 -88.74344 -6 1 42.255617 -88.74344 5 61046 Lanark IL 42.111421 -89.82436 -6 1 42.111421 -89.82436 6 61051 Milledgeville IL 41.965904 -89.75896 -6 1 41.965904 -89.75896 7 61053 Mount Carroll IL 42.100900 -89.98255 -6 1 42.100900 -89.98255 8 61065 Poplar Grove IL 42.364394 -88.84665 -6 1 42.364394 -88.84665 9 61074 Savanna IL 42.099635 -90.12858 -6 1 42.099635 -90.12858

.xlsx   Data.xlsx (Size: 7.08 KB / Downloads: 95)

References:
read pandas data from excel: https://pandas.pydata.org/docs/reference...excel.html
write padnas data to excel: https://pandas.pydata.org/docs/reference...excel.html
arsouzaesilva likes this post
Reply
#3
I had to do something like this once.

I just adapted an old file I had to get your data, using openpyxl.

Just change the paths and paste myApp() in your shell to try it.

I deleted the existing data in your sheet 2 first, and put the same column headers as sheet 1.

If you really don't want the column DY, you can delete it with: targetFile[target].delete_cols(3)

def myApp():
    import openpyxl
    from openpyxl.styles import PatternFill

    # set the fill colour to highlight the maximum and minimum
    yellow = "00FFFF00"

    print('Opening the file /home/pedro/Downloads/sample.xlsx ....')
    pathToExcel = '/home/pedro/Downloads/sample.xlsx'

    print('Opening the target file ....')

    targetFile = openpyxl.load_workbook(pathToExcel)
    print('Target opened ...')

    # print the target file sheet names

    targetFileSheetNames = targetFile.sheetnames
    for sheet in targetFileSheetNames:
        print('The target file has the following worksheets', sheet)

    source = targetFileSheetNames[0]
    target = targetFileSheetNames[1]

    max_row_source = targetFile[source].max_row + 1

    # reset these values if you repeat the following loop
    T2M_MAX = targetFile[source].cell(row=2, column=4).value
    T2M_MIN = targetFile[source].cell(row=2, column=5).value

    # find the maximum and minimum values
    # also save the corresponding row numbers of the max and min values
    # if there can be 2 rows with the maximum value, you can get that later
    # search the rows by value

    for rowNum in range(2, max_row_source):
        if targetFile[source].cell(row=rowNum, column=1).value == 2015:
            print('found the correct year ... ')        
            maxValue = targetFile[source].cell(row=rowNum, column=4).value
            if maxValue > T2M_MAX:
                T2M_MAX = maxValue
                maxRownr = rowNum
                print('maxRownr is:', maxRownr)
            minValue = targetFile[source].cell(row=rowNum, column=5).value        
            if minValue < T2M_MIN:
                T2M_MIN = minValue
                minRownr = rowNum
                print('minRownr is:', minRownr)

    print('The maximum value in column T2M_MAX is:', T2M_MAX)
    print('The minimum value in column T2M_MIN is:', T2M_MIN)

    # now write these values in the target sheet
    # first get the maximum column number
    # add 1 or you won't get the last column
    max_col_source = targetFile[source].max_column + 1

    # get the maximum row in the target sheet
    # I deleted all rows in the target sheet, except the headers
    # so the row to write to is row 2
    max_row_target = targetFile[target].max_row + 1

    # get the whole row with the maximum value
    # write it to the target sheet
    print('Writing the maximum values to the target sheet ... ')
    for col in range(1, max_col_source):    
        getValue = targetFile[source].cell(row=maxRownr, column=col).value
        targetFile[target].cell(row=max_row_target, column=col, value=getValue)
        if col == 4:
            targetFile[target].cell(row= max_row_target, column=col).fill = PatternFill(start_color=yellow, end_color=yellow, fill_type = "solid")
        targetFile[target].cell(row=max_row_target, column=col+1, value='This is the maximum value row')

    # get the whole row with the minimum value
    
    # add 1 or you won't get the next empty row
    max_row_target = targetFile[target].max_row + 1

    # get the whole row with the minimum value
    print('Writing the minimum values to the target sheet ... ')
    for col in range(1, max_col_source):
        getValue = targetFile[source].cell(row=minRownr, column=col).value
        targetFile[target].cell(row=max_row_target, column=col, value=getValue)
        if col == 5:
            targetFile[target].cell(row=max_row_target, column=col).fill = PatternFill(start_color=yellow, end_color=yellow, fill_type = "solid")
        targetFile[target].cell(row=max_row_target, column=col+1, value='This is the minimum value row')
    
    # if for some reason you don't want the column DY in the target sheet
    # you can delete it with this
    # targetFile[target].delete_cols(3)
    
    print('Saving the results to:', pathToExcel)
    targetFile.save(pathToExcel)
    print('All done!')
arsouzaesilva likes this post
Reply
#4
pandas is a better solution, but openpyxl will work
arsouzaesilva likes this post
Reply
#5
(Aug-25-2021, 03:12 AM)Larz60+ Wrote: pandas is a better solution, but openpyxl will work

Tell me if I am wrong, but I always think pandas is for scientists, big-data, enormous amounts of input/output.

openpyxl works fine for my little needs.
arsouzaesilva likes this post
Reply
#6
pandas is for scientists? Not just scientists, There's even a book titles 'Pandas for Everyone'.

It's easier to use that openpyxl, code I show uses 3 lines of code to open excel file, and read contents of two sheets into dataframes.

That said, I did say ' but openpyxl will work' in my last post.
arsouzaesilva likes this post
Reply
#7
Thank´s all for your help
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Copy Paste excel files based on the first letters of the file name Viento 2 416 Feb-07-2024, 12:24 PM
Last Post: Viento
  Working with Excel and Word, Several Questions Regarding Find and Replace Brandon_Pickert 4 1,538 Feb-11-2023, 03:59 PM
Last Post: Brandon_Pickert
  How to loop through all excel files and sheets in folder jadelola 1 4,447 Dec-01-2022, 06:12 PM
Last Post: deanhystad
  Creating csv files from Excel file azizrasul 40 5,560 Nov-03-2022, 08:33 PM
Last Post: azizrasul
  win32com — How to resolve “AttributeError: xlUp” for Excel files? JaneTan 2 4,207 Aug-18-2021, 05:27 AM
Last Post: snippsat
Question Working with existing files Gilush 7 3,007 Feb-10-2021, 08:55 PM
Last Post: Gilush
  filecmp is not working for UTF-8 BOM encoded files sureshnagarajan 3 2,596 Feb-10-2021, 11:17 AM
Last Post: sureshnagarajan
  MP3 Tags to Excel Not working giddyhead 5 2,880 Jan-27-2021, 03:23 AM
Last Post: giddyhead
  code to read files in folders and transfer the file name, type, date created to excel Divya577 0 1,851 Dec-06-2020, 04:14 PM
Last Post: Divya577
  Creating Excel files compatible with microsoft access vkallavi 0 1,581 Sep-17-2020, 06:57 PM
Last Post: vkallavi

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020