Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Newbie needs help
#1
Brand new Python user, long time APL user.

In APL I have the ability to read a matrix, designate a pararmeter (or multiple parameters) and there is functionality that will return the values in the columns that match the designated parameter(s). Here is an example:

State type Jan Feb Mar
MA widgets 1000 1200 800
MA wuzzels 555 700 750
RI widgets 200 300 375
RI wuzzels 400 300 275

In APL I have the ability to return all the cols that have MA and the function would return:
widgets 1000 1200 800
wuzzels 555 700 750

or I could say return all the cols that have RI and widgets and the function would return:
200 300 375

How would I accomplish this in Python? I started by creating a matrix in Excel. I have both the pandas and openpyxl libraries open. I've watched many tutorials that show how to read a specific cell, or entire row or entire column from an Excel table but I haven't been able to find anything like I do in APL.

Ideally, I would like to be able to read the table in Excel, find what row(s) match my designated parameters and then return the associated column data and assign it to a varaible (as values).

Any ideas? Or any resources I can use/review? Thanks!
Reply
#2
You can do what you describe using pandas. In the example below I use StringIO to simulate reading the data from a CSV file.
import pandas as pd
from io import StringIO


file = StringIO(
    """State type Jan Feb Mar
MA widgets 1000 1200 800
MA wuzzels 555 700 750
RI widgets 200 300 375
RI wuzzels 400 300 275"""
)

data = pd.read_csv(file, sep=" ")
print("Data\n", data, sep="\n")

ma_data = data.loc[data.State == "MA"]
print("\nMA Data\n", ma_data, sep="\n")
print("\nMA Data without State\n", ma_data.drop("State", axis=1), sep="\n")
print("\nMA Data without State or type\n", ma_data.drop(["State", "type"], axis=1), sep="\n")
Output:
Data State type Jan Feb Mar 0 MA widgets 1000 1200 800 1 MA wuzzels 555 700 750 2 RI widgets 200 300 375 3 RI wuzzels 400 300 275 MA Data State type Jan Feb Mar 0 MA widgets 1000 1200 800 1 MA wuzzels 555 700 750 MA Data without State type Jan Feb Mar 0 widgets 1000 1200 800 1 wuzzels 555 700 750 MA Data without State or type Jan Feb Mar 0 1000 1200 800 1 555 700 750
Reply
#3
(Jan-24-2025, 03:03 PM)deanhystad Wrote: You can do what you describe using pandas. In the example below I use StringIO to simulate reading the data from a CSV file.
import pandas as pd
from io import StringIO


file = StringIO(
    """State type Jan Feb Mar
MA widgets 1000 1200 800
MA wuzzels 555 700 750
RI widgets 200 300 375
RI wuzzels 400 300 275"""
)

data = pd.read_csv(file, sep=" ")
print("Data\n", data, sep="\n")

ma_data = data.loc[data.State == "MA"]
print("\nMA Data\n", ma_data, sep="\n")
print("\nMA Data without State\n", ma_data.drop("State", axis=1), sep="\n")
print("\nMA Data without State or type\n", ma_data.drop(["State", "type"], axis=1), sep="\n")
Output:
Data State type Jan Feb Mar 0 MA widgets 1000 1200 800 1 MA wuzzels 555 700 750 2 RI widgets 200 300 375 3 RI wuzzels 400 300 275 MA Data State type Jan Feb Mar 0 MA widgets 1000 1200 800 1 MA wuzzels 555 700 750 MA Data without State type Jan Feb Mar 0 widgets 1000 1200 800 1 wuzzels 555 700 750 MA Data without State or type Jan Feb Mar 0 1000 1200 800 1 555 700 750

Thank you for the response. Much appreciated.

Ultimate goal is to sum up the sales for the quarter, based on specified parameters.
(ex 1) What are the quarterly sales of RI wuzzels?
(ex 2) What are the number of MA widgets sold in March?

And I dont want to Print the result, I want to assign the result to a variable so I can use it further down my program.

Thanks
Reply
#4
I think pandas will work very well for all you want to do.
Gribouillis likes this post
Reply
#5
If you want to use Excel in Python, try the module openpyxl

Basically, you are starting with a csv file, so convert that to xl first, then get whatever you want from the openpyxl data.

import openpyxl
import csv

path2file = 'csv/csv_files/widgit.csv'

# open an empty workbook
wb = openpyxl.Workbook()
# designate the only work sheet in the new wb
ws = wb.active

# change the csv to xlsx
with open(path2file) as f:
    # set the delimiter to space for your given data
    reader = csv.reader(f, delimiter=' ')
    for row in reader:
        ws.append(row)

savename = 'openpyxl/xl_files/wigit.xlsx'
wb.save(savename)

sourceFile = openpyxl.load_workbook(savename)
sourceFilesheets = sourceFile.sheetnames
print(sourceFilesheets) # there is 1 sheet called Sheet in every new wb

# access things you want to see
for sheet in sourceFilesheets:
    print('sheet is', sheet)
    maxRow = sourceFile[sheet].max_row
    maxCol = sourceFile[sheet].max_column
    for rowNum in range(2, maxRow + 1):
        print('id number is', sourceFile[sheet].cell(row=rowNum, column=1).value)
        for colNum in range(1, maxCol + 1):
            header = sourceFile[sheet].cell(row=rowNum, column=colNum).value
            print('Column headers are', header)

# (ex 1) What are the quarterly sales of RI wuzzels?
# (ex 2) What are the number of MA widgets sold in March?
for sheet in sourceFilesheets:
    print('sheet is', sheet)
    maxRow = sourceFile[sheet].max_row
    maxCol = sourceFile[sheet].max_column
    # row 1 is headers, start with row 2
    for rowNum in range(2, maxRow + 1):
        state = sourceFile[sheet].cell(row=rowNum, column=1).value
        product = 'wuzzels'
        print(f'State is {state}, product is {product}')
        if state == 'RI' and product == 'wuzzels':
            total_sales = 0
            for colNum in range(3, maxCol + 1):
                total_sales = total_sales + int(sourceFile[sheet].cell(row=rowNum, column=colNum).value)

print(f'Total sales of wuzzels in RI for the first quarter was {total_sales}.')
Reply
#6
As mentioned Pandas will work very well for this case,you most try some stuff yourself.
Example quarterly sales of RI wuzzels
import pandas as pd
from io import StringIO

file = StringIO(
    """State type Jan Feb Mar
MA widgets 1000 1200 800
MA wuzzels 555 700 750
RI widgets 200 300 375
RI wuzzels 400 300 275"""
)

data = pd.read_csv(file, sep=" ")
# Quarterly sales of RI wuzzels
print(data)

ri_wuzzels_data = data.loc[(data['State'] == 'RI') & (data['type'] == 'wuzzels')]
ri_wuzzels_quarterly_sales = ri_wuzzels_data[['Jan', 'Feb', 'Mar']].sum(axis=1).iloc[0]
print(f"\nQuarterly sales of RI wuzzels: {ri_wuzzels_quarterly_sales}"
Output:
State type Jan Feb Mar 0 MA widgets 1000 1200 800 1 MA wuzzels 555 700 750 2 RI widgets 200 300 375 3 RI wuzzels 400 300 275 Quarterly sales of RI wuzzels: 975
Quote:And I dont want to Print the result, I want to assign the result to a variable so I can use it further down my program.
Can of course to both,usally want print to see stuff working before move on.
The variable will just have the result,and can be used further.
>>> ri_wuzzels_quarterly_sales
975 
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020