Newbie needs help

actuary316 · Jan-24-2025, 01:31 PM

Brand new Python user, long time APL user.

In APL I have the ability to read a matrix, designate a pararmeter (or multiple parameters) and there is functionality that will return the values in the columns that match the designated parameter(s). Here is an example:

State type Jan Feb Mar
MA widgets 1000 1200 800
MA wuzzels 555 700 750
RI widgets 200 300 375
RI wuzzels 400 300 275

In APL I have the ability to return all the cols that have MA and the function would return:
widgets 1000 1200 800
wuzzels 555 700 750

or I could say return all the cols that have RI and widgets and the function would return:
200 300 375

How would I accomplish this in Python? I started by creating a matrix in Excel. I have both the pandas and openpyxl libraries open. I've watched many tutorials that show how to read a specific cell, or entire row or entire column from an Excel table but I haven't been able to find anything like I do in APL.

Ideally, I would like to be able to read the table in Excel, find what row(s) match my designated parameters and then return the associated column data and assign it to a varaible (as values).

Any ideas? Or any resources I can use/review? Thanks!

**deanhystad** · (This post was last modified: Jan-24-2025, 03:03 PM by deanhystad.)

You can do what you describe using pandas. In the example below I use StringIO to simulate reading the data from a CSV file.

import pandas as pd
from io import StringIO


file = StringIO(
    """State type Jan Feb Mar
MA widgets 1000 1200 800
MA wuzzels 555 700 750
RI widgets 200 300 375
RI wuzzels 400 300 275"""
)

data = pd.read_csv(file, sep=" ")
print("Data\n", data, sep="\n")

ma_data = data.loc[data.State == "MA"]
print("\nMA Data\n", ma_data, sep="\n")
print("\nMA Data without State\n", ma_data.drop("State", axis=1), sep="\n")
print("\nMA Data without State or type\n", ma_data.drop(["State", "type"], axis=1), sep="\n")

Output:Data

  State     type   Jan   Feb  Mar
0    MA  widgets  1000  1200  800
1    MA  wuzzels   555   700  750
2    RI  widgets   200   300  375
3    RI  wuzzels   400   300  275

MA Data

  State     type   Jan   Feb  Mar
0    MA  widgets  1000  1200  800
1    MA  wuzzels   555   700  750

MA Data without State

      type   Jan   Feb  Mar
0  widgets  1000  1200  800
1  wuzzels   555   700  750

MA Data without State or type

    Jan   Feb  Mar
0  1000  1200  800
1   555   700  750

actuary316 · Jan-24-2025, 03:35 PM

(Jan-24-2025, 03:03 PM)deanhystad Wrote: You can do what you describe using pandas. In the example below I use StringIO to simulate reading the data from a CSV file.

import pandas as pd
from io import StringIO


file = StringIO(
    """State type Jan Feb Mar
MA widgets 1000 1200 800
MA wuzzels 555 700 750
RI widgets 200 300 375
RI wuzzels 400 300 275"""
)

data = pd.read_csv(file, sep=" ")
print("Data\n", data, sep="\n")

ma_data = data.loc[data.State == "MA"]
print("\nMA Data\n", ma_data, sep="\n")
print("\nMA Data without State\n", ma_data.drop("State", axis=1), sep="\n")
print("\nMA Data without State or type\n", ma_data.drop(["State", "type"], axis=1), sep="\n")

Output:Data

  State     type   Jan   Feb  Mar
0    MA  widgets  1000  1200  800
1    MA  wuzzels   555   700  750
2    RI  widgets   200   300  375
3    RI  wuzzels   400   300  275

MA Data

  State     type   Jan   Feb  Mar
0    MA  widgets  1000  1200  800
1    MA  wuzzels   555   700  750

MA Data without State

      type   Jan   Feb  Mar
0  widgets  1000  1200  800
1  wuzzels   555   700  750

MA Data without State or type

    Jan   Feb  Mar
0  1000  1200  800
1   555   700  750

Thank you for the response. Much appreciated.

Ultimate goal is to sum up the sales for the quarter, based on specified parameters.
(ex 1) What are the quarterly sales of RI wuzzels?
(ex 2) What are the number of MA widgets sold in March?

And I dont want to Print the result, I want to assign the result to a variable so I can use it further down my program.

Thanks

**deanhystad** · Jan-24-2025, 05:12 PM

I think pandas will work very well for all you want to do.

Pedroski55 · Jan-24-2025, 05:54 PM

If you want to use Excel in Python, try the module openpyxl

Basically, you are starting with a csv file, so convert that to xl first, then get whatever you want from the openpyxl data.

import openpyxl
import csv

path2file = 'csv/csv_files/widgit.csv'

# open an empty workbook
wb = openpyxl.Workbook()
# designate the only work sheet in the new wb
ws = wb.active

# change the csv to xlsx
with open(path2file) as f:
    # set the delimiter to space for your given data
    reader = csv.reader(f, delimiter=' ')
    for row in reader:
        ws.append(row)

savename = 'openpyxl/xl_files/wigit.xlsx'
wb.save(savename)

sourceFile = openpyxl.load_workbook(savename)
sourceFilesheets = sourceFile.sheetnames
print(sourceFilesheets) # there is 1 sheet called Sheet in every new wb

# access things you want to see
for sheet in sourceFilesheets:
    print('sheet is', sheet)
    maxRow = sourceFile[sheet].max_row
    maxCol = sourceFile[sheet].max_column
    for rowNum in range(2, maxRow + 1):
        print('id number is', sourceFile[sheet].cell(row=rowNum, column=1).value)
        for colNum in range(1, maxCol + 1):
            header = sourceFile[sheet].cell(row=rowNum, column=colNum).value
            print('Column headers are', header)

# (ex 1) What are the quarterly sales of RI wuzzels?
# (ex 2) What are the number of MA widgets sold in March?
for sheet in sourceFilesheets:
    print('sheet is', sheet)
    maxRow = sourceFile[sheet].max_row
    maxCol = sourceFile[sheet].max_column
    # row 1 is headers, start with row 2
    for rowNum in range(2, maxRow + 1):
        state = sourceFile[sheet].cell(row=rowNum, column=1).value
        product = 'wuzzels'
        print(f'State is {state}, product is {product}')
        if state == 'RI' and product == 'wuzzels':
            total_sales = 0
            for colNum in range(3, maxCol + 1):
                total_sales = total_sales + int(sourceFile[sheet].cell(row=rowNum, column=colNum).value)

print(f'Total sales of wuzzels in RI for the first quarter was {total_sales}.')

***snippsat*** · (This post was last modified: Jan-24-2025, 06:08 PM by snippsat.)

As mentioned Pandas will work very well for this case,you most try some stuff yourself.
Example quarterly sales of RI wuzzels

import pandas as pd
from io import StringIO

file = StringIO(
    """State type Jan Feb Mar
MA widgets 1000 1200 800
MA wuzzels 555 700 750
RI widgets 200 300 375
RI wuzzels 400 300 275"""
)

data = pd.read_csv(file, sep=" ")
# Quarterly sales of RI wuzzels
print(data)

ri_wuzzels_data = data.loc[(data['State'] == 'RI') & (data['type'] == 'wuzzels')]
ri_wuzzels_quarterly_sales = ri_wuzzels_data[['Jan', 'Feb', 'Mar']].sum(axis=1).iloc[0]
print(f"\nQuarterly sales of RI wuzzels: {ri_wuzzels_quarterly_sales}"

Output:  State     type   Jan   Feb  Mar
0    MA  widgets  1000  1200  800
1    MA  wuzzels   555   700  750
2    RI  widgets   200   300  375
3    RI  wuzzels   400   300  275

Quarterly sales of RI wuzzels: 975

Quote:And I dont want to Print the result, I want to assign the result to a variable so I can use it further down my program.

Can of course to both,usally want print to see stuff working before move on.
The variable will just have the result,and can be used further.

>>> ri_wuzzels_quarterly_sales
975

Newbie needs help

User Panel Messages

Announcements