Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Editing spreadsheet/csv
#1
Hi, sorry if something wrong im new its my first post.
Im learning python but english is not my native language and i dont know how to properly put sentence to search so i thought i might ask a question in this forum.

I have spreadhseet table with thousands of records and looking to edit it using python.
My column consist of " city, state,zip" i manually added column to add "city,state" and "zip" but im looking to make it easier. I would rather tell python to read specific column and separate zips and once zip separated - read the first 3 digits of it and if zip contains XXX number - should be writen "YYYY" (XXX and YYY is just generic for example purpose ) and this need to be done by creating another column after "zip" .

Is this possible ? If yes - i would be appreciated if pythonistas could guide me to the right direction to create this code or an example.
As i have strugle to undserstand - example and how to would be even more appretiated.

P.s. ill be using streamlit, pandas as well and i think few more extras so if things are combined together - im more than hhappy to use them.
Also im using Python version 3.11

Thank You and have a Great Day !
Reply
#2
You should post a sample spreadsheet table (19 rows or less) and post a mock-up of what you want to produce. What format is your "spreadsheet table". Is this an excel spreadsheet?

This sounds like a good fit for Pandas. You can load in your entire table into a pandas DataFrame, then you can extract rows from the dataframe based on your conditions.
Reply
#3
(Aug-31-2023, 05:13 PM)deanhystad Wrote: You should post a sample spreadsheet table (19 rows or less) and post a mock-up of what you want to produce. What format is your "spreadsheet table". Is this an excel spreadsheet?

This sounds like a good fit for Pandas. You can load in your entire table into a pandas DataFrame, then you can extract rows from the dataframe based on your conditions.

Yes, excel spreadsheet.
I dont know how to post/show sample of spreadsheet.
Left of the red color - BEFORE
RIght after black - AFTER
Example

I will create some sort of script where i will note if 780 - Laredo, if 333 - Atlanta , if 498 - Green Bay, etc. But most important is to divide column into few other and write to it.
I hope its a bit clearer.
Thank You.
Reply
#4
What do you see if you run this:
import pandas as pd

df = pd.read_excel("spreadshet_filename.xlsx").head(10)  # Change to name of your spreadsheet file.
print(df)
Post results if they don't contain anything sensitive.
Reply
#5
(Aug-31-2023, 06:48 PM)deanhystad Wrote: What do you see if you run this:
import pandas as pd

df = pd.read_excel("spreadshet_filename.xlsx").head(10)  # Change to name of your spreadsheet file.
print(df)
Post results if they don't contain anything sensitive.

I have not run anythiong yet , im trying to understand how to write this code and where can i find info about it as at the moment im maping zip codes inside another excel sheet so i can know what zip belongs to which Area.
But i will start writing the code tomorrow but before i start googling in the morning i woanted to get some links today to read and start experimenting tomorrow.
Multitasking Angel
Reply
#6
Starting with an excel spreadsheet that looks like this:
Output:
Location Laredo, TX 78045 Atlanta, GA 33333 Chicago, IL 60000 Escanaba, MI 49888 Pendergrass, GA 33345
I read the spreadsheet into a DataFrame and split the location column into City and State_ZIP columns using ", " as the delimiter. After that I split the State_ZIP column into state and zip columns using " " as the delimiter.
import pandas as pd

df = pd.read_excel("test.xlsx", header=0)
df[["City", "State Zip"]] = df["Location"].str.split(", ", expand=True)
df[["State", "Zip"]] = df["State Zip"].str.split(" ", expand=True)
print(df)
Output:
Location City State Zip State Zip 0 Laredo, TX 78045 Laredo TX 78045 TX 78045 1 Atlanta, GA 33333 Atlanta GA 33333 GA 33333 2 Chicago, IL 60000 Chicago IL 60000 IL 60000 3 Escanaba, MI 49888 Escanaba MI 49888 MI 49888 4 Pendergrass, GA 33345 Pendergrass GA 33345 GA 33345
You can generate the Area column values in python.
import pandas as pd

df = pd.read_excel("test.xlsx", header=0)
df[["City", "State Zip"]] = df["Location"].str.split(", ", expand=True)
df[["State", "Zip"]] = df["State Zip"].str.split(" ", expand=True)

area_zips = {"333": "Atlanta"}
cities = df["City"].values
codes = df["Zip"].values
df["Area"] = [area_zips.get(code[:3], city) for code, city in zip(codes, cities)]
print(df)
Output:
Location City State Zip State Zip Area 0 Laredo, TX 78045 Laredo TX 78045 TX 78045 Laredo 1 Atlanta, GA 33333 Atlanta GA 33333 GA 33333 Atlanta 2 Chicago, IL 60000 Chicago IL 60000 IL 60000 Chicago 3 Escanaba, MI 49888 Escanaba MI 49888 MI 49888 Escanaba 4 Pendergrass, GA 33345 Pendergrass GA 33345 GA 33345 Atlanta
Another way is to use a map function. If you have a dictionary of Area zip codes with full coverage, you can use the dictionary.
import pandas as pd

df = pd.read_excel("test.xlsx", header=0)
df[["City", "State Zip"]] = df["Location"].str.split(", ", expand=True)
df[["State", "Zip"]] = df["State Zip"].str.split(" ", expand=True)

area_zips = {"333": "Atlanta", "600": "Chicago", "498": "Marquette", "780": "Laredo"}
df["zip_prefix"] = df["Zip"].str[:3]
df["Area"] = df["zip_prefix"].map(area_zips)
print(df)
Output:
Location City State Zip State Zip zip_prefix Area 0 Laredo, TX 78045 Laredo TX 78045 TX 78045 780 Laredo 1 Atlanta, GA 33333 Atlanta GA 33333 GA 33333 333 Atlanta 2 Chicago, IL 60000 Chicago IL 60000 IL 60000 600 Chicago 3 Escanaba, MI 49888 Escanaba MI 49888 MI 49888 498 Marquette 4 Pendergrass, GA 33345 Pendergrass GA 33345 GA 33345 333 Atlanta
map can also call a function.
import pandas as pd


def area_zips(zipcode):
    zips = {"333": "Atlanta", "600": "Chicago", "498": "Marquette", "780": "Laredo"}
    return zips[zipcode[:3]]


df = pd.read_excel("test.xlsx", header=0)
df[["City", "State Zip"]] = df["Location"].str.split(", ", expand=True)
df[["State", "Zip"]] = df["State Zip"].str.split(" ", expand=True)
df["Area"] = df["Zip"].map(area_zips)
df = df[["City", "State", "Zip", "Area"]]
print(df)
None of these methods (using python on values, map dictionary, map function) are particularly fast, but should be fine as long as you aren't processing hundreds of millions of locations.

When you are all done you either drop the unwanted rows or extract the rows you want to keep.
df = df[["City", "State", "Zip", "Area"]]
Output:
City State Zip Area 0 Laredo TX 78045 Laredo 1 Atlanta GA 33333 Atlanta 2 Chicago IL 60000 Chicago 3 Escanaba MI 49888 Marquette 4 Pendergrass GA 33345 Atlanta
BSDevo likes this post
Reply
#7
This is awesome !
Thank You very much.
I dont care if its slow as long as it works for now and no, there wont be hundreds of millions of location but good to know that will be to slow for such a huge DB.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Returning Column and Row Data From Spreadsheet knight2000 0 451 Oct-22-2023, 07:07 AM
Last Post: knight2000
  Looking to automate updating a spreadsheet with image from email cubangt 2 983 Feb-14-2023, 03:43 PM
Last Post: cubangt
  Import XML file directly into Excel spreadsheet demdej 0 860 Jan-24-2023, 02:48 PM
Last Post: demdej
  updating Google spreadsheet with gspread mgallotti 0 1,105 Sep-30-2022, 11:26 PM
Last Post: mgallotti
  Modify LibreOffice's ods spreadsheet Pavel_47 0 1,190 Jul-13-2022, 11:28 AM
Last Post: Pavel_47
  Python create a spreadsheet with column and row header ouruslife 4 1,648 Jul-09-2022, 11:01 AM
Last Post: Pedroski55
  Searching for URLs and printing the corresponding row from an Excel spreadsheet johnbernard 0 1,285 Aug-20-2021, 06:43 PM
Last Post: johnbernard
  "Switch-to-spreadsheet" entry. Feasible in Python? whatspython 2 2,029 Sep-30-2020, 01:12 PM
Last Post: buran
  PYTHON - GOOGLE... create new spreadsheet? danclark81 3 2,708 Feb-02-2020, 08:57 PM
Last Post: danclark81
  Skype BOT project with backend as MS Excel spreadsheet uunniixx 0 1,797 Aug-16-2018, 01:43 PM
Last Post: uunniixx

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020