Python Forum
Logic suggestions for comparing 2 csv's
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Logic suggestions for comparing 2 csv's
#1
So im not looking for functional code, i want to keep learning and just need suggestions or pointers on what to look at in order to accomplish the following:

I have a CSV file with (IP, Store, Register, Status) this is provided from one system. Then in a totally different system we have support tickets, which we can get an export out of in CSV as well. It will contain (Store, Problem Description, Problem #, Date Opened)

The common link between the 2 csv files is the store #, so i would like to be able to write some logic to consume both files, and "search/return" all records from csv file #2 that exist in csv #1 by store and have a new dataframe returned with just those results.

Simple Example
CSV #1 (Store 5 was reported as "Down" when the ping script ran)
111.22.33.4 , 4, 200, 0
111.22.34.5 , 5, 201, 1
111.22.35.6 , 6, 202, 0
111.22.35.7 , 6, 203, 0


CSV #2 (Service report shows that there is currently a ticket/problem opened for Store 6)
6,"Register 202 down", 29743, 11/7/2023
6,"Register 203 down", 29745, 11/7/2023

Return a dataframe that returns a total count of records found by store

Expected Dataframe Data: (Store, Ticket Count, [Ticket #])
If possible in the dataframe, if not in whatever can return the data in this structure

4,0,0
5,0,0
6,2,[29743,29745]

If this is possible, what should i look at using? Pandas Dataframes, Series, Numpy, etc....)
Looking for suggestions on being able to accomplish this efficiently
Reply
#2
This is maybe a little rough, and I don't fully grasp some of the details in your post, but it could be a starting point for you:

import csv

PING_RUN = {}
TICKETS = {}


# KEYS
IP_ADDRESS = 0
STORE = 1
REGISTER = 2
STATUS = 3
with open("ping.csv", mode="r", encoding="UTF-8") as ping:
    READER = csv.reader(ping)
    for row in READER:
        PING_RUN[row[STORE]] = [row[IP_ADDRESS], row[REGISTER], row[STATUS]]


# KEYS
STORE = 0
PROBLEM = 1
TICKET = 2
DATE = 3
with open("ser.rpt", mode="r", encoding="UTF-8") as ser_rpt:
    READER = csv.reader(ser_rpt)
    for row in READER:
        TICKETS[row[TICKET]] = {row[STORE]: [row[PROBLEM], row[DATE]]}


for ticket in TICKETS:
    details = TICKETS[ticket]
    # store = list(details.keys())[STORE]
    store = [*details][STORE]  # this is a nice way
    if store in PING_RUN:
        print(f"Ticket: {ticket}")
        print(f"Store {store}")
        print(f"{details[store]}")
        print()
Output:
Ticket: 29743 Store 6 ['Register 202 down', '11/7/2023'] Ticket: 29745 Store 6 ['Register 203 down', '11/7/2023']
Sig:
>>> import this

The UNIX philosophy: "Do one thing, and do it well."

"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse

"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Reply
#3
Why do you want to make a file like this?
Output:
4,0,0 5,0,0 6,2,[29743,29745]
What is the purpose? It is not an easy format to work with. When writing this dataframe I think pandas would save as.
Output:
4,0,0 5,0,0 6,2,"[29743,29745]"
Quotes are placed around the list to indicate the list is a single value. The quotes also turn the list into a string. When you read the CSV file you'll have to convert the list-like string back to a list.

If you want to make a file that contains this kind of information I suggest using a json format file. But first you should really think about if a list like this, containing so little information, is worth making.
Reply
#4
Not making a file, just need to filter the ping results against our ticket system data and return ticket counts and ticket numbers by store.. No additional file is needed or being created, just need the counts and ticket #'s for any store that has any reported tickets about registers. Format doesnt matter, only that way in my post for better understanding of what im trying to accomplish..

the end result will be presented in a HTML file.
But my main ask is how to look at the support ticket file and see if any of the "Down" registers already have tickets opened.
Reply
#5
In your example which is the "support ticket file" and which is the "already opened ticket file"?

"opened tickets" sounds like it should be a database. Your ping results should open a ticket for a register that is down, and repairing a register should close the ticket. You can tell the database to only open the ticket if there is no open ticket for the register.

A file can stand in as a database, but it is a clunky substitute. You could read in your list of open tickets (do you need the store number?) and save as a set. Before opening a ticket, check if the register number is in the open ticket set. You could do this with pandas using the "isin()" function. The set of registers with open tickets would be the "values" for the function.

https://pandas.pydata.org/pandas-docs/st....isin.html
Reply
#6
OK here is my best sample data i can provide to help illustrate better what im trying to accomplish..

My current script pings a long list of IP addresses for our stores.. the results from those pings are in the following format: (this is saved out to a CSV file for reporting and other historical reasons, will eventually be saved into a database in the near future)

IP,Store,Register #,Status,Runtime
121.50.10.201,100,201,0,"11/07/2023,15:58:20"
121.50.10.202,100,202,1,"11/07/2023,15:58:20"
121.50.10.203,100,203,0,"11/07/2023,15:58:20"
121.50.10.204,100,204,0,"11/07/2023,15:58:20"
121.50.10.205,100,205,0,"11/07/2023,15:58:20"
121.50.10.206,100,206,1,"11/07/2023,15:58:20"
121.50.10.207,100,207,0,"11/07/2023,15:58:20"
121.50.10.208,100,208,0,"11/07/2023,15:58:20"
121.50.10.209,100,209,0,"11/07/2023,15:58:20"
121.50.10.221,100,221,0,"11/07/2023,15:58:20"
121.50.10.222,100,222,0,"11/07/2023,15:58:20"
Then out of our support system, we can get a excel file with the following data/structure.

Number	Caller	Created	Short description
847733	Store110	2023-11-02 14:02:26	Reg#208 System busy
848518	Store110	2023-11-04 11:14:54	Reg 206 DOA Base with bad 9A port
844657	Store107	2023-10-27 18:12:06	Reg#206 No communicating
842064	Store105	2023-10-22 18:13:36	Register 206 has latency in all functions 
846591	Store100	2023-10-31 12:43:06	Register 211 display is zoomed incorrectly - Touchscreen touching wrong spots
850157	Store100	2023-11-07 17:20:35	Reg #202  Screen went to a blank blue screen
So what i would like to be able to do is that IF there are any registers that return as "Down" check that store and register against the support system excel file. IF that store/register combination is in the excel file, then keep a count. And either add two new columns to the ping results with a True/False indicating that a support ticket is opened and another column with the ticket numbers, or just one new column with the ticket numbers for that register..

I mean a count would be a good first phase, but if the ticket numbers are just as easy to pull out of the excel, then that may be an even better first phase results. Or if both the count and support numbers is easy then that would work as well.. Ultimately we are trying to update our dashboard to provide key high level data/numbers so the support teams can act accordingly every time the dashboard updates.

So based on the above sample data, we can see that registers 202 and 206 returned as "Down" in the ping results, so i would then like to look in the support file(excel) for store 100 for registers 202 and 206 and in that file, there is 1 ticket for register 202 and nothing for 206

So own our dashboard we would show: (this really doesnt matter as all i really care for is the data and it will be presented in some form or fashion on the page)
Store 100 (1)
Store 100 (850157)

What makes is a little difficult, is that in the ping results, we only have the digit representation of the store number, where as in the support file, the numbers are formatted differently, could be Store 100, Store100, ST100. Also if you look at the description field, you can see that the registers are not all uniform either, since that is manually provided by the person and up to them on how to state the issues.
hope this helps demonstrate what im trying to accomplish.
Reply
#7
The support system file doesn't have the information you need to know if a down register has an open ticket. The support system file should have a register column. As is, you have to parse the description, which appears to be free-form, in hope that it contains the register number.

Are register numbers unique, or can two stores have the same register number? If register numbers are unique there is no reason to look at the store. If register numbers are unique, I would open up the support system file and get a list of all registers that have open tickets. Next I would open up the ping file and filter out all registers that are online. Using the register numbers pulled from the support system file I would filter out all registers mentioned in the support file. That leaves you with a ping file of registers that are offline and not mentioned in the support system file.
Reply
#8
The field is free form and up to the support analyst to enter the key information when they take the calls from the stores, so thats the reason why its not very clean.. and the only column where we would find or know if there is a specific register. So yea it would have to be some sort of:
Description contains 202 logic.. The store is somewhat cleaner because they have to pick from a list, but still have some instances where it may very, but it can be parsed out to only have the number without the text values.

As for the register numbers, no, each store has the same set of register numbers, just unique is the "Store#"
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Require Some Suggestions gouravlal 2 1,894 Jul-27-2020, 06:14 AM
Last Post: gouravlal
  Python Debugger Suggestions nilamo 3 3,090 Oct-22-2018, 07:05 PM
Last Post: jdjeffers
  Logic error when comparing randomly generated integers SadoDeomeoon 5 4,684 Jun-05-2017, 02:38 PM
Last Post: SadoDeomeoon
  Learning Python, need suggestions vkozinec 1 25,447 Mar-22-2017, 06:43 PM
Last Post: wavic

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020