Good day - I thought I was going to be able to put my continuous python failures behind me for another month or so, but I've received a new task at work that *may* be easier than what I was trying to do before.
Unlike prior efforts, I'm not looking for an answer, but am hoping for some guidance (what libraries and which functions to investigate).
Here's the problem I'm trying to solve:
I have two CSV files with different row counts. I have deleted all of the columns to remove the complexity of dealing with a grid or table or whatever it may be called in Python. So I'm left with two files each with a single column of data. I have truncated, capitalized, ensured that the data are all alpha only strings, de-duped, and removed all non printable characters to avoid receipt of confusing error messages. I THINK my data is as clean and simple as I can make it.
I would like to end up with 3 things (but I'd be thrilled if I can make any one of them happen) in an output file (which can be a text file or csv or anything else I can open and read or send to a printer):
1) What is in File A that is not in File B
2) What is in File B that is not in File A
3) What is common to both File A and File B
I copied some code that I found in a colleague's old repository, but it ouputs an empty file. I'll add it here in case the best approach is to simply modify her script.
And if it matters, I believe I'm using 3.5.1 and PyCharm as the editor.
Unlike prior efforts, I'm not looking for an answer, but am hoping for some guidance (what libraries and which functions to investigate).
Here's the problem I'm trying to solve:
I have two CSV files with different row counts. I have deleted all of the columns to remove the complexity of dealing with a grid or table or whatever it may be called in Python. So I'm left with two files each with a single column of data. I have truncated, capitalized, ensured that the data are all alpha only strings, de-duped, and removed all non printable characters to avoid receipt of confusing error messages. I THINK my data is as clean and simple as I can make it.
I would like to end up with 3 things (but I'd be thrilled if I can make any one of them happen) in an output file (which can be a text file or csv or anything else I can open and read or send to a printer):
1) What is in File A that is not in File B
2) What is in File B that is not in File A
3) What is common to both File A and File B
I copied some code that I found in a colleague's old repository, but it ouputs an empty file. I'll add it here in case the best approach is to simply modify her script.
And if it matters, I believe I'm using 3.5.1 and PyCharm as the editor.
import pandas with open('C:\Temp\INDU.csv', 'r') as file1: with open('C:\Temp\MOV.csv', 'r') as file2: same = set(file1).intersection(file2) same.discard('\t') with open('C:\Temp\CompareResults.csv', 'w') as file_out: for line in same: file_out.write(line)Thank you for any suggestions!