Posts: 41
Threads: 15
Joined: Aug 2022
Hi all,
I want to sort csv file find which item was failed(if this column have limit)and paint color for the cell, delete afterward item such as final data.xlsx format. I only know how to sort failed and don't know how to compared limit, kindly help me for the matter, thanks!!
import csv
array=[]
failed=[]
count=0
with open('Log.csv','r') as fp:
csv_reader=csv.reader(fp)
header = next(csv_reader)
listreport=list(csv_reader)
for line in listreport:
if count<3:
array.append(line)
if line[5]=='Failed':
failed.append(line)
count+=1
with open('new_name.csv','w',encoding='utf-8',newline='') as new_line:
csv_writer=csv.writer(new_line)
csv_writer.writerow(header)
csv_writer.writerows(array)
csv_writer.writerows(failed)
Posts: 453
Threads: 16
Joined: Jun 2022
Aug-06-2022, 07:51 AM
(This post was last modified: Aug-06-2022, 07:51 AM by rob101.)
You are working with a large amount of data, which is creating huge lists (len = 385), much of the data is duplicated, within the lists, which is very confusing.
Also, I'm unsure why you are creating two objects: header and listreport . You could do the exact same thing with one object (the iterator in the for loop), which I've named row .
with open('Log.csv') as log:
csv_reader = csv.reader(log)
for row in csv_reader:
if count<3:
array.append(row)
if row[5]=='Failed':
failed.append(row)
count+=1 I'm still unclear about your objective, as your English is a little hard for me to follow; I can see that you are doing your best, so the fail here is more on my side than yours.
I'll have a think about this and try to offer better help, but if you can find a way to help me better understand your objective, then please do.
edit to add:
I think that the major issue here is that you're trying to do too much at once: the task needs to be broken down into smaller steps (sub-tasks, if you will), solving one thing at a time, each step moving ever closer to the solution.
Sig:
>>> import this
The UNIX philosophy: "Do one thing, and do it well."
"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse
"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Posts: 41
Threads: 15
Joined: Aug 2022
Thank you for your prompt reply, this is test log of product measured. For the file I want to two step.
1)Sort failed item.-->Already done.
2)Check column data whether in limit if assign limit such in Log.csv Column M(VDD Continuity) limit is -0.8 to 0.25 and then select that column compare value if in range and then check next(LR Continuity) have limit column does same check, if out range paint the cell color and skip this row, thanks!
Posts: 453
Threads: 16
Joined: Jun 2022
You are very welcome.
I have worked with Excel for many years and (from my experience) I see no reason for using Python to solve this problem. All the tools you need (the functions) are already there, in Excel, such as Conditional Formatting, Logical, Search, etc, which can be used to reference any cell in any worksheet.
By trying to use Python, you now need to solve two problems, rather than just the one.
Sig:
>>> import this
The UNIX philosophy: "Do one thing, and do it well."
"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse
"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Posts: 41
Threads: 15
Joined: Aug 2022
Thanks in advance!
Yes ,I have try use excel conditional formatting between function, many column need to compared by manually. It is a large workload, so want to try use python easy handle this matter.
Posts: 2,111
Threads: 10
Joined: May 2017
Aug-06-2022, 10:47 AM
(This post was last modified: Aug-06-2022, 10:47 AM by DeaD_EyE.)
It's not so easy because you need to know how csv.reader works and you need to convert str into float for example.
Otherwise you can't do the comparison.
Here an example with: - skipping first line before using csv.DictReader
- DictReader gets automatically the first list (after skipping, it's the second line) for the column names, which are later the keys
- reading two rows, where the first one should be the minimum limits and the second the maximum limits
- converting this
str to a float and putting them as attribute on the instance of Reader
- iterate through the remaining data and using the methods
reader.test_pass(row) and reader.in_valid_range(row)
Code:
import csv
from operator import itemgetter
class Reader:
def __init__(self, csv_file, sort_fields):
self.fd = open(csv_file, newline="")
self.sort_fields = itemgetter(*sort_fields)
# self.sort_fields(row) -> return a list
# with the wanted data from the mapping
# skipping first line
next(self.fd)
# DictReader reads automatically the first line
# to get the column names
self.csv = csv.DictReader(self.fd, delimiter=",")
# getting minium and maximum values
self.minimum_values = next(self.csv)
self.maximum_values = next(self.csv)
# getting the limit values from the dict and converting them to a float
# that they could used for comparison
self.vdd_continuity_min = float(self.minimum_values["VDD Continuity"])
self.vdd_continuity_max = float(self.maximum_values["VDD Continuity"])
self.lr_continuity_min = float(self.minimum_values["LR Continuity"])
self.lr_continuity_max = float(self.maximum_values["LR Continuity"])
@staticmethod
def sort_by_float(row):
# calling self.sort_fields with row -> data
# map takes each element from data and call it with float()
# tuple consumes the map
# a tuple with floats is returned
return tuple(map(float, self.sort_fields(row)))
def __iter__(self):
yield from self.csv
def __enter__(self):
return self
def __exit__(self, exc_typ, exc_obj, exc_tb):
# closes files automatically when leaving
# the context manager
self.fd.close()
@staticmethod
def test_pass(row):
# no instance (self) is needed to do the test
return row.get("Pass/Fail", "").lower() == "passed"
def in_valid_range(self, row):
# here self is needed, because attributes from this
# instance are read
vdd = float(row["VDD Continuity"])
lr = float(row["LR Continuity"])
return (self.vdd_continuity_min <= lr <= self.vdd_continuity_max) and (
self.lr_continuity_min <= lr <= self.lr_continuity_max
)
if __name__ == "__main__":
with Reader("Log.csv", sort_fields=["VDD Continuity", "Temperature"]) as reader:
# reader instance has now some attributes
print(f"{reader.vdd_continuity_min=}")
print(f"{reader.vdd_continuity_max=}")
print(f"{reader.lr_continuity_min=}")
print(f"{reader.lr_continuity_max=}")
# calls implicity the __iter__ method of the instance of Reader
for row in reader:
if reader.test_pass(row) and reader.in_valid_range(row):
part_id = row.get("Part ID")
if part_id:
print(part_id) The output could be made with: https://www.geeksforgeeks.org/working-wi...in-python/
But working directly with Excel will solve your task faster (currently). Later if you more experienced, this code is nothing difficult to write.
EDIT: Added the sorting. It's not possible to use this method, if the file is the half of your RAM.
Posts: 12,005
Threads: 482
Joined: Sep 2016
FYI: you can convert to pandas dataframe and manipulate that.
This may ( or may not be useful in this instance )
import pandas as pd
import os
def display_log(filename):
df = pd.read_csv(filename)
print(df)
def main():
# this code changes CWD to same directory as python script
os.chdir(os.path.abspath(os.path.dirname(__file__)))
display_log('Log.csv')
if __name__ == '__main__':
main() which produces (partial):
Output: Unnamed: 0 Unnamed: 1 Unnamed: 2 ... 5045.1 5047 5047.1
0 TesterID Batch Number Site Number ... DUT_LM_relativeShock_PAT_Hi DUT_LM_Sens_Corr1000Hz_94SPL__PAT_Lo DUT_LM_Sens_Corr1000Hz_94SPL__PAT_Hi
1 NaN NaN NaN ... -0.053597136 -38.80975306 -38.80975306
2 NaN NaN NaN ... 0.112112568 -37.5058393 -37.5058393
3 Tester#2 1 3 ... 0.107265399 -38.83249426 -37.45496388
4 Tester#2 1 4 ... 0.112704539 -38.82323609 -37.47194499
5 Tester#2 1 5 ... 0.109072175 -38.82126228 -37.49127838
6 Tester#2 1 6 ... 0.111968008 -38.83483278 -37.47525095
7 Tester#2 1 7 ... NaN NaN NaN
8 Tester#2 1 8 ... 0.121047481 -38.85764922 -37.44677698
9 Tester#2 1 9 ... 0.107312512 -38.84481032 -37.48223468
10 Tester#2 1 10 ... 0.113937343 -38.81981198 -37.49873383
11 Tester#2 1 11 ... 0.105469581 -38.83446045 -37.49203972
12 Tester#2 1 13 ... 0.112050497 -38.83122185 -37.47174578
13 Tester#2 1 14 ... 0.113251309 -38.81597385 -37.46724701
14 Tester#2 1 15 ... 0.107133 -38.87445237 -37.44735208
15 Tester#2 1 17 ... 0.106066352 -38.82747034 -37.46790153
16 Tester#2 1 18 ... 0.111913245 -38.84638211 -37.47201892
17 Tester#2 1 19 ... 0.105938991 -38.83105884 -37.45517457
18 Tester#2 1 20 ... 0.11032486 -38.86610586 -37.46020799
19 Tester#2 1 22 ... 0.108430019 -38.83336922 -37.48581459
20 Tester#2 1 23 ... 0.111608632 -38.8631526 -37.47320664
21 Tester#2 1 24 ... 0.11198719 -38.85636541 -37.49164502
22 Tester#2 1 26 ... 0.118748873 -38.81289836 -37.51235239
23 Tester#2 1 27 ... 0.116329872 -38.83398578 -37.47402664
24 Tester#2 1 28 ... 0.111891007 -38.82510794 -37.46041127
25 Tester#2 1 29 ... 0.112180879 -38.83137426 -37.47103944
26 Tester#2 1 30 ... 0.115581321 -38.83673984 -37.49914098
27 Tester#2 1 31 ... 0.111984757 -38.84949218 -37.45391246
28 Tester#2 1 32 ... 0.114766975 -38.84627817 -37.46742165
29 Tester#2 1 33 ... 0.116238335 -38.82761274 -37.47276069
30 Tester#2 1 34 ... 0.11051097 -38.84194923 -37.44687194
31 Tester#2 2 0 ... 0.112107334 -38.80890703 -37.50574658
32 Tester#2 2 1 ... 0.114155945 -38.83613146 -37.49142007
33 Tester#2 2 2 ... NaN NaN NaN
34 Tester#2 2 3 ... NaN NaN NaN
35 Tester#2 2 4 ... 0.112702549 -38.8238681 -37.47156012
36 Tester#2 2 5 ... 0.109082816 -38.82111536 -37.49128727
37 Tester#2 2 6 ... 0.11193 -38.83687834 -37.47190235
38 Tester#2 2 7 ... 0.110769315 -38.81795084 -37.48783699
39 Tester#2 2 8 ... 0.120575644 -38.85765686 -37.44676501
40 Tester#2 2 9 ... 0.107202601 -38.84448086 -37.48138477
41 Tester#2 2 10 ... 0.113890241 -38.82515551 -37.49137495
42 Tester#2 2 11 ... 0.105438307 -38.83445528 -37.49161599
43 Tester#2 2 13 ... NaN NaN NaN
44 Tester#2 2 14 ... 0.113201626 -38.81667164 -37.46618239
45 Tester#2 2 15 ... 0.107045831 -38.87418522 -37.44798724
46 Tester#2 2 17 ... 0.105892395 -38.82975426 -37.46608974
47 Tester#2 2 18 ... 0.111842669 -38.84624438 -37.47263779
48 Tester#2 2 19 ... 0.106001641 -38.83279308 -37.45250412
49 Tester#2 2 20 ... 0.110353187 -38.86689691 -37.4586455
50 Tester#2 2 22 ... 0.108419992 -38.83444891 -37.4837277
51 Tester#2 2 23 ... 0.111471083 -38.86236253 -37.47474858
52 Tester#2 2 24 ... 0.112015322 -38.8547756 -37.49260473
53 Tester#2 3 3 ... NaN NaN NaN
Posts: 2,111
Threads: 10
Joined: May 2017
Pandas feels for me like numpy + black magic .
Pandas is powerful, but what I see is, that people without knowledge about the Core Language Python itself trying to work with this framework.
Posts: 41
Threads: 15
Joined: Aug 2022
It appreciated for your reply, I just a newbie in Python.
Posts: 41
Threads: 15
Joined: Aug 2022
import csv
up_limit=[]
low_limit=[]
dt=[]
index_number=[12,13,14,15,22,26,32,41,42,44,45,47,48,49,50,51,52,53,54,55,56,57,58,60,62,64,66,74,80,85,86,91,92,97,98,102,
104,108,110,114,116,120,122,126,128,132,134,138,140,144,146,152,173,177,204,205,206,207,208,210,212,214,215,216,
218,239,240,241,242,243,247,248,263,264,270,271,275,277,281,283,292,299,301,304,306,309,310,311,312,313,314,315,
316,317,318,319,320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,
344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359,360,361,362,363,364,365,366,367,368,369,370,371,
372,373,374,375,376,377,378,379,380,381,382,383,384]
with open('Log.csv','r') as file:
readerfile=csv.reader(file)
for i, rows in enumerate(readerfile):
if i==2 :
for j in range(len(index_number)):
col=rows[index_number[j]]
up_limit.append(float(col))
elif i==3:
for j in range(len(index_number)):
col = rows[index_number[j]]
low_limit.append(float(col))
file.close()
with open('Log.csv', 'r') as fp:
readerfile = csv.reader(fp)
header0=next(readerfile)
header1=next(readerfile)
header2=next(readerfile)
header3=next(readerfile)
data=list(readerfile)
data.sort(key=lambda row:(int(row[2])))
for row in data:
if row[5]=='Failed':
array.append(row)
for i,rows in enumerate(array):
for j in range(len(index_number)):
col_value = rows[index_number[j]]
if num<len(index_number):
if float(col_value)>=up_limit[num] and float(col_value)<=low_limit[num]:
dt.append(col_value)
num += 1
else:
num=0
break
with open('sorted_data.csv','w',encoding='utf-8',newline='') as new:
csvwriter=csv.writer(new)
csvwriter.writerow(header0)
csvwriter.writerow(header1)
csvwriter.writerow(header2)
csvwriter.writerow(header3)
csvwriter.writerows(dt) I have updated the code which can read up&low limit compare by row.Why final generated file format is wrong?The sorted_data.csv is what I want.
|