Posts: 64
Threads: 16
Joined: Nov 2019
(Jan-01-2021, 07:48 AM)SunWers Wrote: (Jan-01-2021, 07:41 AM)buran Wrote: can you upload sample input file? Looking at the screenshot again, it is more likely a fixed-width file.
Sorry, crazy question.... how do I upload a file?
Just copy and paste a small section of the text file, using the output tags.
Posts: 18
Threads: 5
Joined: Dec 2020
Jan-01-2021, 08:04 AM
(This post was last modified: Jan-01-2021, 08:04 AM by SunWers.)
(Jan-01-2021, 07:53 AM)palladium Wrote: (Jan-01-2021, 07:31 AM)SunWers Wrote: Hi Palladium,
I just tried adding "sep = None" and this is the error message I get...
runfile('C:/Users/vhubb/Documents/Python Scripts/temp.py', wdir='C:/Users/vhubb/Documents/Python Scripts')
C:\Users\vhubb\Documents\Python Scripts\temp.py:37: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators (separators > 1 char and different from '\s+' are interpreted as regex); you can avoid this warning by specifying engine='python'.
df = pd.read_csv(in_filename, sep="None")
It's not an error message. Its a warning to let you know that the pandas engine is using a non default method to parse your file (the default is C engine which does not guessing delimiters; pandas is telling you that it has switched to the python engine which I understand is slower but has some built in support for guessing delimiters). If you add the keyword argument engine = 'python' as per the warning message, the message should disappear.
An error message will tell you what type of error it is (e.g. "TypeError" or "ValueError") followed by a message.
If you don't see anything else it means the process is successful. Print out the resulting dataframe to verify.
As buran pointed out you can use pandas' built in to_excel() method, which dumps the dataframe into an xlsx file format. It is customisable to some extent,b ut as (s)he astutely pointed out, given your complex needs openpyxl may be a better bet.
I checked the output file and it is still combining all columns into one. I also just uploaded the text file above.
Posts: 8,151
Threads: 160
Joined: Sep 2016
Jan-01-2021, 08:39 AM
(This post was last modified: Jan-01-2021, 08:39 AM by buran.)
yes, it's fixed-width file.
If you want to read it with pandas - look at pandas.read_fwf
or just parse it your self (i.e. there is no much use of pandas if you just want to read it and then write to excel, without doing calcutions).
Also note that there is extra header line (the underlines of the header) which you will need to deal with in pandas.
import pandas as pd
df = pd.read_fwf('mytxt.txt')
df.drop(index=0, inplace=True) # drop the underlines
print(df.head()) Output: Cutfile Name Start Time End Time Total Time Status ... Dry Run.1 Cut Speed Throughput Feed Rate Operator
1 BWW34580401 B 11/2/2020 6:39:37 AM 11/2/2020 6:40:05 AM 0:00:28 2 ... 0.000 0.000 0.000 0.000 Unknown
2 BWW34580401 B 11/2/2020 6:40:20 AM 11/2/2020 7:11:45 AM 0:31:25 3 ... 0.000 394.513 168.167 40.000 Unknown
3 PRECUT.$$$ 11/2/2020 7:21:07 AM 11/2/2020 7:21:22 AM 0:00:15 3 ... 0.000 354.574 173.032 40.000 Unknown
4 LFRF0106 30D A2 11/2/2020 7:32:59 AM 11/2/2020 8:00:49 AM 0:27:50 3 ... 0.000 430.553 200.613 40.000 Unknown
5 LFRF0106 30D A1 11/2/2020 8:19:21 AM 11/2/2020 9:08:27 AM 0:49:06 3 ... 0.000 430.338 200.812 40.000 Unknown
[5 rows x 27 columns]
palladium likes this post
Posts: 1
Threads: 0
Joined: Oct 2020
why aren't you using the csv module to do this? your code seems to pick the first letter of the name to put it in the second column .
Posts: 8,151
Threads: 160
Joined: Sep 2016
Jan-01-2021, 03:04 PM
(This post was last modified: Jan-01-2021, 03:05 PM by buran.)
(Jan-01-2021, 09:55 AM)stevenroger Wrote: why aren't you using the csv module to do this? csv module will not do. It's a fixed-width file, not delimited (comma-, tab- or whatever).
Of course, as I mentioned, they can parse it themselves.
Posts: 18
Threads: 5
Joined: Dec 2020
Thank You! I think what I will do is parse them myself and then go from there. I'm thinking there is a way to pull the files as .csv format. But I won't know for sure until I get back to work Monday. I could set the script up to be more automatic if the files were .csv format, correct?
Thanks again for everyone's help!
Hope everyone is having a Happy Holiday
-Veronica
(Jan-01-2021, 08:39 AM)buran Wrote: yes, it's fixed-width file.
If you want to read it with pandas - look at pandas.read_fwf
or just parse it your self (i.e. there is no much use of pandas if you just want to read it and then write to excel, without doing calcutions).
Also note that there is extra header line (the underlines of the header) which you will need to deal with in pandas.
import pandas as pd
df = pd.read_fwf('mytxt.txt')
df.drop(index=0, inplace=True) # drop the underlines
print(df.head()) Output: Cutfile Name Start Time End Time Total Time Status ... Dry Run.1 Cut Speed Throughput Feed Rate Operator
1 BWW34580401 B 11/2/2020 6:39:37 AM 11/2/2020 6:40:05 AM 0:00:28 2 ... 0.000 0.000 0.000 0.000 Unknown
2 BWW34580401 B 11/2/2020 6:40:20 AM 11/2/2020 7:11:45 AM 0:31:25 3 ... 0.000 394.513 168.167 40.000 Unknown
3 PRECUT.$$$ 11/2/2020 7:21:07 AM 11/2/2020 7:21:22 AM 0:00:15 3 ... 0.000 354.574 173.032 40.000 Unknown
4 LFRF0106 30D A2 11/2/2020 7:32:59 AM 11/2/2020 8:00:49 AM 0:27:50 3 ... 0.000 430.553 200.613 40.000 Unknown
5 LFRF0106 30D A1 11/2/2020 8:19:21 AM 11/2/2020 9:08:27 AM 0:49:06 3 ... 0.000 430.338 200.812 40.000 Unknown
[5 rows x 27 columns]
Posts: 8,151
Threads: 160
Joined: Sep 2016
Jan-02-2021, 03:24 AM
(This post was last modified: Jan-02-2021, 03:24 AM by buran.)
(Jan-02-2021, 01:22 AM)SunWers Wrote: I could set the script up to be more automatic if the files were .csv format, correct? It will be "automatic" all the same. Just the code will be slightly different, e.g. pandas.read_fwf vs pandas.read_csv or if you parse yourself it will allow to use csv, but in any case it will not do big difference.
Given that pandas did good job reading the file, concentrate on writing to excel. You have 2 options - write to excel from pandas using to_excel, then format the file to look like what you want using other package like openpyxl or write each column from the dataframe in the template you created one by one. In any case you need to handle start and end time because date and time are one column, while you want them separate columns.
Posts: 1,090
Threads: 143
Joined: Jul 2017
A .csv file is a text file.
Something has to tell the csv reader what belongs in what column.
Below, the separator is a comma.
The comma was put there when I saved a Libre Office spreadsheet as .csv
Here is a peek:
Quote:Photo,phone,number,name,attendance,attn%,Final Score,points%,points,hwW16,cwW16,Week 16,hwW15,cwW15,Week 15,hwW14,cwW14,Week 14,hwW13,cwW13,Week 13,hwW12,cwW12,Week 12,hwW11,cwW11,Week 11,hwW10,cwW10,Week 10,hwW9,cwW9,Week 9
,,,,max Attendance,,,Max Score,425,,,,50,25,,50,25,,50,25,,50,25,,50,25,,0,25,,0,25,
,,,23名,8,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Photo,phone,学号,姓名,出勤,出勤率,期末分数,得分率,得分,HW,CW,Week 16,HW,CW,Week 15,HW,CW,Week 14,HW,CW,Week 13,HW,CW,Week 12,HW,CW,Week 11,HW,CW,Week 10,HW,CW,Week 9
,,1921030146,聂林,0,0,0,0,0,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,,1921090147,郜阳杰,0,0,0,0,0,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,,1922190144,汪毓韬,0,0,0,0,0,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,,1923010312,张长乐,7,100,100,100,425,,,1,50,25,1,50,25,1,50,25,1,50,25,1,50,25,1,0,25,1,0,25,1
When I open this .csv file with Libre Office, LO looks for the separator. Can be tab, comma, dunno what.
This reads the above file and also can save the data I want, better than csv.reader() (I think):
import pandas as pd
# Read the file
file1 = '/home/pedro/winter2020/20PY/dumpFiles/attendanceWinter2020_20PYlatestPyMade_test1.csv'
dg = pd.read_csv(file1, usecols=['number', 'name', 'Final Score'])
# write the file
file2 = '/home/pedro/winter2020/20PY/dumpFiles/panda_bear.csv'
dg.to_csv(file2, encoding='utf8', index=False, sep=',')
Posts: 7
Threads: 0
Joined: Dec 2020
How to Convert a TXT file to CSV
- Open Excel and create a new spreadsheet.
- Select the Data tab.
- On the far right, click “Get External Data”, then select the “From Text” option.
- Find the TXT file on your computer and click “Open”.
- In the first step of the Import Wizard, select “Delimited”
- Make sure to select that your data has headers (check the box) then click next.
- In the second step, select “Tab” if you’re working with the Amazon All Products Report, or “Comma” if you’re using another file. The best way to tell is to take a look at the contents of the file and see how they’re divided. If there is just a bunch of space between one field and another, then Tab is the right choice. If you look at the file and there are commas after each entry, then Comma is the right choice.
Regards,
Caleb
Posts: 8,151
Threads: 160
Joined: Sep 2016
@ caleb_cruze, what does this have to do with OP question on how to automate report creation using python. Based on discussion so far she is perfectly good at doing the report manually :)
|