Python Forum
Converting .txt to .csv file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Converting .txt to .csv file
#11
(Jan-01-2021, 07:48 AM)SunWers Wrote:
(Jan-01-2021, 07:41 AM)buran Wrote: can you upload sample input file? Looking at the screenshot again, it is more likely a fixed-width file.

Sorry, crazy question.... how do I upload a file?

Just copy and paste a small section of the text file, using the output tags.
Reply
#12
(Jan-01-2021, 07:53 AM)palladium Wrote:
(Jan-01-2021, 07:31 AM)SunWers Wrote: Hi Palladium,

I just tried adding "sep = None" and this is the error message I get...

runfile('C:/Users/vhubb/Documents/Python Scripts/temp.py', wdir='C:/Users/vhubb/Documents/Python Scripts')
C:\Users\vhubb\Documents\Python Scripts\temp.py:37: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators (separators > 1 char and different from '\s+' are interpreted as regex); you can avoid this warning by specifying engine='python'.
  df = pd.read_csv(in_filename, sep="None")

It's not an error message. Its a warning to let you know that the pandas engine is using a non default method to parse your file (the default is C engine which does not guessing delimiters; pandas is telling you that it has switched to the python engine which I understand is slower but has some built in support for guessing delimiters). If you add the keyword argument engine = 'python' as per the warning message, the message should disappear.

An error message will tell you what type of error it is (e.g. "TypeError" or "ValueError") followed by a message.

If you don't see anything else it means the process is successful. Print out the resulting dataframe to verify.

As buran pointed out you can use pandas' built in to_excel() method, which dumps the dataframe into an xlsx file format. It is customisable to some extent,b ut as (s)he astutely pointed out, given your complex needs openpyxl may be a better bet.

I checked the output file and it is still combining all columns into one. I also just uploaded the text file above.
Reply
#13
yes, it's fixed-width file.
If you want to read it with pandas - look at pandas.read_fwf

or just parse it your self (i.e. there is no much use of pandas if you just want to read it and then write to excel, without doing calcutions).
Also note that there is extra header line (the underlines of the header) which you will need to deal with in pandas.

import pandas as pd

df = pd.read_fwf('mytxt.txt')
df.drop(index=0, inplace=True) # drop the underlines
print(df.head())
Output:
Cutfile Name Start Time End Time Total Time Status ... Dry Run.1 Cut Speed Throughput Feed Rate Operator 1 BWW34580401 B 11/2/2020 6:39:37 AM 11/2/2020 6:40:05 AM 0:00:28 2 ... 0.000 0.000 0.000 0.000 Unknown 2 BWW34580401 B 11/2/2020 6:40:20 AM 11/2/2020 7:11:45 AM 0:31:25 3 ... 0.000 394.513 168.167 40.000 Unknown 3 PRECUT.$$$ 11/2/2020 7:21:07 AM 11/2/2020 7:21:22 AM 0:00:15 3 ... 0.000 354.574 173.032 40.000 Unknown 4 LFRF0106 30D A2 11/2/2020 7:32:59 AM 11/2/2020 8:00:49 AM 0:27:50 3 ... 0.000 430.553 200.613 40.000 Unknown 5 LFRF0106 30D A1 11/2/2020 8:19:21 AM 11/2/2020 9:08:27 AM 0:49:06 3 ... 0.000 430.338 200.812 40.000 Unknown [5 rows x 27 columns]
palladium likes this post
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#14
why aren't you using the csv module to do this? your code seems to pick the first letter of the name to put it in the second column.
Reply
#15
(Jan-01-2021, 09:55 AM)stevenroger Wrote: why aren't you using the csv module to do this?
csv module will not do. It's a fixed-width file, not delimited (comma-, tab- or whatever).

Of course, as I mentioned, they can parse it themselves.
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#16
Thank You! I think what I will do is parse them myself and then go from there. I'm thinking there is a way to pull the files as .csv format. But I won't know for sure until I get back to work Monday. I could set the script up to be more automatic if the files were .csv format, correct?

Thanks again for everyone's help!

Hope everyone is having a Happy Holiday

-Veronica

(Jan-01-2021, 08:39 AM)buran Wrote: yes, it's fixed-width file.
If you want to read it with pandas - look at pandas.read_fwf

or just parse it your self (i.e. there is no much use of pandas if you just want to read it and then write to excel, without doing calcutions).
Also note that there is extra header line (the underlines of the header) which you will need to deal with in pandas.

import pandas as pd

df = pd.read_fwf('mytxt.txt')
df.drop(index=0, inplace=True) # drop the underlines
print(df.head())
Output:
Cutfile Name Start Time End Time Total Time Status ... Dry Run.1 Cut Speed Throughput Feed Rate Operator 1 BWW34580401 B 11/2/2020 6:39:37 AM 11/2/2020 6:40:05 AM 0:00:28 2 ... 0.000 0.000 0.000 0.000 Unknown 2 BWW34580401 B 11/2/2020 6:40:20 AM 11/2/2020 7:11:45 AM 0:31:25 3 ... 0.000 394.513 168.167 40.000 Unknown 3 PRECUT.$$$ 11/2/2020 7:21:07 AM 11/2/2020 7:21:22 AM 0:00:15 3 ... 0.000 354.574 173.032 40.000 Unknown 4 LFRF0106 30D A2 11/2/2020 7:32:59 AM 11/2/2020 8:00:49 AM 0:27:50 3 ... 0.000 430.553 200.613 40.000 Unknown 5 LFRF0106 30D A1 11/2/2020 8:19:21 AM 11/2/2020 9:08:27 AM 0:49:06 3 ... 0.000 430.338 200.812 40.000 Unknown [5 rows x 27 columns]
Reply
#17
(Jan-02-2021, 01:22 AM)SunWers Wrote: I could set the script up to be more automatic if the files were .csv format, correct?
It will be "automatic" all the same. Just the code will be slightly different, e.g. pandas.read_fwf vs pandas.read_csv or if you parse yourself it will allow to use csv, but in any case it will not do big difference.

Given that pandas did good job reading the file, concentrate on writing to excel. You have 2 options - write to excel from pandas using to_excel, then format the file to look like what you want using other package like openpyxl or write each column from the dataframe in the template you created one by one. In any case you need to handle start and end time because date and time are one column, while you want them separate columns.
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#18
A .csv file is a text file.

Something has to tell the csv reader what belongs in what column.

Below, the separator is a comma.

The comma was put there when I saved a Libre Office spreadsheet as .csv

Here is a peek:

Quote:Photo,phone,number,name,attendance,attn%,Final Score,points%,points,hwW16,cwW16,Week 16,hwW15,cwW15,Week 15,hwW14,cwW14,Week 14,hwW13,cwW13,Week 13,hwW12,cwW12,Week 12,hwW11,cwW11,Week 11,hwW10,cwW10,Week 10,hwW9,cwW9,Week 9
,,,,max Attendance,,,Max Score,425,,,,50,25,,50,25,,50,25,,50,25,,50,25,,0,25,,0,25,
,,,23名,8,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Photo,phone,学号,姓名,出勤,出勤率,期末分数,得分率,得分,HW,CW,Week 16,HW,CW,Week 15,HW,CW,Week 14,HW,CW,Week 13,HW,CW,Week 12,HW,CW,Week 11,HW,CW,Week 10,HW,CW,Week 9
,,1921030146,聂林,0,0,0,0,0,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,,1921090147,郜阳杰,0,0,0,0,0,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,,1922190144,汪毓韬,0,0,0,0,0,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,,1923010312,张长乐,7,100,100,100,425,,,1,50,25,1,50,25,1,50,25,1,50,25,1,50,25,1,0,25,1,0,25,1

When I open this .csv file with Libre Office, LO looks for the separator. Can be tab, comma, dunno what.

This reads the above file and also can save the data I want, better than csv.reader() (I think):

import pandas as pd

# Read the file

file1 = '/home/pedro/winter2020/20PY/dumpFiles/attendanceWinter2020_20PYlatestPyMade_test1.csv'

dg = pd.read_csv(file1, usecols=['number', 'name', 'Final Score'])

# write the file

file2 = '/home/pedro/winter2020/20PY/dumpFiles/panda_bear.csv'

dg.to_csv(file2, encoding='utf8', index=False, sep=',')
Reply
#19
How to Convert a TXT file to CSV
  • Open Excel and create a new spreadsheet.
  • Select the Data tab.
  • On the far right, click “Get External Data”, then select the “From Text” option.
  • Find the TXT file on your computer and click “Open”.
  • In the first step of the Import Wizard, select “Delimited”
  • Make sure to select that your data has headers (check the box) then click next.
  • In the second step, select “Tab” if you’re working with the Amazon All Products Report, or “Comma” if you’re using another file. The best way to tell is to take a look at the contents of the file and see how they’re divided. If there is just a bunch of space between one field and another, then Tab is the right choice. If you look at the file and there are commas after each entry, then Comma is the right choice.

Regards,
Caleb
Reply
#20
@caleb_cruze, what does this have to do with OP question on how to automate report creation using python. Based on discussion so far she is perfectly good at doing the report manually :)
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Converting a json file to a dataframe with rows and columns eyavuz21 13 4,457 Jan-29-2023, 03:59 PM
Last Post: eyavuz21
  [split] Help- converting file with pyton script eltomassito 6 3,253 Jul-02-2021, 05:29 PM
Last Post: snippsat
  Help- converting file with pyton script grinleon 3 2,486 Sep-23-2020, 11:48 AM
Last Post: grinleon

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020