Posts: 18
Threads: 5
Joined: Dec 2020
Jan-01-2021, 06:29 AM
(This post was last modified: Jan-01-2021, 06:29 AM by SunWers.)
Hi,
I'm needing a little help getting a text/tab delimited file imported in with python. I think I'm missing how to import the columns... This is what my text file looks like. I was thinking it was a tab delimited file but when I import it in from out Gerber cutters it comes in as a text file. So maybe I'm wrong somewhere there...
I run the code below to try and convert it to .csv so I can convert it to .xlsx
import pandas as pd
save_path = "C:/Users/vhubb/Documents/Python Scripts"
in_filename = os.path.join(save_path,'mytxt.txt')
out_filename = os.path.join(save_path,'Output.csv')
df = pd.read_csv(in_filename, sep=";")
df.to_csv(out_filename, index=False) But when I run this code it converts all columns into one
I'm thinking it's because of this part of the code with the ;
df = pd.read_csv(in_filename, sep=";") Or am I going completely wrong with what I'm thinking. I'm trying to import this text file with specific columns into a .xlsx file that I created with python. The same .xlsx file (Cutter A - Marker Report.xlsx) that I mentioned in an earlier post. I was thinking I needed to convert it to a .csv file that I would then need to convert to a .xlsx file. From there I was going to figure out how to import it in to the correct columns for the Cutter A - Marker Report.xlsx that I have created.
Thanks,
Veronica
Posts: 64
Threads: 16
Joined: Nov 2019
(Jan-01-2021, 06:29 AM)SunWers Wrote: I'm thinking it's because of this part of the code with the ;
Easiest way to find out is to print out the dataframe and see how many columns you get. I can't see any ';' delimiters in your text file, though. The solution may be to use sep = None, this will force the Python engine which can automatically detect the separator (read the docs for details).
Pandas can also directly write to an Excel file, fyi.
Posts: 8,156
Threads: 160
Joined: Sep 2016
(Jan-01-2021, 06:29 AM)SunWers Wrote: I was thinking I needed to convert it to a .csv file that I would then need to convert to a .xlsx file. No need to convert to csv. Parse the input file and write to excel file. Your excel file has specific/custom format, so I guess you need to write using package like openpyxl , just using pandas.DataFrame.to_excel() will not work for you.
(Jan-01-2021, 06:29 AM)SunWers Wrote: I'm thinking it's because of this part of the code with the ;
you think it's tab-delimied, but use sep=";" ? try with sep='\t' or let pandas engine to guess as suggested.
Posts: 18
Threads: 5
Joined: Dec 2020
(Jan-01-2021, 06:52 AM)palladium Wrote: (Jan-01-2021, 06:29 AM)SunWers Wrote: I'm thinking it's because of this part of the code with the ;
Easiest way to find out is to print out the dataframe and see how many columns you get. I can't see any ';' delimiters in your text file, though. The solution may be to use sep = None, this will force the Python engine which can automatically detect the separator (read the docs for details).
Pandas can also directly write to an Excel file, fyi.
Hi Palladium,
I just tried adding "sep = None" and this is the error message I get...
runfile('C:/Users/vhubb/Documents/Python Scripts/temp.py', wdir='C:/Users/vhubb/Documents/Python Scripts')
C:\Users\vhubb\Documents\Python Scripts\temp.py:37: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators (separators > 1 char and different from '\s+' are interpreted as regex); you can avoid this warning by specifying engine='python'.
df = pd.read_csv(in_filename, sep="None") If I'm understanding you right, you said panda's can write directly to a .xlsx file? I was thinking it should be able to but I couldn't find anything about how to import the .txt file (corresponding columns from .txt to .xlsx) to an existing .xlsx file. So I was thinking it needed to be converted over first.
Thanks so much for everyone's help here.
Veronica
Posts: 18
Threads: 5
Joined: Dec 2020
Jan-01-2021, 07:39 AM
(This post was last modified: Jan-01-2021, 07:43 AM by SunWers.)
(Jan-01-2021, 07:27 AM)buran Wrote: (Jan-01-2021, 06:29 AM)SunWers Wrote: I was thinking I needed to convert it to a .csv file that I would then need to convert to a .xlsx file. No need to convert to csv. Parse the input file and write to excel file. Your excel file has specific/custom format, so I guess you need to write using package like openpyxl , just using pandas.DataFrame.to_excel() will not work for you.
(Jan-01-2021, 06:29 AM)SunWers Wrote: I'm thinking it's because of this part of the code with the ;
you think it's tab-delimied, but use sep=";" ? try with sep='\t' or let pandas engine to guess as suggested.
Hi buran,
I tried setting sep='\t' and it still didn't do the trick. I will work on your suggestion above. I haven't tried using openpyxl. So I need to use a combination of openpyxl and `pandas.DataFrame.to_excel()? Not sure how to set that up yet but I will work on that now and see what I can do.
Thanks,
Veronica
Posts: 8,156
Threads: 160
Joined: Sep 2016
Jan-01-2021, 07:41 AM
(This post was last modified: Jan-01-2021, 07:42 AM by buran.)
can you upload sample input file? Looking at the screenshot again, it is more likely a fixed-width file.
palladium likes this post
Posts: 18
Threads: 5
Joined: Dec 2020
(Jan-01-2021, 07:41 AM)buran Wrote: can you upload sample input file? Looking at the screenshot again, it is more likely a fixed-width file.
Sorry, crazy question.... how do I upload a file?
Posts: 64
Threads: 16
Joined: Nov 2019
(Jan-01-2021, 07:31 AM)SunWers Wrote: (Jan-01-2021, 06:52 AM)palladium Wrote: Easiest way to find out is to print out the dataframe and see how many columns you get. I can't see any ';' delimiters in your text file, though. The solution may be to use sep = None, this will force the Python engine which can automatically detect the separator (read the docs for details).
Pandas can also directly write to an Excel file, fyi.
Hi Palladium,
I just tried adding "sep = None" and this is the error message I get...
runfile('C:/Users/vhubb/Documents/Python Scripts/temp.py', wdir='C:/Users/vhubb/Documents/Python Scripts')
C:\Users\vhubb\Documents\Python Scripts\temp.py:37: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators (separators > 1 char and different from '\s+' are interpreted as regex); you can avoid this warning by specifying engine='python'.
df = pd.read_csv(in_filename, sep="None")
It's not an error message. Its a warning to let you know that the pandas engine is using a non default method to parse your file (the default is C engine which does not guessing delimiters; pandas is telling you that it has switched to the python engine which I understand is slower but has some built in support for guessing delimiters). If you add the keyword argument engine = 'python' as per the warning message, the message should disappear.
An error message will tell you what type of error it is (e.g. "TypeError" or "ValueError") followed by a message.
If you don't see anything else it means the process is successful. Print out the resulting dataframe to verify.
As buran pointed out you can use pandas' built in to_excel() method, which dumps the dataframe into an xlsx file format. It is customisable to some extent,b ut as (s)he astutely pointed out, given your complex needs openpyxl may be a better bet.
Posts: 8,156
Threads: 160
Joined: Sep 2016
Make it a zip (it will not allow txt file because it can be copy/pasted, but in this case we want to check the delimiter) and attach it
Or use some third-party service and upload the file there. Then share a link here
Posts: 18
Threads: 5
Joined: Dec 2020
I'm thinking this might work. This is from my google drive
https://drive.google.com/file/d/1j-Uc3XL...sp=sharing
|