Python Forum

Hi!

i have a problem that i cant solve by myself. I wanna read-in a big textfile (>300mb) as a df. Die File consists of coordinates, that are seperated by \t tabs. There are 4,000 columns and 5,000 rows.

When I read this file in, python only creats one big column with 20,000 entrys, so the dimonesion is 20,000 x 1.

I don't get it as there must obviously be \n-paragraphs in the textfile.

Can someone help me please?

At first I thought something is going wrong with interpreting the end-of-line character(s), but then you would have one big row.
Instead you are telling you get one big column. In that case something is going wrong with the interpretation of the tab character.

Without more information it is hard to say more about this. I suggest you show us the sample of the code reading the file and producing the column.

Maybe you can use the following code to create a (numpy) array?

import pandas as pd

MyFile = 'file.txt'
MyResults = pd.read_csv(PATH + '/' + MyFile, header = None, delimiter = '	')
MyArray = pd.DataFrame.to_numpy(MyResults)

Please, show us your code as well as sample data (only few lines). Obviously, if it is tab-delimited file you need to specify this because the default separator in DataFrame constructor is comma.

G_rizzle

ibreeden

paul18fr

buran