Python Forum
beginner text formatting single line to column - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: beginner text formatting single line to column (/thread-33411.html)



beginner text formatting single line to column - jafrost - Apr-23-2021

Hello, I'm brand new to python having worked through an online python course for beginners, but with no other experience. For my first project I've been tasked with what I assume is a pretty simple task, but that is going right over my head. I'm usually pretty good at tracking down examples or tutorials that can get me where I need to go, but this has got me. I am two days into online searches and tutorials, none of which seem to be what I need or getting me any closer to finding a path forward. Any help or direction this group could provide would be very helpful.

Essentially, I have a .txt file that contains a single very long string that I need to rearrange so its readable in a different program. Here is a much simplified example:

Original data file-all in a single string.
{"header1":[data1],"header2"[dataA, dataB, dataC], "header3":[dataX, dataY, dataZ], "header4":[[0, 1, 2]], "header5":[dataz]}

Needed format
#header1= data1
#header5= dataz
#data table= header5, header2, header3
0 dataA dataX
1 dataB dataY
2 dataC dataZ

As I said, any help or direction on where to look or what this type of reformatting is called would be very helpful. Thank you.


RE: beginner text formatting single line to column - ibreeden - Apr-24-2021

I think this is not just a text file. I think it is a JSON file. Please read JSON encoder and decoder to see if this fits.


RE: beginner text formatting single line to column - snippsat - Apr-24-2021

As mention bye ibreeden the data is JSON or can already been converted to a Python dictionary.
Saving to text file is something that not should done as it destroy the data structure.
Can be possible to get back bye using eg eval(),but that is the wrong way.

So here some fix to make work an can then can take it into Pandas to easier get format wanted.
# Some fix to so it work as a dictionary 
>>> d = {"header1":['data1'], "header2":['dataA', 'dataB', 'dataC'], "header3":['dataX', 'dataY', 'dataZ'], "header4":[[0, 1, 2]], "header5":['dataz']}
>>> d
{'header1': ['data1'],
 'header2': ['dataA', 'dataB', 'dataC'],
 'header3': ['dataX', 'dataY', 'dataZ'],
 'header4': [[0, 1, 2]],
 'header5': ['dataz']}

>>> import pandas as pd
>>> 
# Load into Pandas,orient change so it fill(None) for missing values 
>>> df = pd.DataFrame.from_dict(d, orient='index')
>>> df
                 0      1      2
header1      data1   None   None
header2      dataA  dataB  dataC
header3      dataX  dataY  dataZ
header4  [0, 1, 2]   None   None
header5      dataz   None   None
>>> 
# Transpose index to columns 
>>> df.transpose()
  header1 header2 header3    header4 header5
0   data1   dataA   dataX  [0, 1, 2]   dataz
1    None   dataB   dataY       None    None
2    None   dataC   dataZ       None    None



RE: beginner text formatting single line to column - jafrost - Apr-26-2021

Thanks for the pointers. It was a json file originally.


RE: beginner text formatting single line to column - jafrost - Apr-28-2021

Thank you all very much, with your help I've made considerable progress and learned a lot. I have opened the JSON directly and moved it into a dictionary. However, I'm still doing something wrong with the conversion from the dictionary to the dataframe as it will not transpose the data.

#Build dictionary
d={"freq":[FREQ], "sw":[sw], "ref":[ref], "spec":[spec], "WholeEcho":[WE], "Time":[T], "dataReal":[RD], "dataImag":[ID]}

#import pandas
import pandas as pd

# Load into Pandas,orient change so it fill(None) for missing values
df = pd.DataFrame.from_dict(d, orient='index')

#transpose index to columns
df.T
print(df)

FREQ, sw, spec, and WholeEcho are all floats. ref is a string, and T, RD, and ID are lists. My intention is that T, RD, and ID are lists of floats, but I'm not sure if I was successful.

The output is:

0
freq 1.23456e+08
sw 10000
ref [NAN]
spec 0
WholeEcho 0
Time [[0.00, 0.01, 0.02, 0.03, 0.04, 0.05]]
dataReal [[4, 6, 8, 9, 10, 12]]
dataImag [[1, 2, 3, 5, 7, 11]]

But ideally what I'd like to see in the output is...

freq 1.23456e+08
sw 10000
ref [NAN]
spec 0
WholeEcho 0

Time dataReal dataImag
0.00 4 1
0.01 6 2
0.02 8 3
0.03 9 5
0.04 10 7
0.05 12 11

Thank you for the help and direction.