Hello, I'm brand new to python having worked through an online python course for beginners, but with no other experience. For my first project I've been tasked with what I assume is a pretty simple task, but that is going right over my head. I'm usually pretty good at tracking down examples or tutorials that can get me where I need to go, but this has got me. I am two days into online searches and tutorials, none of which seem to be what I need or getting me any closer to finding a path forward. Any help or direction this group could provide would be very helpful.
Essentially, I have a .txt file that contains a single very long string that I need to rearrange so its readable in a different program. Here is a much simplified example:
Original data file-all in a single string.
{"header1":[data1],"header2"[dataA, dataB, dataC], "header3":[dataX, dataY, dataZ], "header4":[[0, 1, 2]], "header5":[dataz]}
Needed format
#header1= data1
#header5= dataz
#data table= header5, header2, header3
0 dataA dataX
1 dataB dataY
2 dataC dataZ
As I said, any help or direction on where to look or what this type of reformatting is called would be very helpful. Thank you.
I think this is not just a text file. I think it is a JSON file. Please read
JSON encoder and decoder to see if this fits.
As mention bye ibreeden the data is JSON or can already been converted to a Python dictionary.
Saving to text file is something that not should done as it destroy the data structure.
Can be possible to get back bye using eg
eval()
,but that is the wrong way.
So here some fix to make work an can then can take it into
Pandas to easier get format wanted.
# Some fix to so it work as a dictionary
>>> d = {"header1":['data1'], "header2":['dataA', 'dataB', 'dataC'], "header3":['dataX', 'dataY', 'dataZ'], "header4":[[0, 1, 2]], "header5":['dataz']}
>>> d
{'header1': ['data1'],
'header2': ['dataA', 'dataB', 'dataC'],
'header3': ['dataX', 'dataY', 'dataZ'],
'header4': [[0, 1, 2]],
'header5': ['dataz']}
>>> import pandas as pd
>>>
# Load into Pandas,orient change so it fill(None) for missing values
>>> df = pd.DataFrame.from_dict(d, orient='index')
>>> df
0 1 2
header1 data1 None None
header2 dataA dataB dataC
header3 dataX dataY dataZ
header4 [0, 1, 2] None None
header5 dataz None None
>>>
# Transpose index to columns
>>> df.transpose()
header1 header2 header3 header4 header5
0 data1 dataA dataX [0, 1, 2] dataz
1 None dataB dataY None None
2 None dataC dataZ None None
Thanks for the pointers. It was a json file originally.
Thank you all very much, with your help I've made considerable progress and learned a lot. I have opened the JSON directly and moved it into a dictionary. However, I'm still doing something wrong with the conversion from the dictionary to the dataframe as it will not transpose the data.
#Build dictionary
d={"freq":[FREQ], "sw":[sw], "ref":[ref], "spec":[spec], "WholeEcho":[WE], "Time":[T], "dataReal":[RD], "dataImag":[ID]}
#import pandas
import pandas as pd
# Load into Pandas,orient change so it fill(None) for missing values
df = pd.DataFrame.from_dict(d, orient='index')
#transpose index to columns
df.T
print(df)
FREQ, sw, spec, and WholeEcho are all floats. ref is a string, and T, RD, and ID are lists. My intention is that T, RD, and ID are lists of floats, but I'm not sure if I was successful.
The output is:
0
freq 1.23456e+08
sw 10000
ref [NAN]
spec 0
WholeEcho 0
Time [[0.00, 0.01, 0.02, 0.03, 0.04, 0.05]]
dataReal [[4, 6, 8, 9, 10, 12]]
dataImag [[1, 2, 3, 5, 7, 11]]
But ideally what I'd like to see in the output is...
freq 1.23456e+08
sw 10000
ref [NAN]
spec 0
WholeEcho 0
Time dataReal dataImag
0.00 4 1
0.01 6 2
0.02 8 3
0.03 9 5
0.04 10 7
0.05 12 11
Thank you for the help and direction.