Python Forum
Trying to parse only 3 key values from json file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Trying to parse only 3 key values from json file
#1
So im playing around with parsing a json file in python.
Im able to read in the file and print it to the console, but now i want to extract 3 values from each "section" not sure what the proper terminology is.

Here is a example of the data structure..:
  "messages": [
    {
      "sender_name": "Me",
      "timestamp_ms": 1653260883178,
      "content": "There are plenty of leftovers",
      "type": "Generic",
      "is_unsent": false,
      "is_taken_down": false,
      "bumped_message_metadata": {
        "bumped_message": "There are plenty of leftovers",
        "is_bumped": false
      }
    },
    {
      "sender_name": "Me",
      "timestamp_ms": 1653260872966,
      "content": "Watching the new scream movie",
      "type": "Generic",
      "is_unsent": false,
      "is_taken_down": false,
      "bumped_message_metadata": {
        "bumped_message": "Watching the new scream movie",
        "is_bumped": false
      }
    },
I basically need to pull out only the first 3 sets of values and save it into a CSV file.

      "sender_name": "Me",
      "timestamp_ms": 1653260883178,
      "content": "There are plenty of leftovers",
Right now i have this basic simple code, but need to figure out how to get within the "message" section and pull out those 3 values per group

import json

f = open('message_1.json')

data = json.load(f)

for i in data['messages']:
     print(i)

f.close()
Reply
#2
Have a look at dicts
I welcome all feedback.
The only dumb question, is one that doesn't get asked.
My Github
How to post code using bbtags


Reply
#3
import json
from datetime import datetime

json_str = """
{
    "messages": [
        {
            "sender_name": "Me",
            "timestamp_ms": 1653260883178,
            "content": "There are plenty of leftovers",
            "type": "Generic",
            "is_unsent": false,
            "is_taken_down": false,
            "bumped_message_metadata": {
            "bumped_message": "There are plenty of leftovers",
            "is_bumped": false
            }
        },
        {
            "sender_name": "Me",
            "timestamp_ms": 1653260872966,
            "content": "Watching the new scream movie",
            "type": "Generic",
            "is_unsent": false,
            "is_taken_down": false,
            "bumped_message_metadata": {
            "bumped_message": "Watching the new scream movie",
            "is_bumped": false
            }
        }
    ]
}
"""

data = json.loads(json_str)

for message in data["messages"]:
    timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000)
    print(
        f"""{timestamp} from {message["sender_name"]}\n{message["content"]}\n"""
    )
Output:
2022-05-22 18:08:03.178000 from Me There are plenty of leftovers 2022-05-22 18:07:52.966000 from Me Watching the new scream movie
Reply
#4
So here is what i have and seems to work, now im trying to save this to a CSV so i can test importing it into my excel report

import json
from datetime import datetime

f = open('messages.json')

data = json.load(f)

for message in data["messages"]:
    timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000)
    
    if 'content' not in message:
        print(
            f"""{timestamp} from {message["sender_name"]}\n"""
        )
    else:
        print(
            f"""{timestamp} from {message["sender_name"]}\n{message["content"]}\n"""
        )

f.close()
Reply
#5
What am i doing wrong?

import json
from datetime import datetime
import pandas as pd


f = open('message_1.json')

data = json.load(f)

for message in data["messages"]:
    timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000)
    
    if 'content' not in message:
        rw = pd.DataFrame([timestamp,message["sender_name"],pd.NA], columns=['Date', 'Name', 'Comment'])

    else:
        rw = pd.DataFrame([timestamp,message["sender_name"], message["content"]], columns=['Date', 'Name', 'Comment'])


rw.to_csv('igMess.csv',columns=["Date", "Name", "Comment"], header=None, index=None, mode='a')

f.close()
I get this error:

Error:
ValueError: Shape of passed values is (3, 1), indices imply (3, 3)
Reply
#6
Ok got past the error and a file generated, BUT not sure how to split out the timestamp so that i have a date and a time separated in the csv

import json
from datetime import datetime
import pandas as pd


f = open('message_1.json')

data = json.load(f)

lv = []

for message in data["messages"]:
    timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000)
    
    date_val = timestamp.strftime('%Y-%m-%d')
    
    if 'content' not in message:
        st = date_val +","+message["sender_name"]+","+""
        lv.append(st)

    else:
        st = date_val +","+message["sender_name"]+","+message["content"]
        lv.append(st)

df = pd.DataFrame(lv)

df.to_csv('igMess.csv', header=None, index=None, mode='a')

f.close()
the file that was generated when the above was run produced this output:

"2022-05-22,Me,There are plenty of leftovers"
"2022-05-22,Me,Watching the new scream movie"
expected results should be like so:

5/17/22, 5:28 PM,Me: There are plenty of leftovers
5/17/22, 5:28 PM,Me: Watching the new scream movie
If you notice, the generated results have "" around each row and missing the 5:28 PM time..
Reply
#7
ok got the time added and working, so now the only question is how to remove the " " around each row in the file

here is the currently working code:

import json
from datetime import datetime
import pandas as pd

f = open('message_1.json')

data = json.load(f)
lv = []

for message in data["messages"]:
    timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000)
    
    date_val = timestamp.strftime('%Y-%m-%d')
    time_val = timestamp.strftime("%I:%M %p")
    
    if 'content' not in message:
        st = date_val + "," + time_val + "," + message["sender_name"] + "," + ""
        lv.append(st)

    else:
        st = date_val +"," + time_val + "," + message["sender_name"] + "," + message["content"]
        lv.append(st)

df = pd.DataFrame(lv)

df.to_csv('igMess.csv', header=None, index=None, mode='a')

f.close()
Reply
#8
So i have been running this a few times since the above post and found a few things, im hoping i can fix in the above code. So i noticed that if a message is very large that it gets split up in my csv file., i only want my csv to have 4 columns

Here is the current code the does work, just needs some adjustments to make sure my "content" column is all inclusive and not split out. When i ran this code today against the newest json file, i found data in 4 or 6 other columns, basically had data for certain rows spread across columns A thru M

import json
from datetime import datetime
import pandas as pd
import os
import csv

f = open('message_1.json')

data = json.load(f)

lv = []

for message in data["messages"]:
    timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000)
    
    lv.append([
        timestamp.strftime("%m/%d/%Y"),
        timestamp.strftime("%I:%M:%S %p"),
        message["sender_name"],
        message["content"] if "content" in message else "Media Link"])  

df = pd.DataFrame(lv, columns=["Date", "Time", "Sender", "Content"])

df.to_csv('igMess.csv', header=None, index=None, quoting=csv.QUOTE_NONE,  escapechar=",", mode='a')

f.close()
Reply
#9
What are all the possible keys that contain content values? How should the content values be combined?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  encrypt data in json file help jacksfrustration 1 215 Mar-28-2024, 05:16 PM
Last Post: deanhystad
  parse json field from csv file lebossejames 4 756 Nov-14-2023, 11:34 PM
Last Post: snippsat
  parse/read from file seperated by dots giovanne 5 1,115 Jun-26-2023, 12:26 PM
Last Post: DeaD_EyE
  Python Script to convert Json to CSV file chvsnarayana 8 2,524 Apr-26-2023, 10:31 PM
Last Post: DeaD_EyE
  Loop through json file and reset values [SOLVED] AlphaInc 2 2,140 Apr-06-2023, 11:15 AM
Last Post: AlphaInc
  [split] Parse Nested JSON String in Python mmm07 4 1,531 Mar-28-2023, 06:07 PM
Last Post: snippsat
  Converting a json file to a dataframe with rows and columns eyavuz21 13 4,488 Jan-29-2023, 03:59 PM
Last Post: eyavuz21
  validate large json file with millions of records in batches herobpv 3 1,276 Dec-10-2022, 10:36 PM
Last Post: bowlofred
  Writing to json file ebolisa 1 1,009 Jul-17-2022, 04:51 PM
Last Post: deanhystad
  Modify values in XML file by data from text file (without parsing) Paqqno 2 1,679 Apr-13-2022, 06:02 AM
Last Post: Paqqno

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020