Python Forum
Trying to parse only 3 key values from json file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Trying to parse only 3 key values from json file
#1
So im playing around with parsing a json file in python.
Im able to read in the file and print it to the console, but now i want to extract 3 values from each "section" not sure what the proper terminology is.

Here is a example of the data structure..:
  "messages": [
    {
      "sender_name": "Me",
      "timestamp_ms": 1653260883178,
      "content": "There are plenty of leftovers",
      "type": "Generic",
      "is_unsent": false,
      "is_taken_down": false,
      "bumped_message_metadata": {
        "bumped_message": "There are plenty of leftovers",
        "is_bumped": false
      }
    },
    {
      "sender_name": "Me",
      "timestamp_ms": 1653260872966,
      "content": "Watching the new scream movie",
      "type": "Generic",
      "is_unsent": false,
      "is_taken_down": false,
      "bumped_message_metadata": {
        "bumped_message": "Watching the new scream movie",
        "is_bumped": false
      }
    },
I basically need to pull out only the first 3 sets of values and save it into a CSV file.

      "sender_name": "Me",
      "timestamp_ms": 1653260883178,
      "content": "There are plenty of leftovers",
Right now i have this basic simple code, but need to figure out how to get within the "message" section and pull out those 3 values per group

import json

f = open('message_1.json')

data = json.load(f)

for i in data['messages']:
     print(i)

f.close()
Reply
#2
Have a look at dicts
I welcome all feedback.
The only dumb question, is one that doesn't get asked.
My Github
How to post code using bbtags
Download my project scripts


Reply
#3
import json
from datetime import datetime

json_str = """
{
    "messages": [
        {
            "sender_name": "Me",
            "timestamp_ms": 1653260883178,
            "content": "There are plenty of leftovers",
            "type": "Generic",
            "is_unsent": false,
            "is_taken_down": false,
            "bumped_message_metadata": {
            "bumped_message": "There are plenty of leftovers",
            "is_bumped": false
            }
        },
        {
            "sender_name": "Me",
            "timestamp_ms": 1653260872966,
            "content": "Watching the new scream movie",
            "type": "Generic",
            "is_unsent": false,
            "is_taken_down": false,
            "bumped_message_metadata": {
            "bumped_message": "Watching the new scream movie",
            "is_bumped": false
            }
        }
    ]
}
"""

data = json.loads(json_str)

for message in data["messages"]:
    timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000)
    print(
        f"""{timestamp} from {message["sender_name"]}\n{message["content"]}\n"""
    )
Output:
2022-05-22 18:08:03.178000 from Me There are plenty of leftovers 2022-05-22 18:07:52.966000 from Me Watching the new scream movie
Reply
#4
So here is what i have and seems to work, now im trying to save this to a CSV so i can test importing it into my excel report

import json
from datetime import datetime

f = open('messages.json')

data = json.load(f)

for message in data["messages"]:
    timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000)
    
    if 'content' not in message:
        print(
            f"""{timestamp} from {message["sender_name"]}\n"""
        )
    else:
        print(
            f"""{timestamp} from {message["sender_name"]}\n{message["content"]}\n"""
        )

f.close()
Reply
#5
What am i doing wrong?

import json
from datetime import datetime
import pandas as pd


f = open('message_1.json')

data = json.load(f)

for message in data["messages"]:
    timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000)
    
    if 'content' not in message:
        rw = pd.DataFrame([timestamp,message["sender_name"],pd.NA], columns=['Date', 'Name', 'Comment'])

    else:
        rw = pd.DataFrame([timestamp,message["sender_name"], message["content"]], columns=['Date', 'Name', 'Comment'])


rw.to_csv('igMess.csv',columns=["Date", "Name", "Comment"], header=None, index=None, mode='a')

f.close()
I get this error:

Error:
ValueError: Shape of passed values is (3, 1), indices imply (3, 3)
Reply
#6
Ok got past the error and a file generated, BUT not sure how to split out the timestamp so that i have a date and a time separated in the csv

import json
from datetime import datetime
import pandas as pd


f = open('message_1.json')

data = json.load(f)

lv = []

for message in data["messages"]:
    timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000)
    
    date_val = timestamp.strftime('%Y-%m-%d')
    
    if 'content' not in message:
        st = date_val +","+message["sender_name"]+","+""
        lv.append(st)

    else:
        st = date_val +","+message["sender_name"]+","+message["content"]
        lv.append(st)

df = pd.DataFrame(lv)

df.to_csv('igMess.csv', header=None, index=None, mode='a')

f.close()
the file that was generated when the above was run produced this output:

"2022-05-22,Me,There are plenty of leftovers"
"2022-05-22,Me,Watching the new scream movie"
expected results should be like so:

5/17/22, 5:28 PM,Me: There are plenty of leftovers
5/17/22, 5:28 PM,Me: Watching the new scream movie
If you notice, the generated results have "" around each row and missing the 5:28 PM time..
Reply
#7
ok got the time added and working, so now the only question is how to remove the " " around each row in the file

here is the currently working code:

import json
from datetime import datetime
import pandas as pd

f = open('message_1.json')

data = json.load(f)
lv = []

for message in data["messages"]:
    timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000)
    
    date_val = timestamp.strftime('%Y-%m-%d')
    time_val = timestamp.strftime("%I:%M %p")
    
    if 'content' not in message:
        st = date_val + "," + time_val + "," + message["sender_name"] + "," + ""
        lv.append(st)

    else:
        st = date_val +"," + time_val + "," + message["sender_name"] + "," + message["content"]
        lv.append(st)

df = pd.DataFrame(lv)

df.to_csv('igMess.csv', header=None, index=None, mode='a')

f.close()
Reply
#8
So i have been running this a few times since the above post and found a few things, im hoping i can fix in the above code. So i noticed that if a message is very large that it gets split up in my csv file., i only want my csv to have 4 columns

Here is the current code the does work, just needs some adjustments to make sure my "content" column is all inclusive and not split out. When i ran this code today against the newest json file, i found data in 4 or 6 other columns, basically had data for certain rows spread across columns A thru M

import json
from datetime import datetime
import pandas as pd
import os
import csv

f = open('message_1.json')

data = json.load(f)

lv = []

for message in data["messages"]:
    timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000)
    
    lv.append([
        timestamp.strftime("%m/%d/%Y"),
        timestamp.strftime("%I:%M:%S %p"),
        message["sender_name"],
        message["content"] if "content" in message else "Media Link"])  

df = pd.DataFrame(lv, columns=["Date", "Time", "Sender", "Content"])

df.to_csv('igMess.csv', header=None, index=None, quoting=csv.QUOTE_NONE,  escapechar=",", mode='a')

f.close()
Reply
#9
What are all the possible keys that contain content values? How should the content values be combined?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  JSON File - extract only the data in a nested array for CSV file shwfgd 2 1,031 Aug-26-2024, 10:14 PM
Last Post: shwfgd
  encrypt data in json file help jacksfrustration 1 2,171 Mar-28-2024, 05:16 PM
Last Post: deanhystad
  parse json field from csv file lebossejames 4 1,962 Nov-14-2023, 11:34 PM
Last Post: snippsat
  parse/read from file seperated by dots giovanne 5 2,231 Jun-26-2023, 12:26 PM
Last Post: DeaD_EyE
  Python Script to convert Json to CSV file chvsnarayana 8 4,647 Apr-26-2023, 10:31 PM
Last Post: DeaD_EyE
  Loop through json file and reset values [SOLVED] AlphaInc 2 5,268 Apr-06-2023, 11:15 AM
Last Post: AlphaInc
  [split] Parse Nested JSON String in Python mmm07 4 2,704 Mar-28-2023, 06:07 PM
Last Post: snippsat
  Converting a json file to a dataframe with rows and columns eyavuz21 13 13,490 Jan-29-2023, 03:59 PM
Last Post: eyavuz21
  validate large json file with millions of records in batches herobpv 3 2,141 Dec-10-2022, 10:36 PM
Last Post: bowlofred
  Writing to json file ebolisa 1 1,670 Jul-17-2022, 04:51 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020