![]() |
Need help formatting dataframe data before saving to CSV - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Need help formatting dataframe data before saving to CSV (/thread-37605.html) Pages:
1
2
|
Need help formatting dataframe data before saving to CSV - cubangt - Jun-29-2022 I am trying to parse and save my IG messages for stat purposes. I have the following code that works and saves to a csv file, but when i open the file each line is wrapped in " " like so ("2022-06-29,08:50 AM,Joe,Liked a message"), so when i try to import the data into excel, its not split out correctly, thats issue #1 Issue #2 is how the date is formatted, either im not choosing the correct format below or just missing something, but the date that is placed in the file is formatter as the below code shows yyyy-mm-dd i need to have it formatted in mm/dd/yyyy Issue #3 same thing with the time, in this case i need to include the seconds in the format, currently the below returns 07:05 PM, but need it to return 07:05:00 PM Can someone suggest what im doing wrong or offer some samples that i can change or implement in my code to clean up the results some more. import json from datetime import datetime import pandas as pd import os f = open('message_1.json') data = json.load(f) lv = [] for message in data["messages"]: timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000) date_val = timestamp.strftime('%Y-%m-%d') time_val = timestamp.strftime("%I:%M %p") if 'content' not in message: st = date_val + "," + time_val + "," + message["sender_name"] + "," + "" lv.append(st) else: st = date_val +"," + time_val + "," + message["sender_name"] + "," + message["content"] lv.append(st) df = pd.DataFrame(lv) RE: Need help formatting dataframe data before saving to CSV - cubangt - Jun-29-2022 Ok i figured out #2 and #3 date_val = timestamp.strftime("%m/%d/%Y") time_val = timestamp.strftime("%I:%M:%S %p")How can i prevent the " " wrapped around each line? RE: Need help formatting dataframe data before saving to CSV - cubangt - Jun-29-2022 Here is the latest updated code and getting an error: import json from datetime import datetime import pandas as pd import os import csv f = open('message_1.json') data = json.load(f) lv = [] for message in data["messages"]: timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000) date_val = timestamp.strftime("%m/%d/%Y") time_val = timestamp.strftime("%I:%M:%S %p") if 'content' not in message: st = date_val + "," + time_val + "," + message["sender_name"] + "," + "" lv.append(st) else: st = date_val +"," + time_val + "," + message["sender_name"] + "," + message["content"] lv.append(st) df = pd.DataFrame(lv) df.to_csv('igMess.csv', header=None, index=None, quoting=csv.QUOTE_NONE, mode='a') f.close() The "quoting=csv.QUOTE_NONE" parameter got rid of the quotes on each line, but the above give that error now..
RE: Need help formatting dataframe data before saving to CSV - ibreeden - Jun-30-2022 Please show the complete error message. And which line in your program causes this error? RE: Need help formatting dataframe data before saving to CSV - cubangt - Jun-30-2022
RE: Need help formatting dataframe data before saving to CSV - Axel_Erfurt - Jun-30-2022 set escapechar, replace your_escapechar with yours df.to_csv('igMess.csv', header=None, index=None, quoting=csv.QUOTE_NONE, escapechar="your_escapechar", mode='a')
RE: Need help formatting dataframe data before saving to CSV - cubangt - Jun-30-2022 ok, so doing that adds the delimeter, but since im building the string already with the comma, i now have 2 commas between each column.. so how can i change this line: st = date_val + "," + time_val + "," + message["sender_name"] + "," + "" lv.append(st)so that i get my 3 values comma delimited? This is whats in my csv now with the addition of the escapecharacter param 06/29/2022,,08:50:16 AM,,Joe,,Liked a message 06/29/2022,,08:50:04 AM,,Ron,,Liked a message 06/29/2022,,08:49:58 AM,,Ron,,Liked a message RE: Need help formatting dataframe data before saving to CSV - Axel_Erfurt - Jun-30-2022 How is message_1.json structured? Can you show some lines? RE: Need help formatting dataframe data before saving to CSV - deanhystad - Jun-30-2022 You are making a 1 column dataframe of strings. If you made a dataframe that had columns for date, time, sender and content you would get a proper csv file from df.to_csv(). Here's a short example: mport pandas as pd from random import choice import time, datetime senders = ["Me", "You", "Dog named Boo"] content = [ "Just wanted to sa Hi!", "Coming home soon.", "Take me for a walk.", "We're out of dog food.", None, ] messages = [] for _ in range(10): timestamp = datetime.datetime.now() messages.append( [ timestamp.strftime("%m/%d/%Y"), timestamp.strftime("%I:%M:%S %p"), choice(senders), choice(content), ] ) time.sleep(1) df = pd.DataFrame(messages, columns=["Date", "Time", "Sender", "Content"]) print(df) df.to_csv("test.csv")The dataframe looks like this: And the generated CSV file like this. If I want to remove the column or row headers this is easily accomplished using arguments in the to_csv() call.
RE: Need help formatting dataframe data before saving to CSV - cubangt - Jun-30-2022 { "participants": [ { "name": "Ron" }, { "name": "Joe" } ], "messages": [ { "sender_name": "Ron", "timestamp_ms": 1656510616932, "content": "Liked a message", "type": "Generic", "is_unsent": false, "is_taken_down": false, "bumped_message_metadata": { "is_bumped": false } }, { "sender_name": "Ron", "timestamp_ms": 1656510604636, "content": "Liked a message", "type": "Generic", "is_unsent": false, "is_taken_down": false, "bumped_message_metadata": { "is_bumped": false } }, { "sender_name": "Ron", "timestamp_ms": 1656510598826, "content": "Liked a message", "type": "Generic", "is_unsent": false, "is_taken_down": false, "bumped_message_metadata": { "is_bumped": false } }, { "sender_name": "Joe", "timestamp_ms": 1656510439241, "reactions": [ { "reaction": "\u00e2\u009d\u00a4\u00ef\u00b8\u008f", "actor": "Ron" }, { "reaction": "\u00e2\u009d\u00a4\u00ef\u00b8\u008f", "actor": "Ron" } ], "type": "Generic", "is_unsent": false, "is_taken_down": false, "bumped_message_metadata": { "bumped_message": "", "is_bumped": false } }, |