Posts: 5
Threads: 1
Joined: May 2020
May-05-2020, 05:45 AM
(This post was last modified: May-05-2020, 07:55 AM by buran.)
I'm trying to match a regex expression to the value of Id in a json file. My goal is to iterate over a bunch of json files in a directory and replace the value of the Id key with the regex match. The regex works well by itself when i try it on regex101, however when I run it again the json files on my computer i get a Error: TypeError: expected string or bytes-like object
. Any and all help appreciated.
import json
import os
import re
rootdir = r'C:\\Users\\homersimpson\\jsondumps'
for files in os.scandir(rootdir):
with open(files, "r") as file:
json_data = json.load(file)
extracted = re.findall((r'.+?(?<=\$apples\$)'),json_data)
print(something)
Posts: 1,583
Threads: 3
Joined: Mar 2020
json.load() returns a python object. For instance it might be a deeply-nested set of dictionaries or lists.
re.findall takes a single string and operates on it, not a dict or list.
I wouldn't recommend regex on a JSON file, but if that's what you're trying to do, just read the file as text (a la open() ).
Posts: 8,156
Threads: 160
Joined: Sep 2016
load the json. iterate over it. match the keys (maybe regex here). Replace the value. at the end - dump back to json file.
Posts: 5
Threads: 1
Joined: May 2020
May-05-2020, 04:15 PM
(This post was last modified: May-05-2020, 04:22 PM by senaint.)
well I'm thinking maybe I can load the python as a dictionary, cycle through values, modify it and write it out as Json.
(May-05-2020, 07:57 AM)buran Wrote: load the json. iterate over it. match the keys (maybe regex here). Replace the value. at the end - dump back to json file.
I load the json but won't let me iterate.
Posts: 8,156
Threads: 160
Joined: Sep 2016
we don't know what your json file look like. Can you show some sample data?
Posts: 5
Threads: 1
Joined: May 2020
May-05-2020, 07:54 PM
(This post was last modified: May-05-2020, 08:00 PM by senaint.)
Quote:{
"id": "companyName-channelruleprintermap-5746-$pc2$companyName$5746$uspc02",
"ChannelRuleCollections": [
{
"Name": "Peak",
"State": 0,
"Modified": "2020-01-10T08:17:11.9072155Z",
"Rules": [
{
"RuleType": "Item",
"Channel": "cafe",
"ChannelSource": "usrg01005746",
"RuleName": "Ground Coffee",
"Printer": "None",
"LabelPrinter": "None",
"ChitPrinter": "None",
"FormatName": "Label",
"ChitFormat": "OrderTicket"
},
{
"RuleType": "Item",
"Channel": "cafe",
"ChannelSource": "usrg01005746",
"RuleName": "Warmed Food",
"Printer": "Warming Printer",
"LabelPrinter": "Warming Printer",
"ChitPrinter": "None",
"FormatName": "Label",
"ChitFormat": "OrderTicket"
},
{
"RuleType": "Item",
"Channel": "cafe",
"ChannelSource": "usrg01005746",
"RuleName": "Blended Beverages",
"Printer": "Bar 1 (Closest to HOP)",
"LabelPrinter": "Bar 1 (Closest to HOP)",
"ChitPrinter": "None",
"FormatName": "Label",
"ChitFormat": "OrderTicket"
}...
so I'm trying to change the
"Id" : "companyName-channelruleprintermap-5746-$pc2$companyName$5746$uspc02"
to
"Id" : "companyName-channelruleprintermap-5746"
Posts: 8,156
Threads: 160
Joined: Sep 2016
probably something like this
json_data = json.load(file)
json_data['id'] = json_data['id'].split('-$')[0]
Posts: 5
Threads: 1
Joined: May 2020
May-05-2020, 09:49 PM
(This post was last modified: May-05-2020, 10:41 PM by senaint.)
thank you SO MUCH Buran, I have no idea why I thought regex would be the solution to this.
so I had neglected to mention that there is a [ before { "id":.... is in essence it in an array, does this make it iterable?
[
{
"id": "companyName-channelruleprintermap-5746-$pc2$companyName$5746$uspc02",
"ChannelRuleCollections": [
{
"Name": "Peak",
"State": 0,
"Modified": "2020-01-10T08:17:11.9072155Z",
"Rules": [
{
"RuleType": "Item",
"Channel": "cafe",
"ChannelSource": "usrg01005746",
"RuleName": "Ground Coffee",
"Printer": "None",
"LabelPrinter": "None",
"ChitPrinter": "None",
"FormatName": "Label",
"ChitFormat": "OrderTicket"
},
{
"RuleType": "Item",
"Channel": "cafe",
"ChannelSource": "usrg01005746",
"RuleName": "Warmed Food",
"Printer": "Warming Printer",
"LabelPrinter": "Warming Printer",
"ChitPrinter": "None",
"FormatName": "Label",
"ChitFormat": "OrderTicket"
},
{
"RuleType": "Item",
"Channel": "cafe",
"ChannelSource": "usrg01005746",
"RuleName": "Blended Beverages",
"Printer": "Bar 1 (Closest to HOP)",
"LabelPrinter": "Bar 1 (Closest to HOP)",
"ChitPrinter": "None",
"FormatName": "Label",
"ChitFormat": "OrderTicket"
}...
]
Posts: 1,583
Threads: 3
Joined: Mar 2020
That just adds another list around the json object. Assuming there's only one item inside, your content will be item number 0. Instead of json_data['id'] you'd access the ID as json_data[0]['id']
Posts: 5
Threads: 1
Joined: May 2020
so roughly speaking all the "id" keys are objects in a big array. I have tried various variation of the simple code below but I keep getting TypeError: the JSON object must be str, bytes or bytearray, not TextIOWrapper
import json
import os
rootfile = r'C:\\Users\\homer\\dumps\\cosmodb-sample.json'
with open(rootfile,'r') as file:
json_data = json.loads(file)
for docs in json_data[0]:
print(docs) Quote:[
{
"id": "company-channelruleprintermap-8913-$pc2$company$8913$uspc02",
"ChannelRuleCollections": [
{
"Name": "Peak",
"State": 0,
"Modified": "2020-01-10T08:34:34.8015346Z"
}
]
},
{
"id": "company-channelruleprintermap-8913-$pc2$company$8913$uspc02",
"ChannelRuleCollections": [
{
"Name": "Peak",
"State": 0,
"Modified": "2020-01-10T08:34:34.8015346Z"
}
]
},
]
|