Posts: 453
Threads: 16
Joined: Jun 2022
(Oct-10-2022, 08:07 AM)Calli Wrote: It doesn't work even after adding 4 lines
I'll reiterate: provide a bigger sample (say 10 records) and I'll sort it out: reading 3,000,000 records into memory is not good practice, as we can do this one record at time.
Sig:
>>> import this
The UNIX philosophy: "Do one thing, and do it well."
"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse
"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Posts: 77
Threads: 19
Joined: Apr 2020
(Oct-10-2022, 08:11 AM)rob101 Wrote: (Oct-10-2022, 08:07 AM)Calli Wrote: It doesn't work even after adding 4 lines
I'll reiterate: provide a bigger sample (say 10 records) and I'll sort it out: reading 3,000,000 records into memory is not good practice, as we can do this one record at time.
{"_index":"testdataset","_type":"_doc","_id":"11234567891098646","_score":1,"_source":{"_class":"net.local.host.ca","orderNo":"16536668566434698646","orderDt":"20220527","source":0,"mchntId":"0000000002","mchntOrderNo":"01a3f2b53d16290f41f","appid":"0000000003","payChannelId":"payid","amount":300,"clientIp":"192.168.0.1","currency":"0","subject":"","body":"","cpChannel":"google_upi","timeExpire":1653678578000,"description":"","created":1653657857000,"timePaid":1653657876000,"bankType":"","paySt":2,"refundSt":0,"refundedAmt":0,"checkSt":0,"fee":21,"chnlFee":9,"settleSt":1,"rutId":"0000000098","bankRspDesc":"","bankTransactionId":"20220527212419602959728091384475","credential":"214721678116","notifyUrl":"https://localhost/test/site","pageNotifyUrl":"https://localhost/test/site","notifyCnt":0,"notifySt":0,"openId":"","extra":"","countryId":"Brazil","areaId":"","regionId":"brazil","cityId":"toto","countyId":"","modified":1124657876000,"channlInfoId":"b0768be2248a4aee94ac747c2ab0000","email":"[email protected]","mobile":"100000012457","accountOwner":"Tom Hank","merchantParam":"game","transTp":0,"payAccount":"","payType":"","bankCode":"","settleBatchNo":"2022124578950","startRow":0,"pageSize":0}}
{"_index":"testdataset","_type":"_doc","_id":"16537381196745423359","_score":1,"_source":{"_class":"net.local.host.ca","orderNo":"16537381196745423359","orderDt":"20220528","source":0,"mchntId":"0000000070","mchntOrderNo":"205281711586206","appid":"0000000118","payChannelId":"payid","amount":500,"clientIp":"192.168.0.1","currency":"0","subject":"","body":"","cpChannel":"google_upi","timeExpire":1653764971000,"description":"","created":1653729120000,"timePaid":1653729131000,"bankType":"","paySt":2,"refundSt":0,"refundedAmt":0,"checkSt":0,"fee":40,"chnlFee":15,"settleSt":1,"rutId":"0000000045","bankRspDesc":"","bankTransactionId":"20220528171201602789750098488753","credential":"214839251743","notifyUrl":"https://localhost/test/site","pageNotifyUrl":"https://localhost/test/site","notifyCnt":0,"notifySt":0,"openId":"","extra":"","countryId":"UK","areaId":"","regionId":"hs dsk","cityId":"asd","countyId":"","modified":1653729131000,"channlInfoId":"b6b32e2c2dc14fa492810e1a47387a29","email":"[email protected]","mobile":"7845147210","accountOwner":"Cotton Kate","merchantParam":"","transTp":0,"payAccount":"","payType":"","bankCode":"","settleBatchNo":"20220528000001","startRow":0,"pageSize":0}}
{"_index":"testdataset","_type":"_doc","_id":"16537381191385423350","_score":1,"_source":{"_class":"net.local.host.ca","orderNo":"16537381191385423350","orderDt":"20220528","source":0,"mchntId":"0000000002","mchntOrderNo":"01f9c97994562920a82","appid":"0000000003","payChannelId":"","amount":300,"clientIp":"192.168.0.1","currency":"0","subject":"","body":"","cpChannel":"","timeExpire":1653815519000,"description":"","created":1653729119000,"bankType":"","paySt":0,"refundSt":0,"refundedAmt":0,"checkSt":0,"fee":21,"chnlFee":0,"settleSt":0,"rutId":"","bankRspDesc":"","bankTransactionId":"","credential":"","notifyUrl":"hhttps://localhost/test/site","pageNotifyUrl":"https://localhost/test/site","notifyCnt":0,"notifySt":0,"openId":"","extra":"","countryId":"UK","areaId":"","regionId":"Maharashtra","cityId":"hs","countyId":"","modified":1653729119000,"channlInfoId":"","email":"[email protected]","mobile":"1457845478","accountOwner":"Stefen James","merchantParam":"rummygold","transTp":0,"payAccount":"","payType":"","bankCode":"","settleBatchNo":"","startRow":0,"pageSize":0}}
{"_index":"testdataset","_type":"_doc","_id":"16537381191685423352","_score":1,"_source":{"_class":"net.local.host.ca","orderNo":"16537381191685423352","orderDt":"20220528","source":0,"mchntId":"0000000037","mchntOrderNo":"42702205281711529003346340502","appid":"0000000112","payChannelId":"payid","amount":100,"clientIp":"192.168.0.1","currency":"0","subject":"","body":"","cpChannel":"phonepe_upi","timeExpire":1653764971000,"description":"","created":1653729119000,"timePaid":1653729150000,"bankType":"","paySt":2,"refundSt":0,"refundedAmt":0,"checkSt":0,"fee":8,"chnlFee":3,"settleSt":1,"rutId":"0000000098","bankRspDesc":"","bankTransactionId":"20220528171203602959283068389297","credential":"214831644044","notifyUrl":"https://localhost/test/site","pageNotifyUrl":"https://localhost/test/site","notifyCnt":0,"notifySt":0,"openId":"","extra":"","countryId":"UK","areaId":"","regionId":"Himachal Pradesh","cityId":"Una","countyId":"","modified":1653729150000,"channlInfoId":"b0768be2248a4aee94ac747c2ab45878","email":"[email protected]","mobile":"1457812014","accountOwner":"Michel","merchantParam":"","transTp":0,"payAccount":"","payType":"","bankCode":"","settleBatchNo":"20220528000001","startRow":0,"pageSize":0}}
{"_index":"testdataset","_type":"_doc","_id":"16537381191715423351","_score":1,"_source":{"_class":"net.local.host.ca","orderNo":"16537381191715423351","orderDt":"20220528","source":0,"mchntId":"0000000037","mchntOrderNo":"44602205281711569669753200502","appid":"0000000112","payChannelId":"payid","amount":100,"clientIp":"192.168.0.1","currency":"0","subject":"","body":"","cpChannel":"phonepe_upi","timeExpire":1653815519000,"description":"","created":1653729119000,"bankType":"","paySt":0,"refundSt":0,"refundedAmt":0,"checkSt":0,"fee":8,"chnlFee":3,"settleSt":0,"rutId":"0000000056","bankRspDesc":"","bankTransactionId":"20220528171202602889028098613873","credential":"","notifyUrl":"https://localhost/test/site","pageNotifyUrl":"https://localhost/test/site","notifyCnt":0,"notifySt":0,"openId":"","extra":"","countryId":"UK","areaId":"","regionId":"United kingdom","cityId":"ac","countyId":"","modified":1653729119000,"channlInfoId":"f63243ecdff349c5871c51c060a11954","email":"[email protected]","mobile":"4578412457","accountOwner":"Tom Willims","merchantParam":"","transTp":0,"payAccount":"","payType":"","bankCode":"","settleBatchNo":"","startRow":0,"pageSize":0}}
Posts: 453
Threads: 16
Joined: Jun 2022
Oct-10-2022, 08:36 AM
(This post was last modified: Oct-10-2022, 08:39 AM by rob101.)
@ Calli
The data is not consistent.
Is there anything else I need to know about, do you think?
Sig:
>>> import this
The UNIX philosophy: "Do one thing, and do it well."
"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse
"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Posts: 582
Threads: 1
Joined: Aug 2019
(Oct-10-2022, 08:07 AM)Calli Wrote: It doesn't work even after adding 4 lines Not 4 lines, I said 4 spaces.
Posts: 453
Threads: 16
Joined: Jun 2022
Oct-10-2022, 09:54 AM
(This post was last modified: Oct-10-2022, 09:54 AM by rob101.
Edit Reason: code update: now reads one line in to memory
)
Maybe this?
with open ('data', 'r', ) as f:
content = 'start'
while content:
content = f.readline()
temp = content.split(',')
amount = email = mobile = accountOwner = ''
for item in temp:
if 'amount' in item:
amount = item
elif 'email' in item:
email = item
elif 'mobile' in item:
mobile = item
elif 'accountOwner' in item:
accountOwner = item
if amount and email and mobile and accountOwner:
print(f"{amount} {email} {mobile} {accountOwner}")
amount = email = mobile = accountOwner = '' Output: "amount":300 "email":"[email protected]" "mobile":"100000012457" "accountOwner":"Tom Hank"
"amount":500 "email":"[email protected]" "mobile":"7845147210" "accountOwner":"Cotton Kate"
"amount":300 "email":"[email protected]" "mobile":"1457845478" "accountOwner":"Stefen James"
"amount":100 "email":"[email protected]" "mobile":"1457812014" "accountOwner":"Michel"
"amount":100 "email":"[email protected]" "mobile":"4578412457" "accountOwner":"Tom Willims"
Sig:
>>> import this
The UNIX philosophy: "Do one thing, and do it well."
"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse
"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Posts: 77
Threads: 19
Joined: Apr 2020
(Oct-10-2022, 09:54 AM)rob101 Wrote: Maybe this?
with open ('data', 'r', ) as f:
content = 'start'
while content:
content = f.readline()
temp = content.split(',')
amount = email = mobile = accountOwner = ''
for item in temp:
if 'amount' in item:
amount = item
elif 'email' in item:
email = item
elif 'mobile' in item:
mobile = item
elif 'accountOwner' in item:
accountOwner = item
if amount and email and mobile and accountOwner:
print(f"{amount} {email} {mobile} {accountOwner}")
amount = email = mobile = accountOwner = '' Output: "amount":300 "email":"[email protected]" "mobile":"100000012457" "accountOwner":"Tom Hank"
"amount":500 "email":"[email protected]" "mobile":"7845147210" "accountOwner":"Cotton Kate"
"amount":300 "email":"[email protected]" "mobile":"1457845478" "accountOwner":"Stefen James"
"amount":100 "email":"[email protected]" "mobile":"1457812014" "accountOwner":"Michel"
"amount":100 "email":"[email protected]" "mobile":"4578412457" "accountOwner":"Tom Willims"
This worked well DM me your btc address and your telegram id if you have.
Posts: 6,780
Threads: 20
Joined: Feb 2020
Oct-10-2022, 03:58 PM
(This post was last modified: Oct-10-2022, 03:58 PM by deanhystad.)
Instead using file.readline():
with open ('data', 'r', ) as f:
content = 'start'
while content:
content = f.readline() Use the lazy iterator "in":
with open ('data', 'r', ) as f:
for content in f: Why ignore that the file is a known format and a parser is available.
from io import StringIO
import json
# Simulate getting json strings from the file one line at a time.
file = StringIO(
"""{"_index":"testdataset","_type":"_doc","_id":"11234567891098646","_score":1,"_source":{"_class":"net.local.host.ca","orderNo":"16536668566434698646","orderDt":"20220527","source":0,"mchntId":"0000000002","mchntOrderNo":"01a3f2b53d16290f41f","appid":"0000000003","payChannelId":"payid","amount":300,"clientIp":"192.168.0.1","currency":"0","subject":"","body":"","cpChannel":"google_upi","timeExpire":1653678578000,"description":"","created":1653657857000,"timePaid":1653657876000,"bankType":"","paySt":2,"refundSt":0,"refundedAmt":0,"checkSt":0,"fee":21,"chnlFee":9,"settleSt":1,"rutId":"0000000098","bankRspDesc":"","bankTransactionId":"20220527212419602959728091384475","credential":"214721678116","notifyUrl":"https://localhost/test/site","pageNotifyUrl":"https://localhost/test/site","notifyCnt":0,"notifySt":0,"openId":"","extra":"","countryId":"Brazil","areaId":"","regionId":"brazil","cityId":"toto","countyId":"","modified":1124657876000,"channlInfoId":"b0768be2248a4aee94ac747c2ab0000","email":"[email protected]","mobile":"100000012457","accountOwner":"Tom Hank","merchantParam":"game","transTp":0,"payAccount":"","payType":"","bankCode":"","settleBatchNo":"2022124578950","startRow":0,"pageSize":0}}
{"_index":"testdataset","_type":"_doc","_id":"16537381196745423359","_score":1,"_source":{"_class":"net.local.host.ca","orderNo":"16537381196745423359","orderDt":"20220528","source":0,"mchntId":"0000000070","mchntOrderNo":"205281711586206","appid":"0000000118","payChannelId":"payid","amount":500,"clientIp":"192.168.0.1","currency":"0","subject":"","body":"","cpChannel":"google_upi","timeExpire":1653764971000,"description":"","created":1653729120000,"timePaid":1653729131000,"bankType":"","paySt":2,"refundSt":0,"refundedAmt":0,"checkSt":0,"fee":40,"chnlFee":15,"settleSt":1,"rutId":"0000000045","bankRspDesc":"","bankTransactionId":"20220528171201602789750098488753","credential":"214839251743","notifyUrl":"https://localhost/test/site","pageNotifyUrl":"https://localhost/test/site","notifyCnt":0,"notifySt":0,"openId":"","extra":"","countryId":"UK","areaId":"","regionId":"hs dsk","cityId":"asd","countyId":"","modified":1653729131000,"channlInfoId":"b6b32e2c2dc14fa492810e1a47387a29","email":"[email protected]","mobile":"7845147210","accountOwner":"Cotton Kate","merchantParam":"","transTp":0,"payAccount":"","payType":"","bankCode":"","settleBatchNo":"20220528000001","startRow":0,"pageSize":0}}
{"_index":"testdataset","_type":"_doc","_id":"16537381191385423350","_score":1,"_source":{"_class":"net.local.host.ca","orderNo":"16537381191385423350","orderDt":"20220528","source":0,"mchntId":"0000000002","mchntOrderNo":"01f9c97994562920a82","appid":"0000000003","payChannelId":"","amount":300,"clientIp":"192.168.0.1","currency":"0","subject":"","body":"","cpChannel":"","timeExpire":1653815519000,"description":"","created":1653729119000,"bankType":"","paySt":0,"refundSt":0,"refundedAmt":0,"checkSt":0,"fee":21,"chnlFee":0,"settleSt":0,"rutId":"","bankRspDesc":"","bankTransactionId":"","credential":"","notifyUrl":"hhttps://localhost/test/site","pageNotifyUrl":"https://localhost/test/site","notifyCnt":0,"notifySt":0,"openId":"","extra":"","countryId":"UK","areaId":"","regionId":"Maharashtra","cityId":"hs","countyId":"","modified":1653729119000,"channlInfoId":"","email":"[email protected]","mobile":"1457845478","accountOwner":"Stefen James","merchantParam":"rummygold","transTp":0,"payAccount":"","payType":"","bankCode":"","settleBatchNo":"","startRow":0,"pageSize":0}}
{"_index":"testdataset","_type":"_doc","_id":"16537381191685423352","_score":1,"_source":{"_class":"net.local.host.ca","orderNo":"16537381191685423352","orderDt":"20220528","source":0,"mchntId":"0000000037","mchntOrderNo":"42702205281711529003346340502","appid":"0000000112","payChannelId":"payid","amount":100,"clientIp":"192.168.0.1","currency":"0","subject":"","body":"","cpChannel":"phonepe_upi","timeExpire":1653764971000,"description":"","created":1653729119000,"timePaid":1653729150000,"bankType":"","paySt":2,"refundSt":0,"refundedAmt":0,"checkSt":0,"fee":8,"chnlFee":3,"settleSt":1,"rutId":"0000000098","bankRspDesc":"","bankTransactionId":"20220528171203602959283068389297","credential":"214831644044","notifyUrl":"https://localhost/test/site","pageNotifyUrl":"https://localhost/test/site","notifyCnt":0,"notifySt":0,"openId":"","extra":"","countryId":"UK","areaId":"","regionId":"Himachal Pradesh","cityId":"Una","countyId":"","modified":1653729150000,"channlInfoId":"b0768be2248a4aee94ac747c2ab45878","email":"[email protected]","mobile":"1457812014","accountOwner":"Michel","merchantParam":"","transTp":0,"payAccount":"","payType":"","bankCode":"","settleBatchNo":"20220528000001","startRow":0,"pageSize":0}}
{"_index":"testdataset","_type":"_doc","_id":"16537381191715423351","_score":1,"_source":{"_class":"net.local.host.ca","orderNo":"16537381191715423351","orderDt":"20220528","source":0,"mchntId":"0000000037","mchntOrderNo":"44602205281711569669753200502","appid":"0000000112","payChannelId":"payid","amount":100,"clientIp":"192.168.0.1","currency":"0","subject":"","body":"","cpChannel":"phonepe_upi","timeExpire":1653815519000,"description":"","created":1653729119000,"bankType":"","paySt":0,"refundSt":0,"refundedAmt":0,"checkSt":0,"fee":8,"chnlFee":3,"settleSt":0,"rutId":"0000000056","bankRspDesc":"","bankTransactionId":"20220528171202602889028098613873","credential":"","notifyUrl":"https://localhost/test/site","pageNotifyUrl":"https://localhost/test/site","notifyCnt":0,"notifySt":0,"openId":"","extra":"","countryId":"UK","areaId":"","regionId":"United kingdom","cityId":"ac","countyId":"","modified":1653729119000,"channlInfoId":"f63243ecdff349c5871c51c060a11954","email":"[email protected]","mobile":"4578412457","accountOwner":"Tom Willims","merchantParam":"","transTp":0,"payAccount":"","payType":"","bankCode":"","settleBatchNo":"","startRow":0,"pageSize":0}}
""")
for line in file:
# Convert json string to a dictionary, pull out the fields of interest, convert resulting dict to string and strip brackets.
source = json.loads(line)["_source"]
info = {field:source[field] for field in ["amount", "email", "mobile", "accountOwner"]}
print(str(info).strip("{}")) Output: 'amount': 300, 'email': '[email protected]', 'mobile': '100000012457', 'accountOwner': 'Tom Hank'
'amount': 500, 'email': '[email protected]', 'mobile': '7845147210', 'accountOwner': 'Cotton Kate'
'amount': 300, 'email': '[email protected]', 'mobile': '1457845478', 'accountOwner': 'Stefen James'
'amount': 100, 'email': '[email protected]', 'mobile': '1457812014', 'accountOwner': 'Michel'
'amount': 100, 'email': '[email protected]', 'mobile': '4578412457', 'accountOwner': 'Tom Willims'
For reading from a file you might want to catch json decoding exceptions to get past the brackets at the very start and end of the file.
import json
with open("data.json", "r") as file:
for line in file:
try:
source = json.loads(line)["_source"]
info = {field:source[field] for field in ["amount", "email", "mobile", "accountOwner"]}
print(str(info).strip("{}"))
except json.decoder.JSONDecodeError:
pass
|