Python Forum
Extract only certain text which are needed
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Extract only certain text which are needed
#1
Say for instance I want to extract some words which are required how should I go about doing it with Regular expression or without Regular expression

Data Sample
{"_index":"testdataset","_type":"_doc","_id":"11234567891098646","_score":1,"_source":{"_class":"net.local.host.ca","orderNo":"16536668566434698646","orderDt":"20220527","source":0,"mchntId":"0000000002","mchntOrderNo":"01a3f2b53d16290f41f","appid":"0000000003","payChannelId":"payid","amount":300,"clientIp":"192.168.0.1","currency":"0","subject":"","body":"","cpChannel":"google_upi","timeExpire":1653678578000,"description":"","created":1653657857000,"timePaid":1653657876000,"bankType":"","paySt":2,"refundSt":0,"refundedAmt":0,"checkSt":0,"fee":21,"chnlFee":9,"settleSt":1,"rutId":"0000000098","bankRspDesc":"","bankTransactionId":"20220527212419602959728091384475","credential":"214721678116","notifyUrl":"https://localhost/test/site","pageNotifyUrl":"https://localhost/test/site","notifyCnt":0,"notifySt":0,"openId":"","extra":"","countryId":"Brazil","areaId":"","regionId":"brazil","cityId":"toto","countyId":"","modified":1124657876000,"channlInfoId":"b0768be2248a4aee94ac747c2ab0000","email":"[email protected]","mobile":"100000012457","accountOwner":"Tom Hank","merchantParam":"game","transTp":0,"payAccount":"","payType":"","bankCode":"","settleBatchNo":"2022124578950","startRow":0,"pageSize":0}}
Output needed
amount:300, email:[email protected], mobile:100000012457, accountOwner:Tom Hank
Help this poor dude and leave your btc address I'll send some love thank you
Reply
#2
This data is string or dictionary? What have you tried so far?
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#3
The data is probably coming from a Json and has been now encoded to a Python dictionary.
Then work with dictionary no regex need.
data = {
  "_index": "testdataset",
  "_type": "_doc",
  "_id": "11234567891098646",
  "_score": 1,
  "_source": {
    "_class": "net.local.host.ca",
    "orderNo": "16536668566434698646",
    "orderDt": "20220527",
    "source": 0,
    "mchntId": "0000000002",
    "mchntOrderNo": "01a3f2b53d16290f41f",
    "appid": "0000000003",
    "payChannelId": "payid",
    "amount": 300,
    "clientIp": "192.168.0.1",
    "currency": "0",
    "subject": "",
    "body": "",
    "cpChannel": "google_upi",
    "timeExpire": 1653678578000,
    "description": "",
    "created": 1653657857000,
    "timePaid": 1653657876000,
    "bankType": "",
    "paySt": 2,
    "refundSt": 0,
    "refundedAmt": 0,
    "checkSt": 0,
    "fee": 21,
    "chnlFee": 9,
    "settleSt": 1,
    "rutId": "0000000098",
    "bankRspDesc": "",
    "bankTransactionId": "20220527212419602959728091384475",
    "credential": "214721678116",
    "notifyUrl": "https://localhost/test/site",
    "pageNotifyUrl": "https://localhost/test/site",
    "notifyCnt": 0,
    "notifySt": 0,
    "openId": "",
    "extra": "",
    "countryId": "Brazil",
    "areaId": "",
    "regionId": "brazil",
    "cityId": "toto",
    "countyId": "",
    "modified": 1124657876000,
    "channlInfoId": "b0768be2248a4aee94ac747c2ab0000",
    "email": "[email protected]",
    "mobile": "100000012457",
    "accountOwner": "Tom Hank",
    "merchantParam": "game",
    "transTp": 0,
    "payAccount": "",
    "payType": "",
    "bankCode": "",
    "settleBatchNo": "2022124578950",
    "startRow": 0,
    "pageSize": 0
  }
}
Use.
>>> data['_id']
11234567891098646

>>> data["_source"]["amount"]
300
>>> data["_source"]["countryId"]
Brazil
Reply
#4
(Oct-08-2022, 11:00 AM)snippsat Wrote: The data is probably coming from a Json and has been now encoded to a Python dictionary.
Then work with dictionary no regex need.
data = {
  "_index": "testdataset",
  "_type": "_doc",
  "_id": "11234567891098646",
  "_score": 1,
  "_source": {
    "_class": "net.local.host.ca",
    "orderNo": "16536668566434698646",
    "orderDt": "20220527",
    "source": 0,
    "mchntId": "0000000002",
    "mchntOrderNo": "01a3f2b53d16290f41f",
    "appid": "0000000003",
    "payChannelId": "payid",
    "amount": 300,
    "clientIp": "192.168.0.1",
    "currency": "0",
    "subject": "",
    "body": "",
    "cpChannel": "google_upi",
    "timeExpire": 1653678578000,
    "description": "",
    "created": 1653657857000,
    "timePaid": 1653657876000,
    "bankType": "",
    "paySt": 2,
    "refundSt": 0,
    "refundedAmt": 0,
    "checkSt": 0,
    "fee": 21,
    "chnlFee": 9,
    "settleSt": 1,
    "rutId": "0000000098",
    "bankRspDesc": "",
    "bankTransactionId": "20220527212419602959728091384475",
    "credential": "214721678116",
    "notifyUrl": "https://localhost/test/site",
    "pageNotifyUrl": "https://localhost/test/site",
    "notifyCnt": 0,
    "notifySt": 0,
    "openId": "",
    "extra": "",
    "countryId": "Brazil",
    "areaId": "",
    "regionId": "brazil",
    "cityId": "toto",
    "countyId": "",
    "modified": 1124657876000,
    "channlInfoId": "b0768be2248a4aee94ac747c2ab0000",
    "email": "[email protected]",
    "mobile": "100000012457",
    "accountOwner": "Tom Hank",
    "merchantParam": "game",
    "transTp": 0,
    "payAccount": "",
    "payType": "",
    "bankCode": "",
    "settleBatchNo": "2022124578950",
    "startRow": 0,
    "pageSize": 0
  }
}
Use.
>>> data['_id']
11234567891098646

>>> data["_source"]["amount"]
300
>>> data["_source"]["countryId"]
Brazil

Yes it's a json file
Reply
#5
Donating some bitcoin whoever solve this
Reply
#6
To be 'literal', this will do what you've asked for...

print(f"amount: {data['_source']['amount']}, email: {data['_source']['email']}, mobile: {data['_source']['mobile']}, accountOwner: {data['_source']['accountOwner']}")
Output:
amount: 300, email: [email protected], mobile: 100000012457, accountOwner: Tom Hank
... but my guess is that you don't want to be that literal.

So, what's your search criteria?
Sig:
>>> import this

The UNIX philosophy: "Do one thing, and do it well."

"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse

"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Reply
#7
(Oct-09-2022, 06:25 AM)rob101 Wrote: To be 'literal', this will do what you've asked for...

print(f"amount: {data['_source']['amount']}, email: {data['_source']['email']}, mobile: {data['_source']['mobile']}, accountOwner: {data['_source']['accountOwner']}")
Output:
amount: 300, email: [email protected], mobile: 100000012457, accountOwner: Tom Hank
... but my guess is that you don't want to be that literal.

So, what's your search criteria?

NameError: name 'data' is not defined

f = open('df.json', 'r')

content = f.read(f"amount: {data['_source']['amount']}, email: {data['_source']['email']}, mobile: {data['_source']['mobile']}, accountOwner: {data['_source']['accountOwner']}")

print(content)
Reply
#8
Okay, so the jason file has yet to be translated; I'll work on that.



What about:
with open ('df.jason', 'r') as f:
    content = f.read()

temp = content.split(',')

for item in temp:
    if 'amount' in item:
        amount = item.strip()
    elif 'email' in item:
        email = item.strip()
    elif 'mobile' in item:
        mobile = item.strip()
    elif 'accountOwner' in item:
        accountOwner = item.strip()

print(amount,email,mobile,accountOwner)
Output:
"amount":300 "email":"[email protected]" "mobile":"100000012457" "accountOwner":"Tom Hank"
Sig:
>>> import this

The UNIX philosophy: "Do one thing, and do it well."

"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse

"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Reply
#9
(Oct-09-2022, 07:39 AM)rob101 Wrote: Okay, so the jason file has yet to be translated; I'll work on that.
I think there is nothing wrong with the json. You should use the json module for the translating.

(Oct-09-2022, 07:22 AM)Calli Wrote: NameError: name 'data' is not defined
Of course, you should define data.
The whole problem can be solved in 4 lines of code.
import json     # First import the json module.

f = open('df.json', 'r')  # You did that right.
data = json.load(f)
# Now "data" contains the content of the file. According to what you
# showed us, it is a nested dictionary.

# You can now print it like rob101 showed you.
print(f"amount: {data['_source']['amount']}, email: {data['_source']['email']}, mobile: {data['_source']['mobile']}, accountOwner: {data['_source']['accountOwner']}")
Output:
amount: 300, email: [email protected], mobile: 100000012457, accountOwner: Tom Hank
rob101 likes this post
Reply
#10
Oooo... does that mean I get some SATS? Big Grin
ibreeden likes this post
Sig:
>>> import this

The UNIX philosophy: "Do one thing, and do it well."

"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse

"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  extract only text strip byte array Pir8Radio 7 3,000 Nov-29-2022, 10:24 PM
Last Post: Pir8Radio
  Extract text rektcol 6 1,691 Jun-28-2022, 08:57 AM
Last Post: Gribouillis
  Extract a string between 2 words from a text file OscarBoots 2 1,885 Nov-02-2021, 08:50 AM
Last Post: ibreeden
  Extract text based on postion and pattern guddu_12 2 1,644 Sep-27-2021, 08:32 PM
Last Post: guddu_12
  Extract specific sentences from text file Bubly 3 3,423 May-31-2021, 06:55 PM
Last Post: Larz60+
  extract color text from PDF Maha 0 2,082 May-31-2021, 04:05 PM
Last Post: Maha
Question How to extract multiple text from a string? chatguy 2 2,392 Feb-28-2021, 07:39 AM
Last Post: bowlofred
  How to extract a single word from a text file buttercup 7 3,607 Jul-22-2020, 04:45 AM
Last Post: bowlofred
  How to extract specific rows and columns from a text file with Python Farhan 0 3,396 Mar-25-2020, 09:18 PM
Last Post: Farhan
  Extract Strings From Text File - Out Put Results to Individual Files dj99 8 4,958 Jun-28-2018, 10:41 AM
Last Post: dj99

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020