Python Forum
Extract only certain text which are needed
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Extract only certain text which are needed
#11
(Oct-09-2022, 10:08 AM)rob101 Wrote: Oooo... does that mean I get some SATS? Big Grin

I reckon to receive 1 bitcoin. But we can split if you want. Big Grin
rob101 likes this post
Reply
#12
Just as info so is data shown already a dictionary,so no need to put trough the Json module
I guess in code that you don't show that have been using Requests to get Json data.
Request has build in Json decoder that give a Python dictionary back.

So in my post i did nothing else than a plain copy of your data(and format it) the use assign it to a variable data.
>>> type(data)
<class 'dict'>
>>> 
>>> print(f"amount: {data['_source']['amount']}, email: {data['_source']['email']}, mobile: {data['_source']['mobile']}, accountOwner: {data['_source']['accountOwner']}")
amount: 300, email: [email protected], mobile: 100000012457, accountOwner: Tom Hank
So no Json module is used and the f-string for ibreeden code work on it.
Reply
#13
(Oct-09-2022, 09:41 AM)ibreeden Wrote:
(Oct-09-2022, 07:39 AM)rob101 Wrote: Okay, so the jason file has yet to be translated; I'll work on that.
I think there is nothing wrong with the json. You should use the json module for the translating.

(Oct-09-2022, 07:22 AM)Calli Wrote: NameError: name 'data' is not defined
Of course, you should define data.
The whole problem can be solved in 4 lines of code.
import json     # First import the json module.

f = open('df.json', 'r')  # You did that right.
data = json.load(f)
# Now "data" contains the content of the file. According to what you
# showed us, it is a nested dictionary.

# You can now print it like rob101 showed you.
print(f"amount: {data['_source']['amount']}, email: {data['_source']['email']}, mobile: {data['_source']['mobile']}, accountOwner: {data['_source']['accountOwner']}")
Output:
amount: 300, email: [email protected], mobile: 100000012457, accountOwner: Tom Hank

This is the error I am getting
Traceback (most recent call last):
  File "/media/redhat/test/Shodan/data/df.py", line 4, in <module>
    data = json.load(f)
  File "/usr/lib/python3.10/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/usr/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.10/json/decoder.py", line 340, in decode
    raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 1289)
Reply
#14
(Oct-09-2022, 07:39 AM)rob101 Wrote: Okay, so the jason file has yet to be translated; I'll work on that.



What about:
with open ('df.jason', 'r') as f:
    content = f.read()

temp = content.split(',')

for item in temp:
    if 'amount' in item:
        amount = item.strip()
    elif 'email' in item:
        email = item.strip()
    elif 'mobile' in item:
        mobile = item.strip()
    elif 'accountOwner' in item:
        accountOwner = item.strip()

print(amount,email,mobile,accountOwner)
Output:
"amount":300 "email":"[email protected]" "mobile":"100000012457" "accountOwner":"Tom Hank"

In three million data it prints out only one line of data
It looks like it's printing out the last line of the file
Reply
#15
(Oct-10-2022, 03:41 AM)Calli Wrote:
Error:
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 1289)
This means the json file is corrupt. Open the json file and look at the beginning of line 2. Count the number of brackets, braces, parentheses (there should be an even number) and check for the comma's on the right places.
Reply
#16
(Oct-10-2022, 03:55 AM)Calli Wrote: It looks like it's printing out the last line of the file
Is'nt that what you wanted? You want more? Then add 4 spaces before the print function.
Reply
#17
(Oct-10-2022, 03:55 AM)Calli Wrote: It looks like it's printing out the last line of the file

(Oct-10-2022, 07:06 AM)ibreeden Wrote: Is'nt that what you wanted? You want more? Then add 4 spaces before the print function.

Indeed: move the print() function into the for loop.

To add..

Either that, or provide a bigger sample (say 10 records) and I'll sort it out: reading (what was it... looking back) ... there we go, 3,000,000 records into memory is not good practice, as we can do this one record at time.
Sig:
>>> import this

The UNIX philosophy: "Do one thing, and do it well."

"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse

"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Reply
#18
(Oct-10-2022, 07:58 AM)rob101 Wrote:
(Oct-10-2022, 03:55 AM)Calli Wrote: It looks like it's printing out the last line of the file

(Oct-10-2022, 07:06 AM)ibreeden Wrote: Is'nt that what you wanted? You want more? Then add 4 spaces before the print function.

Indeed: move the print() function into the for loop.

Traceback (most recent call last):
  File "/media/redhat/test/data/df.py", line 15, in <module>
    print(amount,email,mobile,accountOwner)
NameError: name 'amount' is not defined
with open ('df.json', 'r') as f:
    content = f.read()

temp = content.split(',')

for item in temp:
    if 'amount' in item:
        amount = item.strip()
    elif 'email' in item:
        email = item.strip()
    elif 'mobile' in item:
        mobile = item.strip()
    elif 'accountOwner' in item:
        accountOwner = item.strip()
    print(amount,email,mobile,accountOwner)
Reply
#19
(Oct-10-2022, 08:02 AM)Calli Wrote: NameError: name 'amount' is not defined

See my updated post (above).
Sig:
>>> import this

The UNIX philosophy: "Do one thing, and do it well."

"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse

"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Reply
#20
(Oct-10-2022, 07:06 AM)ibreeden Wrote:
(Oct-10-2022, 03:55 AM)Calli Wrote: It looks like it's printing out the last line of the file
Is'nt that what you wanted? You want more? Then add 4 spaces before the print function.

It doesn't work even after adding 4 lines
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  extract only text strip byte array Pir8Radio 7 3,091 Nov-29-2022, 10:24 PM
Last Post: Pir8Radio
  Extract text rektcol 6 1,759 Jun-28-2022, 08:57 AM
Last Post: Gribouillis
  Extract a string between 2 words from a text file OscarBoots 2 1,918 Nov-02-2021, 08:50 AM
Last Post: ibreeden
  Extract text based on postion and pattern guddu_12 2 1,686 Sep-27-2021, 08:32 PM
Last Post: guddu_12
  Extract specific sentences from text file Bubly 3 3,491 May-31-2021, 06:55 PM
Last Post: Larz60+
  extract color text from PDF Maha 0 2,110 May-31-2021, 04:05 PM
Last Post: Maha
Question How to extract multiple text from a string? chatguy 2 2,434 Feb-28-2021, 07:39 AM
Last Post: bowlofred
  How to extract a single word from a text file buttercup 7 3,723 Jul-22-2020, 04:45 AM
Last Post: bowlofred
  How to extract specific rows and columns from a text file with Python Farhan 0 3,438 Mar-25-2020, 09:18 PM
Last Post: Farhan
  Extract Strings From Text File - Out Put Results to Individual Files dj99 8 5,032 Jun-28-2018, 10:41 AM
Last Post: dj99

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020