Python Forum
[SOLVED] Using Python to connect to an XML ?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[SOLVED] Using Python to connect to an XML ?
#11
I cannot disagree much more. I find pysimplegui anything but simple.
Larz60+ and jehoshua like this post
Reply
#12
(Jul-08-2023, 02:40 AM)deanhystad Wrote: I find pysimplegui anything but simple.

Oh well, 'horses for courses' as they say. Big Grin
Reply
#13
(Jul-08-2023, 02:40 AM)deanhystad Wrote: I cannot disagree much more. I find pysimplegui anything but simple.

@deanhystad - What GUI do you recommend please ?
Reply
#14
If the XML data is like this ..

Quote: <TRANSACTION id="T000000000000014727" entrydate="2024-04-20" memo="" commodity="AUD" postdate="2024-04-20">
<SPLITS>
<SPLIT id="S0001" number="" reconciledate="" price="1/1" payee="P000035" account="A000831" value="-1499/50" memo="AAA Duracell - 48 pack" action="" bankid="" reconcileflag="2" shares="-1499/50"/>
<SPLIT id="S0002" number="" reconciledate="" price="1/1" payee="P000035" account="A000620" value="1499/50" memo="AAA Duracell - 48 pack" action="" bankid="" reconcileflag="0" shares="1499/50"/>
</SPLITS>
</TRANSACTION>

and I want the transactions where the account is equal to "A000831" or "A000832", this works ok .

#!/usr/bin/python

import bs4
from bs4 import BeautifulSoup

with open('testxml.xml', 'r') as f:
    file = f.read()

soup = BeautifulSoup(file, 'xml')

for split in soup.find_all("SPLIT", account=["A000831","A000832"]):
    print(split)

print('finished')
1. How do I get the 'entrydate' when it is at a higher level ? Meaning it is part of the TRANSACTION data

Quote:<TRANSACTION id="T000000000000014727" entrydate="2024-04-20" memo="" commodity="AUD" postdate="2024-04-20">

I seem to remember there was a function in BeautifulSoup that can go up one or more levels (parents I guess).

2. How do I total up the amounts for each account ? Just 2 totals to summarise. The amount is stored in the SPLIT as

Quote:value="-1499/50"

In the above example the amount is $ -29.98
Reply
#15
As I only needed to go up 2 levels, this seems to work

#!/usr/bin/python

import bs4
from bs4 import BeautifulSoup

with open('testxml.xml', 'r') as f:
    file = f.read()

soup = BeautifulSoup(file, 'xml')

# only transactions with account A000831 or A000832
for split in soup.find_all("SPLIT", account=["A000831","A000832"]):
    print(split)
    parents=split.find_parents(limit=2)
    for parent in parents:
        print(parent)

print('finished')
This is the output

Quote:<SPLIT account="A000831" action="" bankid="" id="S0001" memo="AAA Duracell - 48 pack" number="" payee="P000035" price="1/1" reconciledate="" reconcileflag="2" shares="-1499/50" value="-1499/50"/>
<SPLITS>
<SPLIT account="A000831" action="" bankid="" id="S0001" memo="AAA Duracell - 48 pack" number="" payee="P000035" price="1/1" reconciledate="" reconcileflag="2" shares="-1499/50" value="-1499/50"/>
<SPLIT account="A000620" action="" bankid="" id="S0002" memo="AAA Duracell - 48 pack" number="" payee="P000035" price="1/1" reconciledate="" reconcileflag="0" shares="1499/50" value="1499/50"/>
</SPLITS>
<TRANSACTION commodity="AUD" entrydate="2024-04-20" id="T000000000000014727" memo="" postdate="2024-04-20">

The TRANSACTION tag is found, however the results/output are not always what is expected.
Reply
#16
Quote:<TRANSACTION id="T000000000000014727" entrydate="2024-04-20" memo="" commodity="AUD" postdate="2024-04-20">
<SPLITS>
<SPLIT id="S0001" number="" reconciledate="" price="1/1" payee="P000035" account="A000831" value="-1499/50" memo="AAA Duracell - 48 pack" action="" bankid="" reconcileflag="2" shares="-1499/50"/>
<SPLIT id="S0002" number="" reconciledate="" price="1/1" payee="P000035" account="A000620" value="1499/50" memo="AAA Duracell - 48 pack" action="" bankid="" reconcileflag="0" shares="1499/50"/>
</SPLITS>
</TRANSACTION>

#!/usr/bin/python

import bs4
from bs4 import BeautifulSoup

with open('testxml.xml', 'r') as f:
    file = f.read()

soup = BeautifulSoup(file, 'xml')

# only transactions with account A000831 or A000832
results = soup.find_all("SPLIT", account=["A000831","A000832"])
for result in results:
    if (result.text):
        print(result.get_text())
    else:
        print(result)
        print("else")

print('finished')
Output

Quote:<SPLIT account="A000831" action="" bankid="" id="S0001" memo="AAA Duracell - 48 pack" number="" payee="P000035" price="1/1" reconciledate="" reconcileflag="2" shares="-1499/50" value="-1499/50"/>
else

This isn't true obviously ..

if (result.text):
I have been looking through the objects and other parts that are returned from BeautifulSoup , but not finding how to retrieve the tag "value" and it's value of -1499/50 ??

Have looked at find() , it's not returning anything. What I need is to extract the following from the string

Quote:value="-1499/50"

and the split into two numeric values..

value1 = -1499
value2 = 50

then divide value1 by value2, and keep an accumulative total by account number. Only 2 - with accounts A000831 and A000832
Reply
#17
(Nov-30-2024, 05:11 AM)jehoshua Wrote: I have been looking through the objects and other parts that are returned from BeautifulSoup , but not finding how to retrieve the tag "value" and it's value of -1499/50 ??
value="1499/50" it's an attribute in the xml tag,not text.
import bs4
from bs4 import BeautifulSoup

with open('testxml.xml', 'r') as f:
    file = f.read()

soup = BeautifulSoup(file, 'xml')
# only transactions with account A000831 or A000832
results = soup.find_all("SPLIT", account=["A000831","A000832"])
>>> results[0].get('value')
'-1499/50'
>>> results[0].get('value').split('/')
['-1499', '50']
>>> value1, value2 = results[0].get('value').split('/')
>>> value1
'-1499'
>>> value2
'50'
So as you see i use .get() as BS return a dictionary with all attributes.
Can look all with .attrs.
>>> results[0].attrs
{'account': 'A000831',
 'action': '',
 'bankid': '',
 'id': 'S0001',
 'memo': 'AAA Duracell - 48 pack',
 'number': '',
 'payee': 'P000035',
 'price': '1/1',
 'reconciledate': '',
 'reconcileflag': '2',
 'shares': '-1499/50',
 'value': '-1499/50'}
jehoshua likes this post
Reply
#18
Thanks, that works fine. This test file has more data.

Quote: <TRANSACTION id="T000000000000014726" entrydate="2024-04-20" memo="" commodity="AUD" postdate="2024-04-20">
<SPLITS>
<SPLIT id="S0001" number="" reconciledate="" price="1/1" payee="P000035" account="A000831" value="-547/50" memo="keys cut (2) $7.94, plus key tags $3" action="" bankid="" reconcileflag="2" shares="-547/50"/>
<SPLIT id="S0002" number="" reconciledate="" price="1/1" payee="P000035" account="A000021" value="547/50" memo="keys cut (2) $7.94, plus key tags $3" action="" bankid="" reconcileflag="0" shares="547/50"/>
</SPLITS>
</TRANSACTION>
<TRANSACTION id="T000000000000014727" entrydate="2024-04-20" memo="" commodity="AUD" postdate="2024-04-20">
<SPLITS>
<SPLIT id="S0001" number="" reconciledate="" price="1/1" payee="P000035" account="A000832" value="-1499/50" memo="AAA Duracell - 48 pack" action="" bankid="" reconcileflag="2" shares="-1499/50"/>
<SPLIT id="S0002" number="" reconciledate="" price="1/1" payee="P000035" account="A000620" value="1499/50" memo="AAA Duracell - 48 pack" action="" bankid="" reconcileflag="0" shares="1499/50"/>
</SPLITS>
</TRANSACTION>
<TRANSACTION id="T000000000000014728" entrydate="2024-04-20" memo="" commodity="AUD" postdate="2024-04-20">
<SPLITS>
<SPLIT id="S0001" number="" reconciledate="" price="1/1" payee="P000035" account="A000831" value="-1499/50" memo="AA Duracell - 48pk" action="" bankid="" reconcileflag="2" shares="-1499/50"/>
<SPLIT id="S0002" number="" reconciledate="" price="1/1" payee="P000035" account="A000620" value="1499/50" memo="AA Duracell - 48pk" action="" bankid="" reconcileflag="0" shares="1499/50"/>
</SPLITS>
</TRANSACTION>

#!/usr/bin/python

import bs4
from bs4 import BeautifulSoup

with open('testxml_1.xml', 'r') as f:
    file = f.read()

soup = BeautifulSoup(file, 'xml')

# only transactions with account A000831 or A000832
accounts = ["A000831","A000832"]

for tag in soup.find_all("SPLIT", account=accounts):
    print("Value:", tag.get('value'))
    print(tag.get('value').split('/'))
    value1, value2 = tag.get('value').split('/')
    print(value1)
    print(value2)
    print(tag.attrs)
    amount = int(value1)/int(value2)
    print("Amount:", amount)
    # find the index of this account in the list accounts
    print("Account index:", accounts.index(tag['account']))


print('finished')
and the output ..

Quote:Value: -547/50
['-547', '50']
-547
50
{'id': 'S0001', 'number': '', 'reconciledate': '', 'price': '1/1', 'payee': 'P000035', 'account': 'A000831', 'value': '-547/50', 'memo': 'keys cut (2) $7.94, plus key tags $3', 'action': '', 'bankid': '', 'reconcileflag': '2', 'shares': '-547/50'}
Amount: -10.94
Account index: 0
Value: -1499/50
['-1499', '50']
-1499
50
{'id': 'S0001', 'number': '', 'reconciledate': '', 'price': '1/1', 'payee': 'P000035', 'account': 'A000832', 'value': '-1499/50', 'memo': 'AAA Duracell - 48 pack', 'action': '', 'bankid': '', 'reconcileflag': '2', 'shares': '-1499/50'}
Amount: -29.98
Account index: 1
Value: -1499/50
['-1499', '50']
-1499
50
{'id': 'S0001', 'number': '', 'reconciledate': '', 'price': '1/1', 'payee': 'P000035', 'account': 'A000831', 'value': '-1499/50', 'memo': 'AA Duracell - 48pk', 'action': '', 'bankid': '', 'reconcileflag': '2', 'shares': '-1499/50'}
Amount: -29.98
Account index: 0
finished

Possibly I should protect a potential "dive by zero" and check that "value2" is not zero. Also the filter by accounts is now in a list, so the accumulation by account can be by the key/index found in the list "accounts".
Reply
#19
Now accumulating amounts by account number

#!/usr/bin/python

import bs4
from bs4 import BeautifulSoup

with open('testxml_1.xml', 'r') as f:
    file = f.read()

soup = BeautifulSoup(file, 'xml')

# only transactions with account A000831 or A000832
accounts = ["A000831","A000832"]
totals = [0,0]          #totals by account

for tag in soup.find_all("SPLIT", account=accounts):
    print("Value:", tag.get('value'))
    print(tag.get('value').split('/'))
    value1, value2 = tag.get('value').split('/')
    print(value1)
    print(value2)
    print(tag.attrs)
    amount = int(value1)/int(value2)
    print("Amount:", amount)

    # find the index of this account in the list accounts
    index_of_this_account = accounts.index(tag['account'])
    print("Account index:", index_of_this_account)

    # Adding two amounts - cumulative totals
    totals[index_of_this_account]  += amount

print('finished')
print(totals)
OUTPUT

Quote:Value: -547/50
['-547', '50']
-547
50
{'id': 'S0001', 'number': '', 'reconciledate': '', 'price': '1/1', 'payee': 'P000035', 'account': 'A000831', 'value': '-547/50', 'memo': 'keys cut (2) $7.94, plus key tags $3', 'action': '', 'bankid': '', 'reconcileflag': '2', 'shares': '-547/50'}
Amount: -10.94
Account index: 0
Value: -1499/50
['-1499', '50']
-1499
50
{'id': 'S0001', 'number': '', 'reconciledate': '', 'price': '1/1', 'payee': 'P000035', 'account': 'A000832', 'value': '-1499/50', 'memo': 'AAA Duracell - 48 pack', 'action': '', 'bankid': '', 'reconcileflag': '2', 'shares': '-1499/50'}
Amount: -29.98
Account index: 1
Value: -1499/50
['-1499', '50']
-1499
50
{'id': 'S0001', 'number': '', 'reconciledate': '', 'price': '1/1', 'payee': 'P000035', 'account': 'A000831', 'value': '-1499/50', 'memo': 'AA Duracell - 48pk', 'action': '', 'bankid': '', 'reconcileflag': '2', 'shares': '-1499/50'}
Amount: -29.98
Account index: 0
finished
[-40.92, -29.98]

It works, but is there a better method to accumulate the totals ??

..Later - actually it worked okay with only a few transactions. Just ran it on the full file and the totals don't match. Back to the drawing board. : Blush
Reply
#20
Like this is,and i use lxml as parser it's better
So now first collect all values in a list,then do the calculation.
import bs4
from bs4 import BeautifulSoup
from fractions import Fraction

with open('testxml.xml', 'r') as f:
    file = f.read()

soup = BeautifulSoup(file, 'lxml')
# only transactions with account A000831 or A000832
value_lst = []
results = soup.find_all("split", account=["A000831", "A000832"])
for tag in results:
    value_lst.append(tag.get('value'))

# Fraction(val) converts the string fraction to a Fraction object.
# Then iterate over value_lst and do calculation.
print(value_lst)
decimal_lst = [float(Fraction(val)) for val in value_lst]
print(decimal_lst)
print(sum(decimal_lst))
Output:
['-547/50', '-1499/50', '-1499/50'] [-10.94, -29.98, -29.98] -70.9
jehoshua likes this post
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  connect sql by python using txt. file dawid294 2 2,130 Jan-12-2024, 08:54 PM
Last Post: deanhystad
  How to Connect to PostgreSQL Through Jump Server and SSH Tunnel using Python? nishans 1 3,949 Jan-02-2024, 10:37 AM
Last Post: khanzain
  python connect to mssql wailoonho 7 5,720 Dec-07-2023, 02:06 AM
Last Post: wailoonho
  Trying to make a bot to connect on discord with Selenium Python johnsmith43 2 75,439 Mar-21-2022, 02:56 PM
Last Post: Cloudytechnical
  How to connect Mysql databse with python and VSCode IDE madhusoodhananER 1 9,765 Oct-31-2019, 10:15 AM
Last Post: Larz60+
  Connect a Teradata DB to Python OscarBoots 10 11,399 Jan-31-2019, 10:23 PM
Last Post: OscarBoots
  Python connect to Sybase IQ FORTITUDE 1 5,758 Jan-24-2019, 02:14 AM
Last Post: micseydel
  How to connect Atom and Python? Jack_Sparrow 1 4,443 May-01-2018, 10:53 AM
Last Post: Larz60+
  connect to remote database via python script sunstar20 5 6,573 Apr-23-2018, 11:05 AM
Last Post: Gribouillis
  Can anyone Please help on fixing this python code to connect and save things on sqlit Roscoes 2 3,624 Mar-06-2018, 04:48 AM
Last Post: Roscoes

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020