[SOLVED] Using Python to connect to an XML ?

**deanhystad** · Jul-08-2023, 02:40 AM

I cannot disagree much more. I find pysimplegui anything but simple.

jehoshua · Jul-08-2023, 03:39 AM

(Jul-08-2023, 02:40 AM)deanhystad Wrote: I find pysimplegui anything but simple.

Oh well, 'horses for courses' as they say. Big Grin

jehoshua · (This post was last modified: Jul-11-2023, 12:34 AM by jehoshua.)

(Jul-08-2023, 02:40 AM)deanhystad Wrote: I cannot disagree much more. I find pysimplegui anything but simple.

@deanhystad - What GUI do you recommend please ?

jehoshua · Nov-28-2024, 04:53 AM

If the XML data is like this ..

Quote: <TRANSACTION id="T000000000000014727" entrydate="2024-04-20" memo="" commodity="AUD" postdate="2024-04-20">
<SPLITS>
<SPLIT id="S0001" number="" reconciledate="" price="1/1" payee="P000035" account="A000831" value="-1499/50" memo="AAA Duracell - 48 pack" action="" bankid="" reconcileflag="2" shares="-1499/50"/>
<SPLIT id="S0002" number="" reconciledate="" price="1/1" payee="P000035" account="A000620" value="1499/50" memo="AAA Duracell - 48 pack" action="" bankid="" reconcileflag="0" shares="1499/50"/>
</SPLITS>
</TRANSACTION>

and I want the transactions where the account is equal to "A000831" or "A000832", this works ok .

#!/usr/bin/python

import bs4
from bs4 import BeautifulSoup

with open('testxml.xml', 'r') as f:
    file = f.read()

soup = BeautifulSoup(file, 'xml')

for split in soup.find_all("SPLIT", account=["A000831","A000832"]):
    print(split)

print('finished')

1. How do I get the 'entrydate' when it is at a higher level ? Meaning it is part of the TRANSACTION data

Quote:<TRANSACTION id="T000000000000014727" entrydate="2024-04-20" memo="" commodity="AUD" postdate="2024-04-20">

I seem to remember there was a function in BeautifulSoup that can go up one or more levels (parents I guess).

2. How do I total up the amounts for each account ? Just 2 totals to summarise. The amount is stored in the SPLIT as

Quote:value="-1499/50"

In the above example the amount is $ -29.98

jehoshua · Nov-29-2024, 02:36 AM

As I only needed to go up 2 levels, this seems to work

#!/usr/bin/python

import bs4
from bs4 import BeautifulSoup

with open('testxml.xml', 'r') as f:
    file = f.read()

soup = BeautifulSoup(file, 'xml')

# only transactions with account A000831 or A000832
for split in soup.find_all("SPLIT", account=["A000831","A000832"]):
    print(split)
    parents=split.find_parents(limit=2)
    for parent in parents:
        print(parent)

print('finished')

This is the output

Quote:<SPLIT account="A000831" action="" bankid="" id="S0001" memo="AAA Duracell - 48 pack" number="" payee="P000035" price="1/1" reconciledate="" reconcileflag="2" shares="-1499/50" value="-1499/50"/>
<SPLITS>
<SPLIT account="A000831" action="" bankid="" id="S0001" memo="AAA Duracell - 48 pack" number="" payee="P000035" price="1/1" reconciledate="" reconcileflag="2" shares="-1499/50" value="-1499/50"/>
<SPLIT account="A000620" action="" bankid="" id="S0002" memo="AAA Duracell - 48 pack" number="" payee="P000035" price="1/1" reconciledate="" reconcileflag="0" shares="1499/50" value="1499/50"/>
</SPLITS>
<TRANSACTION commodity="AUD" entrydate="2024-04-20" id="T000000000000014727" memo="" postdate="2024-04-20">

The TRANSACTION tag is found, however the results/output are not always what is expected.

jehoshua · Nov-30-2024, 05:11 AM

Quote:<TRANSACTION id="T000000000000014727" entrydate="2024-04-20" memo="" commodity="AUD" postdate="2024-04-20">
<SPLITS>
<SPLIT id="S0001" number="" reconciledate="" price="1/1" payee="P000035" account="A000831" value="-1499/50" memo="AAA Duracell - 48 pack" action="" bankid="" reconcileflag="2" shares="-1499/50"/>
<SPLIT id="S0002" number="" reconciledate="" price="1/1" payee="P000035" account="A000620" value="1499/50" memo="AAA Duracell - 48 pack" action="" bankid="" reconcileflag="0" shares="1499/50"/>
</SPLITS>
</TRANSACTION>

#!/usr/bin/python

import bs4
from bs4 import BeautifulSoup

with open('testxml.xml', 'r') as f:
    file = f.read()

soup = BeautifulSoup(file, 'xml')

# only transactions with account A000831 or A000832
results = soup.find_all("SPLIT", account=["A000831","A000832"])
for result in results:
    if (result.text):
        print(result.get_text())
    else:
        print(result)
        print("else")

print('finished')

Output

Quote:<SPLIT account="A000831" action="" bankid="" id="S0001" memo="AAA Duracell - 48 pack" number="" payee="P000035" price="1/1" reconciledate="" reconcileflag="2" shares="-1499/50" value="-1499/50"/>
else

This isn't true obviously ..

if (result.text):

I have been looking through the objects and other parts that are returned from BeautifulSoup , but not finding how to retrieve the tag "value" and it's value of -1499/50 ??

Have looked at find() , it's not returning anything. What I need is to extract the following from the string

Quote:value="-1499/50"

and the split into two numeric values..

value1 = -1499
value2 = 50

then divide value1 by value2, and keep an accumulative total by account number. Only 2 - with accounts A000831 and A000832

***snippsat*** · (This post was last modified: Nov-30-2024, 09:51 AM by snippsat.)

(Nov-30-2024, 05:11 AM)jehoshua Wrote: I have been looking through the objects and other parts that are returned from BeautifulSoup , but not finding how to retrieve the tag "value" and it's value of -1499/50 ??

value="1499/50" it's an attribute in the xml tag,not text.

import bs4
from bs4 import BeautifulSoup

with open('testxml.xml', 'r') as f:
    file = f.read()

soup = BeautifulSoup(file, 'xml')
# only transactions with account A000831 or A000832
results = soup.find_all("SPLIT", account=["A000831","A000832"])

>>> results[0].get('value')
'-1499/50'
>>> results[0].get('value').split('/')
['-1499', '50']
>>> value1, value2 = results[0].get('value').split('/')
>>> value1
'-1499'
>>> value2
'50'

So as you see i use .get() as BS return a dictionary with all attributes.
Can look all with .attrs.

>>> results[0].attrs
{'account': 'A000831',
 'action': '',
 'bankid': '',
 'id': 'S0001',
 'memo': 'AAA Duracell - 48 pack',
 'number': '',
 'payee': 'P000035',
 'price': '1/1',
 'reconciledate': '',
 'reconcileflag': '2',
 'shares': '-1499/50',
 'value': '-1499/50'}

jehoshua · (This post was last modified: Dec-01-2024, 12:10 AM by jehoshua.)

Thanks, that works fine. This test file has more data.

Quote: <TRANSACTION id="T000000000000014726" entrydate="2024-04-20" memo="" commodity="AUD" postdate="2024-04-20">
<SPLITS>
<SPLIT id="S0001" number="" reconciledate="" price="1/1" payee="P000035" account="A000831" value="-547/50" memo="keys cut (2) $7.94, plus key tags $3" action="" bankid="" reconcileflag="2" shares="-547/50"/>
<SPLIT id="S0002" number="" reconciledate="" price="1/1" payee="P000035" account="A000021" value="547/50" memo="keys cut (2) $7.94, plus key tags $3" action="" bankid="" reconcileflag="0" shares="547/50"/>
</SPLITS>
</TRANSACTION>
<TRANSACTION id="T000000000000014727" entrydate="2024-04-20" memo="" commodity="AUD" postdate="2024-04-20">
<SPLITS>
<SPLIT id="S0001" number="" reconciledate="" price="1/1" payee="P000035" account="A000832" value="-1499/50" memo="AAA Duracell - 48 pack" action="" bankid="" reconcileflag="2" shares="-1499/50"/>
<SPLIT id="S0002" number="" reconciledate="" price="1/1" payee="P000035" account="A000620" value="1499/50" memo="AAA Duracell - 48 pack" action="" bankid="" reconcileflag="0" shares="1499/50"/>
</SPLITS>
</TRANSACTION>
<TRANSACTION id="T000000000000014728" entrydate="2024-04-20" memo="" commodity="AUD" postdate="2024-04-20">
<SPLITS>
<SPLIT id="S0001" number="" reconciledate="" price="1/1" payee="P000035" account="A000831" value="-1499/50" memo="AA Duracell - 48pk" action="" bankid="" reconcileflag="2" shares="-1499/50"/>
<SPLIT id="S0002" number="" reconciledate="" price="1/1" payee="P000035" account="A000620" value="1499/50" memo="AA Duracell - 48pk" action="" bankid="" reconcileflag="0" shares="1499/50"/>
</SPLITS>
</TRANSACTION>

#!/usr/bin/python

import bs4
from bs4 import BeautifulSoup

with open('testxml_1.xml', 'r') as f:
    file = f.read()

soup = BeautifulSoup(file, 'xml')

# only transactions with account A000831 or A000832
accounts = ["A000831","A000832"]

for tag in soup.find_all("SPLIT", account=accounts):
    print("Value:", tag.get('value'))
    print(tag.get('value').split('/'))
    value1, value2 = tag.get('value').split('/')
    print(value1)
    print(value2)
    print(tag.attrs)
    amount = int(value1)/int(value2)
    print("Amount:", amount)
    # find the index of this account in the list accounts
    print("Account index:", accounts.index(tag['account']))


print('finished')

and the output ..

Quote:Value: -547/50
['-547', '50']
-547
50
{'id': 'S0001', 'number': '', 'reconciledate': '', 'price': '1/1', 'payee': 'P000035', 'account': 'A000831', 'value': '-547/50', 'memo': 'keys cut (2) $7.94, plus key tags $3', 'action': '', 'bankid': '', 'reconcileflag': '2', 'shares': '-547/50'}
Amount: -10.94
Account index: 0
Value: -1499/50
['-1499', '50']
-1499
50
{'id': 'S0001', 'number': '', 'reconciledate': '', 'price': '1/1', 'payee': 'P000035', 'account': 'A000832', 'value': '-1499/50', 'memo': 'AAA Duracell - 48 pack', 'action': '', 'bankid': '', 'reconcileflag': '2', 'shares': '-1499/50'}
Amount: -29.98
Account index: 1
Value: -1499/50
['-1499', '50']
-1499
50
{'id': 'S0001', 'number': '', 'reconciledate': '', 'price': '1/1', 'payee': 'P000035', 'account': 'A000831', 'value': '-1499/50', 'memo': 'AA Duracell - 48pk', 'action': '', 'bankid': '', 'reconcileflag': '2', 'shares': '-1499/50'}
Amount: -29.98
Account index: 0
finished

Possibly I should protect a potential "dive by zero" and check that "value2" is not zero. Also the filter by accounts is now in a list, so the accumulation by account can be by the key/index found in the list "accounts".

jehoshua · (This post was last modified: Dec-01-2024, 12:58 AM by jehoshua.)

Now accumulating amounts by account number

#!/usr/bin/python

import bs4
from bs4 import BeautifulSoup

with open('testxml_1.xml', 'r') as f:
    file = f.read()

soup = BeautifulSoup(file, 'xml')

# only transactions with account A000831 or A000832
accounts = ["A000831","A000832"]
totals = [0,0]          #totals by account

for tag in soup.find_all("SPLIT", account=accounts):
    print("Value:", tag.get('value'))
    print(tag.get('value').split('/'))
    value1, value2 = tag.get('value').split('/')
    print(value1)
    print(value2)
    print(tag.attrs)
    amount = int(value1)/int(value2)
    print("Amount:", amount)

    # find the index of this account in the list accounts
    index_of_this_account = accounts.index(tag['account'])
    print("Account index:", index_of_this_account)

    # Adding two amounts - cumulative totals
    totals[index_of_this_account]  += amount

print('finished')
print(totals)

OUTPUT

Quote:Value: -547/50
['-547', '50']
-547
50
{'id': 'S0001', 'number': '', 'reconciledate': '', 'price': '1/1', 'payee': 'P000035', 'account': 'A000831', 'value': '-547/50', 'memo': 'keys cut (2) $7.94, plus key tags $3', 'action': '', 'bankid': '', 'reconcileflag': '2', 'shares': '-547/50'}
Amount: -10.94
Account index: 0
Value: -1499/50
['-1499', '50']
-1499
50
{'id': 'S0001', 'number': '', 'reconciledate': '', 'price': '1/1', 'payee': 'P000035', 'account': 'A000832', 'value': '-1499/50', 'memo': 'AAA Duracell - 48 pack', 'action': '', 'bankid': '', 'reconcileflag': '2', 'shares': '-1499/50'}
Amount: -29.98
Account index: 1
Value: -1499/50
['-1499', '50']
-1499
50
{'id': 'S0001', 'number': '', 'reconciledate': '', 'price': '1/1', 'payee': 'P000035', 'account': 'A000831', 'value': '-1499/50', 'memo': 'AA Duracell - 48pk', 'action': '', 'bankid': '', 'reconcileflag': '2', 'shares': '-1499/50'}
Amount: -29.98
Account index: 0
finished
[-40.92, -29.98]

It works, but is there a better method to accumulate the totals ??

..Later - actually it worked okay with only a few transactions. Just ran it on the full file and the totals don't match. Back to the drawing board. : Blush

***snippsat*** · (This post was last modified: Dec-01-2024, 12:05 PM by snippsat.)

Like this is,and i use lxml as parser it's better
So now first collect all values in a list,then do the calculation.

import bs4
from bs4 import BeautifulSoup
from fractions import Fraction

with open('testxml.xml', 'r') as f:
    file = f.read()

soup = BeautifulSoup(file, 'lxml')
# only transactions with account A000831 or A000832
value_lst = []
results = soup.find_all("split", account=["A000831", "A000832"])
for tag in results:
    value_lst.append(tag.get('value'))

# Fraction(val) converts the string fraction to a Fraction object.
# Then iterate over value_lst and do calculation.
print(value_lst)
decimal_lst = [float(Fraction(val)) for val in value_lst]
print(decimal_lst)
print(sum(decimal_lst))

Output:['-547/50', '-1499/50', '-1499/50']
[-10.94, -29.98, -29.98]
-70.9

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	connect sql by python using txt. file	dawid294	2	2,130	Jan-12-2024, 08:54 PM Last Post: deanhystad
	How to Connect to PostgreSQL Through Jump Server and SSH Tunnel using Python?	nishans	1	3,949	Jan-02-2024, 10:37 AM Last Post: khanzain
	python connect to mssql	wailoonho	7	5,720	Dec-07-2023, 02:06 AM Last Post: wailoonho
	Trying to make a bot to connect on discord with Selenium Python	johnsmith43	2	75,439	Mar-21-2022, 02:56 PM Last Post: Cloudytechnical
	How to connect Mysql databse with python and VSCode IDE	madhusoodhananER	1	9,765	Oct-31-2019, 10:15 AM Last Post: Larz60+
	Connect a Teradata DB to Python	OscarBoots	10	11,399	Jan-31-2019, 10:23 PM Last Post: OscarBoots
	Python connect to Sybase IQ	FORTITUDE	1	5,758	Jan-24-2019, 02:14 AM Last Post: micseydel
	How to connect Atom and Python?	Jack_Sparrow	1	4,443	May-01-2018, 10:53 AM Last Post: Larz60+
	connect to remote database via python script	sunstar20	5	6,573	Apr-23-2018, 11:05 AM Last Post: Gribouillis
	Can anyone Please help on fixing this python code to connect and save things on sqlit	Roscoes	2	3,624	Mar-06-2018, 04:48 AM Last Post: Roscoes

[SOLVED] Using Python to connect to an XML ?

User Panel Messages

Announcements