Jul-06-2023, 12:40 AM
The data is of the format
>> <PAYEE matchingenabled="0" email="" name="Transfers - inter account" reference="" id="P000001">
<ADDRESS street="" telephone="" state="" city="" postcode=""/>
</PAYEE>
and this code
>>PAYEE: <bound method PageElement.get_text of <PAYEE id="P000242"/>>
PAYEE: <bound method PageElement.get_text of <PAYEE id="P000344"/>>
at the end. This is caused by this type of data
>><REPORT investments="0" group="Transactions" type="querytable 1.15" rowtype="category" querycolumns="number,payee,tag,account" name="Transactions by Category (Customized)" comment="Custom Report" convertcurrency="1" loans="0" showcolumntotals="1" detail="all" includestransfers="0" skipZero="0" hidetransactions="0" favorite="0" datelock="userdefined" tax="0" id="R000023">
<PAYEE id="P000242"/>
<CATEGORY id="A000536"/>
<DATES to="2020-08-05" from="2001-01-01"/>
</REPORT>
so the code works, but includes PAYEE data I don't want. That PAYEE data is a child of another parent. Need to only parse the PAYEE data at a parent level. Have tried 'recursive=no', and different code to indicate exclude child and siblings, but only include parent, but it all gave syntax errors as I attempted to adjust the code to suit.
I simply need the 'find_all' for PAYEE , but only at a parent level.
>> <PAYEE matchingenabled="0" email="" name="Transfers - inter account" reference="" id="P000001">
<ADDRESS street="" telephone="" state="" city="" postcode=""/>
</PAYEE>
and this code
#!/usr/bin/python from bs4 import BeautifulSoup with open('testxml.xml', 'r') as f: file = f.read() soup = BeautifulSoup(file, 'xml') for tag in soup.find_all('PAYEE'): print(f'{tag.name}: {tag.get_text}')displays all the payees, but it also outputs these 2 lines
>>PAYEE: <bound method PageElement.get_text of <PAYEE id="P000242"/>>
PAYEE: <bound method PageElement.get_text of <PAYEE id="P000344"/>>
at the end. This is caused by this type of data
>><REPORT investments="0" group="Transactions" type="querytable 1.15" rowtype="category" querycolumns="number,payee,tag,account" name="Transactions by Category (Customized)" comment="Custom Report" convertcurrency="1" loans="0" showcolumntotals="1" detail="all" includestransfers="0" skipZero="0" hidetransactions="0" favorite="0" datelock="userdefined" tax="0" id="R000023">
<PAYEE id="P000242"/>
<CATEGORY id="A000536"/>
<DATES to="2020-08-05" from="2001-01-01"/>
</REPORT>
so the code works, but includes PAYEE data I don't want. That PAYEE data is a child of another parent. Need to only parse the PAYEE data at a parent level. Have tried 'recursive=no', and different code to indicate exclude child and siblings, but only include parent, but it all gave syntax errors as I attempted to adjust the code to suit.
I simply need the 'find_all' for PAYEE , but only at a parent level.
![Wink Wink](https://python-forum.io/images/smilies/wink.png)