(Feb-23-2019, 03:17 PM)erdem_ustunmu Wrote: how to get the tags and elements between the <dataDscr> ... </ dataDscr> tags,attributes and transfer them to csv.You have to start testing as it's big file on first how to get data out,then think of structure wanted over to CSV.
How do I make a find and loop for <dataDscr><var> ... </ var></ dataDscr> tags?
There are too many xml files, So have to I ,to define all the tags and attributes that depend on them one by one?
from bs4 import BeautifulSoup soup = BeautifulSoup(open('NPL_2008_LFS_v01_M_v01_A_ILOVAR.xml', encoding='utf-8'), 'xml') data = soup.find('dataDscr')So inside
dataDscr
there are many var
tages.Using
find()
get the first one,all is find_all()
.look at data in first one.
>>> var = data.find('var') >>> var <var ID="V270" dcml="0" files="F6" intrvl="contin" name="PSU"> <location width="16"/> <labl> PSU </labl> <valrng> <range max="1800" min="1001"/> </valrng> <sumStat type="vald"> 76208 </sumStat> <sumStat type="invd"> 0 </sumStat> <sumStat type="min"> 1001 </sumStat> <sumStat type="max"> 1800 </sumStat> <sumStat type="mean"> 1412.79 </sumStat> <sumStat type="stdev"> 231.955 </sumStat> <varFormat schema="other" type="numeric"/> </var> # All attributes >>> var.attrs {'ID': 'V270', 'dcml': '0', 'files': 'F6', 'intrvl': 'contin', 'name': 'PSU'} # Get name >>> var.attrs.get('name') 'PSU' # All sumStat >>> [i.text.strip() for i in var.find_all('sumStat')] ['76208', '0', '1001', '1800', '1412.79', '231.955'] >>>