Feb-25-2019, 04:14 PM
(This post was last modified: Feb-25-2019, 04:14 PM by erdem_ustunmu.)
Thank you @snippsat so much for your help.
I've been doing this all day.I've made different experiments with what you wrote.
I managed to do some of them. Simply the ones.
great thing you did for sumStat.
I tried to do something for catgry. But the result is not correct.
always the latest values, All values are not coming.
but it did not succeed as I wrote above with red. It brings the most recent value, and I couldn't merge with <catValu> and <labl> (<catValu>-<labl>).
yours sincerely
I've been doing this all day.I've made different experiments with what you wrote.
I managed to do some of them. Simply the ones.
great thing you did for sumStat.
I tried to do something for catgry. But the result is not correct.
always the latest values, All values are not coming.
import itertools from bs4 import BeautifulSoup lst = [] soup = BeautifulSoup(open('NPL_2008_LFS_v01_M_v01_A_ILOVAR.xml', encoding='utf-8'), 'xml') title = soup.find('titl') producer = soup.find('producer') #affiliation=(soup.find('producer'))['affiliation'] #print(title.text.strip()) #print(producer.attrs.get('affiliation')) data = soup.find('dataDscr') vars = data.find_all('var') for var in vars: ID=var.attrs.get('ID') name=var.attrs.get('name') files=var.attrs.get('files') dcml=var.attrs.get('dcml') intrvl=var.attrs.get('intrvl') labl=var.find('labl').text.strip() sumStat=[i.text.strip() for i in var.find_all('sumStat')] VarFormat=(var.find('varFormat')).attrs.get('type') stdCatgry = [stdCat.text.strip() for stdCat in var.find_all("stdCatgry")] #There is a mistake, I will look after merge the categories. #Range_Min=var.find_all('range') #Range_Unit=(var.find_all('range'))['UNITS'] #Range_Min=(var.find_all('range'))['min'] #Range_Max=(var.find_all('range'))['max'] #print(Range_Min) #I tried to do as follows. I could not be successful. for cat in var.find_all('catgry'): catValu = [ values.text.strip() for values in cat.findAll("catValu")] catlabl = [ values.text.strip() for values in cat.findAll("labl")] data = [item for item in itertools.zip_longest(catValu, catlabl)] print(title.text.strip(),producer.text.strip(),ID,name,files,dcml,intrvl,labl,sumStat,VarFormat,data,stdCatgry) lst.append((title.text.strip(),producer.text.strip(),ID,name,files,dcml,intrvl,labl,sumStat,VarFormat,data,stdCatgry))As you said, I'm trying to do something from the code you've written.
but it did not succeed as I wrote above with red. It brings the most recent value, and I couldn't merge with <catValu> and <labl> (<catValu>-<labl>).
yours sincerely