Python Forum
BeautifulSoup returning text as N/A - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: BeautifulSoup returning text as N/A (/thread-34862.html)



BeautifulSoup returning text as N/A - tantony - Sep-08-2021

I'm trying to get the text value from the below html, so I should be getting FFT606, but when I run my Python code I'm getting n/a. Please help.

<span>id="highlighted_callsign">FFT606</span>
from bs4 import BeautifulSoup
import requests

url = 'https://globe.adsbexchange.com/?icao=ad627d'

r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
div = soup.findAll('div',attrs={'id':'layout_container'})
for d in div:
    s = (soup.find('span'))
    print(s.text)
Output:
n/a



RE: BeautifulSoup returning text as N/A - tantony - Sep-08-2021

I tried with Selenium web driver, and I'm able to get text values on that, but I prefer to use BeautifulSoup.


RE: BeautifulSoup returning text as N/A - tantony - Sep-09-2021

I made some changes to my code, at least now I can "see" what I need, but still not sure how to get the text values. The text I need is showing as n/a. Please see output.

from bs4 import BeautifulSoup
import requests
import lxml

url = 'https://globe.adsbexchange.com/?icao=a7d3d3'

r = requests.get(url)
soup = BeautifulSoup(r.content, 'lxml')
main_div = soup.find('div', attrs={'id': 'infoblock-container'})
div = main_div.findAll(class_='infoBlockSection')
for divs in div:
    infoData = divs.findAll('div', class_='infoData')
    print(infoData, '\n')
Output:
[] 

[] 

[] 

[<div class="infoData"><span id="selected_registration">n/a</span></div>, <div class="infoData">
<span id="selected_country" title="Country of registration">n/a</span>
</div>, <div class="infoData"><span id="selected_dbFlags">n/a</span></div>, <div class="infoData">
<span id="selected_squawk1"></span>
</div>] 

[] 

[] 

[<div class="infoData">
<span id="selected_vert_rate">n/a</span>
</div>, <div class="infoData">
<span id="selected_track1">n/a</span>
</div>, <div class="infoData">
<span id="selected_position">n/a</span>
</div>, <div class="infoData">
<span id="selected_sitedist2">n/a</span>
</div>] 

[<div class="infoData">
<span id="selected_source">n/a</span>
</div>, <div class="infoData">
<span id="selected_rssi1">n/a</span>
</div>, <div class="infoData">
<span id="selected_message_rate">n/a</span>
</div>, <div class="infoData" id="selected_message_count">
</div>, <div class="infoData">
<span id="selected_seen_pos">n/a</span>
</div>, <div class="infoData">
<span id="selected_seen">n/a</span>
</div>] 

[<div class="infoData">
<span id="selected_nav_altitude">n/a</span>
</div>, <div class="infoData">
<span id="selected_nav_heading">n/a</span>
</div>] 

[<div class="infoData">
<span id="selected_ws">n/a</span>
</div>, <div class="infoData">
<span id="selected_wd">n/a</span>
</div>, <div class="infoData">
<span id="selected_temp">n/a</span>
</div>] 

[<div class="infoData">
<span id="selected_speed2">n/a</span>
</div>, <div class="infoData">
<span id="selected_tas">n/a</span>
</div>, <div class="infoData">
<span id="selected_ias">n/a</span>
</div>, <div class="infoData">
<span id="selected_mach">n/a</span>
</div>] 

[<div class="infoData">
<span id="selected_altitude2"></span>
</div>, <div class="infoData">
<span id="selected_baro_rate">n/a</span>
</div>, <div class="infoData">
<span id="selected_altitude_geom">n/a</span>
</div>, <div class="infoData fourColumnSection4">
<span id="selected_geom_rate">n/a</span>
</div>, <div class="infoData">
<span id="selected_nav_qnh">n/a</span>
</div>] 

[<div class="infoData">
<span id="selected_track2">n/a</span>
</div>, <div class="infoData">
<span id="selected_true_heading">n/a</span>
</div>, <div class="infoData">
<span id="selected_mag_heading">n/a</span>
</div>, <div class="infoData">
<span id="selected_mag_declination">n/a</span>
</div>, <div class="infoData">
<span id="selected_trackrate">n/a</span>
</div>, <div class="infoData">
<span id="selected_roll">n/a</span>
</div>] 

[<div class="infoData">
<span id="selected_nav_modes">n/a</span>
</div>, <div class="infoData">
<span id="selected_version">n/a</span>
</div>, <div class="infoData">
<span id="selected_category">n/a</span>
</div>] 

[<div class="infoData">
<span id="selected_nac_p">n/a</span>
</div>, <div class="infoData">
<span id="selected_sil">n/a</span>
</div>, <div class="infoData">
<span id="selected_nac_v">n/a</span>
</div>, <div class="infoData">
<span id="selected_nic_baro">n/a</span>
</div>, <div class="infoData">
<span id="selected_rc">n/a</span>
</div>] 



RE: BeautifulSoup returning text as N/A - snippsat - Sep-09-2021

(Sep-08-2021, 11:27 PM)tantony Wrote: tried with Selenium web driver, and I'm able to get text values on that, but I prefer to use BeautifulSoup.
As this site use a lot JavaScript,Selenium is the easiest way can still use BS as can pass page source to it.
Also not loading Browser helps with --headless.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
import time

#--| Setup
options = Options()
options.add_argument("--headless")
browser = webdriver.Chrome(executable_path=r'C:\cmder\bin\chromedriver.exe', options=options)
#--| Parse or automation
url = 'https://globe.adsbexchange.com/?icao=a7d3d3'
browser.get(url)
soup = BeautifulSoup(browser.page_source, 'lxml')
time.sleep(3)
main_div = soup.find('div', attrs={'id': 'infoblock-container'})
div = main_div.findAll(class_='infoBlockSection')
for divs in div:
    infoData = divs.findAll('div', class_='infoData')
    print(infoData, '\n')
Output:
[<div class="infoData"><span id="selected_registration">N603QS</span></div>, <div class="infoData"> <span id="selected_country" title="Country of registration">United States</span> </div>, <div class="infoData"><span id="selected_dbFlags">none</span></div>, <div class="infoData"> <span id="selected_squawk1">3764</span> </div>] [] [] [<div class="infoData"> <span id="selected_vert_rate">n/a</span> </div>, <div class="infoData"> <span id="selected_track1">n/a</span> </div>, <div class="infoData"> <span id="selected_position">41.075°, -73.709°</span> </div>, <div class="infoData"> <span id="selected_sitedist2">n/a</span> </div>] .....



RE: BeautifulSoup returning text as N/A - tantony - Sep-09-2021

Thank you, the reason I want to use BS4 is because wouldn't using Selenium use memory? What does not running in headless mode do?


RE: BeautifulSoup returning text as N/A - snippsat - Sep-09-2021

It will use some more memory,but it's not much so should not make any problems.
Not running headless it will load a visible browser,using headless feel just like running BS just a little more running time.


RE: BeautifulSoup returning text as N/A - tantony - Sep-09-2021

(Sep-09-2021, 12:56 PM)snippsat Wrote: It will use some more memory,but it's not much so should not make any problems.
Not running headless it will load a visible browser,using headless feel just like running BS just a little more running time.

Ok thank you so much, now I can move on to the next portion of my project