Python Forum
I want to scrape latitude and longitude of cell towers from the FCC web site.
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
I want to scrape latitude and longitude of cell towers from the FCC web site.
#1
I want to scrape latitude and longitude of cell towers from the FCC web site. I am writing in Python by using requests package.

The problem is that I’m not getting back any data. Here is the site:
https://wireless2.fcc.gov/UlsApp/AsrSear...Search.jsp
Entering the data manually I get my data.

Here is the manual query.
https://drive.google.com/file/d/1GsswwgB...sp=sharing

Some of the data I got back.
https://drive.google.com/file/d/1GdKW_zHdNDJlE6Ddo5DIaHNaOKHjnwfH/view?usp=sharing







Output:
# Scrape Fed website # from datetime import datetime import pandas as pd import requests from bs4 import BeautifulSoup # ------------------------------> # https://stackoverflow.com/questions/16337511/log-all-requests-from-the-python-requests-module # turns on debugging. #import requests #already imported import logging import http.client http.client.HTTPConnection.debuglevel = 1 logging.basicConfig() logging.getLogger().setLevel(logging.DEBUG) requests_log = logging.getLogger("requests.packages.urllib3") requests_log.setLevel(logging.DEBUG) requests_log.propagate = True # <-------------- end of debug code -------- # try : # why hide the errors? # datetime object containing current date and time now = datetime.now() # dd/mm/YY H:M:S dt_string = now.strftime("%m/%d/%Y %H:%M:%S") print("|| dt_string is",dt_string) print("||-----------------------> ",dt_string+" ---------------") # post your link url ="https://wireless2.fcc.gov/UlsApp/AsrSearch/asrRegistrationSearch.jsp" # Get the page page = requests.get(url) print("||type of page is ", type(page)) print("||encoding of page is ", page.encoding) print("||convert page to string is ", str(page) ) print("||length of converted string is ", len(str(page)) ) #print("||page.content is ", page.content) #print("||web page,\n",page) # "w" or"a" f = open("webPageQuery.html", "w") f.write("<!-- Now the file has more content! -->") f.write(page.text) f.close f = open("webPageQuery.txt", "w") f.write("<!-- Now the file has more content for sure. -->") f.write(page.text) f.close print("||page.status_code is ", page.status_code)
Output:
# fill in the data varN = "N" varW = "W" varKilometers = "Kilometers" varState = "state" varAL = "AL" var01001 = "01001" varSubmit = "Submit" varN = "N" varAny = "any" varMeters = "Meters" varMeters = "Meters" vartrue = "true" # format for results payload = \ {\ "fiLatDeg":"",\ "fiLatMin":"",\ "fiLatSec":"",\ "fiLatDir":varN,\ "fiLongDeg":"",\ "fiLongMin":"",\ "fiLongSec":"",\ "fiLongDir":varW,\ "fiRadius":"",\ "fiRadiusMetricType":varKilometers,\ "locatechoice":varState,\ "asr_r_city":"",\ "asr_r_state":varAL,\ "asr_r_county":var01001,\ "asr_r_structure_zipcode":"",\ "Submit":varSubmit,\ "fiExactMatchInd":varN,\ "fiHeightChoice":varAny,\ "fiOverallHgtAGL":"",\ "fiOverallAGLExactMetricType":varMeters,\ "fiLowerOverallHgtAGL":"",\ "fiUpperOverallHgtAGL":"",\ "fiOverallAGLRangeMetricType":varMeters,\ "jsValidated":vartrue } print("||payload is ", payload)

Output:
r = requests.post(url, data=payload) print("||type of page is ", type(r)) print("||encoding of page is ", r.encoding) print("||convert page to string is ", str(r) ) print("||length of converted string is ", len(str(r)) ) #print("||page.content is ", r.content) #print("||web page,\n",r) # "w" or"a" f = open("webPageFedQuery.html", "w") f.write("<!-- Now the file has more content! -->") f.write(r.text) f.close f = open("webPageFedQuery.txt", "w") f.write("<!-- Now the file has more content for sure. -->") f.write(r.text) f.close print("||page.status_code is ", r.status_code)
Next, I have the results. I don't get any of the cell tower locations. I enabled import logging. I put || at the start of the print statement.

Output:
Parallels-HS-user-mac:python mac$ python3 scrapeState\&Local.py || foo || <module> || dt_string is 07/13/2023 15:32:35 ||-----------------------> 07/13/2023 15:32:35 --------------- DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): wireless2.fcc.gov:443 send: b'GET /UlsApp/AsrSearch/asrRegistrationSearch.jsp HTTP/1.1\r\nHost: wireless2.fcc.gov\r\nUser-Agent: python-requests/2.31.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n' reply: 'HTTP/1.1 200 OK\r\n' header: Date: Thu, 13 Jul 2023 19:32:35 GMT header: Content-Type: text/html; charset=ISO-8859-1 header: Transfer-Encoding: chunked header: Connection: keep-alive header: Set-Cookie: AWSALB=4P9ulojhmIW4BvMIfRFbMw6h05JCR21eZi3qC+3I/WhLdl0XMwptO5bRI0zceveiX1LQaXDa4CCT2Cb0+/VGIP2GlV8gfijFF3ppfswVipx7c16iIb9kKYWIPLVu; Expires=Thu, 20 Jul 2023 19:32:35 GMT; Path=/ header: Set-Cookie: AWSALBCORS=4P9ulojhmIW4BvMIfRFbMw6h05JCR21eZi3qC+3I/WhLdl0XMwptO5bRI0zceveiX1LQaXDa4CCT2Cb0+/VGIP2GlV8gfijFF3ppfswVipx7c16iIb9kKYWIPLVu; Expires=Thu, 20 Jul 2023 19:32:35 GMT; Path=/; SameSite=None; Secure header: Server: Apache header: Strict-Transport-Security: max-age=31536000; includeSubDomains header: X-Frame-Options: SAMEORIGIN header: Set-Cookie: JSESSIONID_ASRSEARCH=Uj5Qva28W93KrBEbU3rIeeqMLLLfs0T06ToO_ERht3w_SNktaEcl!-188465283!1997095822; path=/UlsApp/; HttpOnly header: X-OneAgent-JS-Injection: true header: X-ruxit-JS-Agent: true header: Server-Timing: dtSInfo;desc="0", dtRpid;desc="-176933270" header: Set-Cookie: dtCookie=v_4_srv_3_sn_B2EC06D566E8DB06EB91CF9992E6E07C_perc_100000_ol_0_mul_1_app-3A12116eb046fa524b_1; Path=/; Domain=.fcc.gov DEBUG:urllib3.connectionpool:https://wireless2.fcc.gov:443 "GET /UlsApp/AsrSearch/asrRegistrationSearch.jsp HTTP/1.1" 200 None ||type of page is <class 'requests.models.Response'> ||encoding of page is ISO-8859-1 ||convert page to string is <Response [200]> ||length of converted string is 16 ||page.status_code is 200 ||------------------- First data page ---------------------------- ||payload is {'fiLatDeg': '', 'fiLatMin': '', 'fiLatSec': '', 'fiLatDir': 'N', 'fiLongDeg': '', 'fiLongMin': '', 'fiLongSec': '', 'fiLongDir': 'W', 'fiRadius': '', 'fiRadiusMetricType': 'Kilometers', 'locatechoice': 'state', 'asr_r_city': '', 'asr_r_state': 'AL', 'asr_r_county': '01001', 'asr_r_structure_zipcode': '', 'Submit': 'Submit', 'fiExactMatchInd': 'N', 'fiHeightChoice': 'any', 'fiOverallHgtAGL': '', 'fiOverallAGLExactMetricType': 'Meters', 'fiLowerOverallHgtAGL': '', 'fiUpperOverallHgtAGL': '', 'fiOverallAGLRangeMetricType': 'Meters', 'jsValidated': 'true'} DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): wireless2.fcc.gov:443 send: b'POST /UlsApp/AsrSearch/asrRegistrationSearch.jsp HTTP/1.1\r\nHost: wireless2.fcc.gov\r\nUser-Agent: python-requests/2.31.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\nContent-Length: 414\r\nContent-Type: application/x-www-form-urlencoded\r\n\r\n' send: b'fiLatDeg=&fiLatMin=&fiLatSec=&fiLatDir=N&fiLongDeg=&fiLongMin=&fiLongSec=&fiLongDir=W&fiRadius=&fiRadiusMetricType=Kilometers&locatechoice=state&asr_r_city=&asr_r_state=AL&asr_r_county=01001&asr_r_structure_zipcode=&Submit=Submit&fiExactMatchInd=N&fiHeightChoice=any&fiOverallHgtAGL=&fiOverallAGLExactMetricType=Meters&fiLowerOverallHgtAGL=&fiUpperOverallHgtAGL=&fiOverallAGLRangeMetricType=Meters&jsValidated=true' reply: 'HTTP/1.1 200 OK\r\n' header: Date: Thu, 13 Jul 2023 19:32:35 GMT header: Content-Type: text/html; charset=ISO-8859-1 header: Transfer-Encoding: chunked header: Connection: keep-alive header: Set-Cookie: AWSALB=bSnLvJkOlLdcM7LDq2Hgmjuuin9ams5fy2ohNGoGjUod3iIG2ZDtVG+0V0KyLgwrKg6AJ57wR3ESCppOyw0vwIZRqAmJMDOW2lQhvRLYR5ftAdIjtFreqCqJ6Yiz; Expires=Thu, 20 Jul 2023 19:32:35 GMT; Path=/ header: Set-Cookie: AWSALBCORS=bSnLvJkOlLdcM7LDq2Hgmjuuin9ams5fy2ohNGoGjUod3iIG2ZDtVG+0V0KyLgwrKg6AJ57wR3ESCppOyw0vwIZRqAmJMDOW2lQhvRLYR5ftAdIjtFreqCqJ6Yiz; Expires=Thu, 20 Jul 2023 19:32:35 GMT; Path=/; SameSite=None; Secure header: Server: Apache header: Strict-Transport-Security: max-age=31536000; includeSubDomains header: X-Frame-Options: SAMEORIGIN header: Set-Cookie: JSESSIONID_ASRSEARCH=p9lQva9ZjHOzmvlXxki25yTiTd2CEwYBeIBieLFy8j0533OmwmiP!-188465283!1997095822; path=/UlsApp/; HttpOnly header: X-OneAgent-JS-Injection: true header: X-ruxit-JS-Agent: true header: Server-Timing: dtSInfo;desc="0", dtRpid;desc="-1328681484" header: Set-Cookie: dtCookie=v_4_srv_5_sn_A49E4149C72FD6723F1F2F9C9DDEED70_perc_100000_ol_0_mul_1_app-3A12116eb046fa524b_1; Path=/; Domain=.fcc.gov DEBUG:urllib3.connectionpool:https://wireless2.fcc.gov:443 "POST /UlsApp/AsrSearch/asrRegistrationSearch.jsp HTTP/1.1" 200 None ||type of page is <class 'requests.models.Response'> ||encoding of page is ISO-8859-1 ||convert page to string is <Response [200]> ||length of converted string is 16 ||page.status_code is 200 Parallels-HS-user-mac:python mac$
This is what data I should get from the request.
https://drive.google.com/file/d/12SH98yg...sp=sharing
Reply
#2
Look at the fcc download page. The data is probably there, technical documentation should also be available
what's the name of the dataset used?

EDIT: 3:25 UTC
Specific databases probably
Cellular – 47 CFR Part 22
and/or
Antenna Structure Registration
both are located here.
documentation for same here.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020