Python Forum
Creating Python dataframes using a County Jail Daily Booking Register (Snohomish, WA) - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: Creating Python dataframes using a County Jail Daily Booking Register (Snohomish, WA) (/thread-25221.html)



Creating Python dataframes using a County Jail Daily Booking Register (Snohomish, WA) - BrandonKastning - Mar-24-2020

Hey everyone,

I am very new to Python and have been learning from a couple good guys here on the forums. I am not trying to be redundant; however each question / building block solution I am trying to learn / understand will assist me in what I am trying to achieve with python.

What would the best way be to take a site like this:

Snohomish County, WA - Corrections - Daily Booking Register

Currently on 03/23/2020 @ Approx. 18:38 there are 67 Registered Bookings in the Last 72 Hours:

It shows a list of 67 people arrested in my county within the last 72 hours. All the names appear to be driven to collapse upon clicking a name.

Once you click a name; it expands for that specific person and then it lists various fields of data.

The first field is the name of the person
The second field has a group of columns with 1 row which consists of:

"Book Number"
"CIN"
"Book Date"
"Sex"

Then the third field has a group of columns with 1 row which consists of:

"Charge Description"
"Disposition"

The fourth field has a group of columns with 1 row which consists of:

"Warrant/Citation/Court Case Num"
"Bail Amount"
"Bail Type"

The fifth field has a group of columns with 1 row which consists of:

"Court"

The sixth field has a group of columns with 1 row which consists of:

"Charging Agency"
"Charge Date"
"Arrest Type"


I would like to be able to have a python script monitor this website 24/7 with every 60 second refreshes into my python program and turn each booked American in our county into a dictionary which has all the contents like snippsat was showing me on this thread with a different example. I then want to be able to write that data to a MySQL after sorted within python.

Example: American1 = dictionary1(named_by_timestamp_as_a_prefix_maybe)

all the fields from field 1, 2, 3, 4, 5, 6:
within dictionary1

"Book Number"
"CIN"
"Book Date"
"Sex"
"Charge Description"
"Disposition"
"Warrant/Citation/Court Case Num"
"Bail Amount"
"Bail Type"
"Court"
"Charging Agency"
"Charge Date"
"Arrest Type"


then cycle to the next person until the program is caught up; then monitor mode.

For instance; monitoring keywords so while it's saving the data coming in; it's also checking for keywords to flag.

For instance; say I would like to watch for Arrests for "Curfew Violations" due to COVID-19 if that happens here in our County against the United States Constitution & Washington State Constitution. I would like to be able to see all the arrests for a specific charge.

I am a College Student studying Pre-Law by the way. Lots of court / law related questions for advise I will be asking.

Thank you for this awesome forum!

Any tips will be extremely helpful and I have a lot to learn from what snippsat has already shared with me tonight which sparked a lot of ideas which has produced this thread.

Best Regards and God bless,

Brandon Kastning


RE: Creating Python dataframes using a County Jail Daily Booking Register (Snohomish, WA) - ndc85430 - Mar-24-2020

Firstly, be aware that if you intend to make requests to the page quite frequently, you'll be placing more load on the server(s) (in the absence of any caching they have). Do you know how the site will cope with that extra load?

That being said, you can use Beautiful Soup to extract the data from the HTML.


RE: Creating Python dataframes using a County Jail Daily Booking Register (Snohomish, WA) - snippsat - Mar-24-2020

Yes have something to look into and learn,the problem is that you choose not so easy task start with.
Also when struggle with the basic fundamentals of Python,then is not so easy Doh

So to show a demo on how to start,as this site has some challenges.
The description info is generated bye JavaScript,so need to use Selenium.
Need to first click on expand all to activate all data.
Then can parse name and a value as test.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
import time

#--| Setup
options = Options()
options.add_argument("--headless")
options.add_argument("--window-size=1980,1020")
browser = webdriver.Chrome(executable_path=r'chromedriver.exe', options=options)

#--| Parse or automation
url = 'http://www.snoco.org/app/corrections/jailregister/dailyBookingRegister.aspx'
browser.get(url)
soup = BeautifulSoup(browser.page_source, 'lxml')
expan = browser.find_elements_by_css_selector('#expandAll')
expan[0].click()
name = soup.select_one('#bookingName1') # Use BS
charge_description = browser.find_elements_by_css_selector('#booking1 > p > table > tbody > tr:nth-child(2) > td:nth-child(1)') # Use Sel
print(name.text)
print(charge_description[0].text)
Output:
ALFORD, CALEB AARON PAROLE VIOLATION



RE: Creating Python dataframes using a County Jail Daily Booking Register (Snohomish, WA) - BrandonKastning - Mar-25-2020

(Mar-24-2020, 02:41 PM)snippsat Wrote: Yes have something to look into and learn,the problem is that you choose not so easy task start with.
Also when struggle with the basic fundamentals of Python,then is not so easy Doh

So to show a demo on how to start,as this site has some challenges.
The description info is generated bye JavaScript,so need to use Selenium.
Need to first click on expand all to activate all data.
Then can parse name and a value as test.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
import time

#--| Setup
options = Options()
options.add_argument("--headless")
options.add_argument("--window-size=1980,1020")
browser = webdriver.Chrome(executable_path=r'chromedriver.exe', options=options)

#--| Parse or automation
url = 'http://www.snoco.org/app/corrections/jailregister/dailyBookingRegister.aspx'
browser.get(url)
soup = BeautifulSoup(browser.page_source, 'lxml')
expan = browser.find_elements_by_css_selector('#expandAll')
expan[0].click()
name = soup.select_one('#bookingName1') # Use BS
charge_description = browser.find_elements_by_css_selector('#booking1 > p > table > tbody > tr:nth-child(2) > td:nth-child(1)') # Use Sel
print(name.text)
print(charge_description[0].text)
Output:
ALFORD, CALEB AARON PAROLE VIOLATION

snippsat,

Thank you sir for a great starting point with my tasks and challenges set before me with complicated not beginner python projects. You have given me lots of information between this thread and the other which I believe will set me in the right direction.

Upon making progress I will update the threads accordingly!

Best Regards and God Bless,

Brandon Kastning