Python Forum
Creating Python dataframes using a County Jail Daily Booking Register (Snohomish, WA)
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Creating Python dataframes using a County Jail Daily Booking Register (Snohomish, WA)
#1
Hey everyone,

I am very new to Python and have been learning from a couple good guys here on the forums. I am not trying to be redundant; however each question / building block solution I am trying to learn / understand will assist me in what I am trying to achieve with python.

What would the best way be to take a site like this:

Snohomish County, WA - Corrections - Daily Booking Register

Currently on 03/23/2020 @ Approx. 18:38 there are 67 Registered Bookings in the Last 72 Hours:

It shows a list of 67 people arrested in my county within the last 72 hours. All the names appear to be driven to collapse upon clicking a name.

Once you click a name; it expands for that specific person and then it lists various fields of data.

The first field is the name of the person
The second field has a group of columns with 1 row which consists of:

"Book Number"
"CIN"
"Book Date"
"Sex"

Then the third field has a group of columns with 1 row which consists of:

"Charge Description"
"Disposition"

The fourth field has a group of columns with 1 row which consists of:

"Warrant/Citation/Court Case Num"
"Bail Amount"
"Bail Type"

The fifth field has a group of columns with 1 row which consists of:

"Court"

The sixth field has a group of columns with 1 row which consists of:

"Charging Agency"
"Charge Date"
"Arrest Type"


I would like to be able to have a python script monitor this website 24/7 with every 60 second refreshes into my python program and turn each booked American in our county into a dictionary which has all the contents like snippsat was showing me on this thread with a different example. I then want to be able to write that data to a MySQL after sorted within python.

Example: American1 = dictionary1(named_by_timestamp_as_a_prefix_maybe)

all the fields from field 1, 2, 3, 4, 5, 6:
within dictionary1

"Book Number"
"CIN"
"Book Date"
"Sex"
"Charge Description"
"Disposition"
"Warrant/Citation/Court Case Num"
"Bail Amount"
"Bail Type"
"Court"
"Charging Agency"
"Charge Date"
"Arrest Type"


then cycle to the next person until the program is caught up; then monitor mode.

For instance; monitoring keywords so while it's saving the data coming in; it's also checking for keywords to flag.

For instance; say I would like to watch for Arrests for "Curfew Violations" due to COVID-19 if that happens here in our County against the United States Constitution & Washington State Constitution. I would like to be able to see all the arrests for a specific charge.

I am a College Student studying Pre-Law by the way. Lots of court / law related questions for advise I will be asking.

Thank you for this awesome forum!

Any tips will be extremely helpful and I have a lot to learn from what snippsat has already shared with me tonight which sparked a lot of ideas which has produced this thread.

Best Regards and God bless,

Brandon Kastning
“And one of the elders saith unto me, Weep not: behold, the Lion of the tribe of Juda, the Root of David, hath prevailed to open the book,...” - Revelation 5:5 (KJV)

“And oppress not the widow, nor the fatherless, the stranger, nor the poor; and ...” - Zechariah 7:10 (KJV)

#LetHISPeopleGo

Reply
#2
Firstly, be aware that if you intend to make requests to the page quite frequently, you'll be placing more load on the server(s) (in the absence of any caching they have). Do you know how the site will cope with that extra load?

That being said, you can use Beautiful Soup to extract the data from the HTML.
Reply
#3
Yes have something to look into and learn,the problem is that you choose not so easy task start with.
Also when struggle with the basic fundamentals of Python,then is not so easy Doh

So to show a demo on how to start,as this site has some challenges.
The description info is generated bye JavaScript,so need to use Selenium.
Need to first click on expand all to activate all data.
Then can parse name and a value as test.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
import time

#--| Setup
options = Options()
options.add_argument("--headless")
options.add_argument("--window-size=1980,1020")
browser = webdriver.Chrome(executable_path=r'chromedriver.exe', options=options)

#--| Parse or automation
url = 'http://www.snoco.org/app/corrections/jailregister/dailyBookingRegister.aspx'
browser.get(url)
soup = BeautifulSoup(browser.page_source, 'lxml')
expan = browser.find_elements_by_css_selector('#expandAll')
expan[0].click()
name = soup.select_one('#bookingName1') # Use BS
charge_description = browser.find_elements_by_css_selector('#booking1 > p > table > tbody > tr:nth-child(2) > td:nth-child(1)') # Use Sel
print(name.text)
print(charge_description[0].text)
Output:
ALFORD, CALEB AARON PAROLE VIOLATION
Reply
#4
(Mar-24-2020, 02:41 PM)snippsat Wrote: Yes have something to look into and learn,the problem is that you choose not so easy task start with.
Also when struggle with the basic fundamentals of Python,then is not so easy Doh

So to show a demo on how to start,as this site has some challenges.
The description info is generated bye JavaScript,so need to use Selenium.
Need to first click on expand all to activate all data.
Then can parse name and a value as test.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
import time

#--| Setup
options = Options()
options.add_argument("--headless")
options.add_argument("--window-size=1980,1020")
browser = webdriver.Chrome(executable_path=r'chromedriver.exe', options=options)

#--| Parse or automation
url = 'http://www.snoco.org/app/corrections/jailregister/dailyBookingRegister.aspx'
browser.get(url)
soup = BeautifulSoup(browser.page_source, 'lxml')
expan = browser.find_elements_by_css_selector('#expandAll')
expan[0].click()
name = soup.select_one('#bookingName1') # Use BS
charge_description = browser.find_elements_by_css_selector('#booking1 > p > table > tbody > tr:nth-child(2) > td:nth-child(1)') # Use Sel
print(name.text)
print(charge_description[0].text)
Output:
ALFORD, CALEB AARON PAROLE VIOLATION

snippsat,

Thank you sir for a great starting point with my tasks and challenges set before me with complicated not beginner python projects. You have given me lots of information between this thread and the other which I believe will set me in the right direction.

Upon making progress I will update the threads accordingly!

Best Regards and God Bless,

Brandon Kastning
“And one of the elders saith unto me, Weep not: behold, the Lion of the tribe of Juda, the Root of David, hath prevailed to open the book,...” - Revelation 5:5 (KJV)

“And oppress not the widow, nor the fatherless, the stranger, nor the poor; and ...” - Zechariah 7:10 (KJV)

#LetHISPeopleGo

Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question Online calendar for booking. SpongeB0B 6 3,502 Nov-15-2023, 11:27 AM
Last Post: Woki
  register the user as staff member - django rwahdan 0 1,488 Dec-24-2021, 03:08 PM
Last Post: rwahdan
  Scraping daily football score shamil1999 2 2,744 Sep-18-2019, 09:55 AM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020