Python Forum
Calculate the fewest zip codes, for the largest coverage
Thread Rating:
  • 1 Vote(s) - 4 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Calculate the fewest zip codes, for the largest coverage
#1
This is a problem I'm in the middle of solving, and because I don't want to spend forever on it, I'll probably brute force a solution.

Suppose there's a third party website which displays information within a radius (max 25 miles) of a given zip code.  Suppose you want to scrape that website, for all the available info for the entire US.  You know, so you can analyze it, or whatever.

The site in question isn't the fastest in the world, so just blasting it with thousands of zip codes will take... too long to run.

So the problem is then, how do you choose zip codes strategically to cover the entire US, without having too much overlap between the radii between those zip codes?  This seems like basic geometry to me, but I'm bad at math, so...

What I'm working on now is basically pick a zipcode, then in each of the cardinal directions, the next zip code is as close to 10 miles away without going over.  So there's a 15 mile overlap (in each direction, for each zip code), but there's no uncovered territory.  So I'll need to dedupe the results from the external system.  Is there a better way to do this, such that there's no uncovered territory and ALSO less zips involved?

For reference, here's five zip codes, each with a 25(ish) mile radius, near Seattle (you can see that there will be A LOT of requests this way...).

Attached Files

Thumbnail(s)
   
Reply
#2
Perhaps you can take lon/lat for each point to building a grid. Then I think you can take the corresponding zip code for each of these from several places. It should have to be some public database available or even Google. I am not American so I can't be more specific without more digging.
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#3
The problem with using lon/lat is that... they curve. 1 degree of longitude in Minnesota is different than 1 degree of longitude in Texas.

This is an interesting problem, but really has nothing to do with python. So this thread will probably be pruned in the near future.
Reply
#4
One way to do this is to use the us Census Bureau GeoCoder.
The main site is here: https://geocoding.geo.census.gov/
You can submit an address batch file if you want to do it by street
addresses (if you don't know latitude and longtitude, which by the way
can be calculated if you know center lat and lon,  distance from center
and angle to point.

Given that information, you will receive:
Output:
Counties: OID: 27590286621899 STATE: 33 FUNCSTAT: A AREAWATER: 177190123 NAME: Belknap County LSADC: 06 CENTLON: -071.4224504 BASENAME: Belknap INTPTLAT: +43.5191091 COUNTYCC: H1 MTFCC: G4020 COUNTY: 001 GEOID: 33001 CENTLAT: +43.5179017 INTPTLON: -071.4253661 AREALAND: 1040130770 COUNTYNS: 00873174 OBJECTID: 3195
Here's a document that explains the whole process: https://www.census.gov/mso/www/training/...-13-16.pdf
And another for finding lat and long from bearing and distance from a point: http://www.movable-type.co.uk/scripts/latlong.html

You can also download all of the raw data (Tiger files) if you want to do it all yourself.
But a lot easier to use the api.
Reply
#5
Hi,

Not sure if this is still current, but some thoughts...

My inital thoughts relates to long/lat, but I then read your difficulties with this. So in that vain, have you come across the Haversine Formula (I hadn't) but this may help. You take a FULL list of zip codes and use this/ perhaps combined with some alternative API's to remove zip codes from the full list that fall into the same unique area. This may give you a more manageable list, but without tryiing it, I have no idea.

Some links that you may have/have not seen:

http://mcdc.missouri.edu/allabout/zipcodes.html

https://www.zipcodeapi.com/
In particular -> Find Close Zip Codes (https://www.zipcodeapi.com/API#matchClose ) (50 free requests/hour)

The API can take a list of zip codes and match together those that are within a specified distance of each other.

Sounds like a interesting conundrum. Even Quora has been asking this 

https://www.quora.com/Geolocation-What-i...entire-USA

But I may be covering the same ground as you.

Bass

PS I have just seen this blog, it seems to generate a single number to relate to the geography by combining L&Lat. This may be able to let you feed in all the zip codes and extract those that are 'unique'

http://mitchelsellers.com/blogs/2012/01/...erver.aspx
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question Help to find the largest int number in a file directory SalzmannNicholas 1 1,586 Jan-13-2022, 05:22 PM
Last Post: ndc85430
  find 2 largest equal numbers Frankduc 13 3,410 Jan-11-2022, 07:10 PM
Last Post: Frankduc
  Largest product in a grid (projecteuler problem11) tragical 1 2,241 Sep-14-2020, 01:03 PM
Last Post: Gribouillis
  Extract the largest value from a group without replacement (beginner) preliator 1 2,037 Aug-12-2020, 01:56 PM
Last Post: DPaul
  frequency of largest number group anshumanmuj 5 2,916 Jun-22-2020, 04:51 PM
Last Post: perfringo
  Sort by the largest number of the same results (frequency) inlovewiththedj 3 2,141 Apr-01-2020, 07:29 AM
Last Post: DPaul
Lightbulb how to get improve coverage while using coverage.py sami23 0 1,456 Jan-22-2020, 10:53 AM
Last Post: sami23
  Find the second largest number DarkCraftPlayz 8 11,124 May-29-2019, 02:46 AM
Last Post: heiner55
  Looking for good doc on Scraping coverage algorithms Larz60+ 0 1,874 Jan-05-2019, 03:22 PM
Last Post: Larz60+
  How to use python to do "for each 365 data, print the largest 18 value? ctliaf 1 2,634 Apr-28-2018, 08:14 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020