Calculate the fewest zip codes, for the largest coverage

**nilamo** · Mar-07-2017, 05:57 PM

This is a problem I'm in the middle of solving, and because I don't want to spend forever on it, I'll probably brute force a solution.

Suppose there's a third party website which displays information within a radius (max 25 miles) of a given zip code. Suppose you want to scrape that website, for all the available info for the entire US. You know, so you can analyze it, or whatever.

The site in question isn't the fastest in the world, so just blasting it with thousands of zip codes will take... too long to run.

So the problem is then, how do you choose zip codes strategically to cover the entire US, without having too much overlap between the radii between those zip codes? This seems like basic geometry to me, but I'm bad at math, so...

What I'm working on now is basically pick a zipcode, then in each of the cardinal directions, the next zip code is as close to 10 miles away without going over. So there's a 15 mile overlap (in each direction, for each zip code), but there's no uncovered territory. So I'll need to dedupe the results from the external system. Is there a better way to do this, such that there's no uncovered territory and ALSO less zips involved?

For reference, here's five zip codes, each with a 25(ish) mile radius, near Seattle (you can see that there will be A LOT of requests this way...).

wavic · Mar-07-2017, 06:54 PM

Perhaps you can take lon/lat for each point to building a grid. Then I think you can take the corresponding zip code for each of these from several places. It should have to be some public database available or even Google. I am not American so I can't be more specific without more digging.

**nilamo** · Mar-07-2017, 07:05 PM

The problem with using lon/lat is that... they curve. 1 degree of longitude in Minnesota is different than 1 degree of longitude in Texas.

This is an interesting problem, but really has nothing to do with python. So this thread will probably be pruned in the near future.

**Larz60+** · Mar-07-2017, 07:15 PM

One way to do this is to use the us Census Bureau GeoCoder.
The main site is here: https://geocoding.geo.census.gov/
You can submit an address batch file if you want to do it by street
addresses (if you don't know latitude and longtitude, which by the way
can be calculated if you know center lat and lon, distance from center
and angle to point.

Given that information, you will receive:

Output:Counties:
OID: 27590286621899
STATE: 33
FUNCSTAT: A
AREAWATER: 177190123
NAME: Belknap County
LSADC: 06
CENTLON: -071.4224504
BASENAME: Belknap
INTPTLAT: +43.5191091
COUNTYCC: H1
MTFCC: G4020
COUNTY: 001
GEOID: 33001
CENTLAT: +43.5179017
INTPTLON: -071.4253661
AREALAND: 1040130770
COUNTYNS: 00873174
OBJECTID: 3195

Here's a document that explains the whole process: https://www.census.gov/mso/www/training/...-13-16.pdf
And another for finding lat and long from bearing and distance from a point: http://www.movable-type.co.uk/scripts/latlong.html

You can also download all of the raw data (Tiger files) if you want to do it all yourself.
But a lot easier to use the api.

Bass · (This post was last modified: Mar-23-2017, 01:31 PM by Bass.)

Hi,

Not sure if this is still current, but some thoughts...

My inital thoughts relates to long/lat, but I then read your difficulties with this. So in that vain, have you come across the Haversine Formula (I hadn't) but this may help. You take a FULL list of zip codes and use this/ perhaps combined with some alternative API's to remove zip codes from the full list that fall into the same unique area. This may give you a more manageable list, but without tryiing it, I have no idea.

Some links that you may have/have not seen:

http://mcdc.missouri.edu/allabout/zipcodes.html

https://www.zipcodeapi.com/
In particular -> Find Close Zip Codes (https://www.zipcodeapi.com/API#matchClose ) (50 free requests/hour)

The API can take a list of zip codes and match together those that are within a specified distance of each other.

Sounds like a interesting conundrum. Even Quora has been asking this

https://www.quora.com/Geolocation-What-i...entire-USA

But I may be covering the same ground as you.

Bass

PS I have just seen this blog, it seems to generate a single number to relate to the geography by combining L&Lat. This may be able to let you feed in all the zip codes and extract those that are 'unique'

http://mitchelsellers.com/blogs/2012/01/...erver.aspx

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Help to find the largest int number in a file directory	SalzmannNicholas	1	1,727	Jan-13-2022, 05:22 PM Last Post: ndc85430
	find 2 largest equal numbers	Frankduc	13	3,820	Jan-11-2022, 07:10 PM Last Post: Frankduc
	Largest product in a grid (projecteuler problem11)	tragical	1	2,372	Sep-14-2020, 01:03 PM Last Post: Gribouillis
	Extract the largest value from a group without replacement (beginner)	preliator	1	2,161	Aug-12-2020, 01:56 PM Last Post: DPaul
	frequency of largest number group	anshumanmuj	5	3,125	Jun-22-2020, 04:51 PM Last Post: perfringo
	Sort by the largest number of the same results (frequency)	inlovewiththedj	3	2,313	Apr-01-2020, 07:29 AM Last Post: DPaul
	how to get improve coverage while using coverage.py	sami23	0	1,544	Jan-22-2020, 10:53 AM Last Post: sami23
	Find the second largest number	DarkCraftPlayz	8	11,622	May-29-2019, 02:46 AM Last Post: heiner55
	Looking for good doc on Scraping coverage algorithms	Larz60+	0	1,977	Jan-05-2019, 03:22 PM Last Post: Larz60+
	How to use python to do "for each 365 data, print the largest 18 value?	ctliaf	1	2,822	Apr-28-2018, 08:14 PM Last Post: snippsat

Calculate the fewest zip codes, for the largest coverage

User Panel Messages

Announcements