Python Forum

Full Version: Search Engine
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Can anybody tell me what are the steps to create a search engine such as google? I've read that it involves using python libraries such as beautiful soup/scrapy to crawl through different websites. And I understand that Google has an index that stores all of this information across millions of sites on a sever somewhere. Is this something that I would be able to replicate on my own?
First, build a facility and load it with servers. In a July 2016 it was reported that Google at the time had 2.5 million servers.

To start scraping data, you can go through local tutorials to get started.
web scraping part 1
web scraping part 2

I do a lot of web scraping, and have 30 Tb of disk space. You can do quite a lot with that.