Python Forum
How to create dynamic webscraper in Django using BeautifulSoup
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to create dynamic webscraper in Django using BeautifulSoup
#1
I am trying to a dynamic webscraper using Django. I have a html button in which it has buttons so it is like when i will press the button it will start webscraper and will save the data into database and later on from another button i can download that data into CSV format.

Now I have prepared the html page for buttons. I have written webscraper script and in my views.py i have imported that scraper file and called that function but when i type python manage.py runserver scrapers starts but my web page is not reachable.

First how can i solve this problem. Second how can i connect my scraper with the html button Third how i can save the data into database Fourth how can i download them into csv when i press the button

Below is my Views.py

from django.shortcuts import render
from django.conf.urls import url
from django.conf.urls import include
from django.http import HttpResponse
from bs4 import BeautifulSoup
import requests

import Up# this is the web scraping file

# Create your views here.

#def index(request):
#    first = {"here":"will enter more details"}
#    return render(request, "files/first-page.html", context=first)
    #return HttpResponse("<em>Rera details will be patched here</em>")    

#uncomment this below link to exectue scraper but html page will not be accessible

def scrapper_view(request):
    Up.scraper()#calling function here
    return HttpResponse("<em>Scraper Started</em>")
This my Html code:

Quote:<!DOCTYPE html>
{% load staticfiles %}
<html>
<head>
<meta charset="utf-8"/>
<body>
</head>
<style>
table, th, td{
border: 1px solid black;
text-align:center;
}
</style>

<h1 style="background-color:rgb(180,180,180);">DOWNLOAD RERA RAW DATA FROM AVAILABLE SCRAPERS</h1>
<table align="center" style="width:45%">

<tr>
<th align="center"> State </th>
<th align="center"> Data</th>
<th align="center"> Agents</th>
<th align="center"> Download</th>
</tr>

<tr>
<td>MAHARASHTRA</td>

<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" name="submit" value="Download"/></td>
</tr>

<tr>
<td>GUJARAT</td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" name="Download" value="Download"/></td>
</tr>

<tr>
<td>KARNATAKA</td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" name="Download" value="Download"/></td>
</tr>

<tr>
<td>UTTRAKHAND</td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" name="Download" value="Download"/></td>
</tr>

<tr>
<td>HIMACHAL PRADESH</td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" name="Download" value="Download"/></td>
</tr>

<tr>
<td>UTTAR PRADESH</td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" name="Download" value="Download"/></td>
</tr>
<img src="{% static "images/joker.jpg" %}" alt="No images found">
<tr>
<td>MADHYA PRADESH</td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" name="Download" value="Download"/></td>
</tr>

<tr>
<td>PUNJAB</td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" name="Download" value="Download"/></td>
</tr>

<tr>
<td>DELHI/NCR</td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" name="Download" value="Download"/></td>
</tr>

<tr>
<td>ODISHA</td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" name="Download" value="Download"/></td>
</tr>

<tr>
<td>BIHAR</td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" name="Download" value="Download"/></td>
</tr>

<tr>
<td>Kerala</td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" name="Download" value="Download"/></td>
</tr>

<tr>
<td>Goa</td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" value="Start Crawling"/></td>
<td><input type="submit" name="Download" value="Download"/></td>
</tr>


</table>

</body>
</html>


UP is for Uttar predesh and button i wish to attach to this scarper is start crawling in data column.

can anyone please guide?
Reply
#2
What error did you get when you try to access the page? Post traceback here

Just call the view function. That's it. I don't see any problem here.
Quote:Second how can i connect my scraper with the html button
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  <title> django page title dynamic and other field (not working) lemonred 1 211 Nov-04-2021, 08:50 PM
Last Post: lemonred
  how to create dynamic arguments to be passed in jinja template in url_for function? experimental 1 1,243 May-01-2020, 05:50 PM
Last Post: Larz60+
  Django Two blocks of dynamic content on one page iFunKtion 5 2,374 Jul-04-2019, 02:31 AM
Last Post: noisefloor
  Django: How to automatically substitute a variable in the admin page at Django 1.11? m0ntecr1st0 3 1,564 Jun-30-2019, 12:21 AM
Last Post: scidam
  [split] [Help] Keep getting a 'TypeError' from Django and BeautifulSoup moxasya 0 1,278 Nov-15-2018, 07:38 AM
Last Post: moxasya
  Best way to create Apps in django API Norvegat 0 1,573 Sep-18-2018, 10:39 PM
Last Post: Norvegat
  [Help] Keep getting a 'TypeError' from Django and BeautifulSoup Plant_Boy 7 3,616 Jun-13-2018, 04:13 PM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020