Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
web scraping forum
#1
I often wonder if there should be a web scraping forum? (selenium, bs4, lxml uses and questions) Or if this is going into too much detail that should just reside in General Coding Help?
Recommended Tutorials:
Reply
#2
I often wonder if there should be a web scraping forum?
(Apr-12-2017, 03:32 PM)metulburr Wrote: I often wonder if there should be a web scraping forum?
No is my answer.
It's really not so many threads about it,
scientific related(Pandas,Matplotlib....ect) had has more traffic lately,not that we should make a category for that yet either.
Reply
#3
A whole forum? 
I wrote a simple script to download all the user's playlists from Youtube in separate folders with names as the playlists are called. I was thinking it will be something bigger but... And it was quite easy. I just have to add an exclude option.

Under 50 lines all
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#4
(Apr-12-2017, 04:18 PM)wavic Wrote: A whole forum? 
Hmm i think you misunderstand.
It's just name of categories.
"Forum & Off Topic" has two forum "Board" and "Bar" if we add "Foo" there will be 3 forum under "Forum & Off Topic".
Reply
#5
I understood well. But I think it is related to a web programming.
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#6
I had never done much web scraping until recently.
Almost everything that I did that involved large data came
from direct network connection, or by download script.
Only within the past year have I been getting really involved in
scraping (Snippsat, metulburr, (and now zivoni)), thanks for sharing your wisdom
in this area it has helped me tremendously. New for me is bs4 select vs
find.


I am now tearing into lxml so that I am aware of all that it can do, and
will do the same next with bs4. I sometimes discover little used gems this
way.

A lesson I learned from my dad as a kid (literally). He often took me with him on
gemological hunts, along with my mother and siblings. One time at Mt. Mica in
West Paris Maine, I was picking through the trailings and found a 150+ carat green
tourmaline crystal that was overlooked. It was almost a perfect crystal, and it had
been overlooked by the mine operators. I got to keep that, and learned a joyful lesson
that remains with me to this day.

I could have just as easily fallen down a mine shaft!
Reply
#7
I remember my first web scrapping trial. It was something simple but I couldn't make it work. SO was useless. No one could help. I put it aside for months.
One day I felt a desire to try again. And worked. I was so happy. The reason it worked this time was that I was looking for what I've got back from each function call. Print, print, a lot of prints. So I scraped my government website.  Big Grin For a public data but it was spread on multiple web pages. Bad design.

I don't write code often. Only when I need something. That's not good for my coding skills but If I have no motivation, nothing goes well. It's essential to aim a target but it's more important to enjoy all of this. To have fun.
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#8
some government sites are extremely well laid out.
I use (and have used them) them all the time.
I think the tiger-line files were one of the best, and they came out
before ESRI shapefiles. I did many a map with line files.

But back then, you ordered the data on magnetic media. Internet connectivity was too slow for the most part.
files that today take only seconds could take days to download, I had access to very fast connections, but couldn't
use the bandwidth for such things.
Reply
#9
(Apr-12-2017, 04:53 PM)wavic Wrote: I understood well. But I think it is related to a web programming.
I think of web programming and scraping as two different things. Programming is creation of web pages, where scraping is grabbing web pages.

I was just thinking it would help organize content. But i guess you can do that to everything in general coding until there is no general coding forum at all.
Recommended Tutorials:
Reply
#10
Grabbing?
What I see is requesting a web address and get the response. Every web browser is doing the same. The difference is here. You get only pieces of whole the html. Just what you need. Some formatting...
So how is different if I am doing it as clicking with the mouse for a couple of hours than the same but automated? Because it's just I used to. It's not hidden. You don't sneak and grab something which is locked out of your hands.

I've read something funny many years ago. That the Windows users are forced to use the mouse so intensively so they could power the PC just doing their everyday job. This is not so far from the truth. I am using Linux and I know the power of the shell. You can automate almost everything. The computer must serve to the people, not the opposite - in order to do something useful.
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020