Python Forum
how to add a login to a bs4 parser-script
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
how to add a login to a bs4 parser-script
#10
hi there dear snippsat, many many thanks for all you did!


i am currently trying to get ahead here - with the script. note - this works perfect. you can check it with the following combination:


i have created a test-account for the demo-testing of this:

login: pluginfan
pass: testpasswd123
the issus with the session are quite strange. i am musing bout a approbiate solution:


well that said, i think, we could potentially log in with Selenium, then, once that's complete, we could pass the session cookie to the parser, add it to the session, and then parse that way. What do you say - how do you like this idea?


btw: The parser-code above is yielding for conversations on wp-forums - which I would like to save in a CSV file.
There are smart ways to have the "results" that contain

author:
text:
url: - if one is given in the thread..
etc.

well we can do this with the Requests library or the urllib, with Requests we can see how to do the CSV writing which is what I am interested in.... ...saved in columns (from A to D for example) so that the values are stored in columns from A to D ( or so ) in the CSV?
I saw that there are a number of threads on this topic but none of the solutions I have read through worked for the specific situation.


result_stats[query] = soup.find(id="the wordpress-comunication-data").string

with open ('the wordpress-comunication-data.csv', 'w', newline='') as fout:
    cw = csv.writer(fout)
    for q in .....:
        cw.writerow([q, result_stats[q]])
but - besides this CSV export the first and the most important thing is to get the

a. login-part and the
b. parser-part

get it working and up and running as a single script that works with one session...

i am working on this solution - Smile



update: by the way : i have seen some guys that run into very similar issues:

Selenium login looks like it works but then BeautifulSoup output shows login page

https://stackoverflow.com/questions/5238...n-pag?rq=1


question:
Quote:I'm trying to write a script in Python to grab all of the rosters
in my fantasy football league, but you have to login to ESPN first. The code I have is below. It looks like it's working when it runs -- i.e., I see the login page come up, I see it login, and the page closes. Then when I print the soup I don't see any team rosters. I saved the soup output as an html file to see what it is and it's just the page redirecting me to login again. Do I load the page through BS4 before I try to login?


answer:
Quote:Requests you're executing via Selenium in Browser has nothing common with request you're making via urllib. Just pass username/password to your HTTP-request to request data as authorized user (no Selenium required) or use pure Selenium job (note that Selenium has enough built-in methods for page scraping) – Andersson Sep 18 '18 at 10:20


To be more specific, cookies are not shared between Selenium and urllib2 so when you make the request using urllib2 the webserver won't be able to detect your previous login. As others have stated just stick with Selenium for all HTTP requests and you should be OK

answer2 :
Quote:You are using selenium to login and then using urllib2 to open the URL which uses another session to goto the site. Get the source from selenium webdriver and then use it with BeautifulSoup and it should work.



answer 3
Try this instead of urllib2
driver.get("http://games.espn.com/ffl/leaguerosters?leagueId=11111")
# query the website and return the html to the variable 'page'
page = driver.page_source
# parse the html using beautiful soup and store in variable 'soup'
soup = BeautifulSoup(page, 'html.parser')
well - i guess that i need to digg deeper inoto all that stuff.. Smile
Wordpress - super toolkits a. http://wpgear.org/ :: und b. https://github.com/miziomon/awesome-wordpress :: Awesome WordPress: A curated list of amazingly awesome WordPress resources and awesome python things https://github.com/vinta/awesome-python
Reply


Messages In This Thread
RE: how to add a login to a bs4 parser-script - by apollo - Jul-02-2020, 09:52 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  little parser-script crashes after doing good work for some time apollo 0 1,668 Feb-03-2021, 10:48 AM
Last Post: apollo
  Python-selenium script for automated web-login does not work hectorKJ 2 4,125 Sep-10-2019, 01:29 PM
Last Post: buran

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020