How to get a new line - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: How to get a new line (/thread-26013.html) |
How to get a new line - Calli - Apr-18-2020 After scrapping the website everything works like expected here is the code #!/usr/bin/python import urllib2 from bs4 import BeautifulSoup website = ("http://www.ahajokes.com/ym01.html") page = urllib2.urlopen(website) soup = BeautifulSoup(page,'html.parser') yomama = soup.find(id="Joke_box") name = yomama.text.strip() print (name)Output Yo mama so fat God told her he had no room in heaven and the devil said there was no room in hell (Submitted by )Yo Mama so fat her BMI is measured in acres. (Submitted by )Yo Mama so fat when she went to the movies she sat next to everyone (Submitted by )Yo mama so fat when her beeper goes off, people thought she was backing upYo mama so fat her nickname is "Lardo"What i want is after it finds the word "(Submitted by )" it should exclude that and include a new line instead of "(Submitted by )" how can i achieve this? RE: How to get a new line - snippsat - Apr-18-2020 You should not use Python 2 anymore,know this bye urllib2 .The html is a little messy,this is close and i use Requests always for this stuff. import requests from bs4 import BeautifulSoup website = ("http://www.ahajokes.com/ym01.html") page = requests.get(website) soup = BeautifulSoup(page.content, 'html.parser') yomama = soup.find(id="Joke_box") temp = yomama.text.strip() temp = temp.replace('(Submitted by )', '\n') name = temp.replace('(Submitted by )', '\n') print(name) To also fix the long last line,then is more difficult and may need to use regex on the raw html that yomama return,also not using .text .
RE: How to get a new line - Calli - Apr-19-2020 Thank you so much this was really helpful |