Jun-06-2018, 04:00 AM
I am trying to look through html for certain tags and (if a certain tag is found, have python notify me as quickly as possible) This is my code so far:
I've also experimented with multiprocessing (without the dummy):
Well, I've been at this for hours, and I am starting to feel like the dummy. One of the versions of code iterates through the list twice as slow, and the other runs through the list four times at once. What am I doing wrong?
import ast import bs4 as bs doc = open('C:/Users/Me/AppData/Local/Programs/Python/Python36/sample_requests.txt', 'r').readlines() results_string = doc[0] results_list = ast.literal_eval(results_string) results = [] for i in results_list: # This converts my list of strings to a list of bytes of html text. n_coded = i.encode() results.append(n_coded) to_notify_list = [] def parseru(requested): soup = bs.BeautifulSoup(requested, 'lxml') tr_list = soup.find_all('tr') tr_list = (tr_list[3:])[:5] for tr in tr_list: if 'text I am searching for' in tr: to_notify_list.append(requested) for i in results: parseru(i) for i in to_notify_list: print(i)I've experimented with multiprocessing and multiprocessing.dummy:
from multiprocessing.dummy import Pool if __name__ == '__main__': pool = Pool(4) pool.map(parseru, results)However, multiprocessing.dummy just makes the code run twice as slow.
I've also experimented with multiprocessing (without the dummy):
from multiprocessing import Pool if __name__ == '__main__': pool = Pool(4) pool.map(parseru, results)This just ends up running the function 4 times at the same time and almost crashes PyCharm each time. (It freezes for several seconds) It also makes code outside of the if statement run multiple times. For instance, print functions 100 lines away run four times.
Well, I've been at this for hours, and I am starting to feel like the dummy. One of the versions of code iterates through the list twice as slow, and the other runs through the list four times at once. What am I doing wrong?