Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to write 3 Columns to MariaDB? - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to write 3 Columns to MariaDB? (/thread-25165.html) |
Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to write 3 Columns to MariaDB? - BrandonKastning - Mar-22-2020 Hello Python Web Scraping Coders, I am trying to determine how to write the 3 columns that I parse from HTML URL and send to a CSV; also / or directly into a MySQL (MariaDB) Database. I have successfully loaded a working mysql.connector into my existing Python3 script and it connects to my Database Successfully! I have never gotten this far until now. Below is my code and any tips in the right direction would be greatly appreciated! Current Python3 Code: from urllib.request import urlopen from bs4 import BeautifulSoup html = urlopen("http://law.justia.com/cases/federal/appellate-courts/F2/999/663/308588/") bsObj = BeautifulSoup(html.read()) allOpinion = bsObj.findAll(id="opinion") import requests from bs4 import BeautifulSoup url = "http://law.justia.com/cases/federal/appellate-courts/F2/999/663/308588/" allTitle = bsObj.findAll({"title"}) allURL = url print(allOpinion) print(allTitle) print(allURL) import csv csvRow = [allOpinion,allTitle,allURL] csvfile = "current_F2_opinion_with_tags_current.csv" with open(csvfile, "a") as fp: wr = csv.writer(fp, dialect='excel') wr.writerow(csvRow) print(allOpinion[0].get_text(),url) import csv csvRow = [allOpinion[0].get_text(),allTitle[0].get_text(),allURL] csvfile = "current_F2_opinion_without_tags_current.csv" with open(csvfile, "a") as fp: wr = csv.writer(fp, dialect='excel') wr.writerow(csvRow) import mysql.connector from mysql.connector import Error try: connection = mysql.connector.connect(host='localhost', database='PythonMariaDB1', user='PythonMariaDB1', password='password1234') if connection.is_connected(): db_Info = connection.get_server_info() print("Connected to MySQL Server version ", db_Info) cursor = connection.cursor() cursor.execute("select database();") record = cursor.fetchone() print("You're connected to database: ", record) except Error as e: print("Error while connecting to MySQL", e) finally: if (connection.is_connected()): cursor.close() connection.close() print("MySQL connection is closed")Thank you! Best Regards, Brandon Kastning P.S. - May your families be safe! God bless! RE: Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to write 3 Columns to MariaDB? - ndc85430 - Mar-22-2020 Do you know any SQL? If not, you'll want to at least learn the basics of creating tables and inserting rows. Seems like your question is more of an SQL one than a Python one at this point. RE: Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to write 3 Columns to MariaDB? - BrandonKastning - Mar-22-2020 (Mar-22-2020, 07:08 AM)ndc85430 Wrote: Do you know any SQL? If not, you'll want to at least learn the basics of creating tables and inserting rows. Seems like your question is more of an SQL one than a Python one at this point. ndc85430, I do know some SQL. I found some descent reference material that I will share below; however it doesn't demonstrate in ways that I understand how to pass the current Python3 variables that I am working with. (allOpinion, allTitle and allURL) to a MySQL database table. Referring back to the documentation I see that you are right on looking at the code better for MySQL. I am setting up to do an INSERT Python Variable pass off to a MySQL Table. The Instructions are found here: The code used to demonstrate this process is below: import mysql.connector from mysql.connector import Error def insertVariblesIntoTable(id, name, price, purchase_date): try: connection = mysql.connector.connect(host='localhost', database='Electronics', user='pynative', password='pynative@#29') cursor = connection.cursor() mySql_insert_query = """INSERT INTO Laptop (Id, Name, Price, Purchase_date) VALUES (%s, %s, %s, %s) """ recordTuple = (id, name, price, purchase_date) cursor.execute(mySql_insert_query, recordTuple) connection.commit() print("Record inserted successfully into Laptop table") except mysql.connector.Error as error: print("Failed to insert into MySQL table {}".format(error)) finally: if (connection.is_connected()): cursor.close() connection.close() print("MySQL connection is closed") insertVariblesIntoTable(2, 'Area 51M', 6999, '2019-04-14') insertVariblesIntoTable(3, 'MacBook Pro', 2499, '2019-06-20')I am setting up my Database as follows for this script: PythonMariaDB1 (Database) Single_No_Loop (Table) allTitle - VARCHAR - 250 (Column #1) allOpinion - LONGTEXT (Column #2) allURL - VARCHAR - 250 (Column #3) I will use the above example code as reference and hope I can integrate it properly into this existing one. I will update either way (work or fail). Thank you for your response sir! Best Regards, Brandon Kastning RE: Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to write 3 Columns to MariaDB? - ndc85430 - Mar-22-2020 Good! Trying things out is the best way to learn. RE: Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to write 3 Columns to MariaDB? - BrandonKastning - Mar-22-2020 Update is as follows: I have setup my MariaDB Database as follows (Couple changes): Primary Key - opinion_ID / BIGINT / 8 / Unsigned / AUTO_INCREMENT all_Title - VARCHAR / 250 all_Opinion - LONGTEXT all_URL - VARCHAR / 250 and Modified the Python3 Variable -> MySQL INSERT to look like the following: import mysql.connector from mysql.connector import Error def insertVariblesIntoTable(allTitle, allOpinion, allURL): try: connection = mysql.connector.connect(host='localhost', database='PythonMariaDB1', user='PythonMariaDB1', password='password1234') cursor = connection.cursor() mySql_insert_query = """INSERT INTO Single_No_Loop (all_Title, all_Opinion, all_URL) VALUES (%s, %s, %s) """ recordTuple = (allTitle, allOpinion, allURL) cursor.execute(mySql_insert_query, recordTuple) connection.commit() print("Record inserted successfully into Single_No_Loop table") except mysql.connector.Error as error: print("Failed to insert into MySQL table {}".format(error)) finally: if (connection.is_connected()): cursor.close() connection.close() print("MySQL connection is closed") insertVariblesIntoTable(2, 'Area 51M', 6999, '2019-04-14') insertVariblesIntoTable(3, 'MacBook Pro', 2499, '2019-06-20')What I am unsure of is the bottom lines: insertVariblesIntoTable(2, 'Area 51M', 6999, '2019-04-14') insertVariblesIntoTable(3, 'MacBook Pro', 2499, '2019-06-20')My question is; can I disregard those lines as they do not pertain to my original script that I am implementing this into? Will the above be sufficient or are other modifications required? RE: Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to write 3 Columns to MariaDB? - ndc85430 - Mar-22-2020 I don't understand the question. You need to call the function somewhere for its code to execute. RE: Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to write 3 Columns to MariaDB? - BrandonKastning - Mar-22-2020 Here is my full HTML2CSV w/ Python3 Variable Passoff -> MySQL INSERT (However it's not working when it gets to the database) and I left the bottom 2 lines out of the reference code because I cannot figure out how they apply to my variables. Full .py as of now: from urllib.request import urlopen from bs4 import BeautifulSoup html = urlopen("http://law.justia.com/cases/federal/appellate-courts/F2/999/663/308588/") bsObj = BeautifulSoup(html.read()) allOpinion = bsObj.findAll(id="opinion") import requests from bs4 import BeautifulSoup url = "http://law.justia.com/cases/federal/appellate-courts/F2/999/663/308588/" allTitle = bsObj.findAll({"title"}) allURL = url print(allOpinion) print(allTitle) print(allURL) import csv csvRow = [allOpinion,allTitle,allURL] csvfile = "current_F2_opinion_with_tags_current.csv" with open(csvfile, "a") as fp: wr = csv.writer(fp, dialect='excel') wr.writerow(csvRow) print(allOpinion[0].get_text(),url) import csv csvRow = [allOpinion[0].get_text(),allTitle[0].get_text(),allURL] csvfile = "current_F2_opinion_without_tags_current.csv" with open(csvfile, "a") as fp: wr = csv.writer(fp, dialect='excel') wr.writerow(csvRow) import mysql.connector from mysql.connector import Error def insertVariblesIntoTable(allTitle, allOpinion, allURL): try: connection = mysql.connector.connect(host='localhost', database='PythonMariaDB1', user='PythonMariaDB1', password='password1234') cursor = connection.cursor() mySql_insert_query = """INSERT INTO Single_No_Loop (all_Title, all_Opinion, all_URL) VALUES (%s, %s, %s) """ recordTuple = (allTitle, allOpinion, allURL) cursor.execute(mySql_insert_query, recordTuple) connection.commit() print("Record inserted successfully into Single_No_Loop table") except mysql.connector.Error as error: print("Failed to insert into MySQL table {}".format(error)) finally: if (connection.is_connected()): cursor.close() connection.close() print("MySQL connection is closed")Just a heads up: The Python3 Script Variables: - allOpinion - allTitle - allURL The MySQL Database Column Names: - opinion_ID - all_Title - all_Opinion - all_URL How would I go about debugging this? Are you able to determine the error by looking at this post? RE: Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to write 3 Columns to MariaDB? - ndc85430 - Mar-22-2020 What happens when you run the program? Are you getting errors or something? I don't see anywhere you're calling the insertVariablesIntoTable function.
RE: Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to write 3 Columns to MariaDB? - BrandonKastning - Mar-22-2020 (Mar-22-2020, 08:49 AM)ndc85430 Wrote: What happens when you run the program? Are you getting errors or something? I don't see anywhere you're calling the ndc85430, When I run the program it runs without error. The CSV files populate and all is well. No mention of MySQL connection closing and the Database doesn't populate. You mention that you do not see my program calling the "insertVariablesIntoTable" function. Same function that was on the last 2 lines that I didn't understand. Since that's the case; how would I go about using "insertVariablesIntoTable" to generate the same data CSV is getting from those main variables? Thanks ndc85430! RE: Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to write 3 Columns to MariaDB? - ndc85430 - Mar-22-2020 I'm confused. You wrote a function that you don't understand? It looks like you have the data in the variables allOpinion , allTitle and allURL , so why can't you call your function, passing in the values?
|