Python Forum
Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to write 3 Columns to MariaDB?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to write 3 Columns to MariaDB?
#11
(Mar-22-2020, 09:04 AM)ndc85430 Wrote: I'm confused. You wrote a function that you don't understand? It looks like you have the data in the variables allOpinion, allTitle and allURL, so why can't you call your function, passing in the values?

ndc85430,

I modified an existing reference function and since the bottom two lines look as if when the function is being called it's manual entered data.

When parsing the HTML to CSV as I have above this segment of code; I used the same variables. I don't know how to call the function with each of those variables.

Would I go about something like this:

CSV variables for Parse HTML Cycle #1:

allOpinion[0]
allTitle[0]
allURL

Reference Code Calling the Function w/ Custom Data:

insertVariblesIntoTable(2, 'Area 51M', 6999, '2019-04-14')
insertVariblesIntoTable(3, 'MacBook Pro', 2499, '2019-06-20')
Attempt #1 at Calling the Function w/ Parse HTML Cycle #1 Variables that CSV uses:

insertVariablesIntoTable('allTitle[0]', 'allOpinion[0]', 'allURL')
Adding the above code as the bottom line in my .py I receive the following error:

Quote:Traceback (most recent call last):
File "HTML2CSV-NoWrite3Variables-to-MySQL.Python.Variable.Passoff.MySQL.INSERT.py", line 61, in <module>
insertVariablesIntoTable('allTitle[0]', 'allOpinion[0]', 'allURL')
NameError: name 'insertVariablesIntoTable' is not defined

Do you have any suggestions as to how to remedy this?
“And one of the elders saith unto me, Weep not: behold, the Lion of the tribe of Juda, the Root of David, hath prevailed to open the book,...” - Revelation 5:5 (KJV)

“And oppress not the widow, nor the fatherless, the stranger, nor the poor; and ...” - Zechariah 7:10 (KJV)

#LetHISPeopleGo

Reply
#12
(Mar-22-2020, 09:19 AM)BrandonKastning Wrote:
insertVariablesIntoTable('allTitle[0]', 'allOpinion[0]', 'allURL')

Why the quotes? That's not how you refer to variables, is it?


Quote:Adding the above code as the bottom line in my .py I receive the following error:

Quote:Traceback (most recent call last):
File "HTML2CSV-NoWrite3Variables-to-MySQL.Python.Variable.Passoff.MySQL.INSERT.py", line 61, in <module>
insertVariablesIntoTable('allTitle[0]', 'allOpinion[0]', 'allURL')
NameError: name 'insertVariablesIntoTable' is not defined

Do you have any suggestions as to how to remedy this?

Does the function live in another file? You need to import it if so.
Reply
#13
(Mar-22-2020, 09:23 AM)ndc85430 Wrote:
(Mar-22-2020, 09:19 AM)BrandonKastning Wrote:
insertVariablesIntoTable('allTitle[0]', 'allOpinion[0]', 'allURL')

Why the quotes? That's not how you refer to variables, is it?


Quote:Adding the above code as the bottom line in my .py I receive the following error:


Do you have any suggestions as to how to remedy this?

Does the function live in another file? You need to import it if so.

Line 37:

def insertVariblesIntoTable(allTitle, allOpinion, allURL):
Does this line not define that function?

To me it looks as if it reads "Define ""insertVariablesIntoTable""(variable1, variable2, variable3)

Do I need to add [0] to them for it to run properly? (Minus allURL since it's a allURL = url) rather than a BeautifulSoup4 HTML Parse Variable.
“And one of the elders saith unto me, Weep not: behold, the Lion of the tribe of Juda, the Root of David, hath prevailed to open the book,...” - Revelation 5:5 (KJV)

“And oppress not the widow, nor the fatherless, the stranger, nor the poor; and ...” - Zechariah 7:10 (KJV)

#LetHISPeopleGo

Reply
#14
Yes, line 37 defines the function. Not sure what you mean by the last question. Remember that you need to define the function before it's called, so if your calls were before that line, then they need to be after.
Reply
#15
(Mar-22-2020, 09:37 AM)ndc85430 Wrote: Yes, line 37 defines the function. Not sure what you mean by the last question. Remember that you need to define the function before it's called, so if your calls were before that line, then they need to be after.

ndc85430,

I have since changed the last line of my .py to calling the function:

insertVariablesIntoTable()
This then results in the following error:

Quote:Traceback (most recent call last):
File "HTML2CSV-NoWrite3Variables-to-MySQL.Python.Variable.Passoff.MySQL.INSERT.py", line 62, in <module>
insertVariablesIntoTable()
NameError: name 'insertVariablesIntoTable' is not defined


I then go back in the .py and change line 37 from:

def insertVariblesIntoTable(allTitle, allOpinion, allURL):
to: (which is now line 38 w/ commenting line 37 out)

def insertVariablesIntoTable():
Then re-run the .py and receive the following error:

Quote:Failed to insert into MySQL table Failed processing format-parameters; Python 'resultset' cannot be converted to a MySQL type
MySQL connection is closed

Perhaps we are getting somewhere with this!
“And one of the elders saith unto me, Weep not: behold, the Lion of the tribe of Juda, the Root of David, hath prevailed to open the book,...” - Revelation 5:5 (KJV)

“And oppress not the widow, nor the fatherless, the stranger, nor the poor; and ...” - Zechariah 7:10 (KJV)

#LetHISPeopleGo

Reply
#16
Can you post all the code please? It's really hard to help without seeing it.
Reply
#17
Current Code for "HTML2CSV-NoWrite3Variables-to-MySQL.Python.Variable.Passoff.MySQL.INSERT.py":

from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen("http://law.justia.com/cases/federal/appellate-courts/F2/999/663/308588/")
bsObj = BeautifulSoup(html.read())
allOpinion = bsObj.findAll(id="opinion")
import requests
from bs4 import BeautifulSoup

url = "http://law.justia.com/cases/federal/appellate-courts/F2/999/663/308588/"
allTitle = bsObj.findAll({"title"})
allURL = url

print(allOpinion)
print(allTitle)
print(allURL)

import csv
csvRow = [allOpinion,allTitle,allURL]
csvfile = "current_F2_opinion_with_tags_current.csv"
with open(csvfile, "a") as fp:
    wr = csv.writer(fp, dialect='excel')
    wr.writerow(csvRow)

print(allOpinion[0].get_text(),url)
 
import csv
csvRow = [allOpinion[0].get_text(),allTitle[0].get_text(),allURL]
csvfile = "current_F2_opinion_without_tags_current.csv"
with open(csvfile, "a") as fp:
    wr = csv.writer(fp, dialect='excel')
    wr.writerow(csvRow)


import mysql.connector
from mysql.connector import Error

#def insertVariblesIntoTable(allTitle, allOpinion, allURL):
def insertVariablesIntoTable():
    try:
        connection = mysql.connector.connect(host='localhost',
                                             database='PythonMariaDB1',
                                             user='PythonMariaDB1',
                                             password='password1234')
        cursor = connection.cursor()
        mySql_insert_query = """INSERT INTO Single_No_Loop (all_Title, all_Opinion, all_URL) 
                                VALUES (%s, %s, %s) """

        recordTuple = (allTitle, allOpinion, allURL)
        cursor.execute(mySql_insert_query, recordTuple)
        connection.commit()
        print("Record inserted successfully into Single_No_Loop table")

    except mysql.connector.Error as error:
        print("Failed to insert into MySQL table {}".format(error))

    finally:
        if (connection.is_connected()):
            cursor.close()
            connection.close()
            print("MySQL connection is closed")

insertVariablesIntoTable()
Current Error:

Quote:Failed to insert into MySQL table Failed processing format-parameters; Python 'resultset' cannot be converted to a MySQL type
MySQL connection is closed
“And one of the elders saith unto me, Weep not: behold, the Lion of the tribe of Juda, the Root of David, hath prevailed to open the book,...” - Revelation 5:5 (KJV)

“And oppress not the widow, nor the fatherless, the stranger, nor the poor; and ...” - Zechariah 7:10 (KJV)

#LetHISPeopleGo

Reply
#18
On line 48, why aren't you extracting the text from allTitle and allOpinion in the same way that you're doing on line 27?

Also, remember to post the full traceback in future, as it contains important info about the error - like the line number it occurs on.

Also avoid using globals and pass the values into your function - the signature was sensible when you originally wrote it; I don't know why you changed that.
Reply
#19
(Mar-22-2020, 10:38 AM)ndc85430 Wrote: On line 48, why aren't you extracting the text from allTitle and allOpinion in the same way that you're doing on line 27?

Also, remember to post the full traceback in future, as it contains important info about the error - like the line number it occurs on.

Also avoid using globals and pass the values into your function - the signature was sensible when you originally wrote it; I don't know why you changed that.

ndc85430,

I have since commented line #48 and line #49 reads as the following:

        recordTuple = (allTitle[0], allOpinion[0], allURL)
How do I post a full traceback? Is it a command switch I can append to my normal .py kickoff with my usual "python3 *.py"

Regarding globals; are you referring to "url = allURL" ? Does this qualify as a global variable?

How would I specifically avoid using a global for the use of the function itself?

The current .py looks like the following:

from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen("http://law.justia.com/cases/federal/appellate-courts/F2/999/663/308588/")
bsObj = BeautifulSoup(html.read())
allOpinion = bsObj.findAll(id="opinion")
import requests
from bs4 import BeautifulSoup

url = "http://law.justia.com/cases/federal/appellate-courts/F2/999/663/308588/"
allTitle = bsObj.findAll({"title"})
allURL = url

print(allOpinion)
print(allTitle)
print(allURL)

import csv
csvRow = [allOpinion,allTitle,allURL]
csvfile = "current_F2_opinion_with_tags_current.csv"
with open(csvfile, "a") as fp:
    wr = csv.writer(fp, dialect='excel')
    wr.writerow(csvRow)

print(allOpinion[0].get_text(),url)
 
import csv
csvRow = [allOpinion[0].get_text(),allTitle[0].get_text(),allURL]
csvfile = "current_F2_opinion_without_tags_current.csv"
with open(csvfile, "a") as fp:
    wr = csv.writer(fp, dialect='excel')
    wr.writerow(csvRow)


import mysql.connector
from mysql.connector import Error

#def insertVariblesIntoTable(allTitle, allOpinion, allURL):
def insertVariablesIntoTable():
    try:
        connection = mysql.connector.connect(host='localhost',
                                             database='PythonMariaDB1',
                                             user='PythonMariaDB1',
                                             password='password1234')
        cursor = connection.cursor()
        mySql_insert_query = """INSERT INTO Single_No_Loop (all_Title, all_Opinion, all_URL) 
                                VALUES (%s, %s, %s) """

#        recordTuple = (allTitle, allOpinion, allURL)
        recordTuple = (allTitle[0], allOpinion[0], allURL)
        cursor.execute(mySql_insert_query, recordTuple)
        connection.commit()
        print("Record inserted successfully into Single_No_Loop table")

    except mysql.connector.Error as error:
        print("Failed to insert into MySQL table {}".format(error))

    finally:
        if (connection.is_connected()):
            cursor.close()
            connection.close()
            print("MySQL connection is closed")

insertVariablesIntoTable()
The current error is as follows:

Quote:Failed to insert into MySQL table Failed processing format-parameters; Python 'tag' cannot be converted to a MySQL type
MySQL connection is closed
“And one of the elders saith unto me, Weep not: behold, the Lion of the tribe of Juda, the Root of David, hath prevailed to open the book,...” - Revelation 5:5 (KJV)

“And oppress not the widow, nor the fatherless, the stranger, nor the poor; and ...” - Zechariah 7:10 (KJV)

#LetHISPeopleGo

Reply
#20
Sigh. Is there a reason you aren't calling get_text on your variables on line 49 now, exactly like you're doing on line 27?

Yes, the variables declared on lines 9-11 are all global. You should declare your function to take parameters, like you had done in post 5 and then pass the values in. You can find much information on the internet about why using globals is bad.

Ah, you're handling the exception. I wasn't reading the code properly, so disregard my comment about the traceback. If you're interested, the standard library traceback module is useful when you want to print tracebacks during exception handling.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question Python Obstacles | Jeet-Kune-Do | BS4 (Tags > MariaDB) [URL/Local HTML] BrandonKastning 0 1,418 Feb-08-2022, 08:55 PM
Last Post: BrandonKastning
  Beautifulsoup4 help samuelbachorik 1 1,351 Feb-05-2022, 10:44 PM
Last Post: snippsat
Question Securing State Constitutions (USA) from University of Maryland > MariaDB .sql BrandonKastning 1 1,518 Jan-21-2022, 06:34 PM
Last Post: BrandonKastning
Question Scraping Columns with Pandas (Column Entries w/ more than 1 word writes two columns) BrandonKastning 7 3,156 Jan-13-2022, 10:52 PM
Last Post: BrandonKastning
Exclamation Debian 10 Buster Environment - Python 3.x (MariaDB 10.4.21) | Working Connector? BrandonKastning 9 4,241 Jan-04-2022, 08:27 PM
Last Post: BrandonKastning
Lightbulb Python Obstacles | Kung-Fu | Full File HTML Document Scrape and Store it in MariaDB BrandonKastning 5 2,881 Dec-29-2021, 02:26 AM
Last Post: BrandonKastning
  Python Obstacles | Karate | HTML/Scrape Specific Tag and Store it in MariaDB BrandonKastning 8 3,164 Nov-22-2021, 01:38 AM
Last Post: BrandonKastning
  cleaning HTML pages using lxml and XPath wenkos 2 2,435 Aug-25-2021, 10:54 AM
Last Post: wenkos
  HTML multi select HTML listbox with Flask/Python rfeyer 0 4,623 Mar-14-2021, 12:23 PM
Last Post: rfeyer
  Build a simple Webapp with Python Flask and mariaDB newbie1 3 3,392 Jun-04-2020, 09:34 PM
Last Post: lmolter54

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020