Python Forum
How does one clean a populated table in MySQL/MariaDB? Copying values across tables?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How does one clean a populated table in MySQL/MariaDB? Copying values across tables?
#1
Question 
How does one clean a populated table in MySQL/MariaDB? Copying values across tables?

I was using python, pandas, pymysql, sqlalchemy, openpyxl, odfpy, requests and bs4 to aggregate data from Wikipedia and payload it into MariaDB databases.

I am working on getting my countries Cities, Towns and corresponding Counties organized.

This is what I currently have in one example (the table has both Cities and Towns; I want to separate them by copying all the Cities to a new Table in the same order they are set in the aggregated database and I want to copy all the Towns to a new Table in the same order as they are set in the aggregated database). *I hope that makes sense*.

Current Table: NJ_Cities_CSV

Columns:
  • number
  • name
  • type
  • county
  • population_2020_census
  • population_2010_census
  • change
  • sq_mi
  • km2
  • population_density
  • form_of_government
  • incorporated_year

Tables for Towns (NEW): NJ_Only_Towns

Tables for Cities (NEW): NJ_Only_Cities


How would I go about achieving this?

I was hoping it would be as easy as a few SQL Statements and then write a python script to execute them using PyMySQL?

Below is screenshots of the mixed table as aggregated from the following Wikipedia Article: https://en.wikipedia.org/wiki/List_of_mu...New_Jersey

[Image: 1-from-2022-01-15-07-40-27.png]
upload image online

[Image: 2-2022-01-15-07-41-13.png]

Thank you everyone for this forum!

Best Regards,

Brandon Kastning
“And one of the elders saith unto me, Weep not: behold, the Lion of the tribe of Juda, the Root of David, hath prevailed to open the book,...” - Revelation 5:5 (KJV)

“And oppress not the widow, nor the fatherless, the stranger, nor the poor; and ...” - Zechariah 7:10 (KJV)

#LetHISPeopleGo

Reply
#2
Lightbulb 
I have been working on this and so far I have come up with the following: Not working yet, but I am certainly getting closer! Even working on upgrading my coding style by adding Print Lines. Once it works; it should be a nice script to build with.

Sources of Learning: Blogs/Tutorials:


https://support.awesome-table.com/hc/en-...onditions-
https://python-forum.io/thread-35939.html
https://www.w3schools.com/python/ref_func_print.asp

I have added two more tables; the Source Table: "NJ_Cities_CSV" contained more types of municipalities, such as Villages and Boroughs; I added those. 4 Table INSERT payload from criteria specified SQL Statements (which I haven't figured out 100% just yet) ...


Code So far:

# Learning on Python-Forum.io
# Attempt #1
# Disabled American Constitutional Pre-Law Student: BrandonKastning
# Date: 01/15/2021
# Script: Kajukenbo_divide_by_criteria_mariadb.pymysql.py
# Purpose: Building Block for Python 3.9.9 + MariaDB 10.3.27
# Thread URL with Sources of Learning (Cited on Board)
# Thread URL: https://python-forum.io/thread-36085.html
 
# Kajukenbo
# DB: Exodus_J3x_Dev_Bronson
# Source Table: NJ_Cities_CSV
# Destination Table #1: NJ_Only_Cities
# Destination Table #2: NJ_Only_Towns
# Destination Table #3: NJ_Only_Villages
# Destination Table #4: NJ_Only_Boroughs
# Purpose: Populate column "american_city_name" from Source Table to Destination Table #1
# Purpose: Populate column "american_town_name" from Source Table to Destination Table #2
# Purpose: Populate column "american_village_name" from Source Table to Destination Table #3
# Purpose: Populate column "american_borough_name" from Source Table to Destination Table #4
import pymysql.cursors
 
# Connect to the database
connection = pymysql.connect(host='localhost',
                             user='brandon',
                             password='password',
                             database='Exodus_J3x_Dev_Bronson',
                             cursorclass=pymysql.cursors.DictCursor)
 
with connection:
    with connection.cursor() as cursor:
        sql = "INSERT INTO 'NJ_Only_Cities' (`american_city_name`) SELECT * WHERE type = 'City' FROM `NJ_Cities_CSV`;"
        cursor.execute(sql)
print("American Cities Payload Executed without Error! [Line 33]")

with connection:
    with connection.cursor() as cursor:
        sql = "INSERT INTO 'NJ_Only_Towns' (`american_town_name`) SELECT * WHERE type = 'Town' or type = 'Township' FROM `NJ_Cities_CSV`;"
        cursor.execute(sql)
print("American Towns Payload Executed without Error! [Line 39]")

with connection:
    with connection.cursor() as cursor:
        sql = "INSERT INTO 'NJ_Only_Villages' (`american_village_name`) SELECT * WHERE type = 'Village' FROM `NJ_Cities_CSV`;"
        cursor.execute(sql)
print("American Villages Payload Executed without Error! [Line 45]")

with connection:
    with connection.cursor() as cursor:
        sql = "INSERT INTO 'NJ_Only_Boroughs' (`american_borough_name`) SELECT * WHERE type = 'Borough' FROM `NJ_Cities_CSV`;"
        cursor.execute(sql)
print("American Boroughs Payload Executed without Error! [Line 51]")

My output is as follows:


root@icedragon:/home/brandon/Desktop/Exodus/Python.Scripts/Python.Forum.io/06_Kajukenbo# python3.9 Kajukenbo_divide_by_criteria_mariadb.pymysql.py
Traceback (most recent call last):
  File "/home/brandon/Desktop/Exodus/Python.Scripts/Python.Forum.io/06_Kajukenbo/Kajukenbo_divide_by_criteria_mariadb.pymysql.py", line 33, in <module>
    cursor.execute(sql)
  File "/usr/local/lib/python3.9/site-packages/pymysql/cursors.py", line 148, in execute
    result = self._query(query)
  File "/usr/local/lib/python3.9/site-packages/pymysql/cursors.py", line 310, in _query
    conn.query(q)
  File "/usr/local/lib/python3.9/site-packages/pymysql/connections.py", line 548, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  File "/usr/local/lib/python3.9/site-packages/pymysql/connections.py", line 775, in _read_query_result
    result.read()
  File "/usr/local/lib/python3.9/site-packages/pymysql/connections.py", line 1156, in read
    first_packet = self.connection._read_packet()
  File "/usr/local/lib/python3.9/site-packages/pymysql/connections.py", line 725, in _read_packet
    packet.raise_for_error()
  File "/usr/local/lib/python3.9/site-packages/pymysql/protocol.py", line 221, in raise_for_error
    err.raise_mysql_exception(self._data)
  File "/usr/local/lib/python3.9/site-packages/pymysql/err.py", line 143, in raise_mysql_exception
    raise errorclass(errno, errval)
pymysql.err.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near ''NJ_Only_Cities' (`american_city_name`) SELECT * WHERE type = 'City' FROM `NJ...' at line 1")
I believe I have to modify my SQL Statement(s) x 4 to match a working Syntax. Not sure where I went wrong.

Thank you everyone! Smile

Best Regards,

Brandon Kastning
“And one of the elders saith unto me, Weep not: behold, the Lion of the tribe of Juda, the Root of David, hath prevailed to open the book,...” - Revelation 5:5 (KJV)

“And oppress not the widow, nor the fatherless, the stranger, nor the poor; and ...” - Zechariah 7:10 (KJV)

#LetHISPeopleGo

Reply
#3
Thumbs Up 
Source of Assistance:

I was able to get some assistance from #mysql on Libera.chat (IRC Network)
  • - thumbs (Quotes)
    - Xgc (Syntax) & (DB Normalization & Importance)

MariaDB took the following STATEMENTS and it worked!

However, it does not work in pymysql for some reason! I can execute a .py with the pymysql and my working Maria DB SQL STATEMENT and upon refresh and view table (shows nothing); then if I execute it from MariaDB CLI and execute it; it works and when I refresh, the ID columns are (the amount of times I executed the python script -- yet the values are hidden and it just continues from a high # such as ID = 190 instead of 1 because it ran with pymysql and python 3.9; just a glitch or something).

The working MariaDB SQL STATEMENT for the following destination columns for the 4 types of Municipalities for New Jersey scraped from Wikipedia are as follows:

id / INT / 11 - AUTO_INCREMENT + PRIMARY KEY
number / BIGINT / 20 
name / TEXT
type / TEXT
county / TEXT
population_2020_census / BIGINT / 20
population_2010_census / BIGINT / 20
change / TEXT
sq_mi / DOUBLE
km2 / DOUBLE
population_density / TEXT
form_of_government / TEXT
incorporated_year / TEXT
american_constitutional_municipality_entry_timestamp / TIMESTAMP / CURRENT_TIMESTAMP
Destination Tables with the above schema & columns:
  • NJ_Only_Cities_3 (Worked 100% w/ CLI STATEMENT)
    NJ_Only_Cities_4 (The Glitch w/ PyMYSQL)
    NJ_Only_Towns (Worked 100% w/ CLI STATEMENT)
    NJ_Only_Villages (Worked 100% w/ CLI STATEMENT)
    NJ_Only_Boroughs (Worked 100% w/ CLI STATEMENT)

NJ_Only_Cities_3:

MariaDB [Exodus_J3x_Dev_Bronson]> INSERT INTO `NJ_Only_Cities_3` (`number`,`name`,`type`,`county`,`population_2020_census`,`population_2010_census`,`change`,`sq_mi`,`km2`,`population_density`,`form_of_government`,`incorporated_year`) SELECT `number`,`name`,`type`,`county`,`population_2020_census`,`population_2010_census`,`change`,`sq_mi`,`km2`,`population_density`,`form_of_government`,`incorporated_year` FROM `NJ_Cities_CSV` WHERE (`type`) = 'City'; 
Query OK, 52 rows affected (0.093 sec)
Records: 52  Duplicates: 0  Warnings: 0
Screenshot:

[Image: 1-THREE-2022-01-16-21-36-23.png]

Screenshot (From the pymysql payload python script which did not error) (I executed it several times before manually executing the STATEMENT on MariaDB CLI); then I see the glitch:

[Image: 2-FOUR-2022-01-16-21-37-06.png]

NJ_Only_Towns:

MariaDB [Exodus_J3x_Dev_Bronson]> INSERT INTO `NJ_Only_Towns` (`number`,`name`,`type`,`county`,`population_2020_census`,`population_2010_census`,`change`,`sq_mi`,`km2`,`population_density`,`form_of_government`,`incorporated_year`) SELECT `number`,`name`,`type`,`county`,`population_2020_census`,`population_2010_census`,`change`,`sq_mi`,`km2`,`population_density`,`form_of_government`,`incorporated_year` FROM `NJ_Cities_CSV` WHERE (`type`) = 'Town' or (`type`) ='Township';"
Query OK, 256 rows affected (0.203 sec)
Records: 256  Duplicates: 0  Warnings: 0
Screenshot:

[Image: 5-TOWNS-2022-01-16-21-45-36.png]

NJ_Only_Villages:

MariaDB [Exodus_J3x_Dev_Bronson]> INSERT INTO `NJ_Only_Villages` (`number`,`name`,`type`,`county`,`population_2020_census`,`population_2010_census`,`change`,`sq_mi`,`km2`,`population_density`,`form_of_government`,`incorporated_year`) SELECT `number`,`name`,`type`,`county`,`population_2020_census`,`population_2010_census`,`change`,`sq_mi`,`km2`,`population_density`,`form_of_government`,`incorporated_year` FROM `NJ_Cities_CSV` WHERE (`type`) = 'Village';
Query OK, 3 rows affected (0.082 sec)
Records: 3  Duplicates: 0  Warnings: 0
Screenshot:

[Image: 3-VILLAGES-2022-01-16-21-39-56.png]

NJ_Only_Boroughs:

MariaDB [Exodus_J3x_Dev_Bronson]> INSERT INTO `NJ_Only_Boroughs` (`number`,`name`,`type`,`county`,`population_2020_census`,`population_2010_census`,`change`,`sq_mi`,`km2`,`population_density`,`form_of_government`,`incorporated_year`) SELECT `number`,`name`,`type`,`county`,`population_2020_census`,`population_2010_census`,`change`,`sq_mi`,`km2`,`population_density`,`form_of_government`,`incorporated_year` FROM `NJ_Cities_CSV` WHERE (`type`) = 'Borough';
Query OK, 253 rows affected (0.141 sec)
Records: 253  Duplicates: 0  Warnings: 0
Screenshot:

[Image: 4-BOROUGHS-2022-01-16-21-40-23.png]

Thank you everyone for this forum and all your help! If anyone knows about this PyMySQL glitch populating databases (yet invisible; somehow the column ID's are generated even though it doesn't properly payload). Very strange!

Best Regards,

Brandon Kastning
“And one of the elders saith unto me, Weep not: behold, the Lion of the tribe of Juda, the Root of David, hath prevailed to open the book,...” - Revelation 5:5 (KJV)

“And oppress not the widow, nor the fatherless, the stranger, nor the poor; and ...” - Zechariah 7:10 (KJV)

#LetHISPeopleGo

Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Extracting tables and text above the table from a PDF to CSV DivAsh 3 2,475 Jan-18-2023, 07:39 AM
Last Post: perfringo
Question Is this the appropriate forum for MariaDB/MySQL threads? BrandonKastning 0 1,192 Jan-13-2022, 11:30 PM
Last Post: BrandonKastning
  Pandas copying wrong values vmarg 2 2,388 Jan-06-2020, 09:45 AM
Last Post: vmarg
  Match a table with sub-total values to its detail value table klllmmm 2 3,807 Apr-03-2019, 11:28 AM
Last Post: klllmmm
  Adding Tables and Extracting Values from Tables jamescox11480 5 3,463 Sep-29-2018, 04:49 PM
Last Post: jamescox11480
  Trying to clean the selected columns paulgureghian 5 4,435 Jun-08-2018, 10:08 PM
Last Post: volcano63
  Insert values into a column in the same table based on a criteria klllmmm 3 4,212 Apr-13-2017, 10:10 AM
Last Post: zivoni

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020