Python Forum
Matching Regex in Python from Excelfile and inserting into database
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Matching Regex in Python from Excelfile and inserting into database
#1
Hello. Im trying to store multiple excel files into my database but need to match the rows of each excel row by Timestamp, ID, Content before inserting it. Therefor im using REGEX

Each Timestamp ID and Content can vary from row to row and from excel file to excel file.
Im stucking with an Errorcode i get, I hope you can help me out here.

This is my code :

 import os
import re
import pymysql
pymysql.install_as_MySQLdb()
import pandas as pd
import sqlalchemy


def insert_or_update(engine, pd_table, table_name):
    inserts = 0
    updates = 0
    for i in range(len(pd_table)):
        vals_with_quotes = ["'" + str(x) + "'" for x in pd_table.loc[i, :].values]
        # print(vals_with_quotes)
        update_pairs = [str(c) + " = '" + str(v) + "'" for c, v in zip(pd_table.columns, pd_table.loc[i, :])]
        query = f"INSERT INTO {table_name} ({', '.join(list(pd_table.columns.values))}) " \
                f"VALUES ({', '.join(vals_with_quotes)}) " \
                f"ON DUPLICATE KEY UPDATE {', '.join(update_pairs)}"
        print(query)
        result = engine.execute(query)
        if result.lastrowid == 0:
            updates += 1
        else:
            inserts += 1
    print(f"Inserted {inserts} rows and updated {updates} rows.")


schema = '---'
alchemy_connect = "---"
engine = sqlalchemy.create_engine(alchemy_connect)  # connect to server
engine.execute(f"USE {schema}")  # select new db
# engine.execute("SET NAMES UTF8MB4;")

query = "SELECT * FROM --- where study=---"
pm = pd.read_sql(query, engine)

rootpath = "path/"

for root, dirs, files in os.walk(rootpath):
    for file in files:
        print(root, dirs, files, file)
        d = pd.read_excel(root + file, header=None)
        d.drop(columns=[0], inplace=True)
        d.rename(columns={1: "content"}, inplace=True)

        participants = []

        for ix, row in d.iterrows():
            c = row["content"]

            match = re.search(r"^\[(\d{1,2}:\d{1,2}:\d{1,2})]\s+(\w+):\s*(.*)", c)
            if match:
                timestamp = match.group(1)
                id_code = match.group(2)
                only_content = match.group(3)
                if re.search(r"(?:Versuch(?:sleit(?:er)?)?|ersuchsleiter)", id_code):
                    id_code = "Versuchsleiter"

            # get pid
            participant = pm.loc[pm["id_code"] == id_code, "pid"]
            try:
                pid = participant.values[0]
            except:
                pid = "Versuchsleiter"

            # print(ix, pid, id_code, only_content, timestamp)
            if pid and pid not in participants and pid != "Versuchsleiter":
                participants.append(pid)
            d.loc[ix, "pid"] = pid
            d.loc[ix, "timestamp"] = timestamp
            d.loc[ix, "content"] = only_content.strip()
            d.loc[ix, "is_participant"] = 0 if pid == "Versuchsleiter" else 1

        d = d[["pid", "is_participant", "content", "timestamp"]]
        d.loc[(d['pid'] == "Versuchsleiter"), "pid"] = participants[0]
        d.loc[(d['pid'] == None), "pid"] = participants[0]

        # with pd.option_context('display.max_rows', None, 'display.max_columns', None):
            # print(d)
        insert_or_update(engine, d, "YM_Transcripts") 
And this is the Errorcode i get :

 
File "C:/Users/.../PycharmProjects/.../.../---.py", line 20, in insert_or_update
    result = engine.execute(query)

  File "C:\Users\...\PycharmProjects\...\...\lib\site-packages\sqlalchemy\engine\base.py", line 2191, in execute
    return connection.execute(statement, *multiparams, **params)

TypeError: not enough arguments for format string 
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Inserting Python Buttons into KV Files edand19941 3 601 Feb-19-2025, 07:44 PM
Last Post: buran
  Regex replace in SQLite3 database WJSwan 1 1,522 Dec-04-2023, 05:55 PM
Last Post: Larz60+
  problem in matching in regex akbarza 1 1,168 Nov-21-2023, 09:31 AM
Last Post: snippsat
  python script for inserting rows into hbase table lravikumarvsp 7 8,763 Mar-24-2023, 04:44 AM
Last Post: parth_botadara
  Basic SQL query using Py: Inserting or querying sqlite3 database not returning data marlonbown 3 3,226 Nov-08-2022, 07:16 PM
Last Post: marlonbown
  Python Regex quest 2 4,351 Sep-22-2022, 03:15 AM
Last Post: quest
  python regex: get rid of double dot wardancer84 4 3,285 Sep-09-2021, 03:03 PM
Last Post: wardancer84
  Python the regex not getting any attributes sarath_unrelax 1 2,376 Dec-19-2019, 11:06 AM
Last Post: Larz60+
  Python regex to get only numbers tantony 6 5,508 Oct-09-2019, 11:53 PM
Last Post: newbieAuggie2019
  inserting data to mysql with python justin_py 1 3,715 Jul-11-2019, 10:13 PM
Last Post: metulburr

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020