Excel to SQLite

kgall89 · Apr-29-2022, 06:57 PM

Hello,

I am using Python3, PyCharm and DB Browser for SQLite. I am new to this domain and those are what I have learned on so far.

I am collecting daily data that goes to an excel/csv file. I am hoping to send the response to a database on SQLite so I can then use python to analyze and report.

Im hoping this is the right method but am open to suggestions. Excel -> CSV -> Python -> SQLite, then Python to retrieve data when needed.

I am having trouble creating the tables to store the data in SQLite. Im trying to avoid duplicates by using "ids" instead of having the data all in one table. The code I am using is below.

import pandas as pd
import sqlite3

df = pd.read_csv ('PyCharmData.csv')

# column headers
# 'Start', 'Completion', 'Email', 'Name', 'First Name', 'Last Name',
#      'RPE', 'Duration', 'Sleep', 'Fatigue', 'Stress', 'Soreness', 'Mood',
#      'Diet', 'Injuries', 'Full Name'

N = df['Full Name']
D = df['Completion']
I = df['Injuries']
R = df[['RPE', 'Duration', 'Sleep', 'Fatigue', 'Stress', 'Soreness', 'Mood', 'Diet']]

conn = sqlite3.connect('testSQL')
cur = conn.cursor()

cur.executescript('''
DROP TABLE IF EXISTS AthName;
DROP TABLE IF EXISTS Date;
DROP TABLE IF EXISTS Injury;
DROP TABLE IF EXISTS Metrics;


CREATE TABLE AthName (
    id  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    name TEXT UNIQUE,
    Date_id  INTEGER,
    Injury_id  INTEGER,
    Metrics_id  INTEGER
);

CREATE TABLE Date (
    id  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    day TEXT,
    AthName_id  INTEGER,
    Injury_id  INTEGER,
    Metrics_id  INTEGER
);

CREATE TABLE Injury (
    id  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    info TEXT,
    AthName_id  INTEGER,
    Date_id  INTEGER,
    Injury_id  INTEGER
);

CREATE TABLE Metrics (
    id  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    AthName_id  INTEGER,
    Injury_id  INTEGER,
    Date_id  INTEGER,
    sleep TEXT,
    mood TEXT,
    soreness INTEGER,
    rpe TEXT,
    fatigue INTEGER,
    diet TEXT,
    duration INTEGER,
    stress, INTEGER
);

''')

fname = input('Enter file name: ')

for entry in df:

    cur.execute('''INSERT OR IGNORE INTO AthName (name)
        VALUES ( ? )''', (N, ) )
    cur.execute('SELECT id FROM AthName WHERE name = ? ', (N, ))
    AthName_id = cur.fetchone()[0]

    conn.commit()

I would repeat 70 to 74 for each table created.

Error:Traceback (most recent call last):
  "...pyToSQL/main.py", line 71, in <module>
    cur.execute('''INSERT OR IGNORE INTO AthName (name)
sqlite3.InterfaceError: Error binding parameter 0 - probably unsupported type.

I could be way off on this, it is patch work from some courses and other articles.

Any suggestion is appreciated, even it is a different method to achieve this goal.

Thank you

kgall89 · May-02-2022, 10:05 AM

If anyone else comes along this thread and is in the same situation, I have worked out a solution. Im sure its not the best way to go about things but it worked!

rawdata = (csv file)

#this creates a new column in the original file and assigns an id based on specific criteria.
#athlete_id column is inserted into the file, id is given by the column Full name only once per name
#date_id column is inserted into the file, id is given by the column Completion once once per date etc
rawdata['athlete_id'] = rawdata.groupby(['Full Name']).ngroup()
rawdata['date_id'] = rawdata.groupby(['Completion']).ngroup()
rawdata['metrics_id'] = rawdata.groupby(['athlete_id','date_id']).ngroup()

#pulls out the columns I need from main file and puts into a data frame 
df = rawdata[['athlete_id', 'date_id', 'metrics_id', 'Completion', 'Full Name', 'Sleep', 'Fatigue', 'Stress', 'Soreness', 'Mood', 'Diet', 'RPE', 'Duration']]

#variables for new tables
N = df[['athlete_id','Full Name']]
D = df[['date_id','Completion']]
R = df[['metrics_id','athlete_id','date_id', 'Sleep', 'Fatigue', 'Stress', 'Soreness', 'Mood', 'Diet', 'RPE', 'Duration']]

conn = sqlite3.connect('testSQL')
cur = conn.cursor()


#creates tables in sqlite
cur.executescript('''
DROP TABLE IF EXISTS PyCharmData;
DROP TABLE IF EXISTS AthName;
DROP TABLE IF EXISTS Date;
DROP TABLE IF EXISTS Metrics;

CREATE TABLE PyCharmData (
    athlete_id  INTEGER,
    date_id INTEGER,
    metrics_id INTEGER,
    Completion TEXT,
    Name TEXT,
    Sleep TEXT,
    Fatigue INTEGER,
    Stress INTEGER,
    Soreness INTEGER,
    Mood TEXT,
    Diet TEXT,
    RPE TEXT,
    Duration INTEGER,
    Injuries TEXT
);

CREATE TABLE AthName (
    athlete_id   INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    name TEXT UNIQUE
);

CREATE TABLE Date (
     date_id   INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
     day TEXT UNIQUE
 );

 CREATE TABLE Metrics (
     metrics_id  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
     athlete_id INTEGER,
     date_id INTEGER,
     sleep TEXT,
     mood TEXT,
     soreness INTEGER,
     rpe TEXT,
     fatigue INTEGER,
     diet TEXT,
     duration INTEGER,
     stress INTEGER
);

''')

#creates all the tables and loads them with the pertinent info, temporarily places the table in a temporary table then into its new table
df.to_sql('myTempTable', conn, if_exists='replace', index=False)
cur.execute('INSERT OR IGNORE INTO PyCharmData SELECT * FROM MyTempTable')
N.to_sql('myTempTable', conn, if_exists='replace', index=False)
cur.execute("INSERT OR IGNORE INTO athName SELECT * FROM myTempTable")
D.to_sql('myTempTable', conn, if_exists='replace', index=False)
cur.execute("INSERT OR IGNORE INTO date SELECT * FROM myTempTable")
R.to_sql('myTempTable', conn, if_exists='replace', index=False)
cur.execute("INSERT OR IGNORE INTO metrics SELECT * FROM myTempTable")
conn.commit()

Good luck out there!

Excel to SQLite

User Panel Messages

Announcements