Excel to SQLite - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Excel to SQLite (/thread-37079.html) |
Excel to SQLite - kgall89 - Apr-29-2022 Hello, I am using Python3, PyCharm and DB Browser for SQLite. I am new to this domain and those are what I have learned on so far. I am collecting daily data that goes to an excel/csv file. I am hoping to send the response to a database on SQLite so I can then use python to analyze and report. Im hoping this is the right method but am open to suggestions. Excel -> CSV -> Python -> SQLite, then Python to retrieve data when needed. I am having trouble creating the tables to store the data in SQLite. Im trying to avoid duplicates by using "ids" instead of having the data all in one table. The code I am using is below. import pandas as pd import sqlite3 df = pd.read_csv ('PyCharmData.csv') # column headers # 'Start', 'Completion', 'Email', 'Name', 'First Name', 'Last Name', # 'RPE', 'Duration', 'Sleep', 'Fatigue', 'Stress', 'Soreness', 'Mood', # 'Diet', 'Injuries', 'Full Name' N = df['Full Name'] D = df['Completion'] I = df['Injuries'] R = df[['RPE', 'Duration', 'Sleep', 'Fatigue', 'Stress', 'Soreness', 'Mood', 'Diet']] conn = sqlite3.connect('testSQL') cur = conn.cursor() cur.executescript(''' DROP TABLE IF EXISTS AthName; DROP TABLE IF EXISTS Date; DROP TABLE IF EXISTS Injury; DROP TABLE IF EXISTS Metrics; CREATE TABLE AthName ( id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE, name TEXT UNIQUE, Date_id INTEGER, Injury_id INTEGER, Metrics_id INTEGER ); CREATE TABLE Date ( id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE, day TEXT, AthName_id INTEGER, Injury_id INTEGER, Metrics_id INTEGER ); CREATE TABLE Injury ( id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE, info TEXT, AthName_id INTEGER, Date_id INTEGER, Injury_id INTEGER ); CREATE TABLE Metrics ( id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE, AthName_id INTEGER, Injury_id INTEGER, Date_id INTEGER, sleep TEXT, mood TEXT, soreness INTEGER, rpe TEXT, fatigue INTEGER, diet TEXT, duration INTEGER, stress, INTEGER ); ''') fname = input('Enter file name: ') for entry in df: cur.execute('''INSERT OR IGNORE INTO AthName (name) VALUES ( ? )''', (N, ) ) cur.execute('SELECT id FROM AthName WHERE name = ? ', (N, )) AthName_id = cur.fetchone()[0] conn.commit() I would repeat 70 to 74 for each table created. I could be way off on this, it is patch work from some courses and other articles. Any suggestion is appreciated, even it is a different method to achieve this goal. Thank you RE: Excel to SQLite - kgall89 - May-02-2022 If anyone else comes along this thread and is in the same situation, I have worked out a solution. Im sure its not the best way to go about things but it worked! rawdata = (csv file) #this creates a new column in the original file and assigns an id based on specific criteria. #athlete_id column is inserted into the file, id is given by the column Full name only once per name #date_id column is inserted into the file, id is given by the column Completion once once per date etc rawdata['athlete_id'] = rawdata.groupby(['Full Name']).ngroup() rawdata['date_id'] = rawdata.groupby(['Completion']).ngroup() rawdata['metrics_id'] = rawdata.groupby(['athlete_id','date_id']).ngroup() #pulls out the columns I need from main file and puts into a data frame df = rawdata[['athlete_id', 'date_id', 'metrics_id', 'Completion', 'Full Name', 'Sleep', 'Fatigue', 'Stress', 'Soreness', 'Mood', 'Diet', 'RPE', 'Duration']] #variables for new tables N = df[['athlete_id','Full Name']] D = df[['date_id','Completion']] R = df[['metrics_id','athlete_id','date_id', 'Sleep', 'Fatigue', 'Stress', 'Soreness', 'Mood', 'Diet', 'RPE', 'Duration']] conn = sqlite3.connect('testSQL') cur = conn.cursor() #creates tables in sqlite cur.executescript(''' DROP TABLE IF EXISTS PyCharmData; DROP TABLE IF EXISTS AthName; DROP TABLE IF EXISTS Date; DROP TABLE IF EXISTS Metrics; CREATE TABLE PyCharmData ( athlete_id INTEGER, date_id INTEGER, metrics_id INTEGER, Completion TEXT, Name TEXT, Sleep TEXT, Fatigue INTEGER, Stress INTEGER, Soreness INTEGER, Mood TEXT, Diet TEXT, RPE TEXT, Duration INTEGER, Injuries TEXT ); CREATE TABLE AthName ( athlete_id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE, name TEXT UNIQUE ); CREATE TABLE Date ( date_id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE, day TEXT UNIQUE ); CREATE TABLE Metrics ( metrics_id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE, athlete_id INTEGER, date_id INTEGER, sleep TEXT, mood TEXT, soreness INTEGER, rpe TEXT, fatigue INTEGER, diet TEXT, duration INTEGER, stress INTEGER ); ''') #creates all the tables and loads them with the pertinent info, temporarily places the table in a temporary table then into its new table df.to_sql('myTempTable', conn, if_exists='replace', index=False) cur.execute('INSERT OR IGNORE INTO PyCharmData SELECT * FROM MyTempTable') N.to_sql('myTempTable', conn, if_exists='replace', index=False) cur.execute("INSERT OR IGNORE INTO athName SELECT * FROM myTempTable") D.to_sql('myTempTable', conn, if_exists='replace', index=False) cur.execute("INSERT OR IGNORE INTO date SELECT * FROM myTempTable") R.to_sql('myTempTable', conn, if_exists='replace', index=False) cur.execute("INSERT OR IGNORE INTO metrics SELECT * FROM myTempTable") conn.commit()Good luck out there! |