Python Forum
Data Linkage in Python
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Data Linkage in Python
#1
Hi All,

I am new to python and am trying to link four datasets to answer some questions. Can anyone recommend what tools to use and how to go about achieving this. I understand how to data files will link together, I just need to do it in python.

Many thanks in advance for your comments
Reply
#2
How about some details
what type of database (it does make a difference)
what is the structure of each?
which tables do you want to connect?
what are their keys?
Reply
#3
Thanks Larz60. The files are in csv format.
The first is a patient demographic dataset with columns patient identifier, name, gender, dob, address, postcode and gpid
the second a GP clinic attendance dataset with columns patient identifier, gpid, event date, event code, event data
the third is a hospital attendance dataset with columns patient identifier, hospid, event date and event code
the fourth is a deaths dataset with a patient identifier, name, gender, address, postcode, dob, dod and cause
Reply
#4
pandas would be a good option.
Reply
#5
You can use the python csv package to read in the data
I would load each into an sqlite3 database if you are using python 3.5
it will already be installed, so simply import sqlite3.

create four tables, each with a name corresponding to your table names,
and containing all of the fields that you show below.

Build indexes for each on the patient identifier, and maybe for the two tables that contain gpid
an index on that as well

Now you can access data from each table with simple sql query on PatientId

I'm rusty on SQL, but you can write a query to pull from all four tables at once.

Someone else here will be able to help with that.

I see Yoriz has replied while I was typing. Use his suggestion and go for pandas.
Reply
#6
Thanks all Pandas worked just fine
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020