I attached the csv:
client_id visittype_id visittype intake_date visit_date services_needed
90062526 109 Engagement 7/12/2018 7/12/2018
90063330 830 CM Eval 7/17/2018 7/17/2018 Yes
90063330 34 Case Management7/17/2018 7/25/2018 Yes
90065363 109 Engagement 7/16/2018 7/16/2018 No
90066364 123 GAP Case Mangmt Reg 7/12/2018 7/18/2018 Yes
90066364 123 GAP Case Mangmt Reg 7/12/2018 7/23/2018 Yes
90066364 122 GAP Case Mangmt High 7/12/2018 7/24/2018 Yes
90069663 Yes
90096343 233 Psychotherapy 7/5/2018 7/5/2018 No
90096343 233 Psychotherapy 7/5/2018 7/19/2018 No
90096592 436 Comprehensive Eval 7/5/2018 7/5/2018 Yes
90099646 830 CM Eval 7/16/2018 7/16/2018 Yes
90905234 481 No Show Cancelation 7/10/2018 7/19/2018 Yes
90905234 127 Group Orientation 7/10/2018 7/24/2018 Yes
91222222 481 No Show Cancelation 7/12/2018 7/23/2018 Yes
91222222 481 No Show Cancelation 7/12/2018 7/24/2018 Yes
For each client ID, we are tracking who had a followup and who had no followup after an initial evaluation.
The criteria for a with_followup:
services_needed = ‘Yes’.
There must be a visittype other than ‘Eval’ (anything that contains ‘Eval’).
There must be a visittype other than ‘No Show Cancelation’
If the visittype = ‘Evaluation’, the visit_date must be greater than intake_date.
If only one of the rows fits that criteria, it qualifies as a followup regardless of the values in the other rows for a client.
The criteria for no_followup:
services_needed = ‘Yes’.
If none of the criteria for a followup exists for a client, that client had no_followup.
The criteria for cancellation_only:
services_needed = ‘Yes’.
The client only has visittypes = ‘No Show Cancelation’.
This would also qualify for no_followup.
The sum of the percentages for with_followup and no_followup will be 100%.
The percentage for cancellation_only is separate.
The following works:
I could do a lot, if not all, of this in a query. I don't want to make the query too complicated and slow. I need to do as much outside of the query as possible. Also, I am unable to have the Python interact with the database. It is prepackaged software. I can write queries but am not allowed to write outside code to run the queries.
I realize this is not ideal Python code (understatement). I still have a lot to learn.
I'll only be on sporadically until tomorrow.
Thanks to anyone that bothers to look this over.
mockup.csv (Size: 889 bytes / Downloads: 10)
client_id visittype_id visittype intake_date visit_date services_needed
90062526 109 Engagement 7/12/2018 7/12/2018
90063330 830 CM Eval 7/17/2018 7/17/2018 Yes
90063330 34 Case Management7/17/2018 7/25/2018 Yes
90065363 109 Engagement 7/16/2018 7/16/2018 No
90066364 123 GAP Case Mangmt Reg 7/12/2018 7/18/2018 Yes
90066364 123 GAP Case Mangmt Reg 7/12/2018 7/23/2018 Yes
90066364 122 GAP Case Mangmt High 7/12/2018 7/24/2018 Yes
90069663 Yes
90096343 233 Psychotherapy 7/5/2018 7/5/2018 No
90096343 233 Psychotherapy 7/5/2018 7/19/2018 No
90096592 436 Comprehensive Eval 7/5/2018 7/5/2018 Yes
90099646 830 CM Eval 7/16/2018 7/16/2018 Yes
90905234 481 No Show Cancelation 7/10/2018 7/19/2018 Yes
90905234 127 Group Orientation 7/10/2018 7/24/2018 Yes
91222222 481 No Show Cancelation 7/12/2018 7/23/2018 Yes
91222222 481 No Show Cancelation 7/12/2018 7/24/2018 Yes
For each client ID, we are tracking who had a followup and who had no followup after an initial evaluation.
The criteria for a with_followup:
services_needed = ‘Yes’.
There must be a visittype other than ‘Eval’ (anything that contains ‘Eval’).
There must be a visittype other than ‘No Show Cancelation’
If the visittype = ‘Evaluation’, the visit_date must be greater than intake_date.
If only one of the rows fits that criteria, it qualifies as a followup regardless of the values in the other rows for a client.
The criteria for no_followup:
services_needed = ‘Yes’.
If none of the criteria for a followup exists for a client, that client had no_followup.
The criteria for cancellation_only:
services_needed = ‘Yes’.
The client only has visittypes = ‘No Show Cancelation’.
This would also qualify for no_followup.
The sum of the percentages for with_followup and no_followup will be 100%.
The percentage for cancellation_only is separate.
The following works:
import csv import xlwt import xlsxwriter from datetime import datetime workbook = xlsxwriter.Workbook('followups.xlsx') worksheet = workbook.add_worksheet() fs=open("followups.txt" , "w+") ns=open("nofollowups.txt", "w+") cs=open("cancellations.txt", "w+") with_followup = [] no_followup = [] cancellation_only = [] yes_service = [] no_service = [] followup_count = 0 no_followup_count = 0 cancellation_only_count = 0 total_count = 0 avg_followup = 0 avg_no_followup = 0 avg_no_show_cancel = 0 yes_service_count = 0 no_service_count = 0 with open(r'C:\mockup.csv', 'r') as f: reader = csv.reader(f) next(reader, None) # skip the headers for row in reader: for k in row[0]: if 'Yes' in row [5]: if 'Eval' not in row [2]: if 'No Show Cancelation' not in row [2]: if 'Engagement' not in row [2]: if row [2] != "": if row [0] not in with_followup: followup_count += 1 total_count += 1 with_followup.append(row[0]) fs.write(row[0] + "\n") break f.seek(0) next(reader, None) # skip the headers for row in reader: for k in row[0]: if 'Yes' in row [5]: if 'Engagement' in row [2]: if (row [4]) > (row [3]): if row [0] not in with_followup: followup_count += 1 total_count += 1 with_followup.append(row[0]) fs.write(row[0] + "\n") break f.seek(0) next(reader, None) # skip the headers for row in reader: for k in row[0]: if 'Yes' in row [5]: if row[0] not in with_followup: if row[0] not in no_followup: no_followup_count += 1 total_count += 1 no_followup.append(row[0]) ns.write(row[0] + "\n") break f.seek(0) next(reader, None) # skip the headers for row in reader: for k in row[0]: if 'Yes' in row [5]: if 'No Show Cancelation' in row [2]: if row[0] not in with_followup: if row[0] not in cancellation_only: cancellation_only_count += 1 cancellation_only.append(row[0]) cs.write(row[0] + "\n") break f.seek(0) next(reader, None) # skip the headers for row in reader: for k in row[0]: if 'Yes' in row [5]: if row[0] not in yes_service: yes_service_count += 1 yes_service.append(row[0]) break elif 'No' in row [5]: if row[0] not in no_service: no_service_count += 1 no_service.append(row[0]) break avg_followup = followup_count/total_count avg_no_followup = no_followup_count/total_count avg_no_show_cancel = cancellation_only_count/total_count avg_followup = round(avg_followup,2) avg_no_followup = round(avg_no_followup,2) avg_no_show_cancel = round(avg_no_show_cancel,2) print (with_followup) print (no_followup) print (cancellation_only) print ('followup ' + str(followup_count)) print ('no followup ' + str(no_followup_count)) print ('cancellation only ' + str(cancellation_only_count)) print ('total count ' + str(total_count)) print ('average followup ' + str(avg_followup)) print ('average no followup ' + str(avg_no_followup)) print ('average no show / cancellation ' + str(avg_no_show_cancel)) print ('yes service count ' + str(yes_service_count)) print ('no service count ' + str(no_service_count)) row = 0 col = 0 for item in (with_followup): worksheet.write(row,col,item) row += 1 fs.close() ns.close() cs.close() workbook.close()I made the criteria as concise as possible. I understand that some people might not want to wade through all of that. Please let me know if you need any clarifications.
I could do a lot, if not all, of this in a query. I don't want to make the query too complicated and slow. I need to do as much outside of the query as possible. Also, I am unable to have the Python interact with the database. It is prepackaged software. I can write queries but am not allowed to write outside code to run the queries.
I realize this is not ideal Python code (understatement). I still have a lot to learn.
I'll only be on sporadically until tomorrow.
Thanks to anyone that bothers to look this over.
Output:followup 3
no followup 4
cancellation only 1
total count 7
average followup 0.43
average no followup 0.57
average no show / cancellation 0.14
mockup.csv (Size: 889 bytes / Downloads: 10)