Python Forum
Is it possible to recombine values in python?
Thread Rating:
  • 1 Vote(s) - 2 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Is it possible to recombine values in python?
#1
If i post in wrong branch, please let me know.
I was advised to submit this question on python forum, when i asked it in SQL MSDN forum
https://social.msdn.microsoft.com/Forums...ransactsql
So matter of question
i try be more clearly

I have data(data1.csv) for create multiple regression model (today we can have 10 X's, tomorrow is 20 X's I.E predictors, cause i have different models)

The main characteristic of the model is R ^ 2. better to be from 0.8 to 1.

Suppose I create a model manually, but it has R ^ 2 is small! Is it possible recombine the values of all variables in python, until R ^ 2 will be greater, or the maximum possible on this data?

under recombination values is means deleting row and substituting, i.e., iterating

to be more clear here example

data1.csv (can't attach csv file, i upload it on webshare)
data1.csv

Depended variable : PT_POOR (Y variable)

I manually conducted regression and get R^2=,043 (it is very bad)

let's delete Curt's row from data and conduct regression again. Without Curt, R^2=,40, better, but not ideally.

In this example, it was necessary just remove one line(Curt row).

But there are other cases when the values that interfere to create a model are scattered across the dataset.

let's examine it example 2(data2orig.csv)
original dataset

without these values(empty cell in data2.csv) the model has R^2=,70782713 it is good.
Note: deleted values are replaced by mean substitution. Hence a similar recombination of value is needed until the maximum possible R ^ 2 is obtained.

Note that deleted values are not deleted; those values that were taken out of the analysis can be viewed in new generated by python file "thebadvalue.csv"
so in output we have next tables

1.) cleaned input table(data2.csv without "bad" values) (in our case from example 2)
data2.csv


2.) table with beta coefficients(@slope) and R ^ 2 in matlab.csv
b R^2
Intercept 45,41402 0,707827
POP_CHNG -0,29028 0,707827
N_EMPLD 0,00176 0,707827
TAX_RATE 2,18822 0,707827
PT_PHONE -0,28344 0,707827
AGE -0,26575 0,707827
PT_RURAL 0,081 0,707827

view example of "thebadvalue.csv"

POP_CHNG N_EMPLD PT_POOR TAX_RATE PT_PHONE PT_RURAL AGE
Benton
Cannon
Carrol
Cheatheam
Cumberland
DeKalb
Dyer
Gibson 3040
Greene
Hawkins
Haywood
Henry
Houston
Humphreys
Jackson
Johnson
Lawrence
McNairy
Madison
Marshall
Maury
Montgomery
Morgan
Sevier
Shelby 11500
Sullivan
Trousdale 100
Unicoi
Wayne 100
Weakley
How to do it?

Sorry for my english, i am not native speaker.
Reply


Messages In This Thread
Is it possible to recombine values in python? - by synthex - Nov-29-2018, 10:42 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Get max values based on unique values in another list - python Antonio 8 8,489 Jun-12-2018, 07:49 PM
Last Post: Mekire

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020