# Python Forum

You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I am trying to implement a gradient descent algorithm for linear regression. I am using the attached data. My algorithm is shown below:

```
import numpy as np
import csv
import matplotlib.pyplot as plt
path = ''
with open(path + 'ex1data1.txt', 'r') as f:

population = []
profit = []
for i in results:
population.append(float(i))
profit.append(float(i))

def gradientDescent(xData, yData, a, iterations, theta0, theta1):
J = []
it = []
for i in xrange(iterations):
print i
cost0 = 0
cost1 = 0
for j in xrange(len(xData)):

cost0 += (theta0 + theta1*xData[j] - yData[j]) ** 2
cost1 += (theta0 + theta1*xData[j] - yData[j]) ** 2 * xData[j]
tmp0 = theta0 - a*cost0
tmp1 = theta1 - a*cost1
theta0 = tmp0
theta1 = tmp1
J.append(cost1)
it.append(i)
return theta0, theta1, J, it

result = gradientDescent(population, profit, 0.000000005, 8000, 1, 2)

print 'y = %s + %sx' % (result, result)

plt.plot(result, result)
plt.show()

x = np.arange(0, 30, 1)
y = result + result*x
plt.plot(x, y)
plt.plot(population, profit, 'rx')
plt.show()

```
I am trying to implement a gradient descent algorithm for linear regression. Unfortunately I can't attach the data set. My algorithm is shown below:

```
import numpy as np
import csv
import matplotlib.pyplot as plt
path = ''
with open(path + 'ex1data1.txt', 'r') as f:

population = []
profit = []
for i in results:
population.append(float(i))
profit.append(float(i))

#This function implements gradient descent to find a regression line of
#the form y = theta0 + theta1*x
def gradientDescent(xData, yData, a, iterations, theta0, theta1):
J = []
it = []
for i in xrange(iterations):
#Begin calculation of cost function
cost0 = 0
cost1 = 0
for j in xrange(len(xData)):
cost0 += (theta0 + theta1*xData[j] - yData[j]) ** 2
cost1 += (theta0 + theta1*xData[j] - yData[j]) ** 2 * xData[j]
#End calculation of cost function

#Update theta0 and theta1
tmp0 = theta0 - a*cost0
tmp1 = theta1 - a*cost1
theta0 = tmp0
theta1 = tmp1

J.append(cost1)
it.append(i)

return theta0, theta1, J, it

result = gradientDescent(population, profit, 0.000000005, 8000, 1, 2)

print 'y = %s + %sx' % (result, result)

plt.plot(result, result)
plt.show()

x = np.arange(0, 30, 1)
y = result + result*x
plt.plot(x, y)
plt.plot(population, profit, 'rx')
plt.show()

```
I keep track of the cost function value at each iteration in order to plot it at the end to see get a good value of alpha. By changing the number of iterations it can be seen from the plot that the cost function only begins to converge at between 7000 and 8000 iterations. This is much higher than it should be.

Also, the value of alpha required is way too small. An appropriate value of alpha, such as 0.005 or 0.0005, doesn't work.

Does anyone have any ideas on how I can modify this? I can't see how this doesn't work as well as it should.