Why is my gradient descent algorithm requiring such a small alpha?

JoeB · Dec-08-2017, 05:09 PM

I am trying to implement a gradient descent algorithm for linear regression. I am using the attached data. My algorithm is shown below:

import numpy as np 
import csv
import matplotlib.pyplot as plt
path = ''
with open(path + 'ex1data1.txt', 'r') as f:
    results = list(csv.reader(f))

population = []
profit = []
for i in results:
	population.append(float(i[0]))
	profit.append(float(i[1]))

def gradientDescent(xData, yData, a, iterations, theta0, theta1):
	J = []
	it = []
	for i in xrange(iterations):
		print i
		cost0 = 0
		cost1 = 0
		for j in xrange(len(xData)):

			cost0 += (theta0 + theta1*xData[j] - yData[j]) ** 2
			cost1 += (theta0 + theta1*xData[j] - yData[j]) ** 2 * xData[j]
		tmp0 = theta0 - a*cost0
		tmp1 = theta1 - a*cost1
		theta0 = tmp0
		theta1 = tmp1
		J.append(cost1)
		it.append(i)
	return theta0, theta1, J, it

result = gradientDescent(population, profit, 0.000000005, 8000, 1, 2)

print 'y = %s + %sx' % (result[0], result[1])

plt.plot(result[3], result[2])
plt.show()

x = np.arange(0, 30, 1)
y = result[0] + result[1]*x
plt.plot(x, y)
plt.plot(population, profit, 'rx')
plt.show()

JoeB · (This post was last modified: Dec-08-2017, 05:15 PM by JoeB.)

I am trying to implement a gradient descent algorithm for linear regression. Unfortunately I can't attach the data set. My algorithm is shown below:

import numpy as np 
import csv
import matplotlib.pyplot as plt
path = ''
with open(path + 'ex1data1.txt', 'r') as f:
    results = list(csv.reader(f))

population = []
profit = []
for i in results:
	population.append(float(i[0]))
	profit.append(float(i[1]))

#This function implements gradient descent to find a regression line of
#the form y = theta0 + theta1*x
def gradientDescent(xData, yData, a, iterations, theta0, theta1):
	J = []
	it = []
	for i in xrange(iterations):
		#Begin calculation of cost function
		cost0 = 0
		cost1 = 0
		for j in xrange(len(xData)):
			cost0 += (theta0 + theta1*xData[j] - yData[j]) ** 2
			cost1 += (theta0 + theta1*xData[j] - yData[j]) ** 2 * xData[j]
        #End calculation of cost function
    
        #Update theta0 and theta1
		tmp0 = theta0 - a*cost0
		tmp1 = theta1 - a*cost1
		theta0 = tmp0
		theta1 = tmp1

		J.append(cost1)
		it.append(i)

	return theta0, theta1, J, it

result = gradientDescent(population, profit, 0.000000005, 8000, 1, 2)

print 'y = %s + %sx' % (result[0], result[1])

plt.plot(result[3], result[2])
plt.show()

x = np.arange(0, 30, 1)
y = result[0] + result[1]*x
plt.plot(x, y)
plt.plot(population, profit, 'rx')
plt.show()

I keep track of the cost function value at each iteration in order to plot it at the end to see get a good value of alpha. By changing the number of iterations it can be seen from the plot that the cost function only begins to converge at between 7000 and 8000 iterations. This is much higher than it should be.

Also, the value of alpha required is way too small. An appropriate value of alpha, such as 0.005 or 0.0005, doesn't work.

Does anyone have any ideas on how I can modify this? I can't see how this doesn't work as well as it should.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	How come afer some iterations of gradient descent, the error starts growing?	Corpac	0	1,594	Mar-20-2020, 05:20 PM Last Post: Corpac
	How to build linear regression by implementing Gradient Descent using only linear alg	PythonSpeaker	1	2,194	Dec-01-2019, 05:35 PM Last Post: Larz60+
	ANOVA: DataFrame ha no Attribute alpha	Tese	4	3,585	Jul-14-2019, 06:16 PM Last Post: Tese

Why is my gradient descent algorithm requiring such a small alpha?

User Panel Messages

Announcements