The Gradient Descent Algorithm (GDA) is an algorithm used to find the minimum of a function.
In these notes we will concentrate on minimizing functions of two variables, i.e.,
In these notes we will present the basic GDA. In practice, the basic GDA is often tweaked to help ensure convergence or to speed up convergence.
The Gradient. We write the gradient of a function
Here is the GDA
Let
Typically, one runs the GDA algorithm until
is sufficiently small, or until
is sufficiently small, or for a specified number of iterations.
Notation.
which is just the usual Euclidean distance formula from the Pythagorean Theorem.
Note. Don’t forget, if the GDA produces a sequence of points which converge, they may converge to where a local minimum occurs, rather than to where the global minimum occurs. For an example of this, see the picture a few paragraphs down.
Note. The important things to remember are that the starting point
About the GDA. The GDA is based on the theorem that states that a function
The GDA produces a sequence of points
The GDA can also be applied to functions of one variable, e.g., to
For a video about the GDA and a Maple Script to implement the GDA see:
https://mccarthymat301.commons.gc.cuny.edu/calculus-i-iii/mat-303-calculus-iii/mat-303-quiz-1/
GDA implemented in R.
In the following 16 second video we see the surface
together with the GDA points (green) lifted to the surface
Technically speaking, the GDA points are not on the surface
In the R script and in the image below, the GDA starting point is
Here is the R script used to implement the GDA and which output the 3D graph shown in the video. The following R script requires the package “rgl” to make the 3D graph.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | library (rgl) # for 3d plots # the function to minimize f <- function (x, y){ sin ( ((x)^2 + (y)^2)/3 ) } # numerical gradient grad <- function (x,y,f){ h = 0.001 fx = ( f (x + h, y) - f (x - h, y) )/ (2*h) fy = ( f (x, y + h) - f (x, y - h) )/ (2*h) c (fx, fy) } #starting point for GDA x0 = 1 y0 = 2 # iterations its = 1000; ################################# # no user input needed below here # initialize vectors x and y x = c (x0) y = c (y0) # GDA for (k in 1:(its - 1) ) { eta = 0.01 # learning rate x[k+1] = x[k] - eta* grad (x[k], y[k], f)[1] y[k+1] = y[k] - eta* grad (x[k], y[k], f)[2] } # plots id = plot3d (f, col = colorRampPalette ( c ( "blue" , "white" )), xlab = "X" , ylab = "Y" , zlab = "Z" , xlim = c (-5, 5), ylim = c (-5, 5), aspect = c (1, 1, 0.5)) contourLines3d (id) # "z" is the default function rgl.spheres (x, y, f (x,y), r = .2, color = "green" ) rgl.spheres (x[its], y[its], f (x[its], y[its]), r = .4, color = "red" ) # End R script |
For information about R, including a brief 3 1/2 minute video tutorial, how to download it for free, and how to run it online for free, see my page:
https://mccarthymat150.commons.gc.cuny.edu/r/