The gradient -- a vector derivative


In our discussion of kinematics, we often encounter derivatives — rates of change of a function. The velocity is the derivative of the position. The acceleration is the derivative of the velocity. But you may notice that those are all derivatives with respect to time: $v = dx/dt$, $a = dv/dt$.

We'll also encounter functions of position and will have to consider derivatives with respect to position in space. Since a position in space is specified by more than one variable — the coordinates x, y, and z — we have more than one derivative to consider. We put the three derivatives together as a vector — the gradient.

Biological relevance: Gradient driven flows

Derivatives with respect to space are of great importance in biology. Here are four examples that we'll discuss in this wiki.

  • A variation in the concentration of a chemical in space moves material in ways important to cells (Fick's law of diffusion);
  • A variation of temperature from one point in space to another results in the flow of heat, and control of this flow is of great importance to warm blooded animals (Fourier's law of cooling); 
  • A variation in pressure from one point to another is what results in fluid being pushed through arteries (the Hagen-Poiseuille law of fluid flow);
  • The variation of electric potential in space is what drives electric currents in nerves and through membranes (Ohm's law).

In each of these cases, the spatial variation of some scalar quantity results in the transport of something through space in some direction — a vector flow. To understand how to mathematically model scalar function to vector flow requires a vector derivative known as the gradient. And the laws (equations) mentioned above all fall into the class known as gradient-driven flows.

Creating the math to handle the many variables of space requires that we look at derivatives of a function that depends on many variables, for example, $x$, $y$, and $z$ of space.  Derivatives with respect to one of these variables are partial derivatives — looking at the result of changing one of the variables while holding the others constant. The math of partial derivatives is simple at first — as long as we only consider one set of fixed coordinate variables. It gets messy when we decide to change the set of variables we are using (we may have to invoke the chain rule), but these mathematical methods have immense value in advanced biology. They are critical in understanding the math of electromagnetism and thermodynamics. And they have lots of applications in studying topics of population dynamics, evolution, and ecology* that depend on many variables. 

Here we will only introduce the simple math of partial derivatives that is a small extension beyond one-variable calculus. And we will focus on the conceptual interpretations of gradients described at the bottom of this page.

Partial derivatives

In Calc I you probably only studied a single function of a single variable: $y = f(x)$.  You then studied how the value of $y$ changes as $x$ changes. But real physical systems exist in a three dimensional space. That means that we have to describe our position in this space with three coordinates: for example, $x$, $y$, and $z$. This is why we introduce the concept of vector and the rate of change in a space of many variables.

Recall that a vector is a quantity that has both a magnitude and a direction. For example, a position vector is a displacement from a fixed reference point (origin) by a given amount in three directions. We represent these directions by dimensionless unit vectors, $\hat{i}$, $\hat{j}$, and $\hat{k}$ — little arrows pointing in the positive $x$, $y$, and $z$ directions respectively. We then multiply these directions by (positive or negative) coordinates containing units and get a vector of something that has both a direction and a magnitude. Depending on what we multiply these unit vectors by, we can get different kinds of vectors — position, velocity, force. We write them looking like this:

$$\overrightarrow{r} = x\hat{i} + y\hat{j} + z\hat{k}$$

$$\overrightarrow{v} = v_x\hat{i} + v_y\hat{j} + v_z\hat{k}$$

$$\overrightarrow{F} = F_x\hat{i} + F_y\hat{j} + F_z\hat{k}$$

In each case we multiply our direction arrow by a coordinate (signed amount) of the right kind to build a total vector with direction.

An example: Temperature

Now we can also have functions of position that are just numbers (scalar functions) — ones that have only magnitude and not direction (though they might be positive or negative).  A familiar example is the temperature, which can vary from place to place. (Other scalar functions of position that we will encounter include pressure, concentration, and potential energy.) We can then write the temperature as a function of the three position coordinates:

$$T(\overrightarrow{r}) = T(x, y, z)$$

If we want to take a derivative of this function we have three choices. We could take the derivative with respect to $x$, $y$, or $z$.

We introduce a curvy "d" ($\partial$) instead of the regular "d" in these derivatives to remind ourselves that we have other variables around that we are keeping constant while we are taking our derivative: the partial derivative. So if we have an arbitrary function of position in space, $f(x, y, z)$, we could create three different derivatives:

$$\frac{\partial f}{\partial x}\;\;\;\;\;\frac{\partial f}{\partial y}\;\;\;\;\;\frac{\partial f}{\partial z} $$

When we have a fixed set of coordinates that we will always use — like $x,y,z$ — it's not critical to pay attention to the fact that we have partial derivatives. But when we are changing coordinates in mid-stream, it becomes crucial! You can easily get the wrong result if you don't pay attention to what you are keeping constant. This happens in two places of importance: when you are using curvilinear coordinates such as cylindrical or spherical coordinates, and in thermodynamics where the difference in which variables you are choosing switches between different kinds of energy (internal energy, Gibbs free energy, Helmholtz free energy). We won't have to worry about this here, but it becomes crucial in more advanced classes, such as Physical Chemistry.

Making the derivatives a vector: the gradient

Since we have three partial derivatives of a function in space, each associated with a direction ($x$, $y$, and $z$), we can create a vector from them by multiplying by the unit vectors: $\hat{i}$, $\hat{j}$, and $\hat{k}$. This combination for a function $f$ is referred to as the gradient of $f$. We write it with a funny symbol: an upside down delta officially called a "nabla" (a word meaning an Assyrian harp), though it's usually read as "del" by physicists and mathematicians. It looks like this: ∇. With this, we write the gradient of a function $f$ as

$$\overrightarrow{\nabla}f = \frac{\partial f}{\partial x} \hat{i} +\frac{\partial f}{\partial y} \hat{j} +\frac{\partial f}{\partial z} \hat{k} $$

Notice that the arrow is on the nabla since that's what turns the scalar function, $f$, into a vector.

What's a gradient good for?

The gradient is good for understanding the shape of the function $f$ in space. Often, the way a function behaves in space has powerful physical consequences. One example that is important here is the potential energy. This is a (scalar) function of space associated with gravitational and electric forces contains all the information about where an object will feel a particular force. To get the force out of a potential energy, we take the gradient.** Thus, if we have a potential energy of a particular type, the force can be found like this:

$$\overrightarrow{F}_{type} = -\overrightarrow{\nabla}U_{type}  $$

In other cases, the gradient is what gives the direction to our flow in the examples stated above. The negative gradient of the pressure gives the direction of fluid flow; the negative gradient of temperature gives the direction of heat flow, etc.

Making conceptual sense of the gradient

It's easier to think about the gradient in 2D because then we can imagine plotting our function as a function of the two variables in a 3D graph. Consider the complicated function $f(x,y)$ shown in the graph at the right. (This could represent the concentration of a chemical in a cell or relative fitness of an organism as a function of two parameters relevant to its survival.) We have plotted this function on the z axis. The shape of the surface gives the value of $f$ at each point. (We have drawn contour lines of equal values of $f$ to guide the eye.)

At each point on this surface, the gradient of the function points uphill — in the direction that rises most quickly. And its magnitude is the slope of $f$ going in that direction. In that sense, it's just like a regular derivative — the slope of a function — but for a function of many variables it's the largest slope you can find and points in the direction to show you where the function is changing the fastest.

This isn't too bad to make sense of when you have a function of two variables. But when we have a function of three variables it's hard to imagine a plot in four dimensions. You just will have to use this kind of "points uphill" picture as an analogy (or metaphor) to help you make sense of what the gradient means physically in that case. But you can always use the math to get the answer, no matter how many variables you have.

* In ecology, evolution, and population dynamics, one might consider a "fitness function" that depends on many variables. The gradient then tells how that fitness function changes as a result of changing each of those parameters. The gradient might then be a vector in a space with many more than three dimensions!

** In a sense, the gradient is the derivative that is the opposite of the line integral that we used to create the potential energy.

Joe Redish 12/3/11



Article 319
Last Modified: May 27, 2019