<- 1
sigma <- 0
mu <- makeFun( -(x-mu)^2/(2*sigma^2) ~ x, mu=mu, sigma=sigma)
h slice_plot(h(x) ~ x, bounds(x=sigma*c(-1.5, 1.5))) %>%
gf_hline(yintercept = c(0,-0.5), color=c("dodgerblue", "orange3"))
53 Diffusion
This chapter is still in a draft state.
53.1 Origins of the gaussian function
From the start of MOSAIC Calculus we have used a small set of basic modeling functions which will by now be familiar to you:
- exponential
- logarithm: inverse to exponentials
- power-law
- sinusoid
- gaussian and sigmoid
This chapter gives a more detailed introduction to the gaussian function and provides a specific algebraic formula that composes an exponential with a low-order polynomial.
Start with the low-order polynomial:
- The mean,
, is the argmax of the function. - The variance,
, how “fat” the parabola is.
(Recall that the Greek letters
A reasonable person can point out that the domain of the low-order polynomial is
The sandbox defines the function
The width of the parabola is based on the length of the horizontal line segment between the two branches of the parabola. Specifically, the width is defined as half the length of this line segment. to avoid confusion, we will use “width” in the usual English sense and a special term, standard deviation, to refer to half the length of the line segment. (“Standard deviation” is a good candidate for the worst name ever for a simple concept: a length. Another, equivalent term you will hear for this length is “root mean square,” which is almost as bad. Still, those are the standard terms and you should be careful to use instead of non-standard alternatives.)
**Part A** In the above sandbox, set `sigma <- 2`. What *standard deviation* corresponds to $\sigma = 2$?
1/4 1
**Part B** Still holding $\sigma = 2$, what is the *variance* of the function?
1/4 1
**Part C** Set $\sigma = 3$, and read off the graph what is the *standard deviation* of the function?
1/3 1
**Part D** When $\sigma = 3$, what is the *variance* parameter?
1/3 1
**Part E** Pick a $\sigma$ of your choice, try a few non-zero values for $\mu$. Read off from the graph the standard deviation. How does the standard deviation depend on $\mu$?
- The standard deviation is
. - The standard deviation is
. - The standard deviation is
. - The standard deviation is not affected by
.
**Part F** What is the relationship between the *variance* $\sigma^2$ and the *standard deviation* $\sigma$?
- The standard deviation is the square root of the variance.
- The standard deviation is 1 over the variance.
- They are completely unconnected concepts.
- The variance is the square of the mean,
.
One of the features that make bump functions useful is that they are local, the function is practically zero except on a domain of limited width. The parabola
To produce our pattern-book bump function, we compose an exponential function
The sandbox defines
<- 1
sigma <- 0
mu <- makeFun( exp( -(x-mu)^2/(2*sigma^2) ) ~ x, mu=mu, sigma=sigma)
f slice_plot(f(x) ~ x, bounds(x=sigma*c(-3.5, 3.5))) %>%
gf_hline(yintercept = c(0,exp(-0.5)), color=c("dodgerblue", "orange3"))
<- antiD(g(x) ~ x, lower.bound = -Inf)
sigmoid # Graph the sigmoid on the domain -10 < x < 10
Notice that the vertical range of the function is
How to scale the gaussian antiD()
to compute sigmoid()
. From the graph of sigmoid()
you can read off the scaling factor that will make the vertical range of the resulting sigmoid zero to one.
**Part G** Where did the variable $u$ come from in $$\int_{-\infty}^x f(u) du ?$$
- The instructors made a mistake. They will try to be more careful in the future!
- You can use any name for the variable of integration. The function
will be a function of , not . - You can use any name for the variable of integration. The function
will be a function of , not . is the Latin equivalent of .
**Part H** This question will take a bit of detective work in the sandbox. Make a table of a few combinations of different values for $\mu$ and $\sigma$. For each combination, find the maximum value of the corresponding `sigmoid()` function. Using this data, choose the correct formula for the maximum value of the sigmoid as a function of $\mu$ and $\sigma$.
Putting all this together, we arrive at our “official” standard gaussian function, called the Gaussian function:
In R, the Gaussian function is provided as dnorm(x, mu, sigma)
. (The corresponding sigmoid, that is, the anti-derivative of dnorm()
with respect to x
is available as pnorm(x, mu, sigma)
.
The Gaussian function is so important, that it is worth pointing out some recognizable landmarks in the algebraic expression. Knowing to look for such things is one trait that defines an expert.
- The term
is not a function of , it is a constant related to the variance . It is just a number arranged to make - The exponential means that, whatever the values of
and , the function value . - The argmax is at
. This is also the “center” of the function, which is symmetrical around the argmax. - The variance
appears directly in the formula. In no place does , the standard deviation, appear without the exponent 2. This is a hint that the variance is more fundamental than the standard deviation.
**Part I** Go back to the sandbox where we defined the function `f(x)` and graphed it. The function has a distinctive shape which is almost always described in terms of a familiar word. Which one is it?
breadloaf-shaped ghost-shaped bell-shaped sombrero-shaped
Optional background. Now for a bit of irony. We’ve taken a lot of care to define a specific form of gaussian function with a formula that strikes many people as complicated. It must be that all these specifics are important, right? In reality, any function with a roughly similar shape would work in pretty much the same way. We could have defined our “official” gaussian function in any of a number of ways, some of which would be algebraically simpler. But the specific gaussian shape is a kind of fixed point in the differential equation that we will study next.
53.2 Net flux
Recall Newton’s Law of Cooling,
We’ve studied cooling in a spatially discrete context, the cooling of a single point (e.g. “a cup of coffee”) in an environment that has only one property, the “ambient” temperature.
Let’s switch to a spatially continuous context, a bar of iron with one end lying in a bed of hot coals and the other end in the open air, as in the picture:

The iron rod is incandescent at the right end and cooler toward the left. If the picture were a movie, you likely would be able to predict what the action will be: Heat will flow down the rod from right to left. The free end of the rod will eventually get burning hot.
The temperature
If we were thinking about the movie frame-by-frame, we might prefer to treat
Now back to Newton’s Law of Cooling. The flux of heat is the difference between the object’s temperature and the ambient temperature. But in the continuous spatial system, the difference in temperature between two infinitely close neighboring points is zero. That suggests no flux. Of course, a major theme in Calculus is to provide means to discuss the rate of difference of a value at two infinitely close points: the derivative
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
It might be tempting to translate this directly into the terms of Newton’s Law of Cooling and claim, wrongly that
In the spatially continuous context, the net flux or difference in differences (A to B, B to C) is represented by the second derivative with respect to
Some people might be more comfortable thinking about the discrete-time dynamics of the movie, which could be written
Exercise: Turn away from the iron rod of the picture and imagine being presented with four new rods each of which has been heated in some way to produce a temperature profile at time 0, that is
## Warning: All aesthetics have length 1, but the data has 101 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
## a single row.
## Warning: All aesthetics have length 1, but the data has 101 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
## a single row.
## Warning: All aesthetics have length 1, but the data has 101 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
## a single row.
## Warning: All aesthetics have length 1, but the data has 101 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
## a single row.
- For the function
shown in Graph (A), how will the temperature at change instantaneously? (That is, what is ?)
“+Temperature decreases+”, “No change in temperature”, “Temperature increases”, “Temperature oscillates” = “The temperature cannot oscillate instantaneously. The value of a derivative function at a point is a real number, not an oscillation.” , random_answer_order = FALSE )
For the function
shown in Graph (B), how will the temperature at change instantaneously? (That is, what is ?) “Temperature decreases”, “No change in temperature”, “+Temperature increases+”, “Temperature oscillates” = “The temperature cannot oscillate instantaneously. The value of a derivative function at a point is a real number, not an oscillation.” , random_answer_order = FALSE )For the function
shown in Graph (C), how will the temperature at change instantaneously? (That is, what is ?) “Temperature decreases”, “+No change in temperature+”, “Temperature increases”, “Temperature oscillates” = “The temperature cannot oscillate instantaneously. The value of a derivative function at a point is a real number, not an oscillation.” , random_answer_order = FALSE )
askMC( 4. For the function
53.3 Diffusion
Recall that the heat equation describes how the temperature along a approximately 1-dimensional object—an iron bar, for example—changes with time due to spatial differences in temperature from point to point. The heat equation is
**Part J** Suppose the inputs to the temperature function have units centimeters (for $x$) and seconds (for $t$) and that the output is in degrees Kelvin (which we will write "K"). What are the units of the coefficient $k$?
At the time Fourier was working, there was no molecular theory of matter and very little understanding of what the “heat substance” might consist of. Now we know that heat is the energy of molecular vibrations. This energy diffuses through the material.
Similarly, “diffusion” is one mode of physical motion of material, for example movement of sugar molecules within a cell. Other things can diffuse as well, for example the action of viscosity in fluids can be seen as the diffusion of momentum.
Starting in the 20th century and in support of the developing molecular theory of gasses, mathematicians and physicists undertook to follow the trajectories of individual diffusing particles and to develop a means to describe them mathematically. This included the concept of a “random walk,” movement of a particle shifting direction and speed randomly as it collides with other moving molecules and particles in its environment.
The movie shows a simulation of a few particle (in yellow) undergoing a random walk. The path followed by each diffusing particle is shown in blue; the velocity of one particle is indicated with a red vector.
The idea of random walks has become especially important in economics and finance. The walking “particle” might be the price of a stock or other derivative asset. The “collisions” happen as individual trades are made in the market, each trade being influenced by some news or situation or the passing whims, fancies, or fears of investors. The work on this point of view started about 1900 by a mathematics graduate student, Louis Bachelier, who undertook to study the movements of the Bourse, the Parisian stock exchange. The 1997 Nobel Prize in economics was awarded for a “new method to determine the value of [market] derivatives.”
For these reasons, we will focus and the mathematics of diffusion instead of the equivalent but historically prior mathematics of heat. We will work with a function
For the sake of visualization, suppose some odor molecules are released in at the midpoint of a pipe with absolutely still air. Over time, the molecules will diffuse throughout the along the extent of the pipe.
If
**Part K** Suppose the inputs to the concentration function have units centimeters (for $x$) and seconds (for $t$) and that the output is in nanograms per cubic centimeter ($ng\ cm^{-3}$) . What are the units of the coefficient $k$?
Many people have difficulty imagining the sorts of frequent collisions that are behind diffusion. They think, for instance, that in still air the molecules are pretty much still. This is wrong. A typical velocity of a water molecule in air at room temperature and pressure is 650 m/sec. (The speed of sound is about 350 m/sec.) But the time between molecular collisions is on the order of
53.4 Dynamics of variance
In this section, we will explore the connection between diffusion and the gaussian function. Recall that we modeled the temperature along a one-dimensional spatial domain (a “bar of iron”) as it evolves in time as a function of both position and time:
We constructed a differential equation to describe the dynamics of
In studing dynamics we worked first with time taken discretely, e.g. a sequence of states
In our present contexts, heat or diffusion, we are working with functions. Let’s return to the earlier metaphor of a movie of
To describe the dynamics—that is, the change from frame to frame in the movie—we write a finite-difference equation, generically:
In English, this says, “The concentration at
Now imagine making the movie using an ultra-high-speed camera that takes a new frame every
We will find the solution
<- list()
C_funs <- list()
flux_funs <- list()
tmp <- seq(-50, 50, length=1000)
xpts 1]] <- tibble::tibble(
C_funs[[x = xpts,
y = dnorm(x, 0, 1)
)
for (k in 1:500) {
<- spliner(y ~ x, data = C_funs[[k]])
tmp <- flux_funs[[k]] <- D(tmp(x) ~ x + x, .hstep=1)
foo +1]] <- tibble::tibble(
C_funs[[kx = xpts,
y = tmp(x) + 0.1*foo(x)
) }
Imagine that we have a long, thin pipe filled with still air and we inject at position
<- max(C_funs[[1]]$y)
top gf_line(y ~ x, data = C_funs[[1]] |> filter(abs(x) < 25)) %>%
gf_labs(y="Temperature (deg C)", x="Position along bar", title="Initial condition C(x,t=0)") %>%
gf_hline(yintercept = exp(-.5)*top, color="orange3")
The red horizontal line is positioned to enable you to read off the standard deviation of the bell-shaped function.
The next frame of the movie will show
<- spliner(y ~ x, data = C_funs[[1]])
tmp <- D(tmp(x) ~ x & x, .hstep=1)
foo slice_plot(foo(x) ~ x, bounds(x=c(-25, 25)), npts=500) %>%
gf_labs(y="Net flux (deg C/cm^2)", x="Position along bar", title="Net flux at time t=0")
You can see that there is a strong net flux out of the center point and a net flux in to neighboring regions: the heat will be spreading out. Far from the center point, the net flux is zero. In the next graphs, we will zoom in on the center of the domain,
To find the next Euler step, that is, the function
Here is the solution
<- function(n, compare_n=1, show_sd=TRUE, domain=5) {
draw_C <- C_funs[[n]] |> filter(abs(x) < domain)
Dat <- glue::glue("C(x, t={(n-1)*0.1}) in deg C.")
ylab <- glue::glue("C(x, t={(n-1)*0.1})")
title <- exp(-0.5) * max(Dat$y)
redline <- gf_line(y ~ x, data = C_funs[[n]] |> filter(abs(x) < domain)) %>%
P gf_labs(y=ylab,
x="Position along bar", title=title)
if (compare_n > 0) {
<- P |>
P gf_line(y ~ x, alpha=0.5, color="dodgerblue",
data = C_funs[[compare_n]] |> filter(abs(x) < domain))
}
if (show_sd) {
|> gf_hline(yintercept=redline, color="orange3")
P else {
}
P
}
}draw_C(6, domain=25, show_sd=FALSE)
At time
Here is the function
<- function(P, xlim=5, dxlim=xlim/50) {
axis5 |> gf_refine(scale_x_continuous(breaks=(-xlim):xlim,
P minor_breaks = seq(-xlim,xlim,by=dxlim)))
}draw_C(11, domain=5, show_sd=TRUE) |> axis5()
draw_C(21, domain=5, show_sd=TRUE) |> axis5()
draw_C(31, domain=5, show_sd=TRUE) |> axis5()
draw_C(41, domain=5, show_sd=TRUE) |> axis5()
Use the intersection between the red horizontal line and the black curve to find the width of the black curve, that is, the standard deviation. Which of these sequences correspond to the standard deviation at times 1, 2, 3, and 4?
“
Here is a similar set of graphs for
draw_C(101,0, domain=15, show_sd=TRUE) |> axis5(xlim=15, dxlim=0.5)
draw_C(201,0, domain=15, show_sd=TRUE) |> axis5(xlim=15, dxlim=0.5)
draw_C(301,0, domain=15, show_sd=TRUE) |> axis5(xlim=15, dxlim=0.5)
draw_C(401,0, domain=15, show_sd=TRUE) |> axis5(xlim=15, dxlim=0.5)
Which of these sequences correspond to the standard deviation at times 10, 20, 30, and 40?
“+
The standard deviation increases with time. Which one of these power-law relationships best corresponds to the standard deviations you measured off the graphs?
“
Essay: A more fundamental way to measure the increase in width of
53.5 Random walk
The solution to diffusion differential equation gives the concentration of the diffusing particles
Everything is smooth and nicely behaved. But this is an abstraction which summarizes the concentation of a theoretically infinite population of particles. Looking at the position of a single particle as a function of time gives a very different impression. The figure shows individually the trajectories of 50 (randomly selected) particles.
Do take note that the position of each particle in the pipe at time
Each of the trajectories is called a random walk, as if a walker were randomly taking steps forward or backward.
r insert_calcZ_exercise(“XX.XX”, “KktWcD”, “Exercises/Dynamics/fox-write-oven.Rmd”)
53.6 Investment volatility
When people invest money, they expect a return. Generally, the return is measured as a percentage per year. An
Banks and such do things in discrete time, e.g. crediting your savings account with interest once a month. But this is calculus, so we focus on continuous time. (And, of course, Nature does things in continuous time!)
If
Quick review questions:
- Which of those two equations is the differential equation and which is the solution?
- What symbol is being used to stand for the “state” of the system?
- What is the form of the dynamical function of state?
- Is there a fixed point? If so, is it stable?
Investments in the stock market provide two types of return. We will focus on the return that comes from the changing price of the stock, which can go up or down from day to day. The other kind of return is dividends, the typically quarterly payment made by the company to stock holders. In investments, dividends should not be ignored, but we aren’t interested in them here.
Now imagine that you expect, for whatever reason, the stock price to go up by 2% per year (
This situation, which includes volatility, is modeled by a modification of differential equations called “stochastic differential equations.” (“Stochastic” comes from the Greek word for “aiming”, as in aiming an arrow at a target. You won’t necessarily hit exactly.) The math is more advanced and we will not go into details. Our point here is to warn you: Now that you are expert about (ordinary) differential equations, you need to be aware that things are somewhat different in a stochastic situation.
To that end, we will show you trajectories that follow the mathematics of stochastic exponential growth (with
The eye is drawn to the trajectories leading to large returns. That is misleading. Although there are a few trajectories that show a 3-year return above 50% (that is, to $150 or higher) in fact the majority of trajectories fall below that of a purely deterministic
It is easy to understand why the stochastic shocks cause a loss in return. Consider a stock with an initial price of $100 that in two successive days goes up by 50% and down by 50%. These ups and downs should cancel out, right? that is not what happens:
Some class notes in www/2021-03-31-classnotes.Rmd