This is the second post (here’s the first one) about an approach to introducing the derivative to calculus students that is counter to what I’ve seen in textbooks and other traditional treatments of the subject. As I wrote in the first post, in the typical first contact with the derivative, students are given a smooth curve and asked to find the slope of a tangent line to this curve at a point. But I argued that it would be more helpful to students’ understanding of the derivative to start with a simpler case first, namely to use only piecewise-linear functions at the beginning. This way, as we saw, we can develop some important core ideas about the derivative without resorting to anything more than pictures and an occasional slope calculation.

But now, we need to deal with the main problem: What happens if we *do* have a smooth curve, not a straight line or piecewise-linear graph, and we want to answer the same kinds of questions as we posed in the first post? Again, here’s how this might play out in a classroom setting.

Let’s go back to Charlie from the example in the previous post, who travels 100 meters over a 120-second time span to the cafeteria according to this graph:

The piecewise-linearity of this graph makes it easy to calculate Charlie’s velocity at (almost…) any point. But there’s a problem. Can a human being possibly change velocities, as Charlie does at t = 60 and t = 90, without slowing down first? That is clearly not in line with the laws of physics unless you have no mass. So, although the piecewise-linear graph can be a pretty good approximation to real life, in real life no person would ever move like this. Instead, Charlie’s motion is probably more like this:

Charlie’s story as told by this graph is basically the same as before. But now the curve is smoothed out where Charlie changes direction to account for the physical realities of motion. Now let’s ask the same kind of question as before: How fast was Charlie going at, say, 30 seconds?

I like just to give this problem to the students to see what they can make of it. We’ve done instantaneous rates several times by this point, but all for piecewise-linear functions. That was easy; how can you adapt this method to a function that is not linear? Students who come up with any sort of idea at all usually come up with the right one: **Somehow approximate the curve with a straight line at t = 30 and then measure that line’s slope**.

Some students do this by arguing that the graph from t=0 to t=60 is essentially linear already; that tiny bit of curvature we see is so small it can be neglected, so just find the slope of the “line” from 0 to 60 using the origin (through which the graph clearly passes) and either (30, 50) or (60, 80). Other students will draw the tangent line to the graph at t = 30 — without ever having been told what a tangent line is or having seen one — and measure its slope. The first approach, of course, is using a secant line, the second one a tangent line.

Both of these approaches are quite natural and also pretty accurate in this case. But eventually we want students to understand that the best approach is to create not a picture but a *process* whereby we can get an approximate slope to any degree of accuracy we like — and eventually define . The usual way to do this is in the calculus books — fix the point of tangency (e.g. t = 30) and select a movable second point (a, y(a)); calculate the slope of the secant line; repeat until the differences in the secant slopes become negligible. The result is the slope of the tangent line. There’s nothing wrong with that, but here’s another approach that retains the piecewise-linear flavor of the initial encounter.

We don’t (yet) know exactly what it means when we talk about the “slope” of a curve. So let’s take a step backwards. Suppose we broke Charlie’s distance graph into a number of straight pieces by picking a bunch of points on the curve and connecting the dots, like so:

(Here the dots are plotted at t = 0, 30, 60, … , 120.) Voíla — we have piecewise-linearized the graph! Now, if there is a single line segment that contains t=30, just locate it and find its slope. This requires approximation, but that’s the price we pay. (On the other hand, if we had a formula, we wouldn’t need to approximate; that’s a seperate calculation and in the spirit of keeping things relatively algebra-free here, we won’t go into that.)

But since two pieces of data are often better than one, a potentially even better approach is to plot a bunch of dots and make t = 30 one of them, as we have done above. This will create a line segment before t=30 and a line segment after t=30. Then we can estimate the two slopes and average the result.

*Question*: How accurate is this, and can we make it more accurate? Intuitively, as long as the function is relatively well-behaved at t = 30, the more dots we plot on the graph, the better accuracy we get. So go back through and (say) double the number of dots you plot and repeat. This sounds like a lot of busy work until you realize you only need three dots: one at t=30, another just before t=30, and another just after t=30. For simplicity, make the two “outside” dots the same distance from t=30, say 0.1 units away. Find the slope from t=29.9 to t=30 and then from t=30 to t = 31.1; average the results; and that’s a better approximation. Reduce the size of the offset if you want even more accuracy. And if you want a clear idea of what the “slope” of a curve at a point is, reduce the offset size repeatedly and see what the average slopes approach.

All we’re doing here is reformulating the standard method of getting the derivative. If we let represent the offset described above, and if is the function of interest, then the “slope just before t=30” is

and similarly, the “slope just after t=30” is

the average of these two is

and this is known as the symmetric difference quotient, a standard means of calculating numerical derivatives and perhaps the best choice for differentiating functions that are given as tables of data. What we are doing by “shrinking the offset” is merely letting . So ultimately we are setting up the definition of the derivative at t=30 to be:

Of course this produces the same derivative values as the usual limit-based definitions of the derivative. What makes this possibly preferable to the usual formulas, though, is that it arises out of the piecewise-linear approach; and it applies itself very well to functions given as tables of data (if you knock out the limit). The method of going through a smooth curve, putting a bunch of equally-spaced dots on it, and then connecting the dots is also precisely how the formula for arc length is developed when students get around to applications of the integral. So this approach also provides a bit of unification between differential and integral calculus.

But integration, and how the piecewise-linear approach might be useful there, is the subject of the next post.