The Calculation of Slopes, Tangents, or Changes in a Function
Changes in smooth functions are what rate equations, or differential equations, are all about, be they trigonometric, exponential, or polynomial functions, and of course combinations of them as the solutions of exercises will often generate. As laws are introduced in physics text, they will relate some property of the matter with rates of change. The rates of change of a curve themselves are properties of that curve, along with the domain interval and the range of $f$ over the domain (the image).
The second derivative is the change in change of the function, or the derivative of the derivative of $f$. We'll be using $y$ as ordinate, and $x$ as abscissa, or domain parameter, but we should be clear that the domain can be any parameter, and if $f$ is the position parametrized on time (domain), with its value on the ordinate axis (range) as $x$, then the first derivative is velocity, $v$, and the second derivative is acceleration, $a$. Each derivative is itself a function on the same domain interval, generally each has a different range or image, which is just organization nomenclature for such intervals of numbers.
In physical systems, the function is dependent on both time and space (electromagnetic fields, quantum wave functions), whereas in mathematical frameworks (calculus, analysis), the function is often dependent upon a single parameter which is representative of an $n$-tuple, in Cartesian space the point is represented by $x=(x_1,x_2,\dots,x_n)$.
In a direct-product of (real) axes, as opposed to a union of $n$ real axes, which only includes points on the axes, the points have $n$ coordinates. Some systems naturally demand the direct-product of spaces such as classical mechanics, and statistical mechanics, which use in the configuration space variables written with a bold font-face, or with a bar or arrow above the variable, as in, $\boldsymbol{x}$, $\vec{x}$, and $\bar{x}$. But that is just motivation to grasp the concept of the single-variable, single-valued function framework, which is used again and again in piece-wise solution of more complicated configurations, or as a dimensional reduction of the problem to the single-variable case, as in central-force dynamics.
The slope of a function at a given point in the domain, $x$, is the derivative of that function, at $x$. The value of some function, $f(x)$, is near those values of the function for a given neighborhood of distance epsilon from a point on the domain, that is a local neighborhood of $x$, $(x-\epsilon, x+\epsilon)$, or $|\Delta x|=\epsilon$, if the function is continuous on the entire interval. And if the function is smooth, then the slope of the function at $x$ is defined at all points in the domain, as opposed to a curve with a cusp, or discontinuity, which is not smooth at that point and where it is said that the derivative is undefined.
For $\epsilon\ll 1$, the function is approximately linear, and so the slope of the function at $x$ is the tangent to the curve at that point, or the derivative of $f$ at $x$, which is the matter in subject.
Through a forward, positive incrementing of the argument to $f$, as well as making an ordinate value difference calculation (Delta-$f$), we formulate the derivative of any smooth, continuous function, $f$.
$$ \Delta f = f(x+\epsilon{}) - f(x) $$That is the difference between $f$ at $x+\epsilon{}$ and $f$ at $x$, for some small quantity $\epsilon{}$, eventually considered as an infinitesimal, is Delta-$f$. For functions which are polynomial, that is any $f$ which is smooth and continuous to be demonstrated in the Taylor Series article, it will be shown that for $\epsilon{}\ll 1$, the quantity Delta-$f$ can be reduced to a first-order appearance of $\epsilon{}$, as a result of the binomial theorem and the relative size of epsilon.
Since each point on the real number line is a cluster point, meaning that for any non-zero number epsilon, $\epsilon\in\mathbb{R}$, the interval $(x-\epsilon, x+\epsilon)$ contains an infinite number of points, and so we can properly talk about any small, local distance from $x$, $\epsilon\lt 1$.
The derivative of $f$ in parameter $x$, we divide by $\epsilon{}$, or multiply Delta-$f$ by the inverse of $\epsilon{}$, to get the rise-over-run at the origin-offset point $x$. So, that's a triangle with its hypotenuse as the tangent of interest, and the vertical right-side (there's three sides) as the $\Delta f$ (could be plus or minus), with the horizontal right-side extending from the point $(x, f(x))$ to the $(x+\epsilon{}, f(x))$ point which is a distance $\epsilon{}$ away.
Which is the definition of the tangent of a function $f$, with respect to its parameter (not perimeter!), or point in abscissa and domain, $x$. The tangent is independent of the infinitesimal, $\epsilon{}$, and more conventionally appearing as $dx$, with the infintesimal operator $d$ acting on the domain variable $x$. Using the binomial theorem we will see that we only need the first-order in the infinitesimal, because higher order terms, $(dx)^n$, are much smaller than the terms appearing with one factor of $dx$, by vast amounts, where the first order appearances are cancelled in the derivative as the denominator is the single factor of the infinitesimal in abscissa change, $\Delta x\to \epsilon{}\to dx$. $dx$ is defined in terms of the interval of the function's domain, and is also referred to as Delta-$x$, $\Delta x$, or $\delta x$, and is always the infinitesimal measure on $x$ (could be in configuration space or the time parameter, or a combination).
Case $f$ is a Power Function
So, we will calculate the tangent for $f(x)=x^r$, for the case where $x=0$, and for non-zero $x$, for general power $r\ne{}-1$, and starting with the rise of the tangent at $x$, in eq. \eqref{PowerFuncDelta}.
Case $x=0$
For $x=0$, the numerator of the derivative is just $dx^n$, and so the tangent is, $\lim_{dx\to 0} \frac{(dx)^r - 0^r}{dx} = \lim_{dx\to 0} (dx)^{r-1}$. If $r\gt 1$, then the limit is zero, and if $r=1$, then the limit is one, and if $r\lt 1$, then the limit diverges to infinity. So, the slope of a power function at the origin is one for $r=1$ (identity function, line through zero), zero for all $r\gt 1$, and infinite for $r\lt 1$.
Case $x\ne 0$
To recap Ancient Functions, the Binomial theorem is $\sum\limits_{n=0}^n \binom{n}{k} x^{n-k}(dx)^k$ where the binomial coefficient is also referred to as an $n$-choose-$k$ factor, and for Natural number $n$ the sum is terminated at the $k=n+1$ term, and for non-integer real numbers the series does not terminate algebraically. When we normalize the binomial as, $(x+dx)^n =x^n(1+dx/x)^n$, we get:
The formula is derived as follows, paying homage to Leibniz' convention for $\Delta$, or Delta, which is the first letter of the greek word for difference, diffeo, which is to signify the value being the change in whatever is formulaically contiguous immediately to its right: given $f(x)=x^n$ then we calculate the change in $f$ at $x$ for an arbitrary deviation from $x$ by $dx$, the arbitrariness in this case will be such that the distance from $x$ will be measured but much less than unity, algebraically. One can think about the derivative of $f_n$ at $x$ as being associated with a right-triangle, where the tangent is the hypotenuse, and the right-angle sides are the rise and run.
Which says that if we want to know the difference in $f$ at $x$ and $f$ at some lesser/greater amount $dx$ away from $x$, then we need the Binomial Theorem to solve this difference equation. The Binomial Theorem states that for natural number $n\geq 0$ (first published by Bhāskara II in 1150, along with the factorial concept): $(a + \epsilon{})^n= n! \sum\limits_{k=0}^n (k!(n-k)!)^{-1} a^{n-k} \epsilon{}^k$ The factorial notation ($!$) was introduced in 1808 and is a shorthand for the number which is a multiplicative sequence $\prod\limits_1^n q$ (where $ab=ba$ is the commutative property of integers and reals, and is used $(1)(2)\cdots(n)=(n)(n-1)\cdots(1)$):
$$ q!=(q)(q-1)(q-2)\cdots(q-(q-1)) $$Where the last term can also be written $(q-(q-1))=1$, being useful for organization. For $q=5$, we have $5!=(5)(4)(3)(2)(1) = 120$, or 120 ways of arranging 5 distinguishable components. The binomial coefficient, eq. \eqref{Binomial_Coefficient_eq}, is fundamentally composed of a numerator which counts the number of ways of choosing $k$ components from $n$ choices, that is $n$ possibilities for the first choice, $(n-1)$ possibilites for the next choice, and so one through the $k$th. Note, counting just one of the two types of factors is complete information about each expanded term. Since each arrangement with the same $k$ elements is practically the same, one divides the numerator by $k!$, being the number of ways $k$ components can be rearranged (permuted), diminishing the count of terms in the numerator, but never less than unity.
And so, more numerous factors of $\Delta x$, are (more) negligable (than the squared infinitesimal term). So we only keep the first two terms of the binomial expansion, labelling with the big-o ($\mathcal{O}$) the largest of the terms we are omitting from consideration:
$$ \Delta f=f(x+\Delta x)-f(x) = (n!)[ (n!)^{-1}x^n (\Delta x)^0 + (1!(n-1)!)^{-1}x^{n-1}(\Delta x)^1 + \mathcal{O}(\Delta x^2) ] -x^n $$ $$ = n!(n!)^{-1}x^n + (n!)((n-1)!)^{-1}x^{n-1}(\Delta x)^1 + \mathcal{O}(\Delta x^2) ] -x^n $$And since $q^0=1$, $(n!)(n!)^{-1}$ is just unity, and $(n!)((n-1)!)^{-1}=n$, we have:
$$ \Delta f= x^n + nx^{n-1}\Delta x + (-x^n) $$ $$ = nx^{n-1} \Delta x $$So in the limit $\Delta x\to0$, the derivative gives us the slope of the function precisely at the point $x$, and this is annotated $\frac{df}{dx}=f'\,(x)=f^{(1)}(x)$.
Since the slope of the tangent equals the rise-over-run, $\Delta f/ \Delta x$, for any term of the function (See Taylor Series) which goes like $Ax^n$ in $x$ there is contributed a term $nAx^{n-1}$, since you can multiply the function in the derivation by $A$ without changing the dependence on $n$.
One can verify that the derivative of the anti-derivative is the same function of $x$, where one of the algebraic forms of unity, $(q+1)(q+1)^{-1}=1$, is utilized. It isn't obvious that the slope of the Natural logarithm of $x$ is $x^{-1}$, but that the derivatives and anti-derivatives of every other power of $x$ is dictated by the above formula.
The Chain Rule: $D_x(f \circ g(x))$
If $f(x)$ and $g(x)$ are continuous, smooth functions on an interval, $X$, then so is their composition, $f \circ g(x)$, where this is read $f$ of $g$, and is equivalently written $f(g(x))$. This formula for handling any function's composition which can be broken up into two parts (an outer and an inner part), where one function could be a polynomial and the other exponential, or trigonometric. So we break it down the same way as for one function, and then introduce the algebraic unity form, $dg/dg$, to obtain the well-known formula.
Note that if the g(x) is the identity, then we immediately see that $d(f\circ g(x))=df(x)$, as expected. Here we recognize that we're dealing with an infinitesimal in g, or $dg(x)=g(x+\Delta x) - g(x)$, and introduce it using the multiplicative factor of unity.
Eq. \eqref{Chain_Rule_eq} is the Chain Rule.
The Slope of $f_h(x)$
The hypotenuse function, eq. \eqref{f_h}, as a function of $x$, or $f_h(x)$, for $x$ on the interval $[1.5,10]$.
Since $f_h$ is a composition of the outer, $f$, and the inner function, $g$, $f_h=f\circ g$, we just need the derivatives of $f$ and $g$. With, $f=g^\frac{1}{2}$, its derivative is, $f'=\frac{1}{2} g^{-\frac{1}{2}}$. And for the inner part, $g=(x^2 + a^2)$, so its derivative is, $g'=2x$. Putting them together with the Chain Rule, we have.
Where we recognize a slope with an asymptote of unity--now quantifed. To find the quantity of how much less than one the slope is for $x\gt a$, we need the Taylor Series to give us the idea of its first-order estimate.
The slope of the hypotenuse function, as a function of $x$, or $f_h'(x)$, for $x$ bounded from below by the door thickness, $a=1.5$.
The Slopes of $\log_b(x)$ and $b^x$
The derivation of the derivative of the logarithm uses identities which should be familiar from the Natural Exponentials article, but here's a sketch.
$$y=\lim_{\epsilon\to 0}\log_b(1+\epsilon)=\lim_{\epsilon\to 0}\log_e(1+\epsilon)/\log_e(b) =\epsilon/\log_e(b)$$The first-order approximation, for small $\epsilon$, of natural log of $1+\epsilon$ is $\log_e(1+\epsilon) \approx \epsilon$, where the accuracy increases with smaller $\epsilon$, which is proven later after the Taylor series.
Here the logarithmic change of base formula was employed, as described in the Natural Exponentials article.
The derivative of the exponential uses the familiar property of exponentials, $b^x$ that for small argument, $\epsilon$, they are approximated by $1+\log_e(b)\epsilon$, as graphically proven in the Natural Exponentials article for $b\in\{2,e,3\}$, and analytically proven following the Taylor series.
And so we have for base-$e$, we have $D_x(\ln(x)) = \frac{1}{x}$ and $D_x(e^x) = e^x$.
We could arrive at the same result for $D_x b^x$, using the change of base rule (see Natural Exponential article) and then used the differential definition of $e^x$ and the chain rule, as in $b^x=e^{x\ln(b)}$, $D_x b^x=D_x e^{x\ln(b)} = e^{x\ln(b)}\ln(b)$. Similarly for the arbitrary log, using the log change of base rule, $D_x(\frac{\log_e(x)}{\log_e(b)}) = \frac{1}{\log_e(b)}\frac{1}{x}$.