Modernization of the Pythagorean Theorem with Help from the Binomial Theorem, and the post-Euler era

The Babylonians knew the triangle, be it acute, obtuse, or the most useful form, the right triangle (2000 BC).

The three interior angles of a triangle add to 180°, which is the result of work done in Euclid's Elements (300 BC), though much of that work could have been done by students of the Pythagoras school, of the same era. The proof lies in considering not only the interior angles, but also their complimentary angles. So, we construct the Sum of the angles of a triangle with an arrow on the line whose segment is a side, or edge, of the triangle. If a path is drawn around the triangle we know that a vector pointing in the forward direction turns one whole turn upon completing the path, which travels along the three edges, and stops at the three vertices, where it rotates to face the subsequent edge's direction.

When a line (dashed here) crosses another line, it makes two different angles, one acute (θ) and one obtuse (φ) (unless the intersection is perpendicular, it clearly being not so here) which sum to 180°—half the angle subtended by a circle (360° around).

Right Triangles

The right triangle has two edges meeting at a 90°-angle, or right-angled vertex. A right-angled vertex, is sometimes indicated by a small square. The relationship of the three sides is known by the equation, \begin{equation} \label{Pythagorean} a^2 + b^2 = c^2 \end{equation} Where $c$ is the hypotenuse, and $a$ & $b$ are the two edges of the right-angled vertex. The relation for edges of a right triangle is useful anytime there is decomposition of a position into orthogonal components—each at right-angles to the others, which is cartesian coordinates, in euclidean space.

The area of a rectangle is the width times the height. The area of a square, with four equal edges, and four right angles, and edge $x$, is $x^2$ ($x$-squared). The right-triangle theorem is geometrically a relationship of areas.

The area of the right-triangle is half the multiplicative product of the two non-hypotenuse edges. By separating a rectangle into two halves you are left with two identical right-triangles:

Which accounts for the factor of a half in the formula for the area of a right triangle: $A=ab/2$.

An equivalent statement for $x^2$ is, $x*x$ (or, $x$ times $x$, and the multiply operation is often assumed when the two multiplicands are beside eachother, $xx$). $x^2$, where $x$ is the variable (because it could be $a^2$, $b^2$, or $c^2$, or $x\in \{a,b,c\})$, is an example of the exponential notation, of which the general form is the $n$'th degree of $x$: $x^n=x*\ldots$, where the "$\ldots$" ellipsis here stands for exactly $n-1$ factors of $x$. $n$ is just used is an index to an intermediate usage, though in advanced text it is used with other symbolic indexes in abstraction (such as, $df:\mathbb{R}^k\to\mathbb{R}^l$).

Since a regular polygon has area which depends on the number of sides and the length of the side, and because we know an equilateral triangle is $0.5bh$, base times height, and calculating height as $b\sin{(180^\circ{}/3)}$, we have the triangle as a sidewise-quadratic form of area (especially like a square, without the trigonometric factor in it). Regular polygons are $n$-sided flat shapes, with the angle between adjacent sides being the same all the way around. Really, the Pythagorean Areas are applicable for regular polygons not just squares (the $n$-factors, and the trig-factors cancel).

Binomial Theorem

If you hold one of the edges of the right triangle fixed to a constant value, then you get a function $f$ for the hypotenuse as a single variable function (motivation is more advanced equations, which use the same form). \begin{equation} \label{f_h} c=f_h(x)=\sqrt{(x^2 + a^2)}=(x^2 + a^2)^\frac{1}{2} \end{equation} where the $x$ was substituted for $b$ in the right-triangle formula, $a$ is set/fixed as a constant, and the subscript on $f$, stands for hypotenuse.

It was the year 1620 when Henry Briggs discovered a series solution for the binomial with natural-number exponent, but not until 1665 for Isaac Newton to see that it was a general formula for a normalized binomial form (rational exponent, like we found here). [1]

The generalized binomial theorem starts with the binomial theorem, which states that the result of multiplying out the binomial raised to the $n$th power is given by the following summation: $$ (x + y)^n=\sum_{i=0}^n{\binom{n}{i}x^i y^{n-i}} $$ Where $\binom{n}{i}= \frac{n!}{i!(n-i)!}$ is the binomial coefficient, a convention for the two pieces of information in argument, and read as n choose i. And the same on the normalized binomial, $$ (1+x)^n=\sum_{i=0}^n{\binom{n}{i}1^{n-i}x^i} = \sum_{i=0}^\infty \frac{n(n-1)\cdots (n-i+1)}{i!} x^i $$ With the generalization being to advance the upper limit to infinity, which is justified by the fact that the series terms, $i\ge n+1$, have that factor of zero in them (due to $(\dots (n-(n+1)+1)\dots)$ in the numerator). In addition to the natural numbers, the series lends itself to fractional exponents, in which the series doesn't terminate. Doing checks, this is consistent with knowing the fractional exponent (quotients, $\mathbb{Q}$, in the exponent) normalized binomial results in an irrational surd ($\sqrt{2}\notin \mathbb{Q}$).

Monomials of the first five natural number exponent powers (skipping 0): $f_1(x)=x$ is the red one, $f_2(x)=x^2$ is a little bent (cobalt), and $f_3(x)=x^3$ is next most bent (pale green), with two more of the monomials showing progressively more bend.

The plotting of monomials is most dramatic around 1—simply because $1^n=1$, for all $n\in \mathbb{N}$, and from studying their behavior around 1 one understands the two sides of these single variable functions—an exercise. Just as the monomials are found in the generalized binomial formula, they are naturally occuring in the Taylor Series, as well.

A web search for "history of the square root" has a high quality return (Quora, Mathforum, Stackexchange, Wikipedia), where one can learn that n'th roots were studied and written about in the early 1600s, mostly referred to as radix (since math was written about in Latin) and radix means root, and the related term radical also means related to roots. So, we continue naming convention.

Physics is very often concerned with the slope of a function, such as the slope of a car's track in the lab, or the change in potential energy for a change in position for any action-at-a-distance field like gravity or electricity. We're working our way up to calculating the slope (or tangent) at arbitrary points on a curve.

Any given number is silently "raised" to the exponent one, or I should say there is an exponent position on every set of parentheses.

$f_h(x)$ is defined from $x$ equals zero on up, and we'll look at a few such graphs below. Not all single variable functions stay finite for finite $x$, the hyperbola, being one-over-x ($1/x$), can't be plotted for very small $x$, close to or including zero. In order to graph $f_h$ we consider the values of this function at every point along the interval of interest in $x$, from $x_A\ge{}0$ to $x_B>x_A$, or using interval notation the domain is, $X=[x_A, x_B]$.

A single variable function is the most fundamental of graphs, as it lies in two-dimensions (the plane of a screen or page), and to inspect an aspect of a multivariable function's behavior, one can often reduce the problem to a single-valued function on some interval, $X\subset \mathbb{R}$, of the real numbers to an interval of the reals, $Y \subset \mathbb{R}$, $f(x)=f:X\to Y$.

Plot of the hypotenuse, $c=f_h(x=b)$, with edge $x$ on the abscissa, and fixed edge $a=1.5$, versus the identity function (dashed line).

This curve always has a small amount of curvature, never a straight line, but the difference between it and the straight line drawn, $f_h - f_{id}$, is decreasing with $x$.

So there is not really a line with slope one and $y$-intercept between zero and the fixed side length, which won't be crossed at some point. Which just means the swinging door needs a little space around it, no matter how thin you make the door.

So, for $x\gg a$ (much greater than) $a$, the slope of the curve, in graphical display of the value of the function, which is $c$, the hypotenuse, the slope is increasing from close to a constant (horizontal line, independent of $x$), approaching the slope of the asymptote.

Plot of $f_h(x)$, with fixed edge $a=1.5$, and variable edge on the interval $[2,5]$, versus $x$ (dashed line).

From $x=2$ on up, plotting the hypotenuse as a function of the varying edge demonstrates getting even closer to the asymptote $f(x)=x$, with the slope getting closer and closer, but never acquiring a value of one, except at infinity.

The function can be rewritten an endless number of ways (because you can multiply by any representation of 1 or add any representation of zero and make another equals sign), but if it is divided through by $a$ and leaving an $a$ out front, and using the rules of algebra:

$$ (ab)^\frac{1}{2}= (a)^\frac{1}{2} (b)^\frac{1}{2} $$

Making a substitution, for extra clarity of the algebra, letting $c^2=a$ in the previous equation, we have:

$$ ((c)^2b)^\frac{1}{2}= ((c)^2)^\frac{1}{2} (b)^\frac{1}{2} = c (b)^\frac{1}{2} $$

We can look at an alternate form of $f_h$, in particular we will separate the function into asymptote (on the left of the composite) and the other (compound) part on the right. In this constructed way we can look at the hypotenuse function over the $[a, \infty]$ part like so:

\begin{equation} \label{f_plus} f_{h,a+}(x)=\Big(x\Big)\left(1 + \left(\frac{a}{x}\right)^2\right)^{1/2} \end{equation}

As you can see the $x$ is on the left, and the normalized binomial function is on the right. This is the correct form for $x$ greater than $a$ because the behavior of the left and right composites are well defined both at $x=a$ and for $x=\infty$, as opposed to the $f_{h,a-}$ factoring, which is ill behaved on that interval. A composite number is one that can be factored (multiplication) into two whole parts (where a part can itself be composite), whereas a compound number is one that is represented as the addition of two parts.

We'll explore the complicated part by next looking at the inverted monomial (one over $x$). Inversion, generally, is not simply swapping the numerator and denominator, it's a formulation in which the combination of the inverse and the original forms the identity function.

$f_{h,a+}$ is a product of x and another function of $x$, \begin{equation} \label{BinomialForm} (1 + y)^{1/2} \end{equation} where $y=(a/x)^2$, because that's a constant ($a^2$) times one-over $x^2$. Understanding the single term functions gives understanding for how a compound, or composite function looks at different intervals—and this is the identity-times-binomial. Monomials (an expression one term which has a natural number in the exponent) are expressed by a variable, like $x$, with an exponent in $\mathbb{N}$.

A monomial multiplied by a constant has the same behavior as the monomial scaled by unity—it's scaled amplitude by the constant. For any formula of a single variable $g(x)=f(x)-x$, then we can say the difference between $f$ and $g$ is affine, in that is either a linear transformation, a vector addition (such as $x$ in the 1-D slice (2D graph)). So, note that the $y$ substitution is of a similar form because putting $x$ or $x^2$ in the denominator makes the function go small for large $x$—similar in that they represent negative exponents, or Laurent nomials.

The right side of the formula for $f_{h,a+}$ has a similar term in it,

$$ x^{-2} = \frac{1}{x^{2}} $$

Which puts $x$ in the denominator, being the dividing variable, of the ratio fraction. If we wanted to add a column to the table for $f(x)=1/x^2$, we could just square the $1/x$ column.

$$ x^{-2} = x^{(-1)(2)} = (x^{-1})^2 = \left(\frac{1}{x}\right)^2 $$

In this way, we see the relationship between the two functions of $x$. Monomials are monotonic over the domain [0,∞], which means either never increasing, or never decreasing.

From the table, and by inspecting the function, inversion takes the point of origin (zero) and returns infinity, and vice versa.

For the portion of domain less than fixed-side $a$, the function is labelled with the subscript $a-$: \begin{equation} \label{f_minus} f_{h,a-}(x)=a\left(1 + \left(\frac{x}{a}\right)^2\right)^{1/2} \end{equation} For [0, a], the asymptote is $a$, on the left side of the normalized function, and multiplying on the right side is a function deviating from $1$ by the quadratic proper fraction. Now with $y=(x/a)^2$, with which to use the binomial series on. Both forms for f(x), above in equations (\ref{f_plus}) and (\ref{f_minus}), are the same as f(x), algebraically you can put the asymptote part of the function back into the outer power of 1/2, and you get the hypotenuse function. $$ f(x) = \begin{cases} f_{h,a-}(x) & 0 \le x \lt a \\ f_{h,a+}(x) & a \le x \end{cases} $$ At $x=1$, $f_{h,a-}=f_{h,a+}$, so we can write the graph of $f_h$ as the union of the two graph sections, $(X,f_h)=(X_{\lt a},f_{h,a-}) \cup (X_{\ge a},f_{h,a+})$.

It took 1,800 years to get from the Pythagorean theorem to the generalized binomial series, so it is informative to Binomials were studied for a very long time, with a global history of what the formula is for $(a+b)^n$ for arbitrary positive integer n. To use the binomial series, to better understand the plot of the hypotenuse function, we need two series: one for the domain interval from [0,1] and the other function expansion for over 1, because we need a convergant series expansion.

The plot also explains a feature of the swinging door — that you only have to make the clearance in its frame a small fraction of the thickness of the door. If you had a very thick door, and didn't want to leave a big space around the closed door and its frame, you would have to radius the door, or bevel it so that the inner edge of the door didn't obstruct the opening motion. So it is instructive to demonstrate the narrow gap around 1.5" thick, household doors corresponds to a linear, one-to-one asymptote over the domain of $x\gg$(a=1.5) by looking at the hypotenuse function over the configuration of a swinging door in a tight frame, or about $x=33 \textrm{in.}$ :

The hypotenuse function at x=32 inches to x=34 inches, for a fixed edge of 1.5 inches, is 32.03513 and 34.03307 inches respectively (ordinate), and has very subtle curvature (meaning the clearance is reduced by 0.2% of an inch by widening it from 32" to 34").

This plot has curvature that is imperceptable to the eye, but means something to someone who ever wondered how swinging doors didn't hit the frame when the closed-door clearance is so tight, which we get to elucidating with what physicists call convergance. A convergent function is when some complicated function looks like a constant, or a monomial (like the diminishing factor times the identity monomial), at significant parts of the domain, such as while tending towards infinity. We measure the difference between the $f_{h,a+}$ function and $x$, at the width-to-thickness ratio of doors, and we see a frame minimal clearance of much less than the thickness.

The hypotenuse function looks like an asymptotic monomial, written explicitly with the exponent, as $x^1$ (x to the first power), not to be confused with how many numbers there are inside the outermost exponent (that's what the mono and bi refer to in the nomials). The linear asymptote with slope one ($f(x)\approx x$) as $x>>a$ ($x$ is much larger than $a$, the other edge of the triangle), where I changed the variable $x=b$ (or $b\to x$, from $b$ to $x$) because $a$ and $b$ are associated with constants of the configuration, in equation (\ref{Pythagorean}).

f(x, a=1.5)xΔ
32.0351320.0351
33.0341330.0341
34.0331340.0331

So by inspection of this table we can see that the variance of $f(x)$ from $x$ (the asymptote) is still decreasing, but it's in the third decimal place (thousandths, not very much). If we take the difference between $f(x)$ and $x$, you can see that difference is decreasing, which means that the wider we make the door (distance from the hinge axle), the closer the hypotenuse is to the breadth, so the tighter we can make the frame. As expected, the data is consistent with and supports the expected behavior from inspecting $f_h$.

In this plot of the hypotenuse as a function of x, for real ratios of door width to door thickness, we see that the variance of the hypotenuse, at x= 32" and x=34" (normal door widths, as measured left to right standing behind one) is very close to an f(x) value (ie, the "y=mx+b" slope is getting closer to 1 with increasing x).

One of the most important changes to all this as you progress in your physics studies is that the hypotenuse function becomes the radius, which is decomposed into orthogonal components of position measurement. You already know orthogonal, it's what the two non-hypotenuse edges of a right triangle are to eachother—two lines (along which we measure position) intersecting at a 90° angle. In three dimensions, the third component is simply super-positioned with the plane we know and love!

If you're wondering what the next step is to analyze the nature of this single variable function, then you're in the same boat anyone studying the works of Pythagoras and Euclid was in—for hundreds of years, in fact there wasn't another breakthrough in physics until 1665, where the hypotenuse function f(x) is actually called a binomial function with power of one half.

A notational characteristic of math and physics is that all constants and variables are single character in length (could be uppercase or lowercase, boldfaced, script, caligraphic, Greek, and more variants)—there is definately convention as to where character variants are used in the math and physics frameworks, but the significance of a formula is summed up as being a (single-valued) function of some variable (conventionally $x$, such as in the area of any particular regular polygon with edge $x$), or the formula could be a function of more than one variable, as in the function of $f(a, b)\to{}f(x_1,x_2)$ for the hypotenuse length of a right triangle.

The Pythagorean theorem lets us solve problems like calculating how much time it takes to “motor” straight across a river with uniform flow, and parallel banks. We go straight across a river by knowing the speed of our boat in still water, then angling it so that the centerline from the point of the bow through the middle of the stern is along the line of a hypotenuse, with one edge of the right-triangle parallel to the bank—with this component being equal and opposite the flow rate of the river, and the other parallel with the line going straight across is whatever remains of the motor power.

There is a Physics Stackexchange problem on a river and a swimmer focusing on a landmark on the other side, asking about the path across a river where the motor boat is always pointed at a fixed point on the bank, directly across the river from the starting point, but the question has not dismissed because it is an uncommon exercise: swimmer with a focal direction. Let's say the swimmer swims twice as fast as the river flow, but the river is three times as wide as the bank-wise distance up from the start point to the focal point, then the swimmer will be taken downstream from the start point for a good portion of the journey. In general, a newtonian equation of motion can be written for the components of the right-triangle directions, which may be as hard to solve as the catenary solution of the rope suspended at two ends, but there's always numerical solution of differential equations through integration.



Back to Physics Listing
Copyright © 2025 Gabriel Fernandez