Modernization of the Pythagorean Theorem

The Babylonians knew the triangle, be it acute, obtuse, or the most useful form, the right triangle (2000 BC).

The three interior angles of a triangle add to 180°, which is the result of work done in Euclid's Elements (300 BC), though much of that work could have been done by students of the Pythagoras school, of the same era. The proof lies in considering not only the interior angles, but also their complimentary angles. So, we construct the Sum of the angles of a triangle with an arrow on the line whose segment is a side, or edge, of the triangle. If a path is drawn around the triangle we know that a vector pointing in the forward direction turns one whole turn upon completing the path, which travels along the three edges, and stops at the three vertices, where it rotates to face the subsequent edge's direction.

Figure 1: When a line (dashed here) crosses another line obliquely, it makes two angles with the bisected line, one acute (θ) and one obtuse (φ) which sum to 180°—half the angle subtended by a circle (360° around).

If each change of direction at a vertex is protracted, it has the angles drawn from the intersection of the extended side with the next side following the included angle, with such complementary angles of the intersecting sides as depicted in figure (1). So for the three turns at a vertex, we have:

$$\theta_i=180^\circ - \phi_i$$

Adding this three times gives for the exterior angles:

$$ \sum_1^3 \phi_i = 360^\circ $$

To reiterate, if the diagram were endowed with three arrows for the orientation of the walker, as travelling the perimeter, then upon returning to the original position the walker arrow will have rotated some amount three times for a sum total of one whole rotated turn. Plugging into the sum for the three complementary angles:

$$ \sum_1^3 \theta_i = 3(180^\circ)-360^\circ $$

So we have the interior angles of a triangle sum to $180^\circ$.

Right Triangles

The right triangle has two edges meeting at a 90°-angle, or right-angled vertex. A right-angled vertex, is sometimes indicated by a small square. The relationship of the three sides is known by the equation, $$\begin{equation} \label{Pythagorean} a^2 + b^2 = c^2 \end{equation}$$ Where $c$ is the hypotenuse, and $a$ & $b$ are the two edges of the right-angled vertex. The relation for edges of a right triangle is useful anytime there is decomposition of a position into orthogonal components—each at right-angles to the others, which is cartesian coordinates, in euclidean space.

Figure 2: A right triangle, with edges a and b, and hypotenuse c, where the right-angle is indicated by the small square at $90^\circ$ vertex.

The area of a rectangle is the width times the height. The area of a square, with four equal edges, and four right angles, and edge $x$, is $x^2$ ($x$-squared). The right-triangle theorem is geometrically a relationship of areas.

The area of the right-triangle is half the multiplicative product of the two non-hypotenuse edges. By separating a rectangle into two halves you are left with two identical right-triangles:

Figure 3: A rectangle with a diagonal drawn, separating it into two identical right triangles. The area of the rectangle is $A=ab$, and the area of each triangle is half that, $A=ab/2$. This entails the Right-Triangle Area Theorem.

Which accounts for the factor of a half in the formula for the area of a right triangle: $A=ab/2$. This method works when the shape is a parallelogram also, then the area is the length of the base times that of the height. The diagonal of a parallelogram bisects the shape, and so the area of a general triangle is $A_\triangle=\frac{1}{2}\text{base}\times\text{height}$.

Figure 4: A parallelogram with a diagonal $d$, and vertical sides of the end right triangles, labelled `h`, drawn. The diagonal bisects the shape into two mirror-image triangles. The area of each triangle is $\frac{1}{2}bh$, as the area of the parallelogram is $bh$.

Regular Polygons

A regular polygon is a shape that has a finite number of sides, each of which is the same length, $s$. An $n$-sided regular polygon, has an angle between adjacent sides as the same for all $n$ vertices. A regular polygon has area which depends on the number of sides and the length of the side, and because we know an equilateral triangle is $0.5bh$, base times height, and calculating height as $\sqrt{b^2-(b/2)^2}$ from the Pythagorean theorem, we have $A=\frac{\sqrt{3}}{4}b^2$. The area of the equilateral triangle is a quadratic form of side length (like a square).

Really, the Pythagorean Areas are applicable for all regular polygons not just squares, where the geometric particulars are the same for each type of regular polygon to yield the side-squared terms in the recognizable relationship.

$$ \frac{1}{4}na^2\cot{(\pi/n)} + \frac{1}{4}nb^2\cot{(\pi/n)} = \frac{1}{4}nc^2\cot{(\pi/n)} $$

Area of a Regular Polygon

The area of a regular polygon is quadratic in its side-length. Each side, of length $b$, is the base for an isosceles triangle-wedge of the regular polygon. The inscribed angle subtended by the base is $\theta=2\pi/n$. The height of the triangle is determined by the angle $\theta$ combined with the side length.

Figure 5: Decagon Area Diagram: A wedge section of a decagon is depicted with subtended angle, $\theta=(2\pi)/10$, and an overlain perpendicular from the center to the base of the isosceles, bisecting it into two mirror-image right triangles.

For an $n$-sided regular polygon, the angle subtended by an isosceles wedge, with base $b$ in figure 5., is $\theta=2\pi/n$. So $a=\frac{b}{2}\cot{(\pi/n)}$, where the cotangent is simply one-over-tangent, and the area of the wedge is thus $A_w=\frac{b^2}{4}\cot{(\pi/n)}$. For the area of the $n$-gon, it is $n$ such wedges, and we have the following result:

$$ \begin{equation}\label{area-ngon} A_{ngon}=\frac{1}{4}ns^2\cot{(\pi/n)} \end{equation} $$

Figure 6: Regular triangles (equilateral) with areas $A_a$, $A_b$, and $A_c$, where the sides are $a$, $b$, and $c$ respectively. The areas satisfy the Pythagorean relation, $A_a + A_b = A_c$, because the trigonometric factors cancel, leaving the squared-side relation.
Figure 7: Regular squares with areas $A_a$, $A_b$, and $A_c$, where the sides are $a$, $b$, and $c$ respectively. The areas satisfy the Pythagorean relation, $A_a + A_b = A_c$, because the trigonometric factors cancel, leaving the squared-side relation.
Figure 8: Regular pentagons with areas $A_a$, $A_b$, and $A_c$, where the sides are $a$, $b$, and $c$ respectively. The areas satisfy the Pythagorean relation, $A_a + A_b = A_c$, because the trigonometric factors cancel, leaving the squared-side relation.

Binomial Theorem

If you hold one of the edges of the right triangle fixed to a constant value, then you get a function $f$ for the hypotenuse as a single variable function, this binomial form can be normalized and turned into a series using nothing more than the Binomial Theorem. The equivalent series representation of the binomial to the one-half power is an infinite series, while a binomial with whole number powers is a finite series, which is the subject in this section.

$$\begin{equation} \label{f_h} c=f_h(x)=\sqrt{(x^2 + a^2)}=(x^2 + a^2)^\frac{1}{2} \end{equation}$$

Where the $x$ was substituted for $b$ in the right-triangle formula, $a$ is set/fixed as a constant, and the subscript on $f$, stands for hypotenuse.

It was the year 1620 when Henry Briggs discovered a series solution for the binomial with natural-number exponent, but not until 1665 for Isaac Newton to see that it was a general formula for a normalized binomial form (rational exponent, like we found here). [1]

The generalized binomial theorem starts with the binomial theorem, which states that the result of multiplying out the binomial raised to the $n$th power is given by the following summation:

$$ (x + y)^n=\sum_{i=0}^n{\binom{n}{i}x^i y^{n-i}} $$

Where $\binom{n}{i}= \frac{n!}{i!(n-i)!}$ is the Binomial Coefficient, a convention for the symbolizing the function of the two pieces of information in argument, and is also read as n choose i, for the combinatorial relationship of finding the coefficient of the $i$'th powered term, $x^i$. And, normalizing the binomial:

$$ (1+x)^n=\sum_{i=0}^n{\binom{n}{i}1^{n-i}x^i} = \sum_{i=0}^\infty \frac{n(n-1)\cdots (n-i+1)}{i!} x^i $$

With the generalization being to advance the exponent to arbitrary size, which is justified by the fact that the series terms, $i\ge n+1$, have a factor of zero in them, for $i=n+1$ (if there is a solution of such with the required integral $i$). The last term, $(\dots (n-(n+1)+1)\dots)$, in the numerator of the coefficient is zero for $n$-choose-$(n+1)$ as an intrinsic delimiter (choosing two out of one, for instance, makes no sense). So the next step of innovation was to extend the domain of the exponents to the quotients, and reals, by allowing the chosen index to increment the numerator terms into the negative space in an infinite, alternating series.

$$ (1+x)^r=\sum_{i=0}^\infty{\binom{r}{i}x^i} = \sum_{i=0}^\infty \frac{r(r-1)\cdots (r-i+1)}{i!} x^i $$

For $r=0.5$, the first few terms are as follows:

$$ \binom{1/2}{0} \equiv 1 $$

$$ \binom{1/2}{1} = \frac{\frac{1}{2}}{1!} = \frac{1}{2} $$

$$ \binom{1/2}{2} = \frac{\frac{1}{2}(\frac{1}{2}-1)}{2!} = -\frac{1}{8} $$

$$ \binom{1/2}{3} = \frac{\frac{1}{2}(\frac{1}{2}-1)(\frac{1}{2}-2)}{3!} = \frac{1}{16} $$

$$ \binom{1/2}{4} = \frac{\frac{1}{2}(\frac{1}{2}-1)(\frac{1}{2}-2)(\frac{1}{2}-3)}{4!} = -\frac{5}{128} $$

Solving for the first five terms of square-root of the binomial:

$$ (1+x)^{1/2}=1 + \frac{1}{2}x -\frac{1}{8}x^2 + \frac{1}{16}x^3 -\frac{5}{128}x^4 + ... $$

For $x=1$, we have for square-root of two using the first four terms of the series (overshooting):

$$ \sqrt{2} \approx 1 + \frac{1}{2} -\frac{1}{8} + \frac{1}{16} = 1.4375 $$

And one more term makes it a little undershot:

$$ \sqrt{2} \approx 1 + \frac{1}{2} -\frac{1}{8} + \frac{1}{16} -\frac{5}{128} = 1.3984375 $$

Which can be compared to the calculator result:

$$\sqrt{2}=1.41421356$$

This demonstrates the success of the generalized binomial theorem, because with this alternating pattern of over/under valuing the series is convergant with the previously accepted methods of calculation.

A note on the $r$-choose-$0$'th coefficient value of unity: it's not an arbitrary definition—the principle is that the numerator and denominator of the binomial coefficient are structured as products of terms starting with one because mulitplicatively one is what must remain when the exponent of the binomial is zero, $r=0\to \binom{0}{0}=1$ since $x^0=1$.

Binomial Approximation

Based on the binomial formula, we can make a linear approximation when the quantity added to unity is small which is to say that the power of one term will be greater than all higher powers of the small quantity.

$$ \lim_{x\ll 1}(1+x)^r\approx 1 + rx $$

Monomials

To aid intuition for the monomial factors in the normalized binomial series, here is a plot of the first five powers of $x$:

Figure 9: Monomials of the first five natural number exponent powers (skipping 0): $f_1(x)=x$ is the red one, $f_2(x)=x^2$ is a little bent (cobalt), and $f_3(x)=x^3$ is next most bent (pale green), with two more of the monomials showing progressively more bend.

The plotting of monomials is most dramatic around 1—simply because $1^n=1$, for all $n\in \mathbb{N}$, and from studying their behavior around 1 one understands the two sides of these single variable functions—an exercise. Just as the monomials are found in the generalized binomial formula, they are naturally occuring in the Taylor Series, as well.

A web search for "history of the square root" has a high quality return (Quora, Mathforum, Stackexchange, Wikipedia), where one can learn that n'th roots were studied and written about in the early 1600s, mostly referred to as radix (since math was written about in Latin) and radix means root, and the related term radical also means related to roots. So, naming convention has legacy even when translated.

Physics is very often concerned with the slope of a function, such as the slope of a car's track in the lab, or the change in potential energy for a change in position for any action-at-a-distance field like gravity or electricity. We're working our way up to calculating the slope (or tangent) at arbitrary points on a curve.

A Simple Function

If we work with a simple, well-known function, of which we have an intuitive grasp, and we carry it through as a test for our tools, then we'll have something to look at, and remember.

Such a simple function is the binomial-to-one-half-power, $f_h(x)$ has a domain of $x$ equal to zero on up, of which there are graphs below. Not all single variable functions stay finite for finite $x$, as a proper function, the hyperbola, being one-over-x ($1/x$), can't be plotted for very small $|x|$, anything close to zero. In order to graph $f_h$ we consider the values of this function at every point along the interval of interest in $x$, from $x_A$ to $x_B\gt x_A$, or using interval notation the domain is, $X=[x_A, x_B]$.

A single variable function is the most fundamental of graphs. The graph of a function from one argument to one dimension (scalar, not vector) lies naturally in two-dimensions (the plane of a screen or page), without any slicing or projection, and to inspect an aspect of a multivariable function's behavior, we can usually reduce the problem to a function on some interval, contained in a ball of space, such as the reduction of a 3-dimensional, central-force problem such as the Earth-Sun in isolative 2-body approximation, to a planar elliptic orbit.

Figure 10: Plot of the hypotenuse, $c=f_h(x=b)$, with edge $x$ on the abscissa, and fixed edge $a=1.5$, versus the identity function (dashed line).

This curve always has a small amount of curvature, never a straight line, but the difference between it and the straight line drawn, $f_h - f_{id}$, is decreasing with $x$.

So there is not really a line with slope one and $y$-intercept between zero and the fixed side length, which won't be crossed at some point. Which just means the swinging door needs a little space around it, no matter how thin you make the door.

So, for $x\gg a$ (much greater than) $a$, the slope of the curve is increasing from close to a constant (horizontal line, for $x\ll a$), approaching the slope of the asymptote of the identity.

Figure 11: Plot of $f_h(x)$, with fixed edge $a=1.5$, and variable edge on the interval $[2,5]$, versus $x$ (dashed line).

From $x=2$ on up, plotting the hypotenuse as a function of the varying edge demonstrates getting even closer to the asymptote $f(x)=x$, with the slope getting closer and closer, but never acquiring a value of one, except at infinity.

It took 1,800 years to get from the Pythagorean theorem to the generalized binomial series, so it is informative to our perspective that we know Binomials were studied for a very long time, with a global history of what the formula is for $(a+b)^n$ for arbitrary n.

The plot also explains a feature of the swinging door — that you only have to make the clearance in its frame a small fraction of the thickness of the door. If you had a very thick door, and didn't want to leave a big space around the closed door and its frame, you would have to radius the door, or bevel it so that the inner edge of the door didn't obstruct the opening motion. So it is instructive to demonstrate the narrow gap around 1.5" thick, household doors corresponds to a linear, one-to-one asymptote over the domain of $x\gg$(a=1.5) by looking at the hypotenuse function over the configuration of a swinging door in a tight frame, or about $x=33 \textrm{in.}$ :

Figure 12: The hypotenuse function at x=32 inches to x=34 inches, for a fixed edge of 1.5 inches, is 32.03513 and 34.03307 inches respectively (ordinate), and has very subtle curvature (meaning the clearance is reduced by 0.2% of an inch by widening it from 32" to 34").

This plot has curvature that is imperceptable to the eye, but means something to someone who ever wondered how swinging doors didn't hit the frame when the closed-door clearance is so tight, which we get to elucidating with what physicists call convergance. A convergent function is when some complicated function looks like a constant, or a monomial (like the diminishing factor times the identity monomial), at significant parts of the domain, such as while tending towards infinity. We measure the difference between the $f_{h,a+}$ function and $x$, at the width-to-thickness ratio of doors, and we see a frame minimal clearance of much less than the thickness.

The hypotenuse function looks like an asymptotic monomial, written explicitly with the exponent, as $x^1$ (x to the first power), not to be confused with how many numbers there are inside the outermost exponent (that's what the mono and bi refer to in the nomials). The linear asymptote with slope one ($f(x)\approx x$) as $x>>a$ ($x$ is much larger than $a$, the other edge of the triangle), where we changed the variable $x=b$ (or $b\to x$, from $b$ to $x$) because $a$ and $b$ are associated with constants of the configuration, in equation (\ref{Pythagorean}).

$f(x)|_{a=1.5}$ $f_{\textrm{id}}$ $(f-f_{\textrm{id}})$
32.0351 32 0.0351
33.0341 33 0.0341
34.0331 34 0.0331
Table 1: Table of values for the hypotenuse function, $f(x)$, with fixed edge $a=1.5$, versus the identity function, $f_{id}(x)=x$, and the difference between them, over the interval [32,34].

So by inspection of this table we can see that the variance of $f(x)$ from $x$ (the asymptote) is still decreasing, but it's in the third decimal place (thousandths, not very much). If we take the difference between $f(x)$ and $x$, you can see that difference is decreasing, which means that the wider we make the door (distance from the hinge axle), the closer the hypotenuse is to the breadth, so the tighter we can make the frame. As expected, the data is consistent with and supports the expected behavior from inspecting $f_h$.

In this plot of the hypotenuse as a function of x, for real ratios of door width to door thickness, we see that the variance of the hypotenuse, at x= 32" and x=34" (normal door widths, as measured left to right standing behind one) is very close to an f(x) value (ie, the "y=mx+b" slope is getting closer to 1 with increasing x).

One important change to all this, becoming apparent shortly is that the hypotenuse function becomes the radius of a polar-angle coordinate system, which is decomposed into orthogonal components of position measurement. You already know orthogonal, it's what the two non-hypotenuse edges of a right triangle are to eachother—two lines (along which we measure position) intersecting at a 90° angle. In three dimensions, the third component is simply super-positioned with the plane we know and love!

The Pythagorean theorem lets us solve problems like calculating how much time it takes to “motor” straight across a river with uniform flow, and parallel banks. We go straight across a river by knowing the speed of our boat in still water, then angling it so that the centerline from the point of the bow through the middle of the stern is along the line of a hypotenuse, with one edge of the right-triangle parallel to the bank—with this component being equal and opposite the flow rate of the river, and the other parallel with the line going straight across is whatever remains of the motor power.

There is a Physics Stackexchange problem on a river and a swimmer focusing on a landmark on the other side, asking about the path across a river where the motor boat is always pointed at a fixed point on the bank, directly across the river from the starting point, but the question has not dismissed because it is an uncommon exercise: swimmer with a focal direction.

Let's say the swimmer swims twice as fast as the river flow, but the river is three times as wide as the bank-wise distance up from the start point to the focal point, then the swimmer will be taken downstream from the start point for a good portion of the journey. In general, a newtonian equation of motion can be written for the components of the right-triangle directions, which may be as hard to solve as the catenary solution of the rope suspended at two ends, but there's always numerical solution of differential equations through integration.



© 2025 Gabe Fernandez. All rights reserved.