Consider a function, $f(x)$, which doesn't change too rapidly over the interval, or we limit the functions this method will work on to those which have no differential “infinities” on the interval (which happens for the derivative of a step function, at any degree), which is to say the function should be smooth. For example, the inverted functions of trigonometry with an interval of a full period (or the select part) have infinities, because they hold the (first-order) singularity $\lim_{x\rightarrow 0}\frac{1}{x}$, which is not smooth at zero simply because the derivative of which is also inverted.
Let the $n$'th derivative of $f(x)$ exist (that is, be a smooth, non-zero function), then that derivative can be written, $(\frac{d}{dx})^nf(x)=f^{(n)}(x)$, where the parentheses in the exponent are the notation for derivative order, being $n$. The $n$'th derivative existing implies all preceding derivatives are also nonzero, where a function must be zero for the whole domain to be considered a zero function namely $\{f(x)=0: \forall{}x\in{}\mathbb{R}\}$. For example, the exponential and trigonemetric functions all have non-zero derivatives (more in the Sine Series article), whereas a finite polynomial only has non-zero derivatives up to the highest order—so by induction we can develop the Taylor series. We can integrate the $n$'th derivative, $G=f^{(n)}(x)$, anywhere inside the interval of smooth continuity, say $[x_0, x]$, where $x\lt x_1$, and consider the result after repeating for a total of $n$ integrations—which you might immediately be able to see holds the $f^{(0)}=f$ term, and through analysis we will see to carefully use the rules of calculus to bring to the fore the original function, with some degree of error when truncating the series short of the end, as will be demonstrated in the Sine Series article.
$$ I_1 = \int_{x_0}^{x} G dx = \int_{x_0}^{x} f^{(n)} dx = \left. f^{(n-1)} \right|_{x_0}^x = f^{(n-1)}(x) - f^{(n-1)}(x_0) $$Since the last term is constant and recurring, I'll label it, $c_{n-m}=-\left. f^{(n-m)}(x) \right|_{x_0}$, or minus the $(n-m)$'th derivative of $f(x)$ evaluated at $x_0$. Each of the $c$ terms are the result of one of the iterations, $0\le{}m\le{}n$, of integrating the function $f^{(n-m)}(x)$ over the interval, the $(n-m)$'th derivative of $f(x)$, and evaluating such definite integral at the two boundaries of the integral, $x_0$ on the left and $x\le{}x_1$ on the right, with the first being subtracted from the second--leading the separation of the $(n-m-1)$'th derivative (still variable, but easily integrated) plus a constant term to be integrated in the formulaic monomial way in subsequent iterations, demonstrated below.
$$ I_2 = \int_{x_0}^{x} I_1 dx = \int_{x_0}^{x} f^{(n-1)}(x) dx + \int_{x_0}^x c_{n-1} dx = f^{(n-2)}(x) + c_{n-2} + c_{n-1} (x-x_0) $$ $$ I_3 = \int_{x_0}^{x} I_2 dx = \int_{x_0}^{x} f^{(n-2)}(x) dx + \int_{x_0}^x [ c_{n-1} ( x - x_0 ) + c_{n-2} ] dx $$ $$ = f^{(n-3)}(x) + c_{n-3} + c_{n-2}(x-x_0) + \frac{1}{2}c_{n-1} (x-x_0)^2 $$Since $$\int_{x_0}^{x}(c)dx=c(x-x_0)$$ And, $$\int_{x_0}^{x}c(x-x_0)dx = \frac{1}{2}c(x-x_0)^2$$ To be sure this binomial of variable-plus-constant can be treated as a single term, since it has the same derivative as the monomial $x$, and for all integer exponents (except -1), the derivative of such a binomial-to-the-power is the same as the derviative of such monomial-to-the-power. $$\int_{x_0}^{x}\frac{1}{2}c(x-x_0)^2dx = \frac{1}{3!}c(x-x_0)^3$$ Where the exclamation mark indicates factorial, here $3!=3*2*1$ is formed in the denominator by successively integrating a constant three times. For all $c\in{}\{c_{n-1}, c_{n-2},...\}$, and the reason you don't have to break up the binomial into monomials, in order to integrate, is that the derivative of the integrated term, as proposed above, is in fact the $c(x-x_0)$ term, so we keep it as a binomial term, demonstrating that it is the relative distance which defines the series in these terms, as we will also see that it has to have a contextual element added to the series—an offset (not depending on the relative distance, but the absolute/global coordinate, $x_0$).
Then continuing the integrations, the $m=n$'th step will have the following terms: $$ I_n = \int_{x_0}^x I_{n-1} dx = f(x) +c_0 + \frac{c_{1}}{1!} (x-x_0) $$ $$ + \frac{c_{2}}{2!} (x-x_0)^2 + \frac{c_3}{3!} (x-x_0)^{3} +... $$ Where $f(x)=D^0_x f(x)$, and with some algebraic rearangement (adding and subtracting) the terms in $I_n$ yields the following, with $f(x)$ on the left, and the remainder on the right, with the $c$ terms substituted back for the derivatives they represented, to obtain the final form of the Taylor series.
$$ f(x) = f(x_0) + f^{(1)}(x_0) (x-x_0) + f^{(2)}(x_0) \frac{1}{2}(x-x_0)^2 $$ $$ + f^{(3)}(x_0) \frac{1}{3!}(x-x_0)^3 +\cdots{}+I_n $$The remainder term, $I_n=\int_{x_0}^x\cdots{}\int_{x_0}^x G(x)dx^n$, can be analyzed by bounding the first integral. We can say that $G(x)$, is less than, or equal to, the magnitude of the interval, $x-x_0, x\gt{}x_0$, times the maximum of $G=f^{(n)}(x)$ on the interval, which exists since we are only talking about functions $f$, which are continuous and smooth (on that interval). Conversely, we also know that $G$ has a minimum, out of all the values it may take, $G(x)$, on the interval, which may be at the start or end of the interval, or anywhere in between when the first derivative goes to zero (overall peak and pit, but not peak-smaller-pit-peak pits, being a false-minima). So that the first integral is in the ballpark of the product of the interval, $x$, times the average of $G$, $G(x_{\text{avg}})$. And so we jump ahead on calculating the subsequent steps, with an error value depending invariably upon $n$, as it diminishes the remainder term by $n!$, on the denominator, which is arbitrarily large. So, we have for the error, another constant (the $m=0$ term) integrated $n$ times, which gives the following: $$ I_n = \frac{(x-x_0)^{n}}{n!} f^{(n)}(x_\text{avg}) $$ The constant term here is the $n$'th derivative of $f(x)$, with respect to $x$, evaluated at the fixed point $x_\text{avg}$, which could also be written, $f^{(n)}(x_\text{avg})=\langle{}f^{(n)}(x)\rangle{}$, where the angle brackets means average (same as expectation value for any statistically representative function).
Immediately we can see that the approximation for a function, $f$, contains a geometric series component that tends to converge it for intervals less than unity, $(x-x_0)\lt 1$, and since the other coefficient is monotonically decreasing with $n$, $n^{-1} \gt (n+1)^{-1} , \forall\ n\gt 0$, therefore $n!^{-1} \gt (n+1)!^{-1} , \forall\ n\gt 0$, the series converges, and the amount of deviation from the true function is controlled with more terms (pushing out the error, $I_n$, by way of geometric-factorial-denominator coefficient).
When studying functions (algebraically) to find significant features of relationship and behavior, one starts with monomials ($x^n$, $n$ being a natural number, for the Taylor Series ). Plotting monomials is informative, so $x^n$ for $n\in \{1, 2, 3, 4, 5\}$ is plotted here:
Monomials of the first five natural number exponent powers (skipping 0): $f_1(x)=x$ is the red one, $f_2(x)=x^2$ is the middle one on either side of $f_1$, and $f_3(x)=x^3$ is the steepest curve (pale green).