Fact-checked by Grok 4 months ago

Equation

An equation is a mathematical statement asserting that two expressions are equal, typically involving variables, constants, and operations, and it is satisfied by specific values of the variables that make the expressions identical.[1][2] Solutions to an equation are the values that render it true, and equations form the core of algebraic reasoning by allowing the modeling and resolution of relationships between quantities.[3][4] Equations have ancient origins, with early civilizations such as the Babylonians around 2000 BCE developing methods to solve quadratic equations through geometric and verbal descriptions rather than symbolic notation.[5] By the 9th century, Persian mathematician Al-Khwarizmi formalized systematic approaches to solving linear and quadratic equations in his treatise Al-Jabr, laying foundational principles for algebra as a discipline.[6] Over time, the concept evolved to encompass more complex forms, including higher-degree polynomials and systems of equations, driven by advancements in notation by figures like René Descartes in the 17th century, who introduced modern symbolic representation.[7] In mathematics, equations are classified into various types based on their structure and the operations involved, such as linear equations (where variables appear to the first power, forming straight lines when graphed), quadratic equations (second-degree polynomials), and nonlinear forms like exponential or differential equations.[8][9] They also include conditional equations (true for specific values), identities (true for all values), and inconsistent equations (no solutions).[2][1] The study and solution of equations are pivotal across fields, enabling precise modeling of physical laws, economic systems, and scientific phenomena, and serving as a gateway to advanced topics like calculus and linear algebra.[10][11]

Introduction

Definition

An equation is a mathematical statement that asserts the equality of two expressions, typically represented as $ f(x) = g(x) $, where the expressions on either side of the equals sign may involve variables, constants, and mathematical operators.[12] This form indicates that the value of $ f(x) $ is identical to the value of $ g(x) $ for certain values of the variable $ x $, or potentially for all values depending on the equation's nature.[13] The symbolic notation for equality in equations employs the equals sign (=), which was introduced by Welsh mathematician Robert Recorde in 1557 in his book The Whetstone of Witte.[14] Recorde justified the symbol's design—two parallel horizontal lines—as a means to denote equivalence without repetition, stating that "noe 2 thynges can be moare equalle."[14] A simple example is $ 2 + 2 = 4 $, where the expressions on both sides evaluate to the same numerical value.[12] Equations differ from inequalities in that they express exact equality between expressions, whereas inequalities denote relational orders such as greater than (>) or less than (<).[15] For instance, while an equation like $ x + 1 = 5 $ seeks the precise value that balances both sides, an inequality like $ x + 1 > 5 $ identifies a range of values satisfying the condition.[15] Equations are classified as identities, conditional, or inconsistent based on the scope of their truth. Identities hold true for all values of the variables involved, such as $ x + 0 = x $.[16] In contrast, conditional equations are true only for specific values of the variables that satisfy the equality, like $ 2x = 4 $ where $ x = 2 $.[16] Inconsistent equations, also known as contradictions, are never true for any value of the variables, such as $ x = x + 1 $.[16]

Historical Development

The origins of equations trace back to ancient civilizations, where practical problems in measurement and trade prompted early algebraic thinking. In Mesopotamia, Babylonian scribes around 1800 BCE recorded solutions to quadratic equations on clay tablets, employing geometric interpretations to find areas and volumes without symbolic notation.[17] Similarly, ancient Egyptian mathematics, as documented in the Rhind Papyrus circa 1650 BCE, addressed linear equations through iterative methods like false position, applying them to problems in resource allocation and geometry.[18] Greek mathematicians advanced these ideas by integrating equations into geometric frameworks. Euclid's Elements, composed around 300 BCE, offered rigorous geometric constructions to solve linear and quadratic problems, emphasizing proofs over computation.[19] Later, Diophantus of Alexandria in his Arithmetica circa 250 CE pioneered syncopated algebra, using abbreviations and symbols to express and solve indeterminate equations, influencing subsequent numerical approaches.[20] The Islamic Golden Age marked a pivotal shift toward systematic algebra. Muhammad ibn Musa al-Khwarizmi's Al-Kitab al-mukhtasar fi hisab al-jabr wal-muqabala (circa 820 CE) classified and provided step-by-step rhetorical solutions for linear and quadratic equations, establishing algebra as a distinct discipline.[21] Building on this, Omar Khayyam around 1070 CE developed geometric techniques to solve cubic equations, intersecting conic sections to find roots, which extended algebraic methods to higher degrees.[22] During the Renaissance, symbolic representation transformed equation solving. François Viète in 1591 introduced letters from the alphabet to denote unknowns and parameters in his works on trigonometry and algebra, enabling general formulas and moving beyond specific numerical cases.[23] René Descartes further bridged algebra and geometry in La Géométrie (1637), using coordinates to translate geometric curves into polynomial equations, foundational to analytic geometry.[24] In the 18th century, Leonhard Euler standardized notation for functions and equations, introducing symbols like f(x)f(x) to describe relationships systematically, which supported advanced analysis.[25] The late 17th century saw Isaac Newton and Gottfried Wilhelm Leibniz independently develop differential equations as part of calculus, modeling rates of change in physical phenomena through infinitesimal methods.[26] By the 1830s, Évariste Galois formulated the theory of equation solvability using group theory, determining conditions under which polynomial equations could be solved by radicals, ushering in abstract algebra.[27]

Fundamental Concepts

Properties

Equations exhibit key properties that determine their solution behavior and structural characteristics. Solvability refers to whether an equation or system admits solutions. An equation is consistent if it has at least one solution and inconsistent if it has none; for systems of linear equations, the number of solutions depends on the ranks of the coefficient matrix and the augmented matrix: if the ranks are equal and equal to the number of variables, there is a unique solution; if equal but less than the number of variables, infinitely many solutions; if the rank of the augmented matrix is greater, no solution (inconsistent).[28] Homogeneous systems, where the constant terms are zero, always have at least the trivial solution.[29] Equivalence is a fundamental property ensuring that manipulations preserve the solution set. Two equations are equivalent if they share the identical set of solutions.[1] Transformations that maintain equivalence include adding or subtracting the same expression from both sides, multiplying or dividing both sides by a non-zero constant, or other operations that do not alter the solution set, such as those used in Gaussian elimination for linear systems.[30] Symmetry and homogeneity describe structural invariances in equations. A symmetric equation remains unchanged under the interchange of two or more variables, such as in expressions involving symmetric polynomials where the form is invariant under variable permutation.[31] Homogeneous equations are those where scaling all variables by a factor $ t $ scales both sides equally, often expressed as $ f(tx, ty) = t^k f(x, y) $ for some degree $ k $, which simplifies substitution methods like $ v = y/x $ in first-order differential equations.[32] The degree and order quantify the complexity of equations. For polynomial equations, the degree is the highest total power of the variables, determining the maximum number of roots by the fundamental theorem of algebra.[33] For differential equations, the order is the highest derivative present, with first-order equations involving only $ dy/dx $ and higher-order ones requiring integration of lower-order solutions.[34] Equations underpin universality in mathematics and science by providing a framework for modeling relationships between variables, from physical laws like Newton's equations of motion to abstract structures in pure math.[35] This foundational role enables predictive analysis across disciplines, capturing dynamics through balanced expressions of quantities and their rates of change.[36]

Variables, Parameters, and Constants

In mathematical equations, variables represent unknowns whose values are to be determined to satisfy the equality. For instance, in the equation $ x + 3 = 5 $, $ x $ is a variable that can take on different values, serving as the quantity to solve for.[37] Variables are often classified as dependent or independent; a dependent variable, such as $ y $ in $ y = mx + b $, expresses the output that relies on the input value of an independent variable like $ x $.[38] Constants, in contrast, are fixed numerical or symbolic values that do not change within the context of a given equation, providing stability to its structure. Examples include the number 3 in $ x + 3 = 5 $ or $ \pi $ in the circumference formula $ C = 2\pi r $, where they define fundamental behaviors such as scaling or proportionality without variation.[39] These elements ensure the equation's consistency across applications, anchoring the relationship among other components.[37] Parameters function as constants within a specific equation but are treated as variables when considering families of related equations, allowing generalization across scenarios. In the linear equation $ ax + b = 0 $, $ a $ and $ b $ act as parameters that can vary to generate different instances, such as altering the slope or intercept in graphical representations.[40] This distinction enables analysis of how changes in parameters influence the equation's overall form and solutions.[39] Standard notation conventions distinguish these elements for clarity: variables are typically denoted by lowercase italic letters (e.g., $ x, y $), constants by upright Roman letters or symbols (e.g., $ c, \pi $), and parameters often by Greek letters (e.g., $ \theta, \alpha $) or uppercase letters in systems involving multiple variables.[41] In multivariable systems, such as $ x + y = 5 $, each variable is assigned distinct symbols to track interactions.[42] In higher mathematics, variables are further categorized as free or bound. Free variables are those not quantified or restricted within an expression, retaining their ability to take arbitrary values, as in the standalone term $ x $. Bound variables, however, are those captured by operators like integrals or summations, where their scope is limited— for example, the $ x $ in $ \int x , dx $ is bound by the integral, representing a dummy index rather than a specific unknown.[43] This distinction is crucial in contexts like logic and calculus, where it affects substitution and evaluation./03:_Volume_II-_Predicate_Logic/3.03:_More_about_Quantifiers/3.3.02:_Quantifier_Scope_Bound_Variables_and_Free_Variables) Parameters can influence properties such as solvability by determining whether an equation has unique, multiple, or no solutions within a family.[39]

Basic Examples

Simple Linear Equations

A simple linear equation is a mathematical statement of equality involving a single variable raised to the first power, typically expressed in the form $ ax + b = 0 $, where $ a $ and $ b $ are constants with $ a \neq 0 $.[44] This form ensures the equation is linear, meaning no exponents higher than 1 or products of variables appear. For instance, the equation $ 2x + 3 = 7 $ is a simple linear equation, which can be rewritten as $ 2x + 3 - 7 = 0 $ or $ 2x - 4 = 0 $.[45] To solve a simple linear equation, apply inverse operations to isolate the variable while maintaining equality on both sides. Starting with $ 2x + 3 = 7 $, subtract 3 from both sides to obtain $ 2x = 4 $, then divide both sides by 2 to yield $ x = 2 $.[44] This process relies on the addition and multiplication properties of equality, ensuring each step produces an equivalent equation. Verification involves substituting the solution back into the original equation: $ 2(2) + 3 = 7 $, which holds true.[45] In simple linear equations, the variable represents an unknown quantity to be found.[46] Simple linear equations often arise from translating real-world scenarios into algebraic form. Consider the problem: "If twice a number plus 3 equals 7, find the number." Let the number be $ x $; the equation becomes $ 2x + 3 = 7 $, solving to $ x = 2 $.[47] Such word problems model direct proportional relationships, like costs or quantities, where one variable changes linearly with another. Graphically, the solution to a simple linear equation in one variable, such as $ x = 2 $, is represented as a point on the number line at 2.[48] When considering linear relations in two variables, equations like $ y = mx + c $ graph as straight lines, intersecting the y-axis at $ c $ and the x-axis at $ -c/m $ (if $ m \neq 0 $).[48] In basic physics, simple linear equations describe uniform motion via the formula $ d = rt $, where $ d $ is distance, $ r $ is rate, and $ t $ is time.[49] For example, if a car travels at 60 mph for 3 hours, then $ d = 60 \times 3 = 180 $ miles.[49] Solving for time given distance and rate, such as $ t = d / r $, yields linear expressions applicable to problems like determining travel duration.[50]

Identities and Equalities

In mathematics, an identity is an equation that holds true for all values of the variables within their defined domain, distinguishing it from general equations that may only be valid under specific conditions.[51] For instance, the algebraic identity (x+1)2=x2+2x+1(x + 1)^2 = x^2 + 2x + 1 is satisfied for every real number xx, as it arises from the binomial theorem expansion.[52] To verify an identity, one can perform algebraic manipulation, such as expanding the left side to match the right, or substitute a range of test values for the variables to confirm the equality persists universally.[53] This process ensures the equation is not merely coincidental but tautological across the domain. Common identities include the Pythagorean identity in its algebraic trigonometric form, sin2θ+cos2θ=1\sin^2 \theta + \cos^2 \theta = 1, which holds for all real angles θ\theta and derives from the geometry of the unit circle.[54] Another example is the difference of squares, a2b2=(ab)(a+b)a^2 - b^2 = (a - b)(a + b), applicable to all real aa and bb.[55] Identities play a crucial role in mathematical proofs by enabling the simplification of complex expressions, such as reducing trigonometric functions in integrals or factoring polynomials in algebraic derivations. In calculus, for example, they facilitate substitutions that streamline differentiation or integration tasks.[54] In contrast to conditional equations, which are true only for particular solutions within a restricted domain, identities are unconditionally valid and possess infinitely many solutions without needing to solve for specific variables.[56] This universality makes identities foundational for establishing equivalences in broader mathematical contexts.[57]

Algebraic Equations

Polynomial Equations

A polynomial equation is an equation that can be expressed in the form anxn+an1xn1++a1x+a0=0a_n x^n + a_{n-1} x^{n-1} + \dots + a_1 x + a_0 = 0, where the aia_i are constants (coefficients) from a given field such as the real or complex numbers, nn is a non-negative integer called the degree of the polynomial (provided an0a_n \neq 0), and xx is the variable./06%3A_Polynomials/6.01%3A_Polynomial_Expressions) These equations generalize linear equations, which are polynomials of degree 1, to higher degrees; for instance, quadratic equations have degree 2, cubic equations degree 3, and so on./03%3A_Polynomial_and_Rational_Functions/03.01%3A_The_Factor_Theorem) The solutions to a polynomial equation of degree nn, known as roots, satisfy the equation when substituted for xx. For quadratic equations of the form ax2+bx+c=0ax^2 + bx + c = 0 with a0a \neq 0, the roots are given by the quadratic formula:
x=b±b24ac2a. x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}.
This formula, which provides an explicit algebraic solution, originated in the work of the Indian mathematician Brahmagupta in 628 AD in his text Brahmasphutasiddhanta.[58] The discriminant D=b24acD = b^2 - 4ac determines the nature of the roots: if D>0D > 0, there are two distinct real roots; if D=0D = 0, there is exactly one real root (a repeated root); and if D<0D < 0, there are no real roots but two complex conjugate roots./03%3A_Polynomial_and_Rational_Functions/03.01%3A_The_Factor_Theorem) Vieta's formulas relate the coefficients to the roots; for a quadratic, if the roots are r1r_1 and r2r_2, then r1+r2=b/ar_1 + r_2 = -b/a and r1r2=c/ar_1 r_2 = c/a. These relations, developed by François Viète in the late 16th century, extend to higher-degree polynomials, connecting sums and products of roots (with signs and symmetries) to the coefficients. Factoring is a key method for solving polynomial equations, often reducing them to simpler factors whose roots are easier to find. The factor theorem states that if f(a)=0f(a) = 0 for a polynomial f(x)f(x), then xax - a is a factor of f(x)f(x), allowing synthetic division or long division to factor it out./03%3A_Polynomial_and_Rational_Functions/03.03%3A_Division_of_Polynomial_and_Synthetic_Division) For polynomials with integer coefficients, the rational root theorem provides a strategy to test possible rational roots: any rational root, expressed in lowest terms p/qp/q, has pp as a factor of the constant term a0a_0 and qq as a factor of the leading coefficient ana_n./03%3A_Polynomial_and_Rational_Functions/03.04%3A_Zeros_of_Polynomial_Functions) This theorem, a consequence of Gauss's lemma on polynomial factorization, limits the candidates to a finite list, facilitating root discovery through evaluation./03%3A_Factorization_in_Polynomial_Rings/3.03%3A_Gausss_Lemma) The Fundamental Theorem of Algebra asserts that every non-constant polynomial equation with complex coefficients has at least one complex root, and more precisely, exactly nn roots counting multiplicities for a degree-nn polynomial.[59] First rigorously proved by Carl Friedrich Gauss in his 1799 doctoral dissertation, the theorem guarantees the existence of roots in the complex numbers, underpinning much of modern algebra and analysis.[60] It implies that any polynomial can be factored completely into linear factors over the complexes, though finding explicit roots for degrees higher than 4 generally requires numerical methods rather than radicals, as established by the Abel-Ruffini theorem (though not detailed here).[59]

Systems of Linear Equations

A system of linear equations consists of two or more linear equations involving the same set of variables, where each equation is of the form a1x1+a2x2++anxn=ba_1 x_1 + a_2 x_2 + \dots + a_n x_n = b, with coefficients aia_i and constant bb.[61] For instance, a two-equation system in two variables might be ax+by=cax + by = c and dx+ey=fdx + ey = f, and a solution is a set of values for the variables that satisfies all equations simultaneously.[62] Geometrically, the solution represents the intersection point of the corresponding lines in the plane for two dimensions or planes in three dimensions.[63] Systems can be represented in matrix form as Ax=bAx = b, where AA is the coefficient matrix, xx is the vector of variables, and bb is the constant vector.[64] Common methods for solving include substitution, where one equation is solved for one variable and substituted into the others; for example, from x+y=3x + y = 3, solve x=3yx = 3 - y and insert into a second equation.[62] The elimination method involves multiplying equations to align coefficients and adding or subtracting to remove a variable, such as scaling the first equation by dd and the second by aa to eliminate xx in a two-variable system.[65] For larger systems, Gaussian elimination transforms the augmented matrix [Ab][A | b] into row echelon form through row operations: swapping rows, multiplying by a nonzero scalar, or adding multiples of one row to another.[64] The process proceeds by eliminating variables below the pivot in each column, starting from the top-left, until back-substitution yields the solution from the upper triangular form.[66] A system is consistent if it has at least one solution (unique, infinite, or none for inconsistent cases) and inconsistent otherwise, detectable when a row reduces to 0=k0 = k for k0k \neq 0.[61] Geometrically, in two dimensions, two lines intersect at a point for a unique solution, are parallel for inconsistency, or coincide for infinite solutions; in three dimensions, planes intersect along a line, at a point, or not at all.[67] The rank of AA determines solution existence: equal to the rank of [Ab][A | b] for consistency.[68] Determinants play a key role via Cramer's rule, which solves Ax=bAx = b for xi=det(Ai)/det(A)x_i = \det(A_i) / \det(A), where AiA_i replaces the ii-th column of AA with bb, provided det(A)0\det(A) \neq 0 for a unique solution.[69] For a 2x2 system {ax+by=ecx+dy=f\begin{cases} a x + b y = e \\ c x + d y = f \end{cases}, x=edbfadbcx = \frac{ed - bf}{ad - bc} and y=afecadbcy = \frac{af - ec}{ad - bc}.[70] This method is efficient for small systems but computationally intensive for large ones compared to Gaussian elimination.[71] Applications include electrical circuit analysis using Kirchhoff's laws, where loop currents satisfy systems from voltage drops equaling sources.[72] In economics, systems model input-output balances, such as Leontief models where production sectors satisfy demand equations like total output equals intermediate plus final demand.[73]

Geometric Equations

Analytic Geometry

Analytic geometry, also known as coordinate geometry, establishes a bridge between algebra and geometry by using equations to describe geometric figures in a Cartesian coordinate system. This approach was pioneered by René Descartes in his 1637 work La Géométrie, where he introduced the method of assigning coordinates to points in the plane, allowing geometric problems to be solved algebraically through equations relating variables x and y.[74] In this framework, a geometric locus—such as a curve or line—is represented by the set of points (x, y) that satisfy a specific equation, enabling precise analysis of shapes via algebraic manipulation.[74] A key application of analytic geometry involves conic sections, which are curves obtained as intersections of a plane with a cone and can be defined by second-degree equations. The circle, for instance, has the standard equation x2+y2=r2x^2 + y^2 = r^2, where r is the radius, representing all points at a fixed distance from the center.[75] The ellipse follows the form x2h2+y2k2=1\frac{x^2}{h^2} + \frac{y^2}{k^2} = 1, describing an oval-shaped curve with semi-major axis h and semi-minor axis k.[76] Parabolas are captured by equations like y=ax2y = ax^2, which models a U-shaped curve opening upward or downward depending on the sign of a.[76] Hyperbolas, in contrast, use x2h2y2k2=1\frac{x^2}{h^2} - \frac{y^2}{k^2} = 1, forming two branches symmetric about the axes.[76] These equations, derived from the general second-degree form Ax2+Bxy+Cy2+Dx+Ey+F=0Ax^2 + Bxy + Cy^2 + Dx + Ey + F = 0, classify conics based on the discriminant B24ACB^2 - 4AC.[75] To simplify equations of conics, transformations such as translation and rotation of axes are employed to reduce them to standard forms. Translation shifts the origin by replacing x with xhx' - h and y with yky' - k, eliminating linear terms and centering the curve.[77] Rotation, used to remove the cross term Bxy, involves substituting x=xcosθysinθx = x'\cos\theta - y'\sin\theta and y=xsinθ+ycosθy = x'\sin\theta + y'\cos\theta, where θ=12tan1BAC\theta = \frac{1}{2}\tan^{-1}\frac{B}{A-C}, aligning the axes with the curve's symmetry.[77] These transformations preserve the geometric properties while facilitating identification and graphing. Fundamental formulas in analytic geometry, such as those for distance and midpoint between points, are derived directly from equations and the Pythagorean theorem. The distance d between points (x1,y1)(x_1, y_1) and (x2,y2)(x_2, y_2) is given by d=(x2x1)2+(y2y1)2d = \sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2}, obtained by considering the right triangle formed by the horizontal and vertical differences along the axes.[78] Similarly, the midpoint M of the segment joining these points has coordinates (x1+x22,y1+y22)\left( \frac{x_1 + x_2}{2}, \frac{y_1 + y_2}{2} \right), verified by ensuring equal distances from M to each endpoint using the distance formula.[79] Analytic geometry plays a crucial role in calculus by providing graphical representations of functions via equations, allowing derivatives to be interpreted geometrically as slopes of tangents to curves. For a function defined implicitly by an equation F(x, y) = 0, implicit differentiation yields dydx=F/xF/y\frac{dy}{dx} = -\frac{\partial F / \partial x}{\partial F / \partial y}, linking algebraic equations to rates of change along the graph.[80] This integration enabled early developments in calculus, such as computing instantaneous velocities from coordinate-based motion equations.[80]

Parametric and Cartesian Forms

In the Cartesian form, an equation explicitly relates the coordinates by expressing one variable, typically yy, as a direct function of the other, xx, such as the linear equation y=mx+by = mx + b, where mm is the slope and bb is the y-intercept.[81] This explicit representation facilitates straightforward graphing and analysis of functional relationships in the plane.[82] Parametric forms, by contrast, describe curves or paths by expressing both coordinates as functions of an independent parameter, commonly tt, in the form x=f(t)x = f(t), y=g(t)y = g(t).[83] For instance, the parametric equations for a circle of radius rr centered at the origin are
x=rcosθ,y=rsinθ, x = r \cos \theta, \quad y = r \sin \theta,
where θ\theta serves as the parameter, allowing the curve to be traced as θ\theta varies.[84] To obtain the Cartesian form from parametric equations, the parameter is eliminated through algebraic substitution or solving.[85] Consider the parametric equations of a line, x=x0+atx = x_0 + at, y=y0+bty = y_0 + bt; solving the first for t=(xx0)/at = (x - x_0)/a and substituting into the second yields yy0=(b/a)(xx0)y - y_0 = (b/a)(x - x_0), or equivalently y=mx+cy = mx + c with m=b/am = b/a.[81] Parametric equations offer advantages over Cartesian forms for modeling non-functional or complex curves, such as the cycloid traced by a point on the rim of a rolling circle of radius aa, given by
x=a(θsinθ),y=a(1cosθ). x = a(\theta - \sin \theta), \quad y = a(1 - \cos \theta).
This parameterization naturally captures the periodic motion and cusps that would be cumbersome in explicit Cartesian coordinates.[86] Additionally, parametric forms extend to vector notation as r(t)=x(t),y(t)\mathbf{r}(t) = \langle x(t), y(t) \rangle, which is particularly useful for describing trajectories in physics and vector analysis.[87] Polar coordinates represent another parameterized system, where points are defined by radial distance r=f(θ)r = f(\theta) and angle θ\theta, convertible to Cartesian form using the relations x=rcosθx = r \cos \theta and y=rsinθy = r \sin \theta.[88] This conversion enables polar equations, like r=2acosθr = 2a \cos \theta for a circle, to be expressed explicitly in xx and yy.[89]

Equations in Number Theory

Diophantine Equations

Diophantine equations are polynomial equations with integer coefficients for which solutions in integers are sought.[90] These equations, named after the ancient Greek mathematician Diophantus of Alexandria, typically involve finitely many variables and focus on determining whether integer solutions exist and, if so, describing them completely.[91] A classic example is the linear Diophantine equation $ ax + by = c $, where $ a $, $ b $, and $ c $ are given integers, and $ x $ and $ y $ are unknowns to be solved for in the integers.[92] For the linear case, the equation $ ax + by = c $ has integer solutions if and only if the greatest common divisor $ d = \gcd(a, b) $ divides $ c $.[92] This condition arises from Bézout's identity, which states that there exist integers $ x' $ and $ y' $ such that $ ax' + by' = d $, allowing scaling to reach any multiple of $ d $.[92] If a particular solution $ (x_0, y_0) $ is found—often using the extended Euclidean algorithm—the general solution is given by
x=x0+bdt,y=y0adt x = x_0 + \frac{b}{d} t, \quad y = y_0 - \frac{a}{d} t
for any integer parameter $ t $.[92] This parametrization generates all integer solutions, highlighting the infinite nature of the solution set when it is non-empty.
A prominent nonlinear example is Fermat's Last Theorem, which asserts that there are no positive integers $ a $, $ b $, and $ c $ satisfying $ a^n + b^n = c^n $ for any integer $ n > 2 $. Proposed by Pierre de Fermat in 1637, the theorem remained unproven for over 350 years until Andrew Wiles announced a proof in 1994, with the final version co-authored with Richard Taylor and published in 1995.[93] Wiles' approach linked the problem to the modularity theorem for elliptic curves, establishing that no such counterexamples exist by showing contradictions in assumed solutions via advanced algebraic number theory. Another key Diophantine equation is Pell's equation, $ x^2 - d y^2 = 1 $, where $ d $ is a positive square-free integer and solutions $ (x, y) $ are sought in positive integers.[94] The solutions can be systematically found using the continued fraction expansion of $ \sqrt{d} $, as the convergents of this expansion yield approximations that satisfy the equation.[94] Specifically, if the continued fraction period length is $ k $, the fundamental solution corresponds to the convergent at the end of the first period, and all subsequent solutions are generated recursively from it using powers of the fundamental unit in the ring $ \mathbb{Z}[\sqrt{d}] $.[94] This method, originally developed by Joseph-Louis Lagrange, provides an efficient algorithm for computing solutions even for large $ d $.[94] Diophantine equations have significant applications in cryptography, particularly in the RSA public-key cryptosystem, whose security relies on the computational difficulty of integer factorization—a problem reducible to solving certain Diophantine equations over the integers.[95] Introduced by Rivest, Shamir, and Adleman in 1978, RSA uses the product $ n = pq $ of two large primes $ p $ and $ q $ as the modulus; factoring $ n $ to recover $ p $ and $ q $ is intractable for sufficiently large $ n $, ensuring the trapdoor function's one-way property essential for encryption and digital signatures. This hardness underpins RSA's widespread use, as no efficient general algorithm exists for such factorizations despite extensive study.

Algebraic and Transcendental Numbers

Algebraic numbers are complex numbers that satisfy a non-zero polynomial equation with rational coefficients, known more precisely as roots of such polynomials.[96] For instance, 2\sqrt{2} is an algebraic number because it is a root of the equation x22=0x^2 - 2 = 0.[96] The set of all algebraic numbers forms an algebraically closed field, the algebraic closure of the rationals, and each individual algebraic number α\alpha generates a finite-degree field extension Q(α)\mathbb{Q}(\alpha) over Q\mathbb{Q}, where the degree equals that of the minimal polynomial of α\alpha.[96] Transcendental numbers are those complex numbers that are not algebraic, meaning they are not roots of any non-zero polynomial with rational coefficients.[97] Prominent examples include π\pi and ee, the base of the natural logarithm.[97] The Lindemann–Weierstrass theorem provides key insights into their transcendence: if α1,,αn\alpha_1, \dots, \alpha_n are distinct algebraic numbers linearly independent over Q\mathbb{Q}, then eα1,,eαne^{\alpha_1}, \dots, e^{\alpha_n} are algebraically independent over Q\mathbb{Q}.[98] A special case implies that eαe^\alpha is transcendental for any nonzero algebraic α\alpha, establishing the transcendence of ee (take α=1\alpha = 1) and contributing to proofs of π\pi's transcendence via related exponential relations.[98] The algebraic numbers form a countable set, as demonstrated by Georg Cantor in 1874; there are countably many polynomials with rational coefficients (since rationals are countable and polynomials have finite degree), and each such polynomial has finitely many roots.[99] In contrast, the transcendental numbers are uncountable, following from the uncountability of the complex numbers and the countability of the algebraics.[99] Constructible numbers constitute a proper subfield of the algebraic numbers, consisting precisely of those real numbers obtainable from the rationals through a finite sequence of additions, subtractions, multiplications, divisions, and square roots—or equivalently, via compass-and-straightedge constructions starting from segments of lengths 0 and 1. These numbers lie in field extensions of Q\mathbb{Q} whose degrees are powers of 2, reflecting the quadratic nature of straightedge-and-compass operations. This restriction explains the impossibility of certain classical constructions, such as doubling the cube: constructing a side length of 23\sqrt[3]{2} to double the volume of a unit cube requires adjoining 23\sqrt[3]{2} to Q\mathbb{Q}, yielding an extension of degree 3 (the degree of the irreducible minimal polynomial x32x^3 - 2), which cannot divide a power of 2.[100] Galois theory provides a profound application by classifying which algebraic numbers—roots of polynomials over Q\mathbb{Q}—can be expressed using radicals; a polynomial is solvable by radicals if and only if its Galois group is solvable, linking the structural properties of field extensions to explicit constructibility via nested roots.[101]

Differential Equations

Ordinary Differential Equations

Ordinary differential equations (ODEs) are equations that relate a function of a single independent variable to its derivatives with respect to that variable. They are typically expressed in the form dydx=f(x,y)\frac{dy}{dx} = f(x, y) or higher-order equivalents, where yy is the dependent variable and xx is the independent variable. Unlike algebraic equations, ODEs describe dynamic systems where rates of change are involved, making them essential for modeling phenomena that evolve over time or space in one dimension.[102][103] The order of an ODE is determined by the highest derivative present in the equation; for instance, first-order ODEs involve only the first derivative, while second-order ones include up to the second derivative. ODEs are classified as linear if the dependent variable and all its derivatives appear to the first power with no products or nonlinear functions of them, such as y+p(x)y+q(x)y=g(x)y'' + p(x)y' + q(x)y = g(x); otherwise, they are nonlinear. Linear ODEs are often easier to solve analytically due to the principle of superposition, which allows solutions to be combined linearly.[104][34] For first-order ODEs, separable equations take the form dydx=f(x)g(y)\frac{dy}{dx} = f(x)g(y), which can be solved by rearranging to dyg(y)=f(x)dx\frac{dy}{g(y)} = f(x)\, dx and integrating both sides: dyg(y)=f(x)dx+C\int \frac{dy}{g(y)} = \int f(x)\, dx + C. Exact equations, written as M(x,y)dx+N(x,y)dy=0M(x,y)\, dx + N(x,y)\, dy = 0, admit a solution if My=Nx\frac{\partial M}{\partial y} = \frac{\partial N}{\partial x}, in which case they represent the total differential of some function F(x,y)=CF(x,y) = C. These methods provide explicit or implicit solutions for many practical first-order problems.[105][106] Second-order linear homogeneous ODEs with constant coefficients have the standard form y+ay+by=0y'' + ay' + by = 0, where aa and bb are constants. Solutions are found by assuming y=erxy = e^{rx}, leading to the characteristic equation r2+ar+b=0r^2 + ar + b = 0. The roots r1,r2r_1, r_2 determine the general solution: if distinct real roots, y=c1er1x+c2er2xy = c_1 e^{r_1 x} + c_2 e^{r_2 x}; if repeated, y=(c1+c2x)erxy = (c_1 + c_2 x)e^{rx}; if complex, y=eαx(c1cosβx+c2sinβx)y = e^{\alpha x}(c_1 \cos \beta x + c_2 \sin \beta x), where r=α±iβr = \alpha \pm i\beta. This approach is foundational for solving vibrations and oscillations.[107] Initial value problems (IVPs) for ODEs specify the solution and its derivatives at an initial point, such as y(x0)=y0y(x_0) = y_0 for first-order or additional conditions for higher order. The Picard-Lindelöf theorem guarantees existence and uniqueness of solutions for first-order IVPs y=f(x,y)y' = f(x,y), y(x0)=y0y(x_0) = y_0, if ff is continuous in xx and Lipschitz continuous in yy on a suitable interval. This theorem underpins numerical and analytical reliability in solving IVPs.[108] ODEs find widespread applications in physics and biology. Newton's second law, md2xdt2=Fm \frac{d^2 x}{dt^2} = F, formulates the motion of particles under forces, yielding second-order ODEs like the harmonic oscillator mx+kx=0m x'' + kx = 0. In population dynamics, the Malthusian model dydt=ky\frac{dy}{dt} = ky describes exponential growth or decay, where y(t)y(t) is population size and kk is the growth rate. These examples illustrate ODEs' role in capturing real-world rates of change.[109]

Partial Differential Equations

Partial differential equations (PDEs) are equations involving a function of multiple independent variables and its partial derivatives of various orders. Unlike ordinary differential equations, which depend on a single independent variable, PDEs describe phenomena varying in space and time or other multidimensional contexts. A general form is $ F(x_1, \dots, x_n, u, \frac{\partial u}{\partial x_1}, \dots, \frac{\partial^2 u}{\partial x_i \partial x_j}, \dots) = 0 $, where $ u = u(x_1, \dots, x_n) $. First-order PDEs involve only first partial derivatives, such as the transport equation $ \frac{\partial u}{\partial t} + c \frac{\partial u}{\partial x} = 0 $, while higher-order PDEs include second or more derivatives.[110] Second-order linear PDEs are classified based on their principal part, analogous to conic sections in geometry. For the equation $ a \frac{\partial^2 u}{\partial x^2} + 2b \frac{\partial^2 u}{\partial x \partial y} + c \frac{\partial^2 u}{\partial y^2} + d \frac{\partial u}{\partial x} + e \frac{\partial u}{\partial y} + f u = g $ in two variables, the type is determined by the discriminant $ \Delta = b^2 - ac $: if $ \Delta < 0 $, it is elliptic; if $ \Delta = 0 $, parabolic; if $ \Delta > 0 $, hyperbolic. This classification holds locally and influences solution behavior and stability. In higher dimensions, the classification generalizes using the eigenvalues of the coefficient matrix for the highest-order terms.[111][112] Elliptic PDEs, such as Laplace's equation $ \nabla^2 u = 0 $, model steady-state problems where solutions are smooth and determined by boundary conditions without propagation of discontinuities. Parabolic PDEs, exemplified by the heat equation $ \frac{\partial u}{\partial t} = k \nabla^2 u $, describe diffusion processes with smoothing effects over time, where initial conditions evolve toward equilibrium. Hyperbolic PDEs, like the wave equation $ \frac{\partial^2 u}{\partial t^2} = c^2 \nabla^2 u $, capture wave propagation, preserving sharp features such as shocks or fronts along characteristics. These canonical examples illustrate the distinct physical interpretations: elliptic for equilibrium, parabolic for dissipation, and hyperbolic for transport.[113][110] Solution techniques for PDEs often begin with separation of variables, assuming $ u(x,t) = X(x) T(t) $, which reduces the PDE to ordinary differential equations solvable via standard methods. For boundary value problems on finite domains, Fourier series expansions represent solutions by decomposing into eigenfunctions satisfying the boundary conditions. This approach is particularly effective for linear PDEs with constant coefficients on rectangular or simple geometries.[110][112] Boundary value problems specify conditions on the domain's boundary to ensure uniqueness. In Dirichlet problems, the function value $ u $ is prescribed on the boundary, as in potential theory for Laplace's equation. Neumann problems instead specify the normal derivative $ \frac{\partial u}{\partial n} $, relevant for flux conditions in heat flow. Mixed problems combine both, while initial-boundary value problems for time-dependent PDEs include initial data alongside spatial boundaries. Existence and uniqueness theorems, such as those from maximum principles for elliptic and parabolic cases, rely on the PDE type.[110][114] PDEs underpin models in continuum mechanics and physics. The heat equation governs temperature distribution in conduction, with solutions revealing how thermal energy diffuses. In fluid dynamics, the Navier-Stokes equations—a system of nonlinear PDEs including momentum and continuity—describe viscous incompressible flow: $ \frac{\partial \mathbf{u}}{\partial t} + (\mathbf{u} \cdot \nabla) \mathbf{u} = -\nabla p + \nu \nabla^2 \mathbf{u} $ and $ \nabla \cdot \mathbf{u} = 0 $, where challenges like turbulence arise from their nonlinearity and hyperbolic-parabolic nature. These applications highlight PDEs' role in predicting real-world behaviors from fundamental conservation laws.[110][115]

Advanced Types

Integral Equations

Integral equations are functional equations in which the unknown function appears within an integral, typically arising in the study of physical systems where the solution at a point depends on its values over an interval. A canonical form is the linear integral equation of the second kind,
ϕ(x)=f(x)+λabK(x,t)ϕ(t)dt, \phi(x) = f(x) + \lambda \int_a^b K(x, t) \phi(t) \, dt,
where ϕ(x)\phi(x) is the unknown function, f(x)f(x) is a known forcing function, λ\lambda is a parameter, and K(x,t)K(x, t) is the kernel specifying the interaction between points xx and tt.[116] This general structure was formalized by Ivar Fredholm in his seminal 1903 paper, which laid the foundations for the theory of such equations and their spectral properties.[117] Integral equations are classified based on the limits of integration: Fredholm equations have fixed limits [a,b][a, b], making them suitable for steady-state problems, while Volterra equations feature a variable upper limit, such as axK(x,t)ϕ(t)dt\int_a^x K(x, t) \phi(t) \, dt, which often model evolutionary processes and trace back to Vito Volterra's 1896 work on inverting definite integrals.[118] The equations are further categorized by the position of the unknown function. In first-kind integral equations, ϕ\phi appears solely inside the integral,
abK(x,t)ϕ(t)dt=f(x), \int_a^b K(x, t) \phi(t) \, dt = f(x),
posing challenges due to their ill-posed nature, as small perturbations in ff can lead to large changes in ϕ\phi; these often require regularization for numerical solution.[119] Second-kind equations include ϕ\phi both outside and inside the integral, as in the form above, and are generally well-posed under mild conditions on the kernel, such as continuity or square-integrability. Fredholm equations of the first kind frequently arise in inverse problems, like geophysical imaging, while Volterra first-kind forms appear in viscoelasticity models.[116] Solution methods for second-kind equations exploit iterative expansions, notably the Neumann series, which assumes λ|\lambda| is small enough for convergence. The solution is expressed as
ϕ(x)=n=0λnKnf(x), \phi(x) = \sum_{n=0}^\infty \lambda^n K^n f(x),
where KnK^n denotes the nn-fold application of the integral operator KK, providing an exact representation when the series converges in appropriate norms, such as L2[a,b]L^2[a, b]. This method, extended from Carl Neumann's operator theory, was pivotal in Fredholm's analysis and applies directly to Fredholm equations with compact kernels.[117] For Volterra equations, differentiation under suitable smoothness assumptions converts the integral equation into an ordinary differential equation, facilitating analytical or numerical resolution; for instance, differentiating the second-kind Volterra form yields a first-order ODE with an integral term that can be integrated explicitly.[120] Eigenvalue problems for integral equations consider the homogeneous case,
abK(x,t)ϕ(t)dt=λϕ(x), \int_a^b K(x, t) \phi(t) \, dt = \lambda \phi(x),
where λ\lambda are eigenvalues and ϕ\phi are eigenfunctions, forming the basis of Fredholm's spectral theory for compact self-adjoint operators on Hilbert spaces. The eigenvalues are real and countable, accumulating only at zero, with the eigenfunctions forming an orthogonal basis, enabling expansions akin to Fourier series for solving inhomogeneous problems.[117] These problems underpin stability analyses in operator theory and have high impact in numerical methods like Nyström approximation for computing spectra. In applications, integral equations reformulate boundary value problems for partial differential equations into equivalent forms on the domain boundary, reducing dimensionality; for example, in potential theory, the Dirichlet problem for Laplace's equation leads to a Fredholm second-kind equation using single- and double-layer potentials, as detailed in boundary element methods for engineering simulations.[121] In quantum mechanics, the Lippmann-Schwinger equation, a Volterra-type integral equation derived from the time-independent Schrödinger equation, describes scattering amplitudes for particles interacting via a potential, with the free-particle Green's function as the kernel; this formulation, introduced by Lippmann and Schwinger in 1950, facilitates perturbative solutions and exact treatments in one dimension.

Functional and Difference Equations

Functional equations are equations in which the unknowns are functions rather than numbers, often relating the values of a function at different points. A prominent example is Cauchy's functional equation, given by f(x+y)=f(x)+f(y)f(x + y) = f(x) + f(y) for all x,yx, y in the domain, typically the real numbers R\mathbb{R}. Under the assumption of continuity, or other regularity conditions such as monotonicity or measurability, the solutions are linear functions of the form f(x)=kxf(x) = kx, where k=f(1)k = f(1) is a constant. Without such assumptions, and relying on the axiom of choice, there exist pathological solutions that are not linear and highly discontinuous, but these are not explicitly constructible.[122] Solving functional equations often involves iterative methods or fixed-point analysis. For instance, iterations of the functional equation can reveal patterns, such as applying Cauchy's equation repeatedly to express f(nx)f(nx) for integer nn, leading to f(nx)=nf(x)f(nx) = n f(x), and extending to rational multiples under additivity. Fixed points play a key role in more general solvability, where a fixed point satisfies f(x)=xf(x) = x, and theorems like the Banach fixed-point theorem ensure unique solutions in complete metric spaces for contractive mappings derived from the equation.[123] Another classic functional equation is d'Alembert's equation, f(x+y)+f(xy)=2f(x)f(y)f(x + y) + f(x - y) = 2 f(x) f(y), which arises in the study of wave propagation and trigonometric functions. Assuming continuity, the solutions include cosine functions, f(x)=cos(ax)f(x) = \cos(ax) for some constant aa, or constant solutions like f(x)=1f(x) = 1 or f(x)=cosh(ax)f(x) = \cosh(ax). This equation connects to representations of groups and has been generalized to abstract settings like metabelian groups.[124] Difference equations, also known as recurrence relations, describe discrete dynamical systems where the value of a function at one point determines its value at subsequent points. A basic form is the forward difference Δyn=yn+1yn=f(n,yn)\Delta y_n = y_{n+1} - y_n = f(n, y_n), which models changes over discrete steps. Linear homogeneous difference equations take the form yn+k+ak1yn+k1++a0yn=0y_{n+k} + a_{k-1} y_{n+k-1} + \cdots + a_0 y_n = 0, solved by assuming solutions of the form yn=rny_n = r^n and deriving the characteristic equation rk+ak1rk1++a0=0r^k + a_{k-1} r^{k-1} + \cdots + a_0 = 0, whose roots determine the general solution as linear combinations of terms like nmrnn^m r^n for repeated roots.[125] A well-known example is the Fibonacci recurrence, Fn+1=Fn+Fn1F_{n+1} = F_n + F_{n-1} with initial conditions F0=0F_0 = 0, F1=1F_1 = 1, which has the characteristic equation r2r1=0r^2 - r - 1 = 0 with roots ϕ=1+52\phi = \frac{1 + \sqrt{5}}{2} and ϕ^=152\hat{\phi} = \frac{1 - \sqrt{5}}{2}, yielding the closed-form Binet formula Fn=ϕnϕ^n5F_n = \frac{\phi^n - \hat{\phi}^n}{\sqrt{5}}. For nonhomogeneous linear cases, such as yn+1=ayn+by_{n+1} = a y_n + b, a particular solution (e.g., constant if a1a \neq 1) is added to the homogeneous solution.[126] Difference equations find applications in modeling discrete dynamical systems, such as population growth in stages or financial sequences, where stability is analyzed via eigenvalues of the companion matrix from the characteristic equation. In fractals, iterative difference equations generate complex structures; for the Mandelbrot set, points cCc \in \mathbb{C} are in the set if the iteration zn+1=zn2+cz_{n+1} = z_n^2 + c starting from z0=0z_0 = 0 remains bounded, revealing self-similar boundaries through repeated applications. These discrete iterations serve as analogs to continuous ordinary differential equations but emphasize finite-step evolutions in fields like chaos theory./04%3A_DiscreteTime_Models_I__Modeling/4.01%3A_DiscreteTime_Models_with_Difference_Equations)[127]

References

Table of Contents