Polynomial Least Squares Data Fitting
Compute the least-squares polynomial approximation of degree up to 50 for discrete data points, continuous functions, or uploaded files. Uses QR decomposition and Legendre quadrature for numerical stability.
The Least Squares Problem
Given data points \((x_i, y_i)\) or a function \(f\) on \([a,b]\), the polynomial least squares problem seeks the polynomial \(p^*\) of degree \(\le n\) that minimises the distance to the data. Unlike interpolation, which forces \(p(x_i)=y_i\) exactly, least squares allows the polynomial to approximate the data in the best possible way under a chosen norm — essential when the data is noisy, over-determined, or when a lower-degree model is preferred.
Discrete Least Squares (Points and Files)
Given \(m\) data points with \(m \ge n+1\), write \(p(x)=\sum_{j=0}^n a_j x^j\). The residual vector is \(r = Va - y\) where \(V\) is the \(m\times(n+1)\) Vandermonde matrix \(V_{ij}=x_i^j\). Minimising \(\|Va-y\|_2^2\) yields the normal equations \(V^T V a = V^T y\), but forming and solving these is numerically hazardous because the condition number of \(V^T V\) is the square of that of \(V\).
Instead, this tool solves the overdetermined system directly via Householder QR decomposition: write \(V = QR\) where \(Q\) has orthonormal columns and \(R\) is upper triangular. The least-squares solution satisfies \(Ra = Q^T y\), solved by back-substitution. This approach is numerically stable for high degrees and irregularly spaced points, avoiding the ill-conditioning of the normal equations entirely.
Continuous Least Squares (Function on \([a,b]\))
For a function \(f\in L^2([a,b])\), the best degree-\(n\) polynomial approximation is its orthogonal projection onto \(\mathcal{P}_n\) with respect to the \(L^2([a,b])\) inner product:
where \(P_k\) are the Legendre polynomials, an orthogonal basis for \(L^2([-1,1])\) satisfying \(\int_{-1}^1 P_j P_k\,dt = \tfrac{2}{2k+1}\delta_{jk}\). The coefficients \(c_k\) are computed by high-order composite Gauss–Legendre quadrature (7-point rule, up to 300+ panels), so no matrix formation or normal equations are needed at all. The Legendre coefficients are then converted to the monomial basis for display and evaluation. This approach is extremely stable for degrees up to 50.
Error Norms (Continuous Mode)
When the true function \(f\) is known, three error norms are displayed:
- \(L^2\) error: \(\|f-p\|_{L^2}=\bigl(\int_a^b(f-p)^2\bigr)^{1/2}\). The relative version \(\|f-p\|/\|f\|\) measures the fractional approximation error.
- \(L^1\) error: \(\|f-p\|_{L^1}=\int_a^b|f-p|\). Less sensitive to isolated spikes than \(L^2\).
- \(L^\infty\) error: \(\|f-p\|_{L^\infty}=\max_{x\in[a,b]}|f(x)-p(x)|\). The pointwise worst-case error, estimated over 2000 sample points.
Export Formats
After fitting, the polynomial can be exported in four formats:
LaTeX (for typesetting), Python function
(a self-contained def p(x)), NumPy polynomial
(via np.poly1d), or MATLAB (via polyval).
Use the format selector and Copy button in the export panel.
How to Use This Tool
- Discrete points: Enter \(x\) and \(y\) as comma-separated values. Math expressions like
pi/2orexp(1)are accepted. - From function: Enter a function expression \(f(x)\), an interval \([a,b]\), and choose the polynomial degree. The orthogonal projection is computed automatically.
- From file: Upload a CSV, TXT, XLSX, or ODS file, then select which columns contain the \(x\) and \(y\) data.
- Set the degree \(n\) (0–50). Lower degrees give smoother fits; higher degrees capture more structure but may oscillate.
- Click Fit Polynomial to compute. The graph, polynomial expression, and (in function mode) error norms are displayed immediately.
- Evaluate the fitted polynomial at arbitrary points using the Evaluate field below the graph.
- Export the polynomial using the format selector and Copy button.