C++ | Least squares

Challenge introduction

This challenge provides valuable practice as it combines various mathematical concepts, data handling, and computational techniques. By requiring the use of the least squares method to fit a second-degree polynomial to a dataset, it encourages the application of linear algebra and matrix operations practically. Additionally, the task involves reading data from an external file, which introduces file handling and input/output operations. Moreover, dealing with potential singularity issues in the coefficient matrix further tests error handling skills. In summary, this problem encompasses a wide range of mathematical and programming skills in both mathematical and computational domains.

The problem is sourced from the programming languages subject from UNED and is available at the following link. I recommend attempting to solve the challenge independently before reviewing the provided solution.

Problem statement

The goal is to calculate, using the least squares method, the following second-degree polynomial:

    \(y = b_{0} + b_{1} · x + b_{2} · x^2\)

that best fits a dataset \((x_1, y_1), . . . , (x_n, y_n)\) stored in a text file. The pairs of values \((x_i, y_i)\) are real numbers. The coefficients of the polynomial \((b_0, b_1, b_2)\) are computed by solving the following system of three linear equations with three unknowns:

        \(\displaystyle b_0 · n + b_1 · \sum_{j=1}^{n}x_j + b_2 · \sum_{j=1}^{n}x^2_j = \sum_{j=1}^{n}y_j \)

    \(\displaystyle b_0 · \sum_{j=1}^{n}x_j + b_1 · \sum_{j=1}^{n}x^2_j + b_2 · \sum_{j=1}^{n}x^3_j = \sum_{j=1}^{n}x_j · y_j \)

    \(\displaystyle b_0 · \sum_{j=1}^{n}x^2_j + b_1 · \sum_{j=1}^{n}x^3_j + b_2 · \sum_{j=1}^{n}x^4_j = \sum_{j=1}^{n}x^2_j · y_j \)

The data file \((x_1, y_1), . . . , (x_n, y_n)\) has n rows and 2 columns. The i-th row contains \(x_i\) in the first column and \(y_i\) in the second. Write a C++ program that performs the following actions:

  1. Display a message on the console, asking the user to enter the name of the text file where the data is stored. Read the filename from the console.
  2. Open the file for reading. If an error occurs, terminate.
  3. Read the file, storing the values in two double-type vectors named \(vX\) and \(vY\).
  4. Solve the system of equations. To do this, calculate the inverse of the coefficient matrix of the system. If the matrix is singular, terminate.
  5. Display the adjusted polynomial on the console, with numeric values in scientific notation and 4 digits of precision.
  6. Terminate.

Solution

In approaching this problem, the emphasis has been on simplicity, leveraging the explicit structure of the problem rather than developing generalized functions for mathematical techniques that could be applied to other scenarios. This simplification is particularly evident in the functions defining the coefficient matrix and the constant vector, which directly translate the equations from the problem statement instead of pursuing broader, more abstract approaches. However, in certain operations, such as matrix inversion, functions suitable for matrices of any dimension have been employed, making use of recursive properties.

Concerning the technical aspects of solving this problem, the determinant method was selected for computing the inverse of the coefficient matrix. While this method may be less computationally efficient than alternatives like the Gauss-Jordan method for larger matrices, it performs well for our 2x2 or 3x3 matrices, resulting in only a minor loss of efficiency. Therefore, the determinant method was chosen due to its compatibility with a straightforward recursive implementation.


#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <cmath>
#include <iomanip>
#include <map>
#include <stdexcept>
#include <sstream>

double sum_vector_to_nth_power(std::vector <double> input_vector, int power) {
    double sum_value = 0;
    std::vector<double>::iterator it;
    for (it = input_vector.begin(); it < input_vector.end(); it++) {
        sum_value += std::pow(*it, power);
    }
    return sum_value;
}

double dot_product_y_and_x_to_power(std::vector <double> x, std::vector <double> y, int x_power) {
    double dot_product = 0;
    std::vector<double>::iterator it_x = x.begin();
    std::vector<double>::iterator it_y = y.begin();
    for (int i = 0; i < x.size(); i++) {
        dot_product += *(it_y + i) * std::pow(*(it_x + i), x_power);
    }
    return dot_product;
}

int check_matrix_is_square(std::vector <std::vector <double>> input_matrix) {
    if (input_matrix.size() != input_matrix[0].size()) {
        std::stringstream ss;
        ss << "The input matrix is not square";
        throw std::invalid_argument(ss.str());
    }
    return 1;
}

// We create a function to generate the coefficients matrix
//     | Σ(x_i)^0   Σ(x_i)^1   Σ(x_i)^2 |
//     | Σ(x_i)^1   Σ(x_i)^2   Σ(x_i)^3 |
//     | Σ(x_i)^2   Σ(x_i)^3   Σ(x_i)^4 |
std::vector <std::vector <double>> calculate_coefficients_matrix(std::vector <double> vX) {
    std::vector <std::vector <double>>  coefficients_matrix(3, std::vector<double>(3));
    for (int i = 0; i < 3; i++) {
        for (int j = 0; j < 3; j++) {
            coefficients_matrix[i][j] = sum_vector_to_nth_power(vX, i + j);
        }
    }
    return coefficients_matrix;
}

// We create a function to generate the vector of constants
//     | Σ[(x_i)^0·(y_i)] |
//     | Σ[(x_i)^1·(y_i)] |
//     | Σ[(x_i)^2·(y_i)] |
std::vector <double> calculate_constants_vector(std::vector <double> vX, std::vector <double> vY) {
    std::vector <double>  constants_matrix(3);
    for (int i = 0; i < 3; i++) {
        constants_matrix[i] = dot_product_y_and_x_to_power(vX, vY, i);
    }
    return constants_matrix;
}

std::vector <std::vector <double>> extract_minor(std::vector <std::vector <double>> matrix,
                                                 int row, int column)
throw (std::invalid_argument) {
    try {
        check_matrix_is_square(matrix);
    }
    catch (std::invalid_argument& exception) {
        std::stringstream ss;
        ss << "The matrix is not square, minors cannot be extracted";
        throw std::invalid_argument(ss.str());
    }
    double size = matrix.size();
    std::vector <std::vector <double>>  minor(size - 1, std::vector<double>(size - 1));
    int ref_row = 0, ref_col = 0;
    for (int i = 0; i < size; i++) {
        if (i != row) {
            for (int j = 0; j < size; j++) {
                if (j != column) {
                    minor[ref_row][ref_col] = matrix[i][j];
                    ref_col++;
                }
            }
            ref_col = 0;
            ref_row++;
        }                
    }
    return minor;
}

double calculate_determinant(std::vector <std::vector <double>> matrix) 
throw (std::invalid_argument) {
    try {
        check_matrix_is_square(matrix);
    }
    catch (std::invalid_argument& exception) {
        std::stringstream ss;
        ss << "The matrix is not square, the determinant cannot be calculated";
        throw std::invalid_argument(ss.str());
    }
    double determinant = 0;
    int size = matrix.size();
    if (size == 1) {
        determinant = matrix[0][0];
        return determinant;
    }
    std::vector <std::vector <double>> minor;
    int sign = 1;
    for (int row = 0; row < size; row++) {
        minor = extract_minor(matrix, row, 0);
        determinant += sign * matrix[row][0] * calculate_determinant(minor);
        sign = -sign;
    }
    return determinant;
}

std::vector <std::vector <double>> calculate_adjoint(std::vector <std::vector <double>> matrix) 
throw (std::invalid_argument) {
    try {
        check_matrix_is_square(matrix);
    }
    catch (std::invalid_argument& exception) {
        std::stringstream ss;
        ss << "The matrix is not square, its adjoint cannot be calculated";
        throw std::invalid_argument(ss.str());
    }
    int size = matrix.size();
    std::vector <std::vector <double>> adjoint(size, std::vector<double>(size));
    if (size == 1) {
        adjoint[0][0] = 1;
        return adjoint;
    }
    int sign = 1;
    std::vector <std::vector <double>> minor;
    for (int row = 0; row < size; row++) {
        for (int col = 0; col < size; col++) {
            sign = ((row + col) % 2 == 0) ? 1 : -1;
            minor = extract_minor(matrix, row, col);
            // Swap row and col references to transpose the array
            adjoint[col][row] = sign * calculate_determinant(minor);
        }
    }
    return adjoint;
}

std::vector <std::vector <double>> calculate_inverse(std::vector <std::vector <double>> matrix)
throw (std::invalid_argument) {
    try {
        check_matrix_is_square(matrix);
    }
    catch (std::invalid_argument& exception) {
        std::stringstream ss;
        ss << "The matrix is not square, it cannot be inverted";
        throw std::invalid_argument(ss.str());
    }
    double determinant = calculate_determinant(matrix);
    if (determinant == 0) {
        std::stringstream ss;
        ss << "The matrix is singular (zero determinant), its inverse cannot be calculated";
        throw std::invalid_argument(ss.str());
    }
    int size = matrix.size();
    std::vector <std::vector <double>> adjoint = calculate_adjoint(matrix);
    std::vector <std::vector <double>> inverse(size, std::vector<double>(size));
    for (int row = 0; row < size; row++) {
        for (int col = 0; col < size; col++) {
            inverse[row][col] = adjoint[row][col] / determinant;
        }
    }
    return inverse;
}

// We obtain the fitted polinomial as X^(-1)Y 
// Where X^(-1) is the inverse of the coefficients matrix and Y is the constants vector
std::vector <double> get_polinomial_fit(std::vector <std::vector <double>> coefficients_inverse, 
                                        std::vector <double> constants_vector)
throw (std::invalid_argument) {
    if (coefficients_inverse[0].size() != constants_vector.size()) {
        std::stringstream ss;
        ss << "Input shapes are not compatible with multiplication";
        throw std::invalid_argument(ss.str());
    }
    std::vector <double> polynomial_coefficients;
    double coefficient;
    int n_rows = coefficients_inverse.size(), n_cols = coefficients_inverse[0].size();
    for (int row = 0; row < n_rows; row++) {
        coefficient = 0;
        for (int col = 0; col < n_cols; col++) {
            coefficient += coefficients_inverse[row][col] * constants_vector[col];
        }
        polynomial_coefficients.push_back(coefficient);
    }
    return polynomial_coefficients;
}

void print_polynomial(std::vector <double> polynomial_coefficients) {
    std::map<int, std::string> x_text;
    x_text[0] = ""; x_text[1] = "x"; x_text[2] = "x^2";
    std::string add_sign = "", subtract_sign = "- ";
    std::cout << "\nFitted polynomial:" << std::endl;
    std::cout << std::scientific << std::setprecision(4) << "y = ";
    for (int i = 0; i < polynomial_coefficients.size(); i++) {
        double coefficient = polynomial_coefficients[i];
        if (coefficient ≥ 0) {
            std::cout << add_sign << coefficient << x_text[i];
        }
        else {
            std::cout << subtract_sign << -coefficient << x_text[i];
        }
        add_sign = " + ";
        subtract_sign = " - ";
    }
    std::cout << std::endl;
}

int main() {
    std::string file_name;
    std::cout << "Name of the file to be uploaded: ";
    std::getline(std::cin, file_name);
    std::ifstream f(file_name, std::ios::in);
    if (!f) {
        std::cout << "An error occurred while opening the file" << std::endl;
        return 1;
    }
    double x_i, y_i;
    std::vector <double> vX, vY;
    while (f >> x_i >> y_i) {
        vX.push_back(x_i);
        vY.push_back(y_i);
    }
    std::vector <std::vector <double>> coefficients_matrix;
    coefficients_matrix  = calculate_coefficients_matrix(vX);
    try {
        std::vector <std::vector <double>> coefficients_inverse;
        coefficients_inverse = calculate_inverse(coefficients_matrix);
        std::vector <double> constants_vector = calculate_constants_vector(vX, vY);
        std::vector <double> polynomial_coefficients;
        polynomial_coefficients = get_polinomial_fit(coefficients_inverse, constants_vector);
        print_polynomial(polynomial_coefficients);
    }
    catch (std::invalid_argument& exception) {
        std::cout << exception.what() << std::endl;
        return -1;
    }
    return 0;
}