SQL | Print prime numbers

Challenge introduction

Initially, this challenge may appear to be an unconventional application of SQL, as other technologies are generally better suited for numerical analysis tasks of this nature. However, upon closer examination, it becomes evident that this problem serves as a valuable opportunity to gain proficiency in two powerful techniques: correlated queries and recursion.

The problem is sourced from HackerRank and is available at the following link. I recommend attempting to solve the challenge independently before reviewing the provided solution.

Problem statement

Write a query to print all prime numbers less than or equal to 1,000. Print your result on a single line, and use the ampersand (&) character as your separator (instead of a space).

For example, the output for all prime numbers ≤ 10 would be: 2&3&5&7

Solution

In contrast to typical SQL queries, this particular task presents a unique challenge in that we don't have an existing table from which to extract data. Instead, we must commence from scratch. Consequently, we can break down the task into two distinct steps: first, generating a sequence of numbers ranging from 2 to 1000, and second, scrutinizing each number to ascertain whether it is prime.

To generate this numerical sequence, we can employ the GENERATE_SERIES function from T-SQL, implementing it within a separate Common Table Expression (CTE). Subsequently, we can employ a correlated subquery to process the values one by one, which is essential in this case. The reason behind this sequential processing is the intricacy of checking whether a given number is prime, which necessitates examining its divisibility by smaller numbers.

This approach allows us to systematically assess each number and determine its primality efficiently.


-- Generate auxiliary CTEs
WITH sequence AS (
        SELECT value AS seq_val
        FROM GENERATE_SERIES(3, 1000)),
    primes AS (
        SELECT 2 AS prime_nums
        UNION ALL
        SELECT seq_val AS prime_nums
        FROM sequence
        -- Only prime numbers less than half the seq_val can be its multiples
        WHERE NOT EXISTS (
            SELECT value
            FROM GENERATE_SERIES(2, CAST(CEILING(seq_val / 2.0) AS INT))
            WHERE seq_val % value = 0))

-- Main query
SELECT STRING_AGG(prime_nums, '&') 
FROM primes;
				

An alternative solution to this problem involves employing recursion instead of relying on the GENERATE_SERIES function. The recursive procedure entails iteratively generating the sequence of numbers until reaching the value of 1000.

However, it's worth noting that, in this instance, this approach exhibits significantly slower performance, with an execution time approximately four times longer compared to the previous method utilizing the correlated subquery.


-- Generate a single auxiliary CTE (recursive)
WITH primes AS (
        SELECT 2 AS seq_val, 1 AS prime
        UNION ALL
        SELECT seq_val + 1,
        -- Only prime numbers less than half the seq_val can be its multiples
        CASE WHEN NOT EXISTS (
            SELECT value
            FROM GENERATE_SERIES(2, CAST(CEILING((seq_val +1) / 2.0) AS INT))
            WHERE (seq_val + 1) % value = 0) THEN 1
                ELSE 0 END
        FROM primes
        WHERE seq_val < 1000)

-- Main query
SELECT STRING_AGG(seq_val, '&') 
FROM primes
WHERE prime = 1
OPTION(MAXRECURSION 1000);