Consider this problem: A list of distinctive integers in random order is provided. The integer is in the range of $$1$$ to $$N$$ and there are $$N-1$$ integers provided, i.e. one integer in $$[1,N]$$ is not in the provided list. Find that number in $$O(1)$$ space and $$O(N)$$ time.

Simply, we sum up the numbers in the list and the difference between the sum and $$\sum_{k=1}^N k$$ is the number in concern.

But what happen if there are two numbers missing? Obviously, we can use the same trick, for the list of integers be denoted by $$n_k$$, we find

\begin{align} \sum_{k=1}^N k - \sum_{k=1}^{N-2} n_k &= s \\ \prod_{k=1}^N k - \prod_{k=1}^{N-2} n_k &= p \end{align}

Then we have the two missing numbers, $$x$$ and $$y$$, to satisfy

\begin{align} x + y &= s \\ xy &= p \end{align}

This approach works but $$\prod_k n_k$$ can be a vastly large number. Can we have another simultaneous equations to find $$x$$ and $$y$$ without overflowing an integer variable?

I have considered the following

$\begin{gather} \operatorname*{XOR}_{k=1}^{N-2} n_k \\ \sum_{k=1}^{N-2} \operatorname{NOT}(n_k) \\ \sum_{k=1}^{N-2} \operatorname{popcount}(n_k) \\ \sum_{k=1}^{N-2} \operatorname{reverse}(n_k) \end{gather}$

and some others. However, I find one test to prove if any of these constructs can help you solve the two unknowns. Consider $$x$$ and $$y$$ are different at bit $$m$$, and we define $$x'$$ and $$y'$$ to be same as $$x$$ and $$y$$ except their bit $$m$$ are flipped. Then we should have a function that

$f(x,y) \ne f(x',y')$

or otherwise we cannot distinguish the solution $$x$$, $$y$$ from $$x'$$, $$y'$$. Obviously the list above does not pass this test. That’s why we need multiplication involved: We can solve for $$x$$ and $$y$$ in the equations below

\begin{align} x + y &= s \\ x^2 + y^2 &= p \end{align}

It can also be done using a non-linear injective function other than square in the second equation, such as logarithms. However, logarithm produce floating point numbers and that may introduce noise. This is not a good solution in implementation perspective because, $$\sum_k n_k^2$$ increases quicker than linear and it has a chance to overflow the variable. But this is the best solution I get so far.