Concatenating integers and floating point instability

November 07, 2022

Suppose we have two integers, a and b, their concatenation may be simply calculated by:

def concat_simple(a, b):
    return int(str(a) + str(b))

This implementation is correct, but converting integers to strings and back to an integer is slow. It would be faster if we could somehow not perform any string conversions.

Mathematically, for $b \neq 0$, the following is true: $$ concat(a, b) = a \times 10 ^ {\lfloor log_{10}b \rfloor + 1} + b $$

Translating the above into Python:

from math import log

def concat(a, b):
    if b == 0:
        return a * 10
    upper = a * 10**int(log(b, 10) + 1)
    return upper + b

>>> concat(1234, 5678)
12345678

>>> concat(100, 1000)
101000    # Incorrect!

Our concat function seems correct, but concat(100, 1000) returns 101000 rather than 1001000 — why? The reason is due to the limited precision of floating point numbers. In this case, log(1000, 10) is 2.9999999999999996, so int(log(1000, 10) + 1) returns 3 rather than 4. To handle the imprecision, we can check if upper <= a * b, and if so, multiply it by 10 to get the correct answer.

def concat(a, b):
    if b == 0:
        return a * 10
    upper = a * 10**int(log(b, 10) + 1)
    if upper <= a * b:
        upper *= 10
    return upper + b

Now concat(100, 1000) == 1001000.