Fast base58 codec: (#4327)

This algorithm is about an order of magnitude faster than the existing
algorithm (about 10x faster for encoding and about 15x faster for
decoding - including the double hash for the checksum). The algorithms
use gcc's int128 (fast MS version will have to wait, in the meantime MS
falls back to the slow code).
This commit is contained in:
Scott Determan
2024-03-05 15:23:27 -05:00
committed by tequ
parent fd36796bbd
commit c8373de952
6 changed files with 1299 additions and 22 deletions

View File

@@ -16,11 +16,25 @@
OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
*/
//==============================================================================
//
/* The base58 encoding & decoding routines in the b58_ref namespace are taken
* from Bitcoin but have been modified from the original.
*
* Copyright (c) 2014 The Bitcoin Core developers
* Distributed under the MIT software license, see the accompanying
* file COPYING or http://www.opensource.org/licenses/mit-license.php.
*/
#include <ripple/protocol/tokens.h>
#include <ripple/basics/safe_cast.h>
#include <ripple/protocol/digest.h>
#include <ripple/protocol/tokens.h>
#include <ripple/protocol/impl/b58_utils.h>
#include <boost/container/small_vector.hpp>
#include <boost/endian.hpp>
#include <boost/endian/conversion.hpp>
#include <cassert>
#include <cstring>
#include <memory>
@@ -28,6 +42,97 @@
#include <utility>
#include <vector>
/*
Converting between bases is straight forward. First, some background:
Given the coefficients C[m], ... ,C[0] and base B, those coefficients represent
the number C[m]*B^m + ... + C[0]*B^0; The following pseudo-code converts the
coefficients to the (infinite precision) integer N:
```
N = 0;
i = m ;; N.B. m is the index of the largest coefficient
while (i>=0)
N = N + C[i]*B^i
i = i - 1
```
For example, in base 10, the number 437 represents the integer 4*10^2 + 3*10^1 +
7*10^0. In base 16, 437 is the same as 4*16^2 + 3*16^1 + 7*16^0.
To find the coefficients that represent the integer N in base B, we start by
computing the lowest order coefficients and work up to the highest order
coefficients. The following pseudo-code converts the (infinite precision)
integer N to the correct coefficients:
```
i = 0
while(N)
C[i] = N mod B
N = floor(N/B)
i = i + 1
```
For example, to find the coefficients of the integer 437 in base 10:
C[0] is 437 mod 10; C[0] = 7;
N is floor(437/10); N = 43;
C[1] is 43 mod 10; C[1] = 3;
N is floor(43/10); N = 4;
C[2] is 4 mod 10; C[2] = 4;
N is floor(4/10); N = 0;
Since N is 0, the algorithm stops.
To convert between a number represented with coefficients from base B1 to that
same number represented with coefficients from base B2, we can use the algorithm
that converts coefficients from base B1 to an integer, and then use the
algorithm that converts a number to coefficients from base B2.
There is a useful shortcut that can be used if one of the bases is a power of
the other base. If B1 == B2^G, then each coefficient from base B1 can be
converted to base B2 independently to create a a group of "G" B2 coefficient.
These coefficients can be simply concatenated together. Since 16 == 2^4, this
property is what makes base 16 useful when dealing with binary numbers. For
example consider converting the base 16 number "93" to binary. The base 16
coefficient 9 is represented in base 2 with the coefficients 1,0,0,1. The base
16 coefficient 3 is represented in base 2 with the coefficients 0,0,1,1. To get
the final answer, just concatenate those two independent conversions together.
The base 16 number "93" is the binary number "10010011".
The original (now reference) algorithm to convert from base 58 to a binary
number used the
```
N = 0;
for i in m to 0 inclusive
N = N + C[i]*B^i
```
algorithm.
However, the algorithm above is pseudo-code. In particular, the variable "N" is
an infinite precision integer in that pseudo-code. Real computers do
computations on registers, and these registers have limited length. Modern
computers use 64-bit general purpose registers, and can multiply two 64 bit
numbers and obtain a 128 bit result (in two registers).
The original algorithm in essence converted from base 58 to base 256 (base
2^8). The new, faster algorithm converts from base 58 to base 58^10 (this is
fast using the shortcut described above), then from base 58^10 to base 2^64
(this is slow, and requires multi-precision arithmetic), and then from base 2^64
to base 2^8 (this is fast, using the shortcut described above). Base 58^10 is
chosen because it is the largest power of 58 that will fit into a 64-bit
register.
While it may seem counter-intuitive that converting from base 58 -> base 58^10
-> base 2^64 -> base 2^8 is faster than directly converting from base 58 -> base
2^8, it is actually 10x-15x faster. The reason for the speed increase is two of
the conversions are trivial (converting between bases where one base is a power
of another base), and doing the multi-precision computations with larger
coefficients sizes greatly speeds up the multi-precision computations.
*/
namespace ripple {
static constexpr char const* alphabetForward =
@@ -86,16 +191,31 @@ checksum(void* out, void const* message, std::size_t size)
std::memcpy(out, h.data(), 4);
}
[[nodiscard]] std::string
encodeBase58Token(TokenType type, void const* token, std::size_t size)
{
#ifndef _MSC_VER
return b58_fast::encodeBase58Token(type, token, size);
#else
return b58_ref::encodeBase58Token(type, token, size);
#endif
}
[[nodiscard]] std::string
decodeBase58Token(std::string const& s, TokenType type)
{
#ifndef _MSC_VER
return b58_fast::decodeBase58Token(s, type);
#else
return b58_ref::decodeBase58Token(s, type);
#endif
}
namespace b58_ref {
namespace detail {
/* The base58 encoding & decoding routines in this namespace are taken from
* Bitcoin but have been modified from the original.
*
* Copyright (c) 2014 The Bitcoin Core developers
* Distributed under the MIT software license, see the accompanying
* file COPYING or http://www.opensource.org/licenses/mit-license.php.
*/
static std::string
std::string
encodeBase58(
void const* message,
std::size_t size,
@@ -146,7 +266,7 @@ encodeBase58(
return str;
}
static std::string
std::string
decodeBase58(std::string const& s)
{
auto psz = reinterpret_cast<unsigned char const*>(s.c_str());
@@ -246,5 +366,367 @@ decodeBase58Token(std::string const& s, TokenType type)
// Skip the leading type byte and the trailing checksum.
return ret.substr(1, ret.size() - 1 - guard.size());
}
} // namespace b58_ref
#ifndef _MSC_VER
// The algorithms use gcc's int128 (fast MS version will have to wait, in the
// meantime MS falls back to the slower reference implementation)
namespace b58_fast {
namespace detail {
// Note: both the input and output will be BIG ENDIAN
B58Result<std::span<std::uint8_t>>
b256_to_b58_be(std::span<std::uint8_t const> input, std::span<std::uint8_t> out)
{
// Max valid input is 38 bytes:
// (33 bytes for nodepublic + 1 byte token + 4 bytes checksum)
if (input.size() > 38)
{
return Unexpected(TokenCodecErrc::inputTooLarge);
};
auto count_leading_zeros =
[](std::span<std::uint8_t const> const& col) -> std::size_t {
std::size_t count = 0;
for (auto const& c : col)
{
if (c != 0)
{
return count;
}
count += 1;
}
return count;
};
auto const input_zeros = count_leading_zeros(input);
input = input.subspan(input_zeros);
// Allocate enough base 2^64 coeff for encoding 38 bytes
// log(2^(38*8),2^64)) ~= 4.75. So 5 coeff are enough
std::array<std::uint64_t, 5> base_2_64_coeff_buf{};
std::span<std::uint64_t> const base_2_64_coeff =
[&]() -> std::span<std::uint64_t> {
// convert input from big endian to native u64, lowest coeff first
std::size_t num_coeff = 0;
for (int i = 0; i < base_2_64_coeff_buf.size(); ++i)
{
if (i * 8 >= input.size())
{
break;
}
auto const src_i_end = input.size() - i * 8;
if (src_i_end >= 8)
{
std::memcpy(
&base_2_64_coeff_buf[num_coeff], &input[src_i_end - 8], 8);
boost::endian::big_to_native_inplace(
base_2_64_coeff_buf[num_coeff]);
}
else
{
std::uint64_t be = 0;
for (int bi = 0; bi < src_i_end; ++bi)
{
be <<= 8;
be |= input[bi];
}
base_2_64_coeff_buf[num_coeff] = be;
};
num_coeff += 1;
}
return std::span(base_2_64_coeff_buf.data(), num_coeff);
}();
// Allocate enough base 58^10 coeff for encoding 38 bytes
// log(2^(38*8),58^10)) ~= 5.18. So 6 coeff are enough
std::array<std::uint64_t, 6> base_58_10_coeff{};
constexpr std::uint64_t B_58_10 = 430804206899405824; // 58^10;
std::size_t num_58_10_coeffs = 0;
std::size_t cur_2_64_end = base_2_64_coeff.size();
// compute the base 58^10 coeffs
while (cur_2_64_end > 0)
{
base_58_10_coeff[num_58_10_coeffs] =
ripple::b58_fast::detail::inplace_bigint_div_rem(
base_2_64_coeff.subspan(0, cur_2_64_end), B_58_10);
num_58_10_coeffs += 1;
if (base_2_64_coeff[cur_2_64_end - 1] == 0)
{
cur_2_64_end -= 1;
}
}
// Translate the result into the alphabet
// Put all the zeros at the beginning, then all the values from the output
std::fill(
out.begin(), out.begin() + input_zeros, ::ripple::alphabetForward[0]);
// iterate through the base 58^10 coeff
// convert to base 58 big endian then
// convert to alphabet big endian
bool skip_zeros = true;
auto out_index = input_zeros;
for (int i = num_58_10_coeffs - 1; i >= 0; --i)
{
if (skip_zeros && base_58_10_coeff[i] == 0)
{
continue;
}
std::array<std::uint8_t, 10> const b58_be =
ripple::b58_fast::detail::b58_10_to_b58_be(base_58_10_coeff[i]);
std::size_t to_skip = 0;
std::span<std::uint8_t const> b58_be_s{b58_be.data(), b58_be.size()};
if (skip_zeros)
{
to_skip = count_leading_zeros(b58_be_s);
skip_zeros = false;
if (out.size() < (i + 1) * 10 - to_skip)
{
return Unexpected(TokenCodecErrc::outputTooSmall);
}
}
for (auto b58_coeff : b58_be_s.subspan(to_skip))
{
out[out_index] = ::ripple::alphabetForward[b58_coeff];
out_index += 1;
}
}
return out.subspan(0, out_index);
}
// Note the input is BIG ENDIAN (some fn in this module use little endian)
B58Result<std::span<std::uint8_t>>
b58_to_b256_be(std::string_view input, std::span<std::uint8_t> out)
{
// Convert from b58 to b 58^10
// Max encoded value is 38 bytes
// log(2^(38*8),58) ~= 51.9
if (input.size() > 52)
{
return Unexpected(TokenCodecErrc::inputTooLarge);
};
if (out.size() < 8)
{
return Unexpected(TokenCodecErrc::outputTooSmall);
}
auto count_leading_zeros = [&](auto const& col) -> std::size_t {
std::size_t count = 0;
for (auto const& c : col)
{
if (c != ::ripple::alphabetForward[0])
{
return count;
}
count += 1;
}
return count;
};
auto const input_zeros = count_leading_zeros(input);
// Allocate enough base 58^10 coeff for encoding 38 bytes
// (33 bytes for nodepublic + 1 byte token + 4 bytes checksum)
// log(2^(38*8),58^10)) ~= 5.18. So 6 coeff are enough
std::array<std::uint64_t, 6> b_58_10_coeff{};
auto [num_full_coeffs, partial_coeff_len] =
ripple::b58_fast::detail::div_rem(input.size(), 10);
auto const num_partial_coeffs = partial_coeff_len ? 1 : 0;
auto const num_b_58_10_coeffs = num_full_coeffs + num_partial_coeffs;
assert(num_b_58_10_coeffs <= b_58_10_coeff.size());
for (auto c : input.substr(0, partial_coeff_len))
{
auto cur_val = ::ripple::alphabetReverse[c];
if (cur_val < 0)
{
return Unexpected(TokenCodecErrc::invalidEncodingChar);
}
b_58_10_coeff[0] *= 58;
b_58_10_coeff[0] += cur_val;
}
for (int i = 0; i < 10; ++i)
{
for (int j = 0; j < num_full_coeffs; ++j)
{
auto c = input[partial_coeff_len + j * 10 + i];
auto cur_val = ::ripple::alphabetReverse[c];
if (cur_val < 0)
{
return Unexpected(TokenCodecErrc::invalidEncodingChar);
}
b_58_10_coeff[num_partial_coeffs + j] *= 58;
b_58_10_coeff[num_partial_coeffs + j] += cur_val;
}
}
constexpr std::uint64_t B_58_10 = 430804206899405824; // 58^10;
// log(2^(38*8),2^64) ~= 4.75)
std::array<std::uint64_t, 5> result{};
result[0] = b_58_10_coeff[0];
std::size_t cur_result_size = 1;
for (int i = 1; i < num_b_58_10_coeffs; ++i)
{
std::uint64_t const c = b_58_10_coeff[i];
ripple::b58_fast::detail::inplace_bigint_mul(
std::span(&result[0], cur_result_size + 1), B_58_10);
ripple::b58_fast::detail::inplace_bigint_add(
std::span(&result[0], cur_result_size + 1), c);
if (result[cur_result_size] != 0)
{
cur_result_size += 1;
}
}
std::fill(out.begin(), out.begin() + input_zeros, 0);
auto cur_out_i = input_zeros;
// Don't write leading zeros to the output for the most significant
// coeff
{
std::uint64_t const c = result[cur_result_size - 1];
auto skip_zero = true;
// start and end of output range
for (int i = 0; i < 8; ++i)
{
std::uint8_t const b = (c >> (8 * (7 - i))) & 0xff;
if (skip_zero)
{
if (b == 0)
{
continue;
}
skip_zero = false;
}
out[cur_out_i] = b;
cur_out_i += 1;
}
}
if ((cur_out_i + 8 * (cur_result_size - 1)) > out.size())
{
return Unexpected(TokenCodecErrc::outputTooSmall);
}
for (int i = cur_result_size - 2; i >= 0; --i)
{
auto c = result[i];
boost::endian::native_to_big_inplace(c);
memcpy(&out[cur_out_i], &c, 8);
cur_out_i += 8;
}
return out.subspan(0, cur_out_i);
}
} // namespace detail
B58Result<std::span<std::uint8_t>>
encodeBase58Token(
TokenType token_type,
std::span<std::uint8_t const> input,
std::span<std::uint8_t> out)
{
constexpr std::size_t tmpBufSize = 128;
std::array<std::uint8_t, tmpBufSize> buf;
if (input.size() > tmpBufSize - 5)
{
return Unexpected(TokenCodecErrc::inputTooLarge);
}
if (input.size() == 0)
{
return Unexpected(TokenCodecErrc::inputTooSmall);
}
// <type (1 byte)><token (input len)><checksum (4 bytes)>
buf[0] = static_cast<std::uint8_t>(token_type);
// buf[1..=input.len()] = input;
memcpy(&buf[1], input.data(), input.size());
size_t const checksum_i = input.size() + 1;
// buf[checksum_i..checksum_i + 4] = checksum
checksum(buf.data() + checksum_i, buf.data(), checksum_i);
std::span<std::uint8_t const> b58Span(buf.data(), input.size() + 5);
return detail::b256_to_b58_be(b58Span, out);
}
// Convert from base 58 to base 256, largest coefficients first
// The input is encoded in XPRL format, with the token in the first
// byte and the checksum in the last four bytes.
// The decoded base 256 value does not include the token type or checksum.
// It is an error if the token type or checksum does not match.
B58Result<std::span<std::uint8_t>>
decodeBase58Token(
TokenType type,
std::string_view s,
std::span<std::uint8_t> outBuf)
{
std::array<std::uint8_t, 64> tmpBuf;
auto const decodeResult =
detail::b58_to_b256_be(s, std::span(tmpBuf.data(), tmpBuf.size()));
if (!decodeResult)
return decodeResult;
auto const ret = decodeResult.value();
// Reject zero length tokens
if (ret.size() < 6)
return Unexpected(TokenCodecErrc::inputTooSmall);
// The type must match.
if (type != static_cast<TokenType>(static_cast<std::uint8_t>(ret[0])))
return Unexpected(TokenCodecErrc::mismatchedTokenType);
// And the checksum must as well.
std::array<std::uint8_t, 4> guard;
checksum(guard.data(), ret.data(), ret.size() - guard.size());
if (!std::equal(guard.rbegin(), guard.rend(), ret.rbegin()))
{
return Unexpected(TokenCodecErrc::mismatchedChecksum);
}
std::size_t const outSize = ret.size() - 1 - guard.size();
if (outBuf.size() < outSize)
return Unexpected(TokenCodecErrc::outputTooSmall);
// Skip the leading type byte and the trailing checksum.
std::copy(ret.begin() + 1, ret.begin() + outSize + 1, outBuf.begin());
return outBuf.subspan(0, outSize);
}
[[nodiscard]] std::string
encodeBase58Token(TokenType type, void const* token, std::size_t size)
{
std::string sr;
// The largest object encoded as base58 is 33 bytes; This will be encoded in
// at most ceil(log(2^256,58)) bytes, or 46 bytes. 128 is plenty (and
// there's not real benefit making it smaller). Note that 46 bytes may be
// encoded in more than 46 base58 chars. Since decode uses 64 as the
// over-allocation, this function uses 128 (again, over-allocation assuming
// 2 base 58 char per byte)
sr.resize(128);
std::span<std::uint8_t> outSp(
reinterpret_cast<std::uint8_t*>(sr.data()), sr.size());
std::span<std::uint8_t const> inSp(
reinterpret_cast<std::uint8_t const*>(token), size);
auto r = b58_fast::encodeBase58Token(type, inSp, outSp);
if (!r)
return {};
sr.resize(r.value().size());
return sr;
}
[[nodiscard]] std::string
decodeBase58Token(std::string const& s, TokenType type)
{
std::string sr;
// The largest object encoded as base58 is 33 bytes; 64 is plenty (and
// there's no benefit making it smaller)
sr.resize(64);
std::span<std::uint8_t> outSp(
reinterpret_cast<std::uint8_t*>(sr.data()), sr.size());
auto r = b58_fast::decodeBase58Token(type, s, outSp);
if (!r)
return {};
sr.resize(r.value().size());
return sr;
}
} // namespace b58_fast
#endif // _MSC_VER
} // namespace ripple