mirror of
https://github.com/XRPLF/clio.git
synced 2025-11-19 19:25:53 +00:00
Merge branch 'master' into verify-new2
This commit is contained in:
@@ -12,6 +12,9 @@ if(VERBOSE)
|
||||
set(FETCHCONTENT_QUIET FALSE CACHE STRING "Verbose FetchContent()")
|
||||
endif()
|
||||
|
||||
#c++20 removed std::result_of but boost 1.75 is still using it.
|
||||
add_definitions(-DBOOST_ASIO_HAS_STD_INVOKE_RESULT=1)
|
||||
|
||||
add_library(clio)
|
||||
target_compile_features(clio PUBLIC cxx_std_20)
|
||||
target_include_directories(clio PUBLIC src)
|
||||
|
||||
82
README.md
82
README.md
@@ -1,30 +1,81 @@
|
||||
# CLIO MIGRATOR (ONE OFF!)
|
||||
|
||||
This tool is a (really) hacky way of migrating some data from
|
||||
This tool allows you to backfill data from
|
||||
[clio](https://github.com/XRPLF/clio) due to the [specific pull request
|
||||
313](https://github.com/XRPLF/clio/pull/313) in that repo.
|
||||
|
||||
Specifically, it is meant to migrate NFT data such that:
|
||||
|
||||
* The new `nf_token_uris` table is populated with all URIs for all NFTs known
|
||||
* The new `issuer_nf_tokens_v2` table is populated with all NFTs known
|
||||
* The old `issuer_nf_tokens` table is dropped. This table was never used prior
|
||||
to the above-referenced PR, so it is very safe to drop.
|
||||
- The new `nf_token_uris` table is populated with all URIs for all NFTs known
|
||||
- The new `issuer_nf_tokens_v2` table is populated with all NFTs known
|
||||
- The old `issuer_nf_tokens` table is dropped. This table was never used prior
|
||||
to the above-referenced PR, so it is very safe to drop.
|
||||
|
||||
## How to use
|
||||
|
||||
This tool should be used as follows, with regard to the above update:
|
||||
|
||||
1) Stop serving requests from your clio
|
||||
2) Stop your clio and upgrade it to the version after the after PR
|
||||
3) Start your clio
|
||||
4) Now, your clio is writing new data correctly. This tool will update your
|
||||
old data, while your new clio is running.
|
||||
5) Run this tool, using the _exact_ same config as what you are using for your
|
||||
production clio.
|
||||
6) Once this tool terminates successfully, you can resume serving requests
|
||||
from your clio.
|
||||
1. __Compile or download the new version of `clio`__, but don't run it just yet.
|
||||
2. __Stop serving requests from your existing `clio`__. If you need to achieve zero downtime, you have two options:
|
||||
- Temporarily point your traffic to someone else's `clio` that has already performed this
|
||||
migration. The XRPL Foundation should have performed this on their servers before this
|
||||
release. Ask in our Discord what server to point traffic to.
|
||||
- Create a new temporary `clio` instance running _the prior release_ and make sure
|
||||
that its config.json specifies `read_only: true`. You can safely serve data
|
||||
from this separate instance.
|
||||
3. __Stop your `clio` and restart it, running the new version__. Now, your `clio` is writing new data correctly. This tool will update your
|
||||
old data, while your upgraded `clio` is running and writing new ledgers.
|
||||
5. __Run this tool__, using the _exact_ same config as what you are using for your
|
||||
production `clio`.
|
||||
6. __Once this tool terminates successfully__, you can resume serving requests
|
||||
from your `clio`.
|
||||
|
||||
|
||||
## Compiling
|
||||
## Notes on timing
|
||||
|
||||
The amount of time that this migration takes depends greatly on what your data
|
||||
looks like. This migration migrates data in three steps:
|
||||
|
||||
1. __Transaction loading__
|
||||
- Pull all successful transactions that relate to NFTs.
|
||||
The hashes of these transactions are stored in the `nf_token_transctions` table.
|
||||
- For each of these transactions, discard any that were posted after the
|
||||
migration started
|
||||
- For each of these transactions, discard any that are not NFTokenMint
|
||||
transactions
|
||||
- For any remaning transactions, pull the associated NFT data from them and
|
||||
write them to the database.
|
||||
2. __Initial ledger loading__ We need to also scan all objects in the initial
|
||||
ledger, looking for any NFTokenPage objects that would not have an associated
|
||||
transaction recorded.
|
||||
- Pull all objects from the initial ledger
|
||||
- For each object, if it is not an NFTokenPage, discard it.
|
||||
- Otherwise, load all NFTs stored in the NFTokenPage
|
||||
3. __Drop the old (and unused) `issuer_nf_tokens` table__. This step is completely
|
||||
safe, since this table is not used for anything in clio. It was meant to drive
|
||||
a clio-only API called `nfts_by_issuer`, which is still in development.
|
||||
However, we decided that for performance reasons its schema needed to change
|
||||
to the schema we have in `issuer_nf_tokens_v2`. Since the API in question is
|
||||
not yet part of clio, removing this table will not affect anything.
|
||||
|
||||
|
||||
Step 1 is highly performance optimized. If you have a full-history clio
|
||||
set-up, this migration make take only a few minutes. We tested it on a
|
||||
full-history server and it completed in about 9 minutes.
|
||||
|
||||
However Step 2 is not well-optimized and unfortuntely cannot be. If you have a
|
||||
clio server whose `start_sequence` is relatively recent (even if the
|
||||
`start_sequence` indicates a ledger prior to NFTs being enabled on your
|
||||
network), the migration will take longer. We tested it on a clio with a
|
||||
`start_sequence` of about 1 week prior to testing and it completed in about 6
|
||||
hours.
|
||||
|
||||
As a result, we recommend _assuming_ the worst case: that this migration will take about 8
|
||||
hours.
|
||||
|
||||
|
||||
|
||||
## Compiling and running
|
||||
|
||||
Git-clone this project to your server. Then from the top-level directory:
|
||||
```
|
||||
@@ -44,3 +95,4 @@ This migration will take a few hours to complete. After this completes, it is op
|
||||
```
|
||||
./clio_verifier <config path>
|
||||
```
|
||||
|
||||
|
||||
@@ -111,7 +111,7 @@ synchronous(F&& f)
|
||||
* R is the currently executing coroutine that is about to get passed.
|
||||
* If corountine types do not match, the current one's type is stored.
|
||||
*/
|
||||
using R = typename std::result_of<F(boost::asio::yield_context&)>::type;
|
||||
using R = typename boost::result_of<F(boost::asio::yield_context&)>::type;
|
||||
if constexpr (!std::is_same<R, void>::value)
|
||||
{
|
||||
/**
|
||||
|
||||
@@ -550,7 +550,7 @@ CassandraBackend::fetchTransactions(
|
||||
std::vector<TransactionAndMetadata> results{numHashes};
|
||||
std::vector<std::shared_ptr<ReadCallbackData<result_type>>> cbs;
|
||||
cbs.reserve(numHashes);
|
||||
auto timeDiff = util::timed([&]() {
|
||||
[[maybe_unused]] auto timeDiff = util::timed([&]() {
|
||||
for (std::size_t i = 0; i < hashes.size(); ++i)
|
||||
{
|
||||
CassandraStatement statement{selectTransaction_};
|
||||
@@ -580,9 +580,9 @@ CassandraBackend::fetchTransactions(
|
||||
throw DatabaseTimeout();
|
||||
}
|
||||
|
||||
log_.debug() << "Fetched " << numHashes
|
||||
<< " transactions from Cassandra in " << timeDiff
|
||||
<< " milliseconds";
|
||||
// log_.debug() << "Fetched " << numHashes
|
||||
// << " transactions from Cassandra in " << timeDiff
|
||||
// << " milliseconds";
|
||||
return results;
|
||||
}
|
||||
|
||||
|
||||
@@ -12,49 +12,63 @@
|
||||
|
||||
static std::uint32_t const MAX_RETRIES = 5;
|
||||
static std::chrono::seconds const WAIT_TIME = std::chrono::seconds(60);
|
||||
static std::uint32_t const NFT_WRITE_BATCH_SIZE = 10000;
|
||||
|
||||
static void
|
||||
wait(boost::asio::steady_timer& timer, std::string const reason)
|
||||
wait(boost::asio::steady_timer& timer, std::string const& reason)
|
||||
{
|
||||
BOOST_LOG_TRIVIAL(info) << reason << ". Waiting";
|
||||
BOOST_LOG_TRIVIAL(info) << reason << ". Waiting then retrying";
|
||||
timer.expires_after(WAIT_TIME);
|
||||
timer.wait();
|
||||
BOOST_LOG_TRIVIAL(info) << "Done";
|
||||
BOOST_LOG_TRIVIAL(info) << "Done waiting";
|
||||
}
|
||||
|
||||
static void
|
||||
static std::vector<NFTsData>
|
||||
doNFTWrite(
|
||||
std::vector<NFTsData>& nfts,
|
||||
Backend::CassandraBackend& backend,
|
||||
std::string const tag)
|
||||
std::string const& tag)
|
||||
{
|
||||
if (nfts.size() <= 0)
|
||||
return;
|
||||
if (nfts.size() == 0)
|
||||
return nfts;
|
||||
auto const size = nfts.size();
|
||||
backend.writeNFTs(std::move(nfts));
|
||||
backend.sync();
|
||||
BOOST_LOG_TRIVIAL(info) << tag << ": Wrote " << size << " records";
|
||||
return {};
|
||||
}
|
||||
|
||||
static std::optional<Backend::TransactionAndMetadata>
|
||||
doTryFetchTransaction(
|
||||
static std::vector<NFTsData>
|
||||
maybeDoNFTWrite(
|
||||
std::vector<NFTsData>& nfts,
|
||||
Backend::CassandraBackend& backend,
|
||||
std::string const& tag)
|
||||
{
|
||||
if (nfts.size() < NFT_WRITE_BATCH_SIZE)
|
||||
return nfts;
|
||||
return doNFTWrite(nfts, backend, tag);
|
||||
}
|
||||
|
||||
static std::vector<Backend::TransactionAndMetadata>
|
||||
doTryFetchTransactions(
|
||||
boost::asio::steady_timer& timer,
|
||||
Backend::CassandraBackend& backend,
|
||||
ripple::uint256 const& hash,
|
||||
std::vector<ripple::uint256> const& hashes,
|
||||
boost::asio::yield_context& yield,
|
||||
std::uint32_t const attempts = 0)
|
||||
{
|
||||
try
|
||||
{
|
||||
return backend.fetchTransaction(hash, yield);
|
||||
return backend.fetchTransactions(hashes, yield);
|
||||
}
|
||||
catch (Backend::DatabaseTimeout const& e)
|
||||
{
|
||||
if (attempts >= MAX_RETRIES)
|
||||
throw e;
|
||||
|
||||
wait(timer, "Transaction read error");
|
||||
return doTryFetchTransaction(timer, backend, hash, yield, attempts + 1);
|
||||
wait(timer, "Transactions read error");
|
||||
return doTryFetchTransactions(
|
||||
timer, backend, hashes, yield, attempts + 1);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -69,7 +83,7 @@ doTryFetchLedgerPage(
|
||||
{
|
||||
try
|
||||
{
|
||||
return backend.fetchLedgerPage(cursor, sequence, 2000, false, yield);
|
||||
return backend.fetchLedgerPage(cursor, sequence, 10000, false, yield);
|
||||
}
|
||||
catch (Backend::DatabaseTimeout const& e)
|
||||
{
|
||||
@@ -104,24 +118,12 @@ doTryGetTxPageResult(
|
||||
}
|
||||
|
||||
static void
|
||||
doMigration(
|
||||
doMigrationStepOne(
|
||||
Backend::CassandraBackend& backend,
|
||||
boost::asio::steady_timer& timer,
|
||||
boost::asio::yield_context& yield)
|
||||
boost::asio::yield_context& yield,
|
||||
Backend::LedgerRange const& ledgerRange)
|
||||
{
|
||||
BOOST_LOG_TRIVIAL(info) << "Beginning migration";
|
||||
auto const ledgerRange = backend.hardFetchLedgerRangeNoThrow(yield);
|
||||
|
||||
/*
|
||||
* Step 0 - If we haven't downloaded the initial ledger yet, just short
|
||||
* circuit.
|
||||
*/
|
||||
if (!ledgerRange)
|
||||
{
|
||||
BOOST_LOG_TRIVIAL(info) << "There is no data to migrate";
|
||||
return;
|
||||
}
|
||||
|
||||
/*
|
||||
* Step 1 - Look at all NFT transactions recorded in
|
||||
* `nf_token_transactions` and reload any NFTokenMint transactions. These
|
||||
@@ -130,6 +132,9 @@ doMigration(
|
||||
* the tokens in `nf_tokens` because we also want to cover the extreme
|
||||
* edge case of a token that is re-minted with a different URI.
|
||||
*/
|
||||
std::string const stepTag = "Step 1 - transaction loading";
|
||||
std::vector<NFTsData> toWrite;
|
||||
|
||||
std::stringstream query;
|
||||
query << "SELECT hash FROM " << backend.tablePrefix()
|
||||
<< "nf_token_transactions";
|
||||
@@ -140,11 +145,11 @@ doMigration(
|
||||
// For all NFT txs, paginated in groups of 1000...
|
||||
while (morePages)
|
||||
{
|
||||
std::vector<NFTsData> toWrite;
|
||||
|
||||
CassResult const* result =
|
||||
doTryGetTxPageResult(nftTxQuery, timer, backend);
|
||||
|
||||
std::vector<ripple::uint256> txHashes;
|
||||
|
||||
// For each tx in page...
|
||||
CassIterator* txPageIterator = cass_iterator_from_result(result);
|
||||
while (cass_iterator_next(txPageIterator))
|
||||
@@ -165,37 +170,29 @@ doMigration(
|
||||
"Could not retrieve hash from nf_token_transactions");
|
||||
}
|
||||
|
||||
auto const txHash = ripple::uint256::fromVoid(buf);
|
||||
auto const tx =
|
||||
doTryFetchTransaction(timer, backend, txHash, yield);
|
||||
if (!tx)
|
||||
{
|
||||
cass_iterator_free(txPageIterator);
|
||||
cass_result_free(result);
|
||||
cass_statement_free(nftTxQuery);
|
||||
std::stringstream ss;
|
||||
ss << "Could not fetch tx with hash "
|
||||
<< ripple::to_string(txHash);
|
||||
throw std::runtime_error(ss.str());
|
||||
txHashes.push_back(ripple::uint256::fromVoid(buf));
|
||||
}
|
||||
|
||||
// Not really sure how cassandra paging works, but we want to skip
|
||||
// any transactions that were loaded since the migration started
|
||||
if (tx->ledgerSequence > ledgerRange->maxSequence)
|
||||
auto const txs =
|
||||
doTryFetchTransactions(timer, backend, txHashes, yield);
|
||||
|
||||
for (auto const& tx : txs)
|
||||
{
|
||||
if (tx.ledgerSequence > ledgerRange.maxSequence)
|
||||
continue;
|
||||
|
||||
ripple::STTx const sttx{ripple::SerialIter{
|
||||
tx->transaction.data(), tx->transaction.size()}};
|
||||
tx.transaction.data(), tx.transaction.size()}};
|
||||
if (sttx.getTxnType() != ripple::TxType::ttNFTOKEN_MINT)
|
||||
continue;
|
||||
|
||||
ripple::TxMeta const txMeta{
|
||||
sttx.getTransactionID(), tx->ledgerSequence, tx->metadata};
|
||||
sttx.getTransactionID(), tx.ledgerSequence, tx.metadata};
|
||||
toWrite.push_back(
|
||||
std::get<1>(getNFTDataFromTx(txMeta, sttx)).value());
|
||||
}
|
||||
|
||||
doNFTWrite(toWrite, backend, "TX");
|
||||
toWrite = maybeDoNFTWrite(toWrite, backend, stepTag);
|
||||
|
||||
morePages = cass_result_has_more_pages(result);
|
||||
if (morePages)
|
||||
@@ -205,8 +202,16 @@ doMigration(
|
||||
}
|
||||
|
||||
cass_statement_free(nftTxQuery);
|
||||
BOOST_LOG_TRIVIAL(info) << "\nDone with transaction loading!\n";
|
||||
doNFTWrite(toWrite, backend, stepTag);
|
||||
}
|
||||
|
||||
static void
|
||||
doMigrationStepTwo(
|
||||
Backend::CassandraBackend& backend,
|
||||
boost::asio::steady_timer& timer,
|
||||
boost::asio::yield_context& yield,
|
||||
Backend::LedgerRange const& ledgerRange)
|
||||
{
|
||||
/*
|
||||
* Step 2 - Pull every object from our initial ledger and load all NFTs
|
||||
* found in any NFTokenPage object. Prior to this migration, we were not
|
||||
@@ -214,32 +219,43 @@ doMigration(
|
||||
* missed. This will also record the URI of any NFTs minted prior to the
|
||||
* start sequence.
|
||||
*/
|
||||
std::string const stepTag = "Step 2 - initial ledger loading";
|
||||
std::vector<NFTsData> toWrite;
|
||||
std::optional<ripple::uint256> cursor;
|
||||
|
||||
// For each object page in initial ledger
|
||||
do
|
||||
{
|
||||
auto const page = doTryFetchLedgerPage(
|
||||
timer, backend, cursor, ledgerRange->minSequence, yield);
|
||||
timer, backend, cursor, ledgerRange.minSequence, yield);
|
||||
|
||||
// For each object in page
|
||||
for (auto const& object : page.objects)
|
||||
{
|
||||
std::vector<NFTsData> toWrite = getNFTDataFromObj(
|
||||
ledgerRange->minSequence,
|
||||
ripple::to_string(object.key),
|
||||
auto const objectNFTs = getNFTDataFromObj(
|
||||
ledgerRange.minSequence,
|
||||
std::string(object.key.begin(), object.key.end()),
|
||||
std::string(object.blob.begin(), object.blob.end()));
|
||||
doNFTWrite(toWrite, backend, "OBJ");
|
||||
toWrite.insert(toWrite.end(), objectNFTs.begin(), objectNFTs.end());
|
||||
}
|
||||
|
||||
toWrite = maybeDoNFTWrite(toWrite, backend, stepTag);
|
||||
cursor = page.cursor;
|
||||
} while (cursor.has_value());
|
||||
|
||||
BOOST_LOG_TRIVIAL(info) << "\nDone with object loading!\n";
|
||||
doNFTWrite(toWrite, backend, stepTag);
|
||||
}
|
||||
|
||||
static bool
|
||||
doMigrationStepThree(Backend::CassandraBackend& backend)
|
||||
{
|
||||
/*
|
||||
* Step 3 - Drop the old `issuer_nf_tokens` table, which is replaced by
|
||||
* `issuer_nf_tokens_v2`. Normally, we should probably not drop old tables
|
||||
* in migrations, but here it is safe since the old table wasn't yet being
|
||||
* used to serve any data anyway.
|
||||
*/
|
||||
query.str("");
|
||||
std::stringstream query;
|
||||
query << "DROP TABLE " << backend.tablePrefix() << "issuer_nf_tokens";
|
||||
CassStatement* issuerDropTableQuery =
|
||||
cass_statement_new(query.str().c_str(), 0);
|
||||
@@ -249,12 +265,42 @@ doMigration(
|
||||
cass_future_free(fut);
|
||||
cass_statement_free(issuerDropTableQuery);
|
||||
backend.sync();
|
||||
if (rc != CASS_OK)
|
||||
BOOST_LOG_TRIVIAL(warning) << "\nCould not drop old issuer_nf_tokens "
|
||||
return rc == CASS_OK;
|
||||
}
|
||||
|
||||
static void
|
||||
doMigration(
|
||||
Backend::CassandraBackend& backend,
|
||||
boost::asio::steady_timer& timer,
|
||||
boost::asio::yield_context& yield)
|
||||
{
|
||||
BOOST_LOG_TRIVIAL(info) << "Beginning migration";
|
||||
auto const ledgerRange = backend.hardFetchLedgerRangeNoThrow(yield);
|
||||
|
||||
/*
|
||||
* Step 0 - If we haven't downloaded the initial ledger yet, just short
|
||||
* circuit.
|
||||
*/
|
||||
if (!ledgerRange)
|
||||
{
|
||||
BOOST_LOG_TRIVIAL(info) << "There is no data to migrate";
|
||||
return;
|
||||
}
|
||||
|
||||
doMigrationStepOne(backend, timer, yield, *ledgerRange);
|
||||
BOOST_LOG_TRIVIAL(info) << "\nStep 1 done!\n";
|
||||
|
||||
doMigrationStepTwo(backend, timer, yield, *ledgerRange);
|
||||
BOOST_LOG_TRIVIAL(info) << "\nStep 2 done!\n";
|
||||
|
||||
auto const stepThreeResult = doMigrationStepThree(backend);
|
||||
BOOST_LOG_TRIVIAL(info) << "\nStep 3 done!";
|
||||
if (stepThreeResult)
|
||||
BOOST_LOG_TRIVIAL(info) << "Dropped old 'issuer_nf_tokens' table!\n";
|
||||
else
|
||||
BOOST_LOG_TRIVIAL(warning) << "Could not drop old issuer_nf_tokens "
|
||||
"table. If it still exists, "
|
||||
"you should drop it yourself\n";
|
||||
else
|
||||
BOOST_LOG_TRIVIAL(info) << "\nDropped old 'issuer_nf_tokens' table!\n";
|
||||
|
||||
BOOST_LOG_TRIVIAL(info)
|
||||
<< "\nCompleted migration from " << ledgerRange->minSequence << " to "
|
||||
|
||||
@@ -178,7 +178,7 @@ public:
|
||||
{
|
||||
for (auto const& handler : handlers)
|
||||
{
|
||||
handlerMap_[handler.method] = move(handler);
|
||||
handlerMap_[handler.method] = std::move(handler);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user