Using std::barrier performs extremely poorly (~1 hour vs ~1 minute to run the test suite) in certain macOS environments.
To unblock our macOS CI pipeline, std::barrier has been replaced with a custom mutex-based barrier (Barrier) that significantly improves performance without compromising correctness.
- PR #5228 added assert=TRUE and werr=TRUE CMake flags to the
build/action.yml script which is used by all CI jobs to build rippled,
ensuring those flags were always set. The assumption was that only the
CI jobs used that script, so any extra time cost was offset by the
benefit of the extra checks. That assumption was incorrect. That
script is used by other downstream projects. Therefore, those flags
have been moved into the individual CI jobs' "cmake-args" parameter
passed to build/action.yml. This will have the same effect for CI jobs
without any side effects.
* Rename ASSERT to XRPL_ASSERT
* Upgrade to Anthithesis SDK 0.4.4, and use new 0.4.4 features
* automatic cast to bool, like assert
* Add instrumentation workflow to verify build with instrumentation enabled
Github Actions for the build/test jobs (nix.yml, mac.yml, windows.yml) will only run on branches that build packages (develop, release, master), and branches with names starting with "ci/". This is intended as a compromise between disabling CI jobs on personal forks entirely, and having the jobs run as a free-for-all. Note that it will not affect PR jobs at all.
- Update container for Doxygen workflow. Matches Linux workflow, with newer GLIBC version required by newer actions.
- Fixes macOS workflow to install and configure Conan correctly. Still fails on tests, but that does not seem attributable to the workflow.
Artifactory support was added to the `nix` builds with #4556. This
extends that support to the Windows build. Now the Windows build works;
CI will build and test a Windows release build. This only affects CI and
does not change any C++ code.
* Copy the remote setup step outcome fix from #4716 discussion
* Allow the Windows job to succeed if tests fail:
* Currently the tests do not always pass, even on a single threaded
run on the GitHub runners. So we are using parallel runs and mark
the test step as allowed to fail (continue-on-error).
* At this point, it's more important that the build succeeds than that
the tests succeed, because:
* We've got plenty of test coverage on the other jobs.
* Test failures are much rarer than build failures because of
cross-platform issues.
* Having a test failure locally doesn't interrupt a workflow nearly as
much as a build failure.
Note that Conan Center cannot hold the binaries we need. They do not
build the configurations we need, and they will not add them.
## Future Tasks
This introduces a new bottleneck since the build and test takes over an
hour. Speed up the job by:
* Making this job run on heavy Windows runners.
* Increasing the number of hardware threads.
Use the most recent versions in ConanCenter.
* Due to a bug in Clang 16, you may get a compile error:
"call to 'async_teardown' is ambiguous"
* A compiler flag workaround is documented in `BUILD.md`.
* At this time, building this with gcc 13 may require editing some files
in `.conan/data`
* A patch to support gcc13 may be added in a later PR.
---------
Co-authored-by: Scott Schurr <scott@ripple.com>
This change makes progress on the plan in #4371. It does not replicate
the full [matrix] implemented in #3851, but it does replicate the 1.ii
section of the Linux matrix. It leverages "heavy" self-hosted runners,
and demonstrates a repeatable pattern for future matrices.
[matrix]: d794a0f3f1/.github/README.md (continuous-integration)