Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CFP] Building LLVM at rev.ng: a report #6

Open
aleclearmind opened this issue Jun 18, 2021 · 10 comments
Open

[CFP] Building LLVM at rev.ng: a report #6

aleclearmind opened this issue Jun 18, 2021 · 10 comments

Comments

@aleclearmind
Copy link

Title

Building LLVM at rev.ng: a report

Authors

Alessandro Di Federico ([email protected]), rev.ng
Filippo Cremonese ([email protected]), rev.ng

Distribution

We build and distribute LLVM both for development purposes (sort of an SDK) and as a library for our application.

Abstract

At rev.ng we're building a LLVM-based decompiler with Qt UI.
Our development environment is Linux-only.

We build and distribute LLVM as:

  1. a Linux toolchain for the whole project, using libc++;
  2. a Windows (and soon macOS) cross-toolchain for the whole project, using libc++;
  3. a library used by our application (the decompiler);

Our own package manager, orchestra, produces binaries that can be used by end-users and developers.
Thanks to RPATH magic, all of our binaries are portable (run them from your $HOME) and do not require root to be installed.
Also, since we build and link against an ancient glibc, the binaries can run on distributions as old as Debian 7, despite being built on modern systems.
We build LLVM in debug -O0, debug -O2, debug + ASan and release.

We'd like to report (and hopefully get feedback) on issues we regularly face when building LLVM:

  1. issues building clang and libc++/libc++abi for Windows;
  2. issues arising when a release build of an application uses a debug build of LLVM (or an ASan build);
  3. constraints in the version of LLVM we want/have to use as a library (for the decompiler, for building mesa...) and the desire to use the most recent version of clang as a toolchain;
  4. dealing with failures in LLVM tests due to our particular set up (building as "root", building as C++20...);
  5. struggles in passing the right flags to the compiler when dealing with custom/ancient build systems;
  6. discussion on having a set of well-defined, documented and CI-tested build configurations that are supported/known to work;

What makes your distribution of LLVM unique?

  • We produce Linux binaries compatible with ancient glibc (i.e., Linux distros)
  • We cross-compile for Windows from Linux
  • We build portable installations of LLVM
  • We support mixing different build modes across different projects (debug LLVM and a non-debug application using it)

What might others learn from your experience?

Struggles and workarounds in building (and cross-building) recent and ancient packages using an clang/libc++ based toolchain.

What could be improved in upstream LLVM to make working with it easier as a downstream packager?

  • Have a set of build configurations well-document, supported and CI-tested by upstream
  • Make cross-compiling libc++ easier
@androm3da
Copy link

We cross-compile for Windows from Linux

That does sound unique, I'm curious to learn more 😁

@aleclearmind
Copy link
Author

That does sound unique, I'm curious to learn more

Once you have a working mingw toolchain and all LLVM dependencies for Windows, it's not crazily hard.

LLVM patches for a successful mingw cross build:

They mainly concern building llvm-tblgen/clang-tblgen for the build machine.

Here you can find the configuration we use: revng/orchestra@800bca6

One key thing is to use lld as a linker. The GNU linker ld.bfd (can't use ld.gold, does not support PE/COFF) fails building clangAST.dll. It tries to allocate more than 2^16 exported symbols (upper limit of COFF). You run it, wait two minutes, get an error message, wait three more minutes, fail.
With lld, it succeeds, producing a 235 MB file in 0.8 seconds.

libc++ introduces additional challenges. But it seems to work.

@nickdesaulniers
Copy link
Member

Thanks for taking the time to write up a CFP; we'd be overjoyed to have you present at LLVM Distributors Conf 2021! If you still plan on presenting, this is a reminder to get started on your slides for next week. Once they're done, we will contact you about submitting a PDF of your slides as either a pull request to this repository or via email to the organizer. We hope to have a schedule finalized by EOW; we may iterate on the schedule based on whether presenters have conflicts. Please keep this issue open for attendees to ask questions, or close this issue if you no longer plan on attending. Reminder to keep your talk concise (15 minutes); we wont be doing time for questions in order to fit as much content as possible. Attendees should ask questions here in this github issue.

@fcremo
Copy link

fcremo commented Sep 10, 2021

Hi @nickdesaulniers, we do intend to present our work and we're happy to have been accepted!

@sylvestre
Copy link
Contributor

We cross-compile for Windows from Linux

That does sound unique, I'm curious to learn more

Firefox for Windows is built on Linux:
https://glandium.org/blog/?p=4020

@llvm-beanz
Copy link

The repository strings that are encoded in binaries should all be configurable via CMake. At the very least you should be able to disable it all by setting LLVM_APPEND_VC_REV=Off.

@fcremo
Copy link

fcremo commented Sep 16, 2021

Here's the slides I used llvm_at_revng_presentation.pdf

@fcremo
Copy link

fcremo commented Sep 16, 2021

The repository strings that are encoded in binaries should all be configurable via CMake. At the very least you should be able to disable it all by setting LLVM_APPEND_VC_REV=Off.

Thanks for the suggestion @llvm-beanz, I missed that option. Currently trying it out to confirm it won't leak the git remote :)

nickdesaulniers added a commit that referenced this issue Sep 16, 2021
@mstorsjo
Copy link
Contributor

LLVM patches for a successful mingw cross build:

They mainly concern building llvm-tblgen/clang-tblgen for the build machine.

I regularly cross build from linux to windows, without any out of tree patches. However for the nested native build of the tblgen tool, there's a rather surprising case where the compiler tool name you've specified for the outer cross build ends up propagated to the nested build of the native tools. If you don't pass the CROSS_TOOLCHAIN_FLAGS_NATIVE variable when configuring llvm, then it sets a default that consts of propagating CMAKE_C_COMPILER and CMAKE_CXX_COMPILER: https://github.com/llvm/llvm-project/blob/llvmorg-13.0.0-rc3/llvm/cmake/modules/CrossCompile.cmake#L19-L25 But you can just pass -DCROSS_TOOLCHAIN_FLAGS_NATIVE= to the cross build to skip setting these implicit defaults.

(If you happen to have a matching native build of llvm with the llvm-config and tblgen tools ready made, you can also do e.g. -DLLVM_TABLEGEN=$native/llvm-tblgen (and so on for CLANG_TABLEGEN, LLDB_TABLEGEN and LLVM_CONFIG_PATH) to avoid needing to rebuild them with the nested native build altogether - although it's unclear if it's worth the extra hassle.)

Also regarding cross compiling libc++ and libc++abi - that's indeed a bit tricky, but I've been trying to upstream things to reduce the amount of manual tweaks needed. You can have a look at https://github.com/mstorsjo/llvm-mingw/blob/master/build-libcxx.sh for how I do it. I used to have a bunch of the same hardcoded things that you're listing in your slides, but I've been able to get rid of many of them lately. It's also possible to build both of them at the same time by pointing cmake at llvm-project/runtimes, allowing you to manually build more than one of the projects linked - while still decoupled from the build of the compiler. If you have a look at mstorsjo/llvm-mingw@next-runtimes you can see what simplifications that allows.

Most of the defining of *_BUILDING_LIBRARY and _LIBCXX_DISABLE_VISIBILITY_ANNOTATIONS should not be needed now, since around LLVM 12 I think.

As for needing to set LIBCXXABI_LIBCXX_INCLUDES while they're located next to each other in the monorepo setup, this changed and became a bit clearer in LLVM 12 I think. This isn't meant to be the original untouched libc++ headers, but the installed ones - which include the configuration specific __config_site. Earlier, the __config_site was appended/prepended to __config, so it worked silently even if this hadn't happened (but you'd miss configuration specific bits), but since llvm/llvm-project@c06a8f9, you'll get an error if you just point the libc++abi build at the plain libcxx/include directory. For my build I worked around this by first configuring libc++ and doing ninja generate-cxx-headers, then configuring libc++abi and pointing it at the generated headers, then building libc++abi followed by libc++: mstorsjo/llvm-mingw@ea3ff51#diff-58e7371d0eea276763191f8d101700c7eb56d5b02a37c76a8abe1e544806b431

Regarding ENABLE_NEW_DELETE_DEFINITIONS, you're right that on Windows you have to make sure to only enable in only one of the libraries - but this has changed upstream now so it defaults to off in libc++, so this no longer needs to be hardcoded.

Regarding HAS_WIN32_THREAD_API, the problem is that there's autodetection looking for pthread.h too, and if you've got winpthreads installed, that takes precedence over the plain win32 threading (which probably is what most users of libc++ want). And earlier before getting the effects of libc++'s __config_site included in the libc++abi build, you'd need to specify this for the libc++abi build too. But nowadays it should be enough to specify this once for the libc++ build.

Also these days, you don't need to hardcode LIBCXX_ENABLE_FILESYSTEM=OFF (it was only needed some time long ago, nowadays it's defaulting to on or off depending on whether it's supported). Since this spring, libc++ does support filesystem on Windows (it's enabled automatically for mingw configurations, but not for MSVC configurations as it requires some compiler-rt/libgcc helper routines for __int128_t).

All in all, it used to be quite tricky - but I've tried to upstream fixes to get rid of much of the trickery needed, and it should be a fair bit less messy now than it used to.

@aleclearmind
Copy link
Author

aleclearmind commented Sep 21, 2021

All in all, it used to be quite tricky - but I've tried to upstream fixes to get rid of much of the trickery needed, and it should be a fair bit less messy now than it used to.

Thanks a lot for the precious suggestions. We'll look into those.
I think we set up the cross-build configuration before LLVM 12, and testing what has improved takes some times. But thanks a lot for your effort for improving the situation! We really appreciate it.
This is exactly the feedback we were hoping to get from this event.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants