The Wayback Machine - https://web.archive.org/web/20230706142749/https://github.com/facebook/buck2/pull/58
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conan integration to import third-party C/C++ dependencies #58

Open
wants to merge 134 commits into
base: main
Choose a base branch
from

Conversation

aherrmann
Copy link
Contributor

This PR adds support for importing third-party C/C++ dependencies provided by the Conan package manager.

Detailed motivation and design are contained in the module doctring of the main toolchain module under prelude/toolchains/conan/defs.bzl. In brief, some C/C++ third-party dependencies are difficult to import into a Buck2 project, e.g. the current self-hosted build of Buck2 vendors a binary openssl and skips jemalloc for that reason.

The approach proposed here is to instead use the existing package build definitions for the Conan package manager and build these third-party dependencies with Conan. This approach is similar to how rules_haskell integrates with Stack and Cabal to import third-party Haskell dependencies, or how JVM packages are imported with the help of Coursier in Bazel.

Conan was chosen for its relatively large community package index, compatibility with Linux, MacOS, and Windows, configurability and extensibility, and its support for cross-compilation. Alternatively, vcpkg was considered. However, its choice of CMake as implementation and configuration language makes it harder to integrate. Other interesting comparisons can be found here and here.

This integration aims to leave Buck2 in charge of the overall build, when which targets are built, and which configurations and toolchains are used.

Code map:

  • prelude/toolchains/conan/defs.bzl the main Starlark module for the Conan integration. Defines the relevant rules.
  • prelude/toolchains/conan/*.py Python scripts implementing the various build actions.
  • prelude/toolchains/conan/buckler/ a Conan extension that defines a generator to generate Buck2 import targets for Conan built packages.
  • examples/prelude/toolchains/BUILD an example Conan toolchain configuration.
  • examples/prelude/cpp/conan/ an example Conan use-case.

This PR adds tests of the Conan example to the Linux and MacOS CI pipelines. Windows support will be added separately.

Remaining work for future PRs

  • Better cross-platform support.
    Use Conan lock-file bundles to define cross-platform import targets.
  • Windows support in particular.
    Currently builds fail on long-path issues, Conan provides workarounds for these issues that need to be investigated.
  • Import Buck2’s third-party C/C++ dependencies with Conan.
  • Cross-compilation support.
    Conan has builtin cross-compilation support that needs to be made use of.
  • Test some more complex packages.
    Builds of some more complex packages failed when attempted with a zig cc toolchain, e.g. libmysqlclient fails to find libresolv.so. Investigate these issues. An example setup with the Zig toolchain can be found here.
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 21, 2022
@arlyon
Copy link
Contributor

arlyon commented Dec 21, 2022

Excellent, looking forward to reading through this :)

static_lib = None
native.prebuilt_cxx_library(
name = name,
deps = deps, # TODO[AH] Do we need exported_deps?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is exported_deps not needed for transitive deps to work correctly? @ndmitchell please confirm

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For c++, exported_deps is required when your dependents need to include your dependency's headers. This could be because your headers that your dependencies include then further directly include ones from your dependency's or if you just otherwise expect your dependents to do so (like if your interfaces are going to require things from your dependency's). Often this isn't required.

Now, when looking at doing something like we're doing here, we're kinda limited by only being able to understand what the underlying specification encodes (and how its other interpretations behave). If the specification doesn't distinguish between public and private dependencies, it is almost certainly treating them all as public (it could be treating them as private, but then everything we sort of need to individually specify their transitive deps and that is rather gross). In that case, I'd recommend encoding them as public deps (i.e. exported_deps) in buck2 as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjhopman Thanks for the context! I've changed the code to set exported_deps instead.

_conan_cxx_libraries(
name = name,
main = ":_bundle_" + name,
components = components,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love we support this. Some context about components for others: https://docs.conan.io/en/latest/creating_packages/package_information.html#using-components

attrs = {
"arch": attrs.string(doc = "The target architecture"),
"os": attrs.string(doc = "The target operating system"),
"build_type": attrs.string(doc = "The Conan build-type, e.g. Release or Debug"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

#env["CONAN_REVISIONS_ENABLED"] = "1"

# Prevent over-allocation.
env["CONAN_CPU_COUNT"] = "1"
Copy link
Contributor

@arlyon arlyon Dec 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it this means conan build libraries single-threaded. Edit: make is still invoked w/ -j16 on my machine (16 threads)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it this means conan build libraries single-threaded.

The intention is to avoid over-allocation due to both Buck2 and Conan parallizing their builds.

Edit: make is still invoked w/ -j16 on my machine (16 threads)

Thanks for testing! IIUC Conan recipes still need to actively use this setting:

Conan recipes can use the cpu_count() tool to build the library using more than one core.

Which package were you building there?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention is to avoid over-allocation due to both Buck2 and Conan parallizing their builds

There is a weight on actions that you could try tweaking here. It can basically be thought of as telling buck that the action requires that many cores. it's also been used to limit parallelism due to memory requirements, not cores, but its modeled with just the one limit (trivia note: that's actually different than buck1 which had ability to configure several different weights for different resources)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjhopman Thanks! That looks like a useful feature to address this. Given that seemingly not all Conan recipes make use of the CONAN_CPU_COUNT knob, and that the load a parallelized package build will also depend on the size of the package, this does indeed seem like something that will require a bit of tuning. In light of that I'd prefer to leave this to a follow-up PR. I've added a TODO comment to the code to keep track of it.

lockfile = ":lock",
)

# TODO[AH] Prevent double build of packages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting. I take it this cost only appears when actually generating the lockfile itself? That seems like an ok trade-off for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, well, when generating the targets file, i.e. when invoking conan install --generator .... But, yes, that's the idea.

user_home=user_home,
trace_log=trace_log)

# TODO[AH] Allow users to define additional remotes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A conan_remote rule would be cool. You could pass it in similar to the profile similar to deps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a conan_remote rule sounds like a good way to do this.

@arlyon
Copy link
Contributor

arlyon commented Dec 21, 2022

This is great. I would love to see the package component limitation lifted though it seems like simply a matter of implementing it.

https://github.com/facebookincubator/buck2/blob/76c18df1fe5a0ef8e611779eb080bdcc606b1595/prelude/toolchains/conan/defs.bzl#L344

Otherwise, the code is succinct and clear. The python scripts seem solid, and I've managed to run :update for boost and glm, glfw, and glad and the output file looks correct and includes transitive deps. It even told me I had a missing system package which is cool. It's great that components are included (and I love the syntax) and dependencies are generated though I couldn't test the transitive deps due to the aforementioned issue.

I will let Neil dispense buck2 related wisdom but I am excited that this is ready to land. Hope you have a lovely holiday.

@aherrmann
Copy link
Contributor Author

aherrmann commented Dec 22, 2022

[...] I've managed to run :update for boost and glm, glfw, and glad [...] though I couldn't test the transitive deps due to the aforementioned issue.

Thanks for testing!

This is great. I would love to see the package component limitation lifted though it seems like simply a matter of implementing it.

I looked into these on Conan central and tried to build boost. It looks like the issue is that boost contains some header-only components for which len(libs) == 0. I've added a case to handle this and update the PR. With that I was able to build boost locally.

For the multiple library components case, i.e. len(libs) > 1, I haven't found an example yet to test this on, so haven't attempted implementing it, yet. Are you aware of such a Conan package? I looked into ffmpeg, which seems to have been one of the packages motivating components, but even that seems to use single library components by now as far as I can tell.

Copy link
Contributor

@ndmitchell ndmitchell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - just a few notes/questions.

@@ -68,7 +68,7 @@ def _cxx_toolchain(ctx):
nm = RunInfo(args = ["nm"]),
# objcopy = ctx.attrs.objcopy_for_shared_library_interface[RunInfo],
# ranlib = ctx.attrs.ranlib[RunInfo],
ranlib = RunInfo(args = ["raninfo"]),
ranlib = RunInfo(args = ["ranlib"]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer use shim/toolchain, since we now use the system toolchains in the examples

@@ -295,7 +295,7 @@ def _cxx_zig_toolchain_impl(ctx: "context") -> ["provider"]:
return [ctx.attrs.distribution[DefaultInfo]] + cxx_toolchain_infos(
platform_name = dist.arch,
c_compiler_info = CCompilerInfo(
compiler = RunInfo(args = cmd_args([zig, "cc"])),
compiler = RunInfo(args = cmd_args([zig, "cc"], delimiter = " ")),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Conan toolchain configures Conan to use the C/C++ toolchain provided by Buck2. It does so by setting the relevant variables in the Conan profile. Some build tools assume that these contain a single path, instead of commands with arguments, so the toolchain generates wrapper scripts that execute the required command. Without this change the command would be written on multiple lines despite the explicit delimiter = " " at a higher level. Perhaps there is a better way to do this? As the comment there indicates I also tried setting quote = "shell", but that had other unwanted effects.

@@ -41,12 +41,12 @@ def _system_cxx_toolchain_impl(ctx):
linker_info = LinkerInfo(
linker = RunInfo(args = ["clang++"]),
linker_flags = ["-fuse-ld=lld"],
archiver = RunInfo(args = ["ar", "rcs"]),
archiver = RunInfo(args = ["ar"]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defining "rcs" in the archiver's RunInfo breaks with Conan when building a package from source. The problem is that we configure ar rcs as the archiver programm and the build system of the Conan package again passes similar flags, e.g. cs, to the archiver, resulting in a command line of the form ar rcs cs .... This fails with an error of the form:

ar: libz.a: No such file or directory

As explained in madler/zlib#588 (comment) .

If we set ar alone as the archiver then this problem goes away. However, then we need to make sure that Buck2's archive actions pass the correct arguments to ar, which is why this sets use_archiver_flags = True, see here.

Perhaps the core issue is that the archiver field conflates the archiver binary and the flags to be passed to the archiver in an action to create a static archive. If instead it followed the pattern of e.g. CCompilerInfo and there were separate archiver and archiver_flags fields then we could extract the archiver even in a configuration that explicitly configures archiver_flags and sets use_archiver_flags = False.

See also related commit message.

archiver_type = archiver_type,
type = linker_type,
link_binaries_locally = True,
archive_objects_locally = True,
use_archiver_flags = False,
use_archiver_flags = True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -209,6 +212,8 @@ jobs:
command: |
cd examples/prelude
cmd.exe /c mklink /D prelude C:\Users\circleci\project\prelude
# Skip Conan tests on Windows. TODO[AH] Implement Windows support.
Remove-Item -Recurse -Force cpp/conan
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually do this with host_info if's in the prelude - see where we do it in other places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difficulty here is that some of those targets are generated, namely those in examples/prelude/cpp/conan/import/BUILD. See also the internal discussion we had on this from 21/12/2022. Removing the folder seemed like the best option at the time. But, I'm happy to do something else if there's a better option.

@@ -0,0 +1,57 @@
native.export_file(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should be named TARGETS.v2, since otherwise internal stuff gets confused

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

To test that the headers are exposed and the library is linked properly.
Defining `"rcs"` in the `archiver`'s `RunInfo` breaks with Conan on
MacOS where a prebuilt package is not available for download but has to
be built. The problem is that we configure `ar rcs` as the archiver and
the Conan package's build system again passes similar flags, e.g. `cs`.

The resulting error is as follows:
```
ar: libz.a: No such file or directory
```
As explained in
madler/zlib#588 (comment) .
The profile entry in it contains absolute paths to the build tree, which
is only valid locally. This requires a workaround.
A toolchain is for the use-case of Conan exposing its configuration and
compiler toolchain to another build tool. We have the opposite use-case,
and use Conan profiles to that end instead.
The returned second value only contains inputs implied by macros, i.e.
`$(location ...)` or similar. Inputs due to the written content must be
tracked separately.

At least for, in future the produced output artifact may track these
inputs automatically.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
5 participants