Managing shared library paths in manylinux

When building geospatial Python extensions like PyProj, rasterio, or GDAL bindings, the transition from source compilation to distributable wheels consistently fractures at the dynamic linking stage. The core challenge stems from the strict isolation requirements of PEP 513/571/600 containers. Unlike system-installed packages, manylinux wheels must embed or explicitly vendor all non-stdlib C dependencies. This requires precise manipulation of RPATH, RUNPATH, and dynamic linker search paths to satisfy both the CPython ABI and the geospatial toolchain. Understanding how Geospatial C-Extension Fundamentals & ABI Architecture dictates symbol resolution and relocation is mandatory before modifying linker flags or patching binaries.

At a high level, auditwheel repair turns a host-linked extension into a self-contained wheel:

flowchart LR BUILD["Build extension in manylinux container"] --> EMIT["Links libproj / libgdal"] EMIT --> AW["auditwheel repair"] AW --> COPY["Copy external .so into wheel .libs/"] COPY --> PATCH["Patch RPATH to $ORIGIN/.libs"] PATCH --> VERIFY["auditwheel show: zero external refs"] VERIFY --> WHEEL["manylinux-compliant wheel"]

Pipeline Failure Signatures & Root Cause

CI/CD pipelines typically fail during the auditwheel repair phase with deterministic signatures. Each maps directly to a specific linker or packaging misconfiguration:

Error Signature Root Cause Fix Vector
auditwheel repair failed: libgdal.so.32 not found auditwheel cannot locate the dependency in the provided search paths, or the library was compiled without --disable-rpath and embedded an absolute host path. Explicitly pass --lib-dir pointing to the isolated prefix; verify PKG_CONFIG_PATH during compilation.
ERROR: manylinux_2_28_x86_64 wheel contains external shared libraries: libproj.so.25 The extension binary references a .so outside the wheel’s .libs directory, usually due to RUNPATH injection or missing auditwheel bundling. Force RPATH via --disable-new-dtags; ensure auditwheel copies and patches the dependency.
ImportError: libproj.so.25: cannot open shared object file Runtime resolution fails because $ORIGIN relative paths are malformed, or LD_LIBRARY_PATH overrides RUNPATH in the deployment environment. Validate DT_RPATH with readelf; confirm $ORIGIN/.libs structure in the extracted wheel.

These failures occur because modern binutils default to --enable-new-dtags, which injects DT_RUNPATH instead of DT_RPATH. RUNPATH respects LD_LIBRARY_PATH at runtime, making symbol resolution non-deterministic and explicitly unsafe for manylinux. auditwheel rejects RUNPATH by design. The dynamic linker must rely exclusively on hardcoded RPATH entries pointing to $ORIGIN/.libs to guarantee portable execution.

Step-by-Step Resolution Strategy

1. Enforce Deterministic RPATH at Compile Time

Override the default binutils behavior during the Python extension build. Inject $ORIGIN-relative RPATH and disable RUNPATH generation:

# Escape $ORIGIN to prevent premature shell expansion
export LDFLAGS="-Wl,-rpath,\$ORIGIN/../.libs -Wl,--disable-new-dtags"
export CFLAGS="-fPIC -O2"

The -Wl,--disable-new-dtags flag forces the linker to emit DT_RPATH instead of DT_RUNPATH. The $ORIGIN token resolves to the directory containing the .cpython-*.so file at runtime, allowing the dynamic linker to locate vendored dependencies without environment variable overrides. Refer to the GCC Link Options documentation for flag precedence rules.

2. Isolate Geospatial Dependencies

Never link against host-system libraries inside a manylinux container. Compile your geospatial C dependencies into a dedicated prefix without embedded rpaths so they cannot bake in absolute host paths. Autotools-based transitive deps (e.g. libtiff, libgeotiff) accept --disable-rpath; modern PROJ and GDAL build with CMake, so pass -DCMAKE_SKIP_INSTALL_RPATH=ON instead:

# Inside manylinux Docker container — autotools dependency example
./configure \
  --prefix=/opt/geospatial \
  --disable-rpath \
  --disable-static \
  --enable-shared

# Modern PROJ/GDAL build with CMake instead of ./configure:
# cmake -DCMAKE_INSTALL_PREFIX=/opt/geospatial -DCMAKE_SKIP_INSTALL_RPATH=ON \
#       -DBUILD_SHARED_LIBS=ON -DBUILD_TESTING=OFF ..

make -j$(nproc) && make install

# Export isolated paths for the Python extension build
export PKG_CONFIG_PATH="/opt/geospatial/lib/pkgconfig"
export GDAL_CONFIG="/opt/geospatial/bin/gdal-config"
export PROJ_INCLUDE="/opt/geospatial/include"
export PROJ_LIB="/opt/geospatial/lib"

Disabling rpath on the C libraries themselves is critical. If PROJ or GDAL embeds /opt/geospatial/lib as an absolute path, auditwheel will fail to patch it to $ORIGIN/.libs, resulting in external library violations.

3. Execute Targeted auditwheel Repair

Run auditwheel with explicit directory mapping. The tool will copy required .so files into the wheel’s .libs directory, patch RPATH entries, and verify manylinux compliance:

auditwheel repair dist/*.whl \
  --plat manylinux_2_28_x86_64 \
  --lib-dir /opt/geospatial/lib \
  --exclude libstdc++.so.6 \
  --exclude libgcc_s.so.1

Note on --exclude: PyPA’s manylinux policy maintains a strict whitelist of system libraries (e.g., libcurl, libsqlite3, libz, libtiff in manylinux_2_28). auditwheel automatically excludes these. Use --exclude only for compiler runtime libraries that are guaranteed by the base image but trigger false positives. Never exclude geospatial core libraries (libproj, libgdal, libgeos).

Validation & Compliance Verification

After repair, validate the wheel before publishing. Do not rely on ldd alone, as it follows symlinks and can be skewed by the host environment.

1. Verify DT_RPATH Injection

# Extract the compiled extension from the repaired wheel
unzip -q wheelhouse/*.whl -d /tmp/wheel_check
readelf -d /tmp/wheel_check/your_ext/*.cpython-*.so | grep -E 'RPATH|RUNPATH'

Expected Output: 0x000000000000000f (RPATH) Library rpath: [$ORIGIN/../.libs] If RUNPATH appears, the --disable-new-dtags flag was ignored or overridden.

2. Confirm Library Bundling & Path Patching

# Check that .so files were copied and RPATH patched
patchelf --print-rpath /tmp/wheel_check/your_ext/*.cpython-*.so
ls /tmp/wheel_check/your_ext/.libs/

The .libs directory must contain libproj.so.*, libgdal.so.*, and any transitive dependencies not whitelisted by manylinux. All paths must resolve to $ORIGIN/../.libs.

3. Run auditwheel Compliance Check

auditwheel show wheelhouse/your_ext-*.manylinux_2_28_x86_64.whl

Verify the output confirms manylinux_2_28_x86_64 platform tag and zero external dependencies. Cross-reference against PEP 571 and PEP 600 to ensure glibc symbol versioning aligns with the target platform.

4. Runtime Smoke Test

# Execute in a clean manylinux environment with no LD_LIBRARY_PATH
docker run --rm -v $(pwd)/wheelhouse:/wheels quay.io/pypa/manylinux_2_28_x86_64 \
  bash -c "pip install /wheels/*.whl && python -c 'import your_ext; print(\"ABI OK\")'"

A successful import confirms that the dynamic linker resolves vendored libraries via RPATH without host interference.

Compliance Notes

  • PyPA Policy: manylinux wheels must not depend on system libraries outside the explicit whitelist. All geospatial C dependencies must be vendored.
  • ABI Stability: Geospatial libraries frequently break ABI compatibility between minor versions. Pin exact PROJ/GDAL versions during compilation and verify symbol versioning with nm -D or objdump -T.
  • Security Boundaries: Avoid RUNPATH and LD_LIBRARY_PATH overrides in production. Hardcoded RPATH with $ORIGIN prevents path injection attacks and ensures deterministic loading across heterogeneous deployment environments.

For deeper mechanics on how the ELF dynamic linker processes relocation tables and symbol visibility, consult the Shared Library Path Resolution reference.