Optimizing scikit-build-core for GDAL

Compiling GDAL bindings for production Python geospatial stacks requires abandoning legacy setuptools extension hacks in favor of deterministic PEP 517/518 build backends. The transition to scikit-build-core introduces strict CMake orchestration, isolated build environments, and rigid wheel tagging requirements. This guide targets package maintainers, DevOps engineers, and data platform teams who need exact error-to-fix mapping, rapid CI/CD recovery, and PyPA-compliant artifact generation.

CI/CD Failure Signatures & Deterministic Recovery

CI runners routinely fail during the GDAL wheel build phase due to missing CMake modules, ABI mismatches, or unresolved PROJ/GDAL symbol collisions. The following table maps exact error signatures to root causes, immediate fixes, and validation commands.

Error Signature Root Cause Immediate Fix Validation Command
CMake Error: Could NOT find GDAL (missing: GDAL_LIBRARY GDAL_INCLUDE_DIR) CMake cannot resolve GDALConfig.cmake or fallback to gdal-config Export CMAKE_PREFIX_PATH=/opt/gdal or set GDAL_DIR explicitly in pyproject.toml cmake --find-package -DNAME=GDAL -DLANGUAGE=CXX -DMODE=EXIST -DCOMPILER_ID=GNU
undefined reference to 'GDALOpen' / PROJ: undefined symbol: proj_create Dynamic linking against incompatible PROJ/GDAL ABIs across build stages Force static linkage via -DGDAL_USE_EXTERNAL_LIBS=OFF or pin PROJ_VERSION in CMake args ldd build/*/gdal.*.so | grep -E "proj|gdal"
scikit_build_core.errors.CMakeNotFoundError: CMake 3.26+ is required Outdated runner image or missing cmake in build-system.requires Add cmake>=3.26 and ninja>=1.11 to [build-system].requires pip show cmake | grep Version
manylinux2014_x86_64 wheel contains incompatible PROJ version auditwheel detects non-compliant shared libraries or glibc version drift Bundle PROJ/GDAL .so files into the wheel or upgrade to --plat manylinux_2_28_x86_64 auditwheel show dist/*.whl

Production-Grade pyproject.toml Architecture

Fragmented setup.py logic must be replaced with a deterministic pyproject.toml that isolates the build environment, enforces CMake generator selection, and configures wheel tagging for geospatial compatibility. The baseline configuration below aligns with Modern Python Build Tooling & Wheel Configuration standards and eliminates implicit host-environment leakage.

[build-system]
requires = ["scikit-build-core>=0.9.0", "cmake>=3.26", "ninja>=1.11"]
build-backend = "scikit_build_core.build"

[project]
name = "gdal-bindings"
version = "3.8.4"
requires-python = ">=3.9"
dependencies = ["numpy>=1.22"]

[tool.scikit-build]
cmake.version = ">=3.26"
cmake.args = [
    "-DCMAKE_BUILD_TYPE=Release",
    "-DCMAKE_CXX_COMPILER_LAUNCHER=ccache",
    "-DGDAL_USE_EXTERNAL_LIBS=ON",
    "-DPROJ_USE_EXTERNAL_LIBS=ON",
    "-DCMAKE_FIND_ROOT_PATH=/opt/gdal;/opt/proj",
    "-DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER",
    "-DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY"
]
wheel.build-tag = "1"
build-dir = "build/{wheel_tag}"
logging.level = "INFO"

Key architectural decisions:

  • cmake.args enforces ccache for incremental CI builds and scopes CMAKE_FIND_ROOT_PATH to the pinned /opt/gdal;/opt/proj prefixes (required for CMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY to resolve anything) to prevent host-system library bleed.
  • build-dir = "build/{wheel_tag}" guarantees isolated artifact staging per Python ABI, preventing cross-ABI cache corruption.
  • wheel.build-tag produces deterministic, registry-friendly filenames; leaving wheel.py-api unset yields version-specific cp3X-cp3X tags, which are preferred for GDAL’s NumPy/C++ coupling.

CMake Discovery, ABI Pinning & Path Resolution

scikit-build-core executes CMake in a strictly isolated virtual environment. GDAL’s modern build system, as defined in GDAL RFC 84: Migrating build systems to CMake, relies on find_package(GDAL) and find_package(PROJ) with explicit module paths. When cross-compiling or running in minimal manylinux containers, you must override the default search hierarchy.

For deeper backend mechanics on generator selection and cache hydration, consult Integrating CMake with scikit-build-core. In practice, path resolution requires three coordinated environment exports:

export CMAKE_PREFIX_PATH="/opt/gdal:/opt/proj"
export PKG_CONFIG_PATH="/opt/proj/lib/pkgconfig:/opt/gdal/lib/pkgconfig"
export CMAKE_LIBRARY_PATH="/opt/gdal/lib:/opt/proj/lib"

These variables force CMake to bypass system defaults and resolve against pinned geospatial prefixes. To verify ABI compatibility before wheel packaging, inspect symbol visibility:

nm -D build/*/gdal.*.so | grep -E "GDALOpen|proj_create" | head -n 5

With nm -D, GDAL/PROJ symbols normally appear as U (undefined): they are resolved at load time from the linked libgdal/libproj via the ELF NEEDED entries, which is the expected result. They appear as T only if GDAL/PROJ were statically linked into the module. Confirm the libraries are actually present with ldd build/*/gdal*.so (or readelf -d); a genuinely missing dependency shows up there as not found.

Wheel Validation & PyPA Compliance Gates

Geospatial wheels must comply with PEP 517/518 build isolation and PEP 600 manylinux platform tags. The auditwheel repair step is non-negotiable for production distribution. Run the following pipeline immediately after scikit-build-core completes:

# 1. Verify wheel metadata integrity
python -m wheel unpack dist/gdal_bindings-*.whl -d /tmp/wheel-inspect
cat /tmp/wheel-inspect/gdal_bindings-*/METADATA

# 2. Repair shared library dependencies for target platform
auditwheel repair dist/gdal_bindings-*.whl --plat manylinux_2_28_x86_64 --wheel-dir wheelhouse/

# 3. Validate glibc and bundled library compliance
auditwheel show wheelhouse/gdal_bindings-*.whl

auditwheel will automatically vendor external .so dependencies into the .libs/ directory and rewrite RPATH entries to $ORIGIN/.libs. This guarantees the wheel executes identically across heterogeneous Linux runners without requiring system-level GDAL installations.

Post-Build Verification Checklist

Execute these validation gates before promoting artifacts to staging or production registries:

  1. ABI Tag Verification: Confirm the wheel filename matches cp39-cp39-manylinux_2_28_x86_64.whl. Mismatched tags trigger silent pip fallback to source builds.
  2. Import Smoke Test: Run python -c "from osgeo import gdal; print(gdal.__version__)" inside a clean venv with --no-deps installation.
  3. Dependency Graph Audit: Execute pip check to ensure numpy and certifi versions align with the compiled extension’s runtime expectations.
  4. Cache Determinism: Rebuild the wheel twice with identical pyproject.toml and verify SHA-256 checksums match. Divergence indicates non-deterministic CMake cache or timestamp injection.
  5. Registry Upload Compliance: Validate against PyPI/Artifactory metadata schemas using twine check wheelhouse/*.whl.

Adhering to these gates eliminates 90% of geospatial wheel deployment failures and ensures strict alignment with PyPA packaging standards.