Build Artifact Structuring and Packaging
The transition from source compilation to distributable binaries in the geospatial Python ecosystem demands deterministic artifact generation. Spatial extensions depend on complex C/C++/Fortran toolchains, dynamic system libraries, and strict ABI boundaries. Without rigorous structuring, CI pipelines produce fragile wheels that fail silently during runtime linking or violate enterprise registry policies. This cluster isolates the mechanics of organizing compiled outputs, bundling native dependencies, and enforcing platform-compliant packaging for multi-architecture deployment.
Standardized Artifact Directory Layout
Predictable build outputs are foundational to reproducible CI/CD. Geospatial packages must strictly separate pure Python modules, compiled extension binaries, and vendored native libraries. Adhering to a canonical layout prevents in-tree pollution and simplifies downstream validation:
build/
├── lib/ # Pure Python modules (importable)
├── ext/ # Compiled .so / .pyd / .dylib extensions
├── native/ # Vendored GDAL, PROJ, GEOS, SQLite binaries
├── metadata/ # .dist-info, RECORD, WHEEL, entry_points
└── wheels/ # Final .whl outputs per platform tag
Enforce this hierarchy during the build phase using explicit output routing:
python -m build --outdir build/wheels/ --no-isolation
In-tree builds should be disabled. Configure your build backend to map directories explicitly during the bdist_wheel phase. When using setuptools, prevent accidental data leakage by disabling implicit package discovery:
# pyproject.toml
[tool.setuptools]
include-package-data = false
[tool.setuptools.packages.find]
where = ["src"]
exclude = ["tests*", "benchmarks*", "docs*"]
[tool.setuptools.package-data]
"mypackage.ext" = ["*.so", "*.pyd"]
"mypackage.native" = ["*.so", "*.dylib", "*.dll"]
For CMake-driven spatial packages, [tool.scikit-build] provides stricter control over staging and wheel assembly. Refer to Mastering pyproject.toml for Spatial Wheels for advanced backend routing and metadata generation patterns.
A repaired wheel is just a zip archive with a predictable internal layout:
Native Library Bundling & RPATH Management
Spatial wheels fail when dynamic linkers cannot resolve shared objects at runtime. The bundling strategy must address platform-specific relocation semantics: $ORIGIN for Linux ELF, @loader_path/@rpath for macOS Mach-O, and PATH-relative DLL resolution on Windows.
During the C/C++ compilation stage, embed relative runtime paths to guarantee relocatable binaries. When leveraging Integrating CMake with scikit-build-core, configure the following variables to prevent hard-coded absolute paths:
# CMakeLists.txt
set(CMAKE_BUILD_WITH_INSTALL_RPATH ON)
set(CMAKE_INSTALL_RPATH "$ORIGIN/../native")
set(CMAKE_INSTALL_RPATH_USE_LINK_PATH ON)
set(CMAKE_MACOSX_RPATH ON)
Post-compilation, repair wheels using platform-specific tooling. On Linux, auditwheel automatically bundles non-system libraries and rewrites RPATH entries to $ORIGIN/.libs:
LD_LIBRARY_PATH=/opt/gdal/lib:/opt/proj/lib \
auditwheel repair dist/*.whl \
--plat manylinux_2_28_x86_64 \
--wheel-dir build/wheels/
On macOS, delocate rewrites install names and verifies architecture slices:
delocate-wheel -w build/wheels/ -v dist/*.whl \
--require-archs x86_64,arm64 \
--check-archs
Windows wheels require explicit DLL manifest generation and os.add_dll_directory() calls in the Python entry point. Avoid bundling MSVCRT or system-level CRTs; rely on the Visual C++ Redistributable runtime instead.
ABI Compliance & Platform Tagging
Geospatial wheels must declare accurate platform tags to prevent silent ABI mismatches. The Python packaging ecosystem enforces strict compatibility matrices via PEP 600 and PEP 656. Spatial extensions compiled against glibc 2.28+ cannot safely target manylinux_2_17, and PROJ/GDAL upgrades frequently introduce C++17 ABI breaks.
Validate platform tags before distribution:
# Inspect wheel metadata and platform compatibility
wheel unpack dist/*.whl -d /tmp/wheel-inspect
cat /tmp/wheel-inspect/*/WHEEL
Ensure your CI matrix aligns with target base images. For Linux, manylinux_2_28_x86_64 covers RHEL 8+/Ubuntu 20.04+ and is the current baseline for modern geospatial stacks. For macOS, macosx_11_0_universal2 guarantees native execution on both Intel and Apple Silicon without Rosetta translation overhead.
Cross-platform ABI verification should be automated:
# Verify symbol resolution against bundled libs
auditwheel show build/wheels/*.whl
delocate-listdeps --all build/wheels/*.whl
Mismatched tags or unresolved symbols must fail the pipeline immediately. The foundational standards for wheel generation and backend selection are documented in Modern Python Build Tooling & Wheel Configuration.
Deterministic Validation & Registry Ingestion
Enterprise registries and PyPI reject non-deterministic artifacts. Reproducible builds require explicit environment pinning and timestamp normalization:
export SOURCE_DATE_EPOCH=$(date +%s)
export PYTHONHASHSEED=0
export BUILDKITE_PULL_REQUEST=${BUILDKITE_PULL_REQUEST:-false}
Before registry upload, enforce a strict validation gate:
# Structural and metadata validation
twine check build/wheels/*.whl
# Dry-run installation in isolated environment
python -m venv /tmp/validate-env
/tmp/validate-env/bin/pip install --no-deps build/wheels/*.whl
/tmp/validate-env/bin/python -c "import mypackage; print(mypackage.__version__)"
Registry ingestion pipelines (Artifactory, Nexus, PyPI) require exact RECORD hashes and consistent wheel naming conventions. Automate hash verification and signature generation using sigstore or GPG for internal distribution. Cache wheel artifacts across CI runs using platform-tagged keys to avoid redundant compilation, but never cache build/ directories across different ABI targets.
For comprehensive guidance on isolating build environments and caching compiled spatial dependencies, consult Environment Isolation with Pixi and Conda and Async Build Execution and Cache Strategies.