Step-by-step C-extension lifecycle for Python GIS

Building and distributing geospatial Python packages requires strict adherence to the C-API vs CPython ABI Compatibility contract. When GDAL, PROJ, or PyProj extensions break in CI/CD, it is rarely a logic failure. It is almost always an ABI mismatch, missing dynamic symbol, or rpath misconfiguration. This guide details the exact lifecycle from source compilation to production wheel deployment, optimized for rapid pipeline recovery and deterministic builds.

The five phases form a linear pipeline, each gated by validation before the next begins:

flowchart LR P1["1. ABI and toolchain"] --> P2["2. Dependency resolution"] P2 --> P3["3. Build isolation"] P3 --> P4["4. RPATH hardening"] P4 --> P5["5. CI/CD deploy"]

Phase 1: ABI Contract & Toolchain Initialization

Geospatial C-extensions must target a specific Python minor version or opt into the Stable ABI (abi3). Targeting abi3 reduces build matrix size but requires strict adherence to the limited C-API subset defined in PEP 384. Mixing abi3 with GDAL/PROJ C++ wrappers often triggers errors such as undefined symbol: PyUnicode_AsUTF8 during runtime import, because that symbol was only added to the limited API in Python 3.10 and is absent from the stable ABI of older interpreters.

Production pyproject.toml ABI Configuration:

[build-system]
requires = ["setuptools>=68.0", "wheel", "Cython>=3.0", "setuptools_scm[toml]>=8.0"]
build-backend = "setuptools.build_meta"

[project]
name = "my-geospatial-ext"
requires-python = ">=3.9"
classifiers = [
    "Programming Language :: Python :: 3 :: Only",
    "Programming Language :: Python :: 3.9",
    "Programming Language :: Python :: 3.10",
    "Programming Language :: Python :: 3.11",
    "Programming Language :: Python :: 3.12",
]

[tool.distutils.bdist_wheel]
py-limited-api = "cp39"  # Emit a single abi3 wheel (Python 3.9+)

Validation Command:

# Inspect the compiled extension's ABI tag from its filename suffix
python -c "import sysconfig; print(sysconfig.get_config_var('EXT_SUFFIX'))"
# Version-specific build -> .cpython-311-x86_64-linux-gnu.so
# Stable-ABI (abi3) build -> .abi3.so

Phase 2: Geospatial Dependency Resolution & Linkage

GDAL and PROJ introduce complex transitive dependencies (libcurl, libsqlite3, libtiff, libproj, libgeos). The Geospatial C-Extension Fundamentals & ABI Architecture dictates that extensions must either vendor these libraries statically or dynamically link against a known, pinned system baseline. In CI/CD, the manylinux container isolates the build environment, but pkg-config frequently resolves to stale headers or mismatched libproj minor versions.

CI/CD Dependency Bootstrap (GitHub Actions / Docker):

env:
  PROJ_VERSION: "9.3.1"
  GDAL_VERSION: "3.8.4"
  CFLAGS: "-O2 -fPIC -Wno-unused-variable -Wno-implicit-function-declaration"
  LDFLAGS: "-Wl,-rpath,'$ORIGIN/.libs' -Wl,--no-undefined"

steps:
  - name: Install PROJ & GDAL (manylinux compatible)
    run: |
      yum install -y epel-release
      yum install -y gcc gcc-c++ make cmake pkgconfig sqlite-devel curl-devel \
        proj-devel proj proj-data gdal-devel
  - name: Verify pkg-config resolution
    run: |
      pkg-config --modversion proj
      pkg-config --modversion gdal
      # Must match env vars exactly. Mismatch = immediate runtime segfault.

Phase 3: Build Isolation & Cross-Compiler Toolchain Setup

Deterministic geospatial builds require strict environment isolation. Use cibuildwheel to orchestrate cross-platform compilation across manylinux_2_28, manylinux_2_34, and musllinux_1_2 targets. Do not rely on host system toolchains.

cibuildwheel Integration:

[tool.cibuildwheel]
build-frontend = "build"
skip = ["cp38-*", "pp*"]
manylinux-x86_64-image = "quay.io/pypa/manylinux_2_28_x86_64:latest"
environment = { PROJ_LIB="/usr/share/proj", GDAL_DATA="/usr/share/gdal" }
before-all = "yum install -y proj-devel gdal-devel"
test-command = "python -c \"import my_ext; assert hasattr(my_ext, 'transform_coords')\""

Cross-Compilation Validation:

# Verify toolchain architecture alignment
file dist/*.so | grep -E "ELF 64-bit LSB|ARM|AArch64"
# Must match target platform. Mismatch = ImportError: wrong ELF class

Phase 4: Shared Library Path Resolution & rpath Hardening

Dynamic linking in geospatial wheels fails when LD_LIBRARY_PATH is unset or when rpath points to absolute build directories. Wheels must use $ORIGIN-relative paths to ensure portability across deployment targets.

rpath Patching Workflow:

# 1. Inspect current rpath/runpath
readelf -d dist/my_ext/*.so | grep -E "RPATH|RUNPATH"

# 2. Patch with auditwheel (automates $ORIGIN injection and dependency bundling)
auditwheel repair dist/*.whl --plat manylinux_2_28_x86_64 -w dist/repaired/

# 3. Verify post-repair linkage
ldd dist/repaired/my_ext*.so | grep "not found"
# Output must be empty. Any "not found" indicates missing transitive deps.

Phase 5: CI/CD Pipeline Integration & Deterministic Deployment

Finalize the lifecycle by enforcing deterministic artifact generation, cryptographic verification, and PyPI compliance checks before promotion to production.

Pipeline Gate Commands:

# 1. Validate wheel structure against PEP 427
python -m zipfile -l dist/repaired/*.whl | grep -E "\.pyc|\.so|\.libs"

# 2. Verify ABI compatibility
python -c "import sys; from wheel.vendored.packaging.tags import compatible_tags; print(list(compatible_tags(sys.version_info[:2])))"

# 3. Upload to test index
twine upload --repository-url https://test.pypi.org/legacy/ dist/repaired/*.whl

Exact Error-to-Fix Mapping & Validation Matrix

Error Signature Root Cause Exact Fix Validation Step
ImportError: ... undefined symbol: PyUnicode_AsUTF8 abi3 wheel built for Python < 3.10 using a symbol added to the limited API only in 3.10 Raise the floor to -D Py_LIMITED_API=0x030A0000 (3.10) or drop the py-limited-api setting and build version-specific wheels python -c "import sysconfig; print(sysconfig.get_config_var('EXT_SUFFIX'))" must match the wheel tag
OSError: libgdal.so.32: cannot open shared object file Missing rpath or LD_LIBRARY_PATH not propagated Run auditwheel repair with --plat flag matching target readelf -d *.so | grep RUNPATH must contain $ORIGIN/.libs
ImportError: ... undefined symbol: _ZTVN4proj11Coordinate... C++ name mangling mismatch or missing -lstdc++ in linkage Append -lstdc++ to LDFLAGS and ensure extern "C" wrappers nm -D *.so | grep _ZTVN should return 0 results
ValueError: PROJ: proj_create_from_database: Cannot find proj.db PROJ_LIB not embedded or runtime path missing Set PROJ_LIB in wheel metadata or bundle proj.db in .libs python -c "import pyproj; print(pyproj.datadir.get_data_dir())" returns valid path
Segmentation fault (core dumped) ABI mismatch between GDAL C++ API and Python extension Rebuild against exact GDAL minor version; verify GDAL_VERSION pin gdb --args python -c "import my_ext"bt shows crash in libgdal

Compliance & Standards Checklist

Deploy only when all validation gates pass. Geospatial extensions are infrastructure components; treat ABI drift as a critical pipeline failure.