Step-by-step C-extension lifecycle for Python GIS
Building and distributing geospatial Python packages requires strict adherence to the C-API vs CPython ABI Compatibility contract. When GDAL, PROJ, or PyProj extensions break in CI/CD, it is rarely a logic failure. It is almost always an ABI mismatch, missing dynamic symbol, or rpath misconfiguration. This guide details the exact lifecycle from source compilation to production wheel deployment, optimized for rapid pipeline recovery and deterministic builds.
The five phases form a linear pipeline, each gated by validation before the next begins:
Phase 1: ABI Contract & Toolchain Initialization
Geospatial C-extensions must target a specific Python minor version or opt into the Stable ABI (abi3). Targeting abi3 reduces build matrix size but requires strict adherence to the limited C-API subset defined in PEP 384. Mixing abi3 with GDAL/PROJ C++ wrappers often triggers errors such as undefined symbol: PyUnicode_AsUTF8 during runtime import, because that symbol was only added to the limited API in Python 3.10 and is absent from the stable ABI of older interpreters.
Production pyproject.toml ABI Configuration:
[build-system]
requires = ["setuptools>=68.0", "wheel", "Cython>=3.0", "setuptools_scm[toml]>=8.0"]
build-backend = "setuptools.build_meta"
[project]
name = "my-geospatial-ext"
requires-python = ">=3.9"
classifiers = [
"Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
]
[tool.distutils.bdist_wheel]
py-limited-api = "cp39" # Emit a single abi3 wheel (Python 3.9+)
Validation Command:
# Inspect the compiled extension's ABI tag from its filename suffix
python -c "import sysconfig; print(sysconfig.get_config_var('EXT_SUFFIX'))"
# Version-specific build -> .cpython-311-x86_64-linux-gnu.so
# Stable-ABI (abi3) build -> .abi3.so
Phase 2: Geospatial Dependency Resolution & Linkage
GDAL and PROJ introduce complex transitive dependencies (libcurl, libsqlite3, libtiff, libproj, libgeos). The Geospatial C-Extension Fundamentals & ABI Architecture dictates that extensions must either vendor these libraries statically or dynamically link against a known, pinned system baseline. In CI/CD, the manylinux container isolates the build environment, but pkg-config frequently resolves to stale headers or mismatched libproj minor versions.
CI/CD Dependency Bootstrap (GitHub Actions / Docker):
env:
PROJ_VERSION: "9.3.1"
GDAL_VERSION: "3.8.4"
CFLAGS: "-O2 -fPIC -Wno-unused-variable -Wno-implicit-function-declaration"
LDFLAGS: "-Wl,-rpath,'$ORIGIN/.libs' -Wl,--no-undefined"
steps:
- name: Install PROJ & GDAL (manylinux compatible)
run: |
yum install -y epel-release
yum install -y gcc gcc-c++ make cmake pkgconfig sqlite-devel curl-devel \
proj-devel proj proj-data gdal-devel
- name: Verify pkg-config resolution
run: |
pkg-config --modversion proj
pkg-config --modversion gdal
# Must match env vars exactly. Mismatch = immediate runtime segfault.
Phase 3: Build Isolation & Cross-Compiler Toolchain Setup
Deterministic geospatial builds require strict environment isolation. Use cibuildwheel to orchestrate cross-platform compilation across manylinux_2_28, manylinux_2_34, and musllinux_1_2 targets. Do not rely on host system toolchains.
cibuildwheel Integration:
[tool.cibuildwheel]
build-frontend = "build"
skip = ["cp38-*", "pp*"]
manylinux-x86_64-image = "quay.io/pypa/manylinux_2_28_x86_64:latest"
environment = { PROJ_LIB="/usr/share/proj", GDAL_DATA="/usr/share/gdal" }
before-all = "yum install -y proj-devel gdal-devel"
test-command = "python -c \"import my_ext; assert hasattr(my_ext, 'transform_coords')\""
Cross-Compilation Validation:
# Verify toolchain architecture alignment
file dist/*.so | grep -E "ELF 64-bit LSB|ARM|AArch64"
# Must match target platform. Mismatch = ImportError: wrong ELF class
Phase 4: Shared Library Path Resolution & rpath Hardening
Dynamic linking in geospatial wheels fails when LD_LIBRARY_PATH is unset or when rpath points to absolute build directories. Wheels must use $ORIGIN-relative paths to ensure portability across deployment targets.
rpath Patching Workflow:
# 1. Inspect current rpath/runpath
readelf -d dist/my_ext/*.so | grep -E "RPATH|RUNPATH"
# 2. Patch with auditwheel (automates $ORIGIN injection and dependency bundling)
auditwheel repair dist/*.whl --plat manylinux_2_28_x86_64 -w dist/repaired/
# 3. Verify post-repair linkage
ldd dist/repaired/my_ext*.so | grep "not found"
# Output must be empty. Any "not found" indicates missing transitive deps.
Phase 5: CI/CD Pipeline Integration & Deterministic Deployment
Finalize the lifecycle by enforcing deterministic artifact generation, cryptographic verification, and PyPI compliance checks before promotion to production.
Pipeline Gate Commands:
# 1. Validate wheel structure against PEP 427
python -m zipfile -l dist/repaired/*.whl | grep -E "\.pyc|\.so|\.libs"
# 2. Verify ABI compatibility
python -c "import sys; from wheel.vendored.packaging.tags import compatible_tags; print(list(compatible_tags(sys.version_info[:2])))"
# 3. Upload to test index
twine upload --repository-url https://test.pypi.org/legacy/ dist/repaired/*.whl
Exact Error-to-Fix Mapping & Validation Matrix
| Error Signature | Root Cause | Exact Fix | Validation Step |
|---|---|---|---|
ImportError: ... undefined symbol: PyUnicode_AsUTF8 |
abi3 wheel built for Python < 3.10 using a symbol added to the limited API only in 3.10 |
Raise the floor to -D Py_LIMITED_API=0x030A0000 (3.10) or drop the py-limited-api setting and build version-specific wheels |
python -c "import sysconfig; print(sysconfig.get_config_var('EXT_SUFFIX'))" must match the wheel tag |
OSError: libgdal.so.32: cannot open shared object file |
Missing rpath or LD_LIBRARY_PATH not propagated |
Run auditwheel repair with --plat flag matching target |
readelf -d *.so | grep RUNPATH must contain $ORIGIN/.libs |
ImportError: ... undefined symbol: _ZTVN4proj11Coordinate... |
C++ name mangling mismatch or missing -lstdc++ in linkage |
Append -lstdc++ to LDFLAGS and ensure extern "C" wrappers |
nm -D *.so | grep _ZTVN should return 0 results |
ValueError: PROJ: proj_create_from_database: Cannot find proj.db |
PROJ_LIB not embedded or runtime path missing |
Set PROJ_LIB in wheel metadata or bundle proj.db in .libs |
python -c "import pyproj; print(pyproj.datadir.get_data_dir())" returns valid path |
Segmentation fault (core dumped) |
ABI mismatch between GDAL C++ API and Python extension | Rebuild against exact GDAL minor version; verify GDAL_VERSION pin |
gdb --args python -c "import my_ext" → bt shows crash in libgdal |
Compliance & Standards Checklist
Deploy only when all validation gates pass. Geospatial extensions are infrastructure components; treat ABI drift as a critical pipeline failure.