gis/README.md

8.2 KiB

gis

A Go service scaffold following golang-standards/project-layout, with cleanly separated layers: HTTP transport → services → repositories, plus RabbitMQ messaging and embedded database migrations. Single binary, three subcommands.

Layout

cmd/gis/                 binary entrypoint
internal/
  cli/                   cobra commands: serve, worker, migrate
  config/                env-based configuration
  app/                   composition root (wires all dependencies)
  domain/                entities, enums, sentinel errors
  repository/postgres/   pgx-backed repositories
  service/               business logic
  transport/http/        chi router, middleware, handlers
  storage/s3/            MinIO/S3 object storage
  messaging/rabbitmq/    connection, publisher, consumer
  platform/logger/       slog setup
pkg/httputil/            generic JSON/validation HTTP helpers
migrations/              embedded goose SQL migrations
configs/                 .env.example
deployments/             docker-compose (postgres, minio, rabbitmq)
build/package/           Dockerfile
api/openapi.yaml         OpenAPI 3.1.1 spec (embedded + served at /openapi.yaml)

Domain

  • Category — hierarchical (self-referencing parent_id). Full CRUD; cycle-safe on update.
  • Dataset — a geo file uploaded to S3/MinIO (file_type: vector_with_kato | vector | raster), belonging to one Category. Carries code/name/description/ unit metadata, a user-defined meta (JSONB) blob, an automated flag, a status lifecycle field (defaults to pending), properties (JSONB, populated from the file's attribute table), and a PostGIS geometry footprint stored in EPSG:4326 (returned as GeoJSON, with a STAC-style bbox array for rasters). Upload / list / get / download / delete (delete also removes the stored object). Uploads are validated three ways before being stored: the file_type enum, the file extension (must be allowed for the type), and a content magic-byte check (TIFF for .tif, ZIP for .zip, SQLite for .gpkg, JSON for .geojson) so mislabeled files are rejected with 422 up front.

Every uploaded file is then processed asynchronously by the worker, dispatched by file_type:

  • vector — the attribute table is parsed and stored (as a JSON array of row objects) in properties (status processingready).
  • raster — converted to a Cloud-Optimized GeoTIFF via gdal_translate -of COG (processingready); the COG is stored under cog_storage_key (the original is kept) and the footprint geometry + bbox are read from the raster extent. Requires GDAL in the worker image (gdal-tools).
  • vector_with_kato — the column-selection flow below (parsingawaiting_mappingextractingready).
  • events + the example RabbitMQ consumer/publisher are a generic messaging scaffold kept alongside the real async flows.

vector_with_kato two-phase flow

Uploading a vector_with_kato file (zipped shapefile, GeoJSON, or GeoPackage) triggers asynchronous parsing of its attribute table, after which the user maps the KATO column and the year columns:

  1. POST /datasets with file_type=vector_with_kato → dataset created with status=parsing; a dataset.parse job is published to RabbitMQ.
  2. The worker consumes the job, parses the file's columns (with sample values; CP1251/Cyrillic aware for shapefiles) and stores them in attribute_columns; statusawaiting_mapping (or failed with parse_error).
  3. The client polls GET /datasets/{id} until awaiting_mapping, then submits POST /datasets/{id}/mapping with the chosen kato_column and a year_columns map (each {column, date}). Validated against the detected columns; statusextracting.
  4. A second worker job unpivots the attribute table into long-format dataset_observations — one row per (kato_code, date) with a numeric value (or value_text for non-numeric cells); statusready. Read them via GET /datasets/{id}/observations (paginated, optional ?kato_code=).
curl -X POST https://dssgis.dwh.kz/datasets/<id>/mapping -H 'Content-Type: application/json' -d '{
  "kato_column": "като",
  "year_columns": [
    {"column": "F_2023", "date": "2023-01-01"},
    {"column": "D_2025", "date": "2025-01-01"}
  ]
}'

Getting started

cp configs/.env.example .env
docker compose -f deployments/docker-compose.yml up -d postgres minio rabbitmq

go run ./cmd/gis migrate up        # apply migrations
go run ./cmd/gis serve             # HTTP server on :8080
go run ./cmd/gis worker --publish-example   # consume (and seed one message)

Health: GET /healthz (liveness), GET /readyz (DB + S3 + RabbitMQ).

HTTP API

The API is described by an OpenAPI 3.1.1 spec at api/openapi.yaml, embedded into the binary. While the server runs it is served at /openapi.yaml, with an interactive Redoc UI at /docs.

Method Path Description
GET /categories list (optional ?parent_id=)
POST /categories create (name, description, parent_id?)
GET /categories/{id} get
PUT /categories/{id} update
DELETE /categories/{id} delete
GET /datasets paginated list of summaries (?page=, ?page_size=, ?category_id=)
POST /datasets upload (multipart: file, file_type, category_id, code, name, description?, unit?, meta? (JSON), automated? (bool))
GET /datasets/{id} full dataset (geometry as GeoJSON, bbox for rasters)
GET /datasets/{id}.geojson GeoJSON FeatureCollection; plain vector returns its geometry as a single feature with the extracted attribute table as top-level properties; vector_with_kato always ignores its own geometry and joins the districts table by KATO, mapping observations onto each polygon
GET /datasets/{id}/status processing status; long-polls with ?current=<status> (holds up to ?wait= secs, default 25, max 60)
GET /datasets/{id}/download download the stored file
POST /datasets/{id}/mapping set KATO column + year→date map (vector_with_kato)
GET /datasets/{id}/observations paginated unpivoted values (?kato_code=, ?page=, ?page_size=)
DELETE /datasets/{id} delete (row + object)

Example upload:

curl -X POST https://dssgis.dwh.kz/datasets \
  -F file=@sample.geojson -F file_type=vector -F category_id=<uuid> \
  -F code=POP_2026 -F name=Population -F description="Resident population" -F unit=people

Migrations

Embedded via goose and run through the binary. The first migration enables the PostGIS extension (the database runs the postgis/postgis image), so a PostGIS- capable Postgres is required.

go run ./cmd/gis migrate up|down|status|reset
go run ./cmd/gis migrate fresh    # drop everything in the schema and re-run

On Apple Silicon, postgis/postgis has no native arm64 build, so the compose file pins platform: linux/amd64 (Docker Desktop emulates it). Remove that line on amd64 hosts.

Development

Common tasks are wrapped in the Makefile (run make help for the full list):

make up            # start postgres, minio, rabbitmq
make migrate-fresh # drop the schema and re-apply migrations
make run           # run the HTTP server
make check         # go vet + go test
make lint          # golangci-lint (if installed)

CI (.github/workflows/ci.yml) runs build, vet, go test -race, and golangci-lint on every push and pull request.

Adding a feature

Each new domain is one vertical slice mirroring Category/Dataset: domain/repository/postgres/service/transport/http/ (+ messaging/rabbitmq/ if it needs async processing), wired in internal/app.