# gis A Go service scaffold following [golang-standards/project-layout](https://github.com/golang-standards/project-layout), with cleanly separated layers: HTTP transport → services → repositories, plus RabbitMQ messaging and embedded database migrations. Single binary, three subcommands. ## Layout ``` cmd/gis/ binary entrypoint internal/ cli/ cobra commands: serve, worker, migrate config/ env-based configuration app/ composition root (wires all dependencies) domain/ entities, enums, sentinel errors repository/postgres/ pgx-backed repositories service/ business logic transport/http/ chi router, middleware, handlers storage/s3/ MinIO/S3 object storage messaging/rabbitmq/ connection, publisher, consumer platform/logger/ slog setup pkg/httputil/ generic JSON/validation HTTP helpers migrations/ embedded goose SQL migrations configs/ .env.example deployments/ docker-compose (postgres, minio, rabbitmq) build/package/ Dockerfile api/openapi.yaml OpenAPI 3.1.1 spec (embedded + served at /openapi.yaml) ``` ## Domain - **Category** — hierarchical (self-referencing `parent_id`). Full CRUD; cycle-safe on update. - **Dataset** — a geo file uploaded to S3/MinIO (`file_type`: `vector_with_kato | vector | raster`), belonging to one Category. Carries `code`/`name`/`description`/ `unit` metadata, a user-defined `meta` (JSONB) blob, an `automated` flag, a `status` lifecycle field (defaults to `pending`), `properties` (JSONB, populated from the file's attribute table), and a PostGIS `geometry` footprint stored in EPSG:4326 (returned as GeoJSON, with a STAC-style `bbox` array for rasters). Upload / list / get / download / delete (delete also removes the stored object). Uploads are validated three ways before being stored: the `file_type` enum, the file **extension** (must be allowed for the type), and a **content** magic-byte check (TIFF for `.tif`, ZIP for `.zip`, SQLite for `.gpkg`, JSON for `.geojson`) so mislabeled files are rejected with 422 up front. Every uploaded file is then processed asynchronously by the worker, dispatched by `file_type`: - **`vector`** — the attribute table is parsed and stored (as a JSON array of row objects) in `properties` (`status` `processing` → `ready`). - **`raster`** — converted to a **Cloud-Optimized GeoTIFF** via `gdal_translate -of COG` (`processing` → `ready`); the COG is stored under `cog_storage_key` (the original is kept) and the footprint `geometry` + `bbox` are read from the raster extent. Requires GDAL in the worker image (`gdal-tools`). - **`vector_with_kato`** — the column-selection flow below (`parsing` → `awaiting_mapping` → `extracting` → `ready`). - **events** + the example RabbitMQ consumer/publisher are a generic messaging scaffold kept alongside the real async flows. ### vector_with_kato two-phase flow Uploading a `vector_with_kato` file (zipped shapefile, GeoJSON, or GeoPackage) triggers asynchronous parsing of its attribute table, after which the user maps the KATO column and the year columns: 1. `POST /datasets` with `file_type=vector_with_kato` → dataset created with `status=parsing`; a `dataset.parse` job is published to RabbitMQ. 2. The **worker** consumes the job, parses the file's columns (with sample values; CP1251/Cyrillic aware for shapefiles) and stores them in `attribute_columns`; `status` → `awaiting_mapping` (or `failed` with `parse_error`). 3. The client polls `GET /datasets/{id}` until `awaiting_mapping`, then submits `POST /datasets/{id}/mapping` with the chosen `kato_column` and a `year_columns` map (each `{column, date}`). Validated against the detected columns; `status` → `extracting`. 4. A second worker job **unpivots** the attribute table into long-format `dataset_observations` — one row per `(kato_code, date)` with a numeric `value` (or `value_text` for non-numeric cells); `status` → `ready`. Read them via `GET /datasets/{id}/observations` (paginated, optional `?kato_code=`). ```sh curl -X POST localhost:8080/datasets//mapping -H 'Content-Type: application/json' -d '{ "kato_column": "като", "year_columns": [ {"column": "F_2023", "date": "2023-01-01"}, {"column": "D_2025", "date": "2025-01-01"} ] }' ``` ## Getting started ```sh cp configs/.env.example .env docker compose -f deployments/docker-compose.yml up -d postgres minio rabbitmq go run ./cmd/gis migrate up # apply migrations go run ./cmd/gis serve # HTTP server on :8080 go run ./cmd/gis worker --publish-example # consume (and seed one message) ``` Health: `GET /healthz` (liveness), `GET /readyz` (DB + S3 + RabbitMQ). ### HTTP API The API is described by an **OpenAPI 3.1.1** spec at [`api/openapi.yaml`](api/openapi.yaml), embedded into the binary. While the server runs it is served at `/openapi.yaml`, with an interactive **Redoc** UI at `/docs`. | Method | Path | Description | |--------|----------------------------|--------------------------------------| | GET | `/categories` | list (optional `?parent_id=`) | | POST | `/categories` | create (`name`, `description`, `parent_id?`) | | GET | `/categories/{id}` | get | | PUT | `/categories/{id}` | update | | DELETE | `/categories/{id}` | delete | | GET | `/datasets` | paginated list of summaries (`?page=`, `?page_size=`, `?category_id=`) | | POST | `/datasets` | upload (multipart: `file`, `file_type`, `category_id`, `code`, `name`, `description?`, `unit?`, `meta?` (JSON), `automated?` (bool)) | | GET | `/datasets/{id}` | full dataset (geometry as GeoJSON, `bbox` for rasters) | | GET | `/datasets/{id}.geojson` | GeoJSON `FeatureCollection`; plain `vector` returns its geometry as a single feature with the extracted attribute table as top-level properties; `vector_with_kato` maps observations, joining the `districts` table by KATO when it has no geometry of its own | | GET | `/datasets/{id}.kato.geojson` | GeoJSON `FeatureCollection` (vector_with_kato); ignores dataset geometry and always joins `districts` by KATO, mapping observations onto each polygon | | GET | `/datasets/{id}/status` | processing status; long-polls with `?current=` (holds up to `?wait=` secs, default 25, max 60) | | GET | `/datasets/{id}/download` | download the stored file | | POST | `/datasets/{id}/mapping` | set KATO column + year→date map (vector_with_kato) | | GET | `/datasets/{id}/observations` | paginated unpivoted values (`?kato_code=`, `?page=`, `?page_size=`) | | DELETE | `/datasets/{id}` | delete (row + object) | Example upload: ```sh curl -X POST localhost:8080/datasets \ -F file=@sample.geojson -F file_type=vector -F category_id= \ -F code=POP_2026 -F name=Population -F description="Resident population" -F unit=people ``` ## Migrations Embedded via goose and run through the binary. The first migration enables the PostGIS extension (the database runs the `postgis/postgis` image), so a PostGIS- capable Postgres is required. ```sh go run ./cmd/gis migrate up|down|status|reset go run ./cmd/gis migrate fresh # drop everything in the schema and re-run ``` > On Apple Silicon, `postgis/postgis` has no native arm64 build, so the compose > file pins `platform: linux/amd64` (Docker Desktop emulates it). Remove that line > on amd64 hosts. ## Development Common tasks are wrapped in the `Makefile` (run `make help` for the full list): ```sh make up # start postgres, minio, rabbitmq make migrate-fresh # drop the schema and re-apply migrations make run # run the HTTP server make check # go vet + go test make lint # golangci-lint (if installed) ``` CI (`.github/workflows/ci.yml`) runs build, vet, `go test -race`, and golangci-lint on every push and pull request. ## Adding a feature Each new domain is one vertical slice mirroring Category/Dataset: `domain/` → `repository/postgres/` → `service/` → `transport/http/` (+ `messaging/rabbitmq/` if it needs async processing), wired in `internal/app`.