Go to file
2026-06-28 00:40:50 +05:00
.idea Second checkpoint 2026-06-25 01:17:42 +05:00
api fix: Code for category 2026-06-27 19:35:53 +05:00
build/package fix: Fix geometry parsing 2026-06-28 00:37:29 +05:00
cmd/gis fix: Fix entrypoint 2026-06-25 01:26:16 +05:00
configs Second checkpoint 2026-06-25 01:17:42 +05:00
deployments fix: Platform arch for postgres 2026-06-27 19:51:15 +05:00
internal feat: reprocess all command 2026-06-28 00:40:50 +05:00
migrations fix: Code for category 2026-06-27 19:35:53 +05:00
pkg/httputil Second checkpoint 2026-06-25 01:17:42 +05:00
.dockerignore Second checkpoint 2026-06-25 01:17:42 +05:00
.env.example First checkpoint 2026-06-04 20:52:28 +05:00
.gitignore fix: Platform arch for postgres 2026-06-27 19:51:15 +05:00
.golangci.yml Second checkpoint 2026-06-25 01:17:42 +05:00
go.mod Second checkpoint 2026-06-25 01:17:42 +05:00
go.sum Second checkpoint 2026-06-25 01:17:42 +05:00
Makefile Second checkpoint 2026-06-25 01:19:59 +05:00
README.md Second checkpoint 2026-06-25 01:19:59 +05:00

gis

A Go service scaffold following golang-standards/project-layout, with cleanly separated layers: HTTP transport → services → repositories, plus RabbitMQ messaging and embedded database migrations. Single binary, three subcommands.

Layout

cmd/gis/                 binary entrypoint
internal/
  cli/                   cobra commands: serve, worker, migrate
  config/                env-based configuration
  app/                   composition root (wires all dependencies)
  domain/                entities, enums, sentinel errors
  repository/postgres/   pgx-backed repositories
  service/               business logic
  transport/http/        chi router, middleware, handlers
  storage/s3/            MinIO/S3 object storage
  messaging/rabbitmq/    connection, publisher, consumer
  platform/logger/       slog setup
pkg/httputil/            generic JSON/validation HTTP helpers
migrations/              embedded goose SQL migrations
configs/                 .env.example
deployments/             docker-compose (postgres, minio, rabbitmq)
build/package/           Dockerfile
api/openapi.yaml         OpenAPI 3.1.1 spec (embedded + served at /openapi.yaml)

Domain

  • Category — hierarchical (self-referencing parent_id). Full CRUD; cycle-safe on update.
  • Dataset — a geo file uploaded to S3/MinIO (file_type: vector_with_kato | vector | raster), belonging to one Category. Carries code/name/description/ unit metadata, a user-defined meta (JSONB) blob, an automated flag, a status lifecycle field (defaults to pending), properties (JSONB, populated from the file's attribute table), and a PostGIS geometry footprint stored in EPSG:4326 (returned as GeoJSON, with a STAC-style bbox array for rasters). Upload / list / get / download / delete (delete also removes the stored object). Uploads are validated three ways before being stored: the file_type enum, the file extension (must be allowed for the type), and a content magic-byte check (TIFF for .tif, ZIP for .zip, SQLite for .gpkg, JSON for .geojson) so mislabeled files are rejected with 422 up front.

Every uploaded file is then processed asynchronously by the worker, dispatched by file_type:

  • vector — the attribute table is parsed and stored (as a JSON array of row objects) in properties (status processingready).
  • raster — converted to a Cloud-Optimized GeoTIFF via gdal_translate -of COG (processingready); the COG is stored under cog_storage_key (the original is kept) and the footprint geometry + bbox are read from the raster extent. Requires GDAL in the worker image (gdal-tools).
  • vector_with_kato — the column-selection flow below (parsingawaiting_mappingextractingready).
  • events + the example RabbitMQ consumer/publisher are a generic messaging scaffold kept alongside the real async flows.

vector_with_kato two-phase flow

Uploading a vector_with_kato file (zipped shapefile, GeoJSON, or GeoPackage) triggers asynchronous parsing of its attribute table, after which the user maps the KATO column and the year columns:

  1. POST /datasets with file_type=vector_with_kato → dataset created with status=parsing; a dataset.parse job is published to RabbitMQ.
  2. The worker consumes the job, parses the file's columns (with sample values; CP1251/Cyrillic aware for shapefiles) and stores them in attribute_columns; statusawaiting_mapping (or failed with parse_error).
  3. The client polls GET /datasets/{id} until awaiting_mapping, then submits POST /datasets/{id}/mapping with the chosen kato_column and a year_columns map (each {column, date}). Validated against the detected columns; statusextracting.
  4. A second worker job unpivots the attribute table into long-format dataset_observations — one row per (kato_code, date) with a numeric value (or value_text for non-numeric cells); statusready. Read them via GET /datasets/{id}/observations (paginated, optional ?kato_code=).
curl -X POST localhost:8080/datasets/<id>/mapping -H 'Content-Type: application/json' -d '{
  "kato_column": "като",
  "year_columns": [
    {"column": "F_2023", "date": "2023-01-01"},
    {"column": "D_2025", "date": "2025-01-01"}
  ]
}'

Getting started

cp configs/.env.example .env
docker compose -f deployments/docker-compose.yml up -d postgres minio rabbitmq

go run ./cmd/gis migrate up        # apply migrations
go run ./cmd/gis serve             # HTTP server on :8080
go run ./cmd/gis worker --publish-example   # consume (and seed one message)

Health: GET /healthz (liveness), GET /readyz (DB + S3 + RabbitMQ).

HTTP API

The API is described by an OpenAPI 3.1.1 spec at api/openapi.yaml, embedded into the binary. While the server runs it is served at /openapi.yaml, with an interactive Redoc UI at /docs.

Method Path Description
GET /categories list (optional ?parent_id=)
POST /categories create (name, description, parent_id?)
GET /categories/{id} get
PUT /categories/{id} update
DELETE /categories/{id} delete
GET /datasets paginated list of summaries (?page=, ?page_size=, ?category_id=)
POST /datasets upload (multipart: file, file_type, category_id, code, name, description?, unit?, meta? (JSON), automated? (bool))
GET /datasets/{id} full dataset (geometry as GeoJSON, bbox for rasters)
GET /datasets/{id}/status processing status; long-polls with ?current=<status> (holds up to ?wait= secs, default 25, max 60)
GET /datasets/{id}/download download the stored file
POST /datasets/{id}/mapping set KATO column + year→date map (vector_with_kato)
GET /datasets/{id}/observations paginated unpivoted values (?kato_code=, ?page=, ?page_size=)
DELETE /datasets/{id} delete (row + object)

Example upload:

curl -X POST localhost:8080/datasets \
  -F file=@sample.geojson -F file_type=vector -F category_id=<uuid> \
  -F code=POP_2026 -F name=Population -F description="Resident population" -F unit=people

Migrations

Embedded via goose and run through the binary. The first migration enables the PostGIS extension (the database runs the postgis/postgis image), so a PostGIS- capable Postgres is required.

go run ./cmd/gis migrate up|down|status|reset
go run ./cmd/gis migrate fresh    # drop everything in the schema and re-run

On Apple Silicon, postgis/postgis has no native arm64 build, so the compose file pins platform: linux/amd64 (Docker Desktop emulates it). Remove that line on amd64 hosts.

Development

Common tasks are wrapped in the Makefile (run make help for the full list):

make up            # start postgres, minio, rabbitmq
make migrate-fresh # drop the schema and re-apply migrations
make run           # run the HTTP server
make check         # go vet + go test
make lint          # golangci-lint (if installed)

CI (.github/workflows/ci.yml) runs build, vet, go test -race, and golangci-lint on every push and pull request.

Adding a feature

Each new domain is one vertical slice mirroring Category/Dataset: domain/repository/postgres/service/transport/http/ (+ messaging/rabbitmq/ if it needs async processing), wired in internal/app.