Go to file

Bakhtiyar Issakhmetov 30f7ae1e07 feat: reprocess all command		2026-06-28 00:40:50 +05:00
.idea	Second checkpoint	2026-06-25 01:17:42 +05:00
api	fix: Code for category	2026-06-27 19:35:53 +05:00
build/package	fix: Fix geometry parsing	2026-06-28 00:37:29 +05:00
cmd/gis	fix: Fix entrypoint	2026-06-25 01:26:16 +05:00
configs	Second checkpoint	2026-06-25 01:17:42 +05:00
deployments	fix: Platform arch for postgres	2026-06-27 19:51:15 +05:00
internal	feat: reprocess all command	2026-06-28 00:40:50 +05:00
migrations	fix: Code for category	2026-06-27 19:35:53 +05:00
pkg/httputil	Second checkpoint	2026-06-25 01:17:42 +05:00
.dockerignore	Second checkpoint	2026-06-25 01:17:42 +05:00
.env.example	First checkpoint	2026-06-04 20:52:28 +05:00
.gitignore	fix: Platform arch for postgres	2026-06-27 19:51:15 +05:00
.golangci.yml	Second checkpoint	2026-06-25 01:17:42 +05:00
go.mod	Second checkpoint	2026-06-25 01:17:42 +05:00
go.sum	Second checkpoint	2026-06-25 01:17:42 +05:00
Makefile	Second checkpoint	2026-06-25 01:19:59 +05:00
README.md	Second checkpoint	2026-06-25 01:19:59 +05:00

README.md

gis

A Go service scaffold following golang-standards/project-layout, with cleanly separated layers: HTTP transport → services → repositories, plus RabbitMQ messaging and embedded database migrations. Single binary, three subcommands.

Layout

cmd/gis/                 binary entrypoint
internal/
  cli/                   cobra commands: serve, worker, migrate
  config/                env-based configuration
  app/                   composition root (wires all dependencies)
  domain/                entities, enums, sentinel errors
  repository/postgres/   pgx-backed repositories
  service/               business logic
  transport/http/        chi router, middleware, handlers
  storage/s3/            MinIO/S3 object storage
  messaging/rabbitmq/    connection, publisher, consumer
  platform/logger/       slog setup
pkg/httputil/            generic JSON/validation HTTP helpers
migrations/              embedded goose SQL migrations
configs/                 .env.example
deployments/             docker-compose (postgres, minio, rabbitmq)
build/package/           Dockerfile
api/openapi.yaml         OpenAPI 3.1.1 spec (embedded + served at /openapi.yaml)

Domain

Category — hierarchical (self-referencing parent_id). Full CRUD; cycle-safe on update.
Dataset — a geo file uploaded to S3/MinIO (file_type: vector_with_kato | vector | raster), belonging to one Category. Carries code/name/description/ unit metadata, a user-defined meta (JSONB) blob, an automated flag, a status lifecycle field (defaults to pending), properties (JSONB, populated from the file's attribute table), and a PostGIS geometry footprint stored in EPSG:4326 (returned as GeoJSON, with a STAC-style bbox array for rasters). Upload / list / get / download / delete (delete also removes the stored object). Uploads are validated three ways before being stored: the file_type enum, the file extension (must be allowed for the type), and a content magic-byte check (TIFF for .tif, ZIP for .zip, SQLite for .gpkg, JSON for .geojson) so mislabeled files are rejected with 422 up front.

Every uploaded file is then processed asynchronously by the worker, dispatched by file_type:

vector — the attribute table is parsed and stored (as a JSON array of row objects) in properties (status processing → ready).
raster — converted to a Cloud-Optimized GeoTIFF via gdal_translate -of COG (processing → ready); the COG is stored under cog_storage_key (the original is kept) and the footprint geometry + bbox are read from the raster extent. Requires GDAL in the worker image (gdal-tools).
vector_with_kato — the column-selection flow below (parsing → awaiting_mapping → extracting → ready).
events + the example RabbitMQ consumer/publisher are a generic messaging scaffold kept alongside the real async flows.

vector_with_kato two-phase flow

Uploading a vector_with_kato file (zipped shapefile, GeoJSON, or GeoPackage) triggers asynchronous parsing of its attribute table, after which the user maps the KATO column and the year columns:

POST /datasets with file_type=vector_with_kato → dataset created with status=parsing; a dataset.parse job is published to RabbitMQ.
The worker consumes the job, parses the file's columns (with sample values; CP1251/Cyrillic aware for shapefiles) and stores them in attribute_columns; status → awaiting_mapping (or failed with parse_error).
The client polls GET /datasets/{id} until awaiting_mapping, then submits POST /datasets/{id}/mapping with the chosen kato_column and a year_columns map (each {column, date}). Validated against the detected columns; status → extracting.
A second worker job unpivots the attribute table into long-format dataset_observations — one row per (kato_code, date) with a numeric value (or value_text for non-numeric cells); status → ready. Read them via GET /datasets/{id}/observations (paginated, optional ?kato_code=).

curl -X POST localhost:8080/datasets/<id>/mapping -H 'Content-Type: application/json' -d '{
  "kato_column": "като",
  "year_columns": [
    {"column": "F_2023", "date": "2023-01-01"},
    {"column": "D_2025", "date": "2025-01-01"}
  ]
}'

Getting started

cp configs/.env.example .env
docker compose -f deployments/docker-compose.yml up -d postgres minio rabbitmq

go run ./cmd/gis migrate up        # apply migrations
go run ./cmd/gis serve             # HTTP server on :8080
go run ./cmd/gis worker --publish-example   # consume (and seed one message)

Health: GET /healthz (liveness), GET /readyz (DB + S3 + RabbitMQ).

HTTP API

The API is described by an OpenAPI 3.1.1 spec at api/openapi.yaml, embedded into the binary. While the server runs it is served at /openapi.yaml, with an interactive Redoc UI at /docs.

Method	Path	Description
GET	`/categories`	list (optional `?parent_id=`)
POST	`/categories`	create (`name`, `description`, `parent_id?`)
GET	`/categories/{id}`	get
PUT	`/categories/{id}`	update
DELETE	`/categories/{id}`	delete
GET	`/datasets`	paginated list of summaries (`?page=`, `?page_size=`, `?category_id=`)
POST	`/datasets`	upload (multipart: `file`, `file_type`, `category_id`, `code`, `name`, `description?`, `unit?`, `meta?` (JSON), `automated?` (bool))
GET	`/datasets/{id}`	full dataset (geometry as GeoJSON, `bbox` for rasters)
GET	`/datasets/{id}/status`	processing status; long-polls with `?current=<status>` (holds up to `?wait=` secs, default 25, max 60)
GET	`/datasets/{id}/download`	download the stored file
POST	`/datasets/{id}/mapping`	set KATO column + year→date map (vector_with_kato)
GET	`/datasets/{id}/observations`	paginated unpivoted values (`?kato_code=`, `?page=`, `?page_size=`)
DELETE	`/datasets/{id}`	delete (row + object)

Example upload:

curl -X POST localhost:8080/datasets \
  -F file=@sample.geojson -F file_type=vector -F category_id=<uuid> \
  -F code=POP_2026 -F name=Population -F description="Resident population" -F unit=people

Migrations

Embedded via goose and run through the binary. The first migration enables the PostGIS extension (the database runs the postgis/postgis image), so a PostGIS- capable Postgres is required.

go run ./cmd/gis migrate up|down|status|reset
go run ./cmd/gis migrate fresh    # drop everything in the schema and re-run

On Apple Silicon, postgis/postgis has no native arm64 build, so the compose file pins platform: linux/amd64 (Docker Desktop emulates it). Remove that line on amd64 hosts.

Development

Common tasks are wrapped in the Makefile (run make help for the full list):

make up            # start postgres, minio, rabbitmq
make migrate-fresh # drop the schema and re-apply migrations
make run           # run the HTTP server
make check         # go vet + go test
make lint          # golangci-lint (if installed)

CI (.github/workflows/ci.yml) runs build, vet, go test -race, and golangci-lint on every push and pull request.

Adding a feature

Each new domain is one vertical slice mirroring Category/Dataset: domain/ → repository/postgres/ → service/ → transport/http/ (+ messaging/rabbitmq/ if it needs async processing), wired in internal/app.