176 lines
8.0 KiB
Markdown
176 lines
8.0 KiB
Markdown
# gis
|
|
|
|
A Go service scaffold following [golang-standards/project-layout](https://github.com/golang-standards/project-layout),
|
|
with cleanly separated layers: HTTP transport → services → repositories, plus
|
|
RabbitMQ messaging and embedded database migrations. Single binary, three
|
|
subcommands.
|
|
|
|
## Layout
|
|
|
|
```
|
|
cmd/gis/ binary entrypoint
|
|
internal/
|
|
cli/ cobra commands: serve, worker, migrate
|
|
config/ env-based configuration
|
|
app/ composition root (wires all dependencies)
|
|
domain/ entities, enums, sentinel errors
|
|
repository/postgres/ pgx-backed repositories
|
|
service/ business logic
|
|
transport/http/ chi router, middleware, handlers
|
|
storage/s3/ MinIO/S3 object storage
|
|
messaging/rabbitmq/ connection, publisher, consumer
|
|
platform/logger/ slog setup
|
|
pkg/httputil/ generic JSON/validation HTTP helpers
|
|
migrations/ embedded goose SQL migrations
|
|
configs/ .env.example
|
|
deployments/ docker-compose (postgres, minio, rabbitmq)
|
|
build/package/ Dockerfile
|
|
docs/ generated OpenAPI/Swagger spec (swaggo/swag)
|
|
```
|
|
|
|
## Domain
|
|
|
|
- **Category** — hierarchical (self-referencing `parent_id`). Full CRUD; cycle-safe
|
|
on update.
|
|
- **Dataset** — a geo file uploaded to S3/MinIO (`file_type`: `vector_with_kato |
|
|
vector | raster`), belonging to one Category. Carries `code`/`name`/`description`/
|
|
`unit` metadata, a user-defined `meta` (JSONB) blob, an `automated` flag, a
|
|
`status` lifecycle field (defaults to `pending`), `properties` (JSONB, populated
|
|
from the file's attribute table), and a PostGIS `geometry` footprint stored in
|
|
EPSG:4326 (returned as GeoJSON, with a STAC-style `bbox` array for rasters).
|
|
Upload / list / get / download / delete (delete also removes the stored object).
|
|
Uploads are validated three ways before being stored: the `file_type` enum, the
|
|
file **extension** (must be allowed for the type), and a **content** magic-byte
|
|
check (TIFF for `.tif`, ZIP for `.zip`, SQLite for `.gpkg`, JSON for `.geojson`)
|
|
so mislabeled files are rejected with 422 up front.
|
|
|
|
Every uploaded file is then processed asynchronously by the worker, dispatched by
|
|
`file_type`:
|
|
|
|
- **`vector`** — the attribute table is parsed and stored (as a JSON array of row
|
|
objects) in `properties` (`status` `processing` → `ready`).
|
|
- **`raster`** — converted to a **Cloud-Optimized GeoTIFF** via `gdal_translate
|
|
-of COG` (`processing` → `ready`); the COG is stored under `cog_storage_key`
|
|
(the original is kept) and the footprint `geometry` + `bbox` are read from the
|
|
raster extent. Requires GDAL in the worker image (`gdal-tools`).
|
|
- **`vector_with_kato`** — the column-selection flow below (`parsing` →
|
|
`awaiting_mapping` → `extracting` → `ready`).
|
|
- **events** + the example RabbitMQ consumer/publisher are a generic messaging
|
|
scaffold kept alongside the real async flows.
|
|
|
|
### vector_with_kato two-phase flow
|
|
|
|
Uploading a `vector_with_kato` file (zipped shapefile, GeoJSON, or GeoPackage)
|
|
triggers asynchronous parsing of its attribute table, after which the user maps
|
|
the KATO column and the year columns:
|
|
|
|
1. `POST /datasets` with `file_type=vector_with_kato` → dataset created with
|
|
`status=parsing`; a `dataset.parse` job is published to RabbitMQ.
|
|
2. The **worker** consumes the job, parses the file's columns (with sample
|
|
values; CP1251/Cyrillic aware for shapefiles) and stores them in
|
|
`attribute_columns`; `status` → `awaiting_mapping` (or `failed` with
|
|
`parse_error`).
|
|
3. The client polls `GET /datasets/{id}` until `awaiting_mapping`, then submits
|
|
`POST /datasets/{id}/mapping` with the chosen `kato_column` and a
|
|
`year_columns` map (each `{column, date}`). Validated against the detected
|
|
columns; `status` → `extracting`.
|
|
4. A second worker job **unpivots** the attribute table into long-format
|
|
`dataset_observations` — one row per `(kato_code, date)` with a numeric
|
|
`value` (or `value_text` for non-numeric cells); `status` → `ready`. Read
|
|
them via `GET /datasets/{id}/observations` (paginated, optional
|
|
`?kato_code=`).
|
|
|
|
```sh
|
|
curl -X POST localhost:8080/datasets/<id>/mapping -H 'Content-Type: application/json' -d '{
|
|
"kato_column": "като",
|
|
"year_columns": [
|
|
{"column": "F_2023", "date": "2023-01-01"},
|
|
{"column": "D_2025", "date": "2025-01-01"}
|
|
]
|
|
}'
|
|
```
|
|
|
|
## Getting started
|
|
|
|
```sh
|
|
cp configs/.env.example .env
|
|
docker compose -f deployments/docker-compose.yml up -d postgres minio rabbitmq
|
|
|
|
go run ./cmd/gis migrate up # apply migrations
|
|
go run ./cmd/gis serve # HTTP server on :8080
|
|
go run ./cmd/gis worker --publish-example # consume (and seed one message)
|
|
```
|
|
|
|
Health: `GET /healthz` (liveness), `GET /readyz` (DB + S3 + RabbitMQ).
|
|
|
|
### HTTP API
|
|
|
|
The API is documented with [swaggo/swag](https://github.com/swaggo/swag)
|
|
annotations on the handlers. The generated spec lives in `docs/` and is served
|
|
as interactive **Swagger UI** at `/swagger/index.html` while the server runs.
|
|
Regenerate after changing annotations:
|
|
|
|
```sh
|
|
make docs # go tool swag init -g cmd/gis/main.go --parseInternal --output docs
|
|
```
|
|
|
|
| Method | Path | Description |
|
|
|--------|----------------------------|--------------------------------------|
|
|
| GET | `/categories` | list (optional `?parent_id=`) |
|
|
| POST | `/categories` | create (`name`, `description`, `parent_id?`) |
|
|
| GET | `/categories/{id}` | get |
|
|
| PUT | `/categories/{id}` | update |
|
|
| DELETE | `/categories/{id}` | delete |
|
|
| GET | `/datasets` | paginated list of summaries (`?page=`, `?page_size=`, `?category_id=`) |
|
|
| POST | `/datasets` | upload (multipart: `file`, `file_type`, `category_id`, `code`, `name`, `description?`, `unit?`, `meta?` (JSON), `automated?` (bool)) |
|
|
| GET | `/datasets/{id}` | full dataset (geometry as GeoJSON, `bbox` for rasters) |
|
|
| GET | `/datasets/{id}/status` | processing status; long-polls with `?current=<status>` (holds up to `?wait=` secs, default 25, max 60) |
|
|
| GET | `/datasets/{id}/download` | download the stored file |
|
|
| POST | `/datasets/{id}/mapping` | set KATO column + year→date map (vector_with_kato) |
|
|
| GET | `/datasets/{id}/observations` | paginated unpivoted values (`?kato_code=`, `?page=`, `?page_size=`) |
|
|
| DELETE | `/datasets/{id}` | delete (row + object) |
|
|
|
|
Example upload:
|
|
|
|
```sh
|
|
curl -X POST localhost:8080/datasets \
|
|
-F file=@sample.geojson -F file_type=vector -F category_id=<uuid> \
|
|
-F code=POP_2026 -F name=Population -F description="Resident population" -F unit=people
|
|
```
|
|
|
|
## Migrations
|
|
|
|
Embedded via goose and run through the binary. The first migration enables the
|
|
PostGIS extension (the database runs the `postgis/postgis` image), so a PostGIS-
|
|
capable Postgres is required.
|
|
|
|
```sh
|
|
go run ./cmd/gis migrate up|down|status|reset
|
|
go run ./cmd/gis migrate fresh # drop everything in the schema and re-run
|
|
```
|
|
|
|
> On Apple Silicon, `postgis/postgis` has no native arm64 build, so the compose
|
|
> file pins `platform: linux/amd64` (Docker Desktop emulates it). Remove that line
|
|
> on amd64 hosts.
|
|
|
|
## Development
|
|
|
|
Common tasks are wrapped in the `Makefile` (run `make help` for the full list):
|
|
|
|
```sh
|
|
make up # start postgres, minio, rabbitmq
|
|
make migrate-fresh # drop the schema and re-apply migrations
|
|
make run # run the HTTP server
|
|
make check # go vet + go test
|
|
make lint # golangci-lint (if installed)
|
|
```
|
|
|
|
CI (`.github/workflows/ci.yml`) runs build, vet, `go test -race`, and golangci-lint
|
|
on every push and pull request.
|
|
|
|
## Adding a feature
|
|
|
|
Each new domain is one vertical slice mirroring Category/Dataset:
|
|
`domain/` → `repository/postgres/` → `service/` → `transport/http/`
|
|
(+ `messaging/rabbitmq/` if it needs async processing), wired in `internal/app`.
|