Compare commits

..

128 commits

Author SHA1 Message Date
Alex Auvolat
b6b18427a5 use optimization level 3 and thin LTO for release builds (#1405)
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1405
Co-authored-by: Alex Auvolat <lx@deuxfleurs.fr>
Co-committed-by: Alex Auvolat <lx@deuxfleurs.fr>
2026-04-16 08:47:02 +00:00
Gauthier Zirnhelt
9987166b2b Fix the LifecycleWorker being uncooperative (#1396)
## Summary

This PR ensures that the `LifecycleWorker` yields at least once to the Tokio scheduler in between each batch of 100 objects.

## Problem being solved

I'm administrating a Garage cluster which has been experiencing timeouts on all endpoints while the lifecycle worker is running at midnight UTC : `Ping timeout` error messages and even requests eventually failing due to `Could not reach quorum ...`.

I have found that this happens while the lifecycle worker is working on a big bucket (containing millions of objects) with a lifecycle rule that applies to very few objects.
The `process_object()` function does not hit any `await`:
- `last_bucket` is always the same, so the `bucket_table` is not read asynchronously
- no transaction is made on the `object_table` because my lifecycle rule (almost) never applies to any object

The first commit in this PR adds an executable which reproduces the problem that I've been experiencing in a self-contained way : the lifecycle worker starves the Tokio scheduler so much that no other task is able to run (or very rarely).
To run it : `cargo run -p garage_model --bin lifecycle-starvation-test`.
This commit can be dropped post-review, as it's only useful to demonstrate the starvation.

The error messages completely stopped after adding the extra yield to the nodes of my cluster.
The duration of the lifecycle worker task does not appear to have changed at all from what I can see (looking at the timestamps produced either by the self-contained binary or by each of my nodes with the `Lifecycle worker finished` message).

## Note

An other potential fix would have been to force the `WorkerProcessor` to yield before re-enqueuing a busy task, but this would have affected all Garage workers even though it's only the `LifecycleWorker` being uncooperative.

Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1396
Reviewed-by: Alex <lx@deuxfleurs.fr>
Co-authored-by: Gauthier Zirnhelt <gauthier.zirnhelt@insimo.fr>
Co-committed-by: Gauthier Zirnhelt <gauthier.zirnhelt@insimo.fr>
2026-04-15 09:56:24 +00:00
trinity-1686a
b72b090a09 fix silent write errors (#1358)
fix #1355

some write errors are not reported when calling write_all. That's notably the case of ENOSPC on small buffers (1MiB).
on ext4, the error is catched when calling flush(). This is hopefully the case on most local filesystems, though afaik this assumption doesn't hold for NFS

Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1358
Co-authored-by: trinity-1686a <trinity@deuxfleurs.fr>
Co-committed-by: trinity-1686a <trinity@deuxfleurs.fr>
2026-02-21 07:21:24 +00:00
Armael
8551aefed4 Fix: correctly parse CORS website configuration with no rules (#1320)
When sending a website config with an empty list of CORS rules, garage currently incorrectly refuses it with error message "Invalid XML: missing field `CORSRule`".
This fix the issue by following the documentation of quick-xml related to serde field parameters for this specific scenario:  https://docs.rs/quick-xml/latest/quick_xml/de/#sequences-xsall-and-xssequence-xml-schema-types .

(I've based this PR on main-v1 because we want it for deuxfleurs' deployment.)

Co-authored-by: Armaël Guéneau <armael.gueneau@ens-lyon.org>
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1320
Co-authored-by: Armael <armael@noreply.localhost>
Co-committed-by: Armael <armael@noreply.localhost>
2026-02-07 13:11:20 +00:00
Alex Auvolat
47bf5d9fb0 bump version to v1.3.1 2026-01-24 13:01:27 +01:00
Alex Auvolat
5df37dae5e update cargo dependencies in main-v1 (#1299)
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1299
Co-authored-by: Alex Auvolat <lx@deuxfleurs.fr>
Co-committed-by: Alex Auvolat <lx@deuxfleurs.fr>
2026-01-24 11:59:01 +00:00
Alex
44af0bdab3 Merge pull request 'Backport #1283 and #1290 to main-v1' (#1297) from backports-v1 into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1297
2026-01-24 11:34:28 +00:00
rmoff
a7d6620e18 Fix typo in error message 2026-01-24 12:21:45 +01:00
Joe Anderson
8eb12755e4 Allow bucket to be missing from presigned post params 2026-01-24 12:21:25 +01:00
maximilien
c685a2cbaf Merge pull request 'Update doc/book/cookbook/binary-packages.md' (#1269) from nmstoker/garage:main-v1 into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1269
2025-12-21 21:12:15 +00:00
maximilien
969f42a970 Merge pull request 'feat: add service annotations' (#1264) from deimosfr/garage:feat/add_helm_svc_annotations into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1264
Reviewed-by: maximilien <git@mricher.fr>
2025-12-21 21:00:00 +00:00
nmstoker
424d4f8d4d Update doc/book/cookbook/binary-packages.md
Correct the Arch Linux link as garage is now available in the official repos under extra, and no longer in AUR.
2025-12-20 13:16:38 +00:00
Pierre Mavro
bf5290036f
feat: add service annotations 2025-12-18 18:12:22 +01:00
Alex
4efc8bac07 Merge pull request 'Add the parameter, which replaces . This is to accommodate different storage media such as HDD and NVMe.' (#1251) from perrynzhou/garage:dev into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1251
Reviewed-by: Alex <lx@deuxfleurs.fr>
2025-12-17 10:05:49 +00:00
perrynzhou
f3dcc39903 Merge branch 'main-v1' into dev 2025-12-17 10:05:19 +00:00
maximilien
43e02920c2 Merge pull request 'docs: fix typo in doc/book/cookbook/kubernetes.md' (#1259) from simonpasquier/garage:fix-typo into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1259
Reviewed-by: maximilien <me@mricher.fr>
2025-12-17 07:09:59 +00:00
Simon Pasquier
dcc2fe4ac5
docs: fix typo in doc/book/cookbook/kubernetes.md 2025-12-16 10:16:44 +01:00
perrynzhou@gmail.com
e3a5ec6ef6 rename put_blocks_max_parallel to block_max_concurrent_writes_per_request and update configuration.md 2025-12-12 07:09:38 +08:00
perrynzhou@gmail.com
4d124e1c76 Add the parameter, which replaces . This is to accommodate different storage media such as HDD and NVMe. 2025-12-10 06:43:51 +08:00
Alex
d769a7be5d Merge pull request 'Update rust toolchain to 1.91.0' (#1233) from toolchain-update into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1233
2025-11-25 09:58:17 +00:00
Alex Auvolat
511cf0c6ec disable awscli checksumming in ci scripts
required because garage.deuxfleurs.fr is still running v1.x
2025-11-24 18:37:34 +01:00
Alex Auvolat
95693d45b2 run cargo fmt as a nix derivation 2025-11-24 18:09:53 +01:00
Alex Auvolat
ca296477f3 disable checksums in aws cli (todo: revert in main-v2) 2025-11-24 17:58:57 +01:00
Alex Auvolat
ca3b4a050d update nixos image used in woodpecker ci 2025-11-24 17:35:51 +01:00
Alex Auvolat
a057ab23ea Update rust toolchain 2025-11-24 11:09:46 +01:00
Alex
58bc65b9a8 Merge pull request 'migrate to this error, garage-v1' (#1218) from thiserror into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1218
Reviewed-by: Alex <lx@deuxfleurs.fr>
2025-11-12 08:05:32 +00:00
trinity-1686a
ac851d6dee fmt 2025-11-01 18:04:54 +01:00
trinity-1686a
eac2aa6fe4 Merge pull request 'fix: default config path changed for alpine binary' (#1204) from berndsen-io/garage:fix-alpine-docs into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1204
Reviewed-by: trinity-1686a <trinity@deuxfleurs.fr>
2025-11-01 16:43:32 +00:00
trinity-1686a
1e0201ada2 Merge pull request 'Update link to signature v2.' (#1211) from teo-tsirpanis/garage:sigv2-docs into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1211
Reviewed-by: trinity-1686a <trinity@deuxfleurs.fr>
2025-11-01 16:43:05 +00:00
trinity-1686a
82297371bf migrate to this error
it doesn't generate a bazillion warning at compile time
2025-11-01 17:20:39 +01:00
teo-tsirpanis
174f4f01a8 Update link to signature v2. 2025-10-26 15:54:08 +00:00
fgberry
1aac7b4875 chore: spacing 2025-10-24 11:25:33 +02:00
fgberry
b43c58cbe5 fix: default config path changed for alpine binary 2025-10-24 11:22:32 +02:00
Alex
9481ac428e Merge pull request 'sigv4: don't enforce x-amz-content-sha256 to be in signed headers list (fix #770)' (#1195) from fix-770 into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1195
2025-10-14 09:34:35 +00:00
Alex Auvolat
1c29d04cc5 sigv4: don't enforce x-amz-content-sha256 to be in signed headers list (fix #770)
From the following page:
https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-header-based-auth.html

> In both cases, because the x-amz-content-sha256 header value is already
> part of your HashedPayload, you are not required to include the
> x-amz-content-sha256 header as a canonical header.
2025-10-14 11:18:25 +02:00
Alex
b48a8eaa1f Merge pull request 'properly handle precondition time equal to object time' (#1193) from precondition-ms into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1193
2025-10-10 19:41:06 +00:00
trinity-1686a
42fd8583bd properly handle precondition time equal to object time 2025-10-08 17:54:22 +02:00
Alex
236af3a958 Merge pull request 'Garage v1.3.0' (#1166) from rel-v1.3.0 into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1166
2025-09-14 21:26:21 +00:00
Alex Auvolat
4b1fdbef55 bump version to v1.3.0 2025-09-14 21:36:33 +02:00
Alex Auvolat
0f1b488be0 fix rust warnings 2025-09-14 21:25:37 +02:00
Alex
0bbf63ee0e Merge pull request 'update rusqlite and snapshot using VACUUM INTO' (#1164) from update-rusqlite into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1164
2025-09-14 18:28:01 +00:00
Alex
879d941d7b Merge pull request 'add garage repair clear-resync-queue (fix #1151)' (#1165) from clear-resync-queue into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1165
2025-09-14 17:50:41 +00:00
Alex Auvolat
d726cf0299 add garage repair clear-resync-queue (fix #1151) 2025-09-14 19:34:44 +02:00
Alex
0c7aeab6f8 Merge pull request 'garage_db: fix error handling logic (fix #1138)' (#1163) from fix-1138 into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1163
2025-09-14 17:26:08 +00:00
Alex Auvolat
5687fc0375 update rusqlite and snapshot using VACUUM INTO 2025-09-14 19:22:36 +02:00
Alex
97f1e9ab52 Merge pull request 'Add Plakar documentation (backup tools)' (#1119) from Lapineige/garage:Plakar_support into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1119
2025-09-14 16:08:36 +00:00
Lapineige
60b1d78b56 Add Plakar documentation 2025-09-14 18:07:49 +02:00
Alex Auvolat
4c895a7186 garage_db: fix error handling logic (fix #1138) 2025-09-14 18:03:31 +02:00
Alex
c3b5cbf212 Merge pull request 'fix panic when cluster_layout cannot be saved (fix #1150)' (#1158) from fix-1150 into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1158
2025-09-13 15:58:52 +00:00
Alex
57a467b5c0 Merge pull request 'Block manager: limit simultaneous block reads from disk' (#1157) from block-max-simultaneous-reads into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1157
2025-09-13 15:53:24 +00:00
Alex Auvolat
6cf6db5c61 fix panic when cluster_layout cannot be saved (fix #1150) 2025-09-13 17:49:25 +02:00
Alex Auvolat
d5a57e3e13 block: read_block: don't add not found blocks to resync queue 2025-09-13 17:38:23 +02:00
Alex Auvolat
5cf354acb4 block: maximum number of simultaneous reads 2025-09-13 17:38:06 +02:00
Alex
2b007ddea3 Merge pull request 'woodpecker: require the nix=enabled label' (#1152) from woodpecker-nix-flag into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1152
2025-09-04 09:10:10 +00:00
Alex Auvolat
c8599a8636 woodpecker: require the nix=enabled label 2025-09-04 11:06:46 +02:00
Alex
0b901bf291 Merge pull request 'garage_db: reduce frequency of sqlite snapshot progress log (fix #1129)' (#1146) from fix-1129 into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1146
2025-08-27 22:26:32 +00:00
Alex Auvolat
c8c20d6f47 garage_db: reduce frequency of sqlite snapshot progress log (fix #1129) 2025-08-28 00:07:35 +02:00
Alex
e5db610e4c Merge pull request 'K2V client: allow custom HTTP client' (#731) from k2v/shared_http_client into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/731
Reviewed-by: maximilien <me@mricher.fr>
2025-08-27 21:21:09 +00:00
Alex
65c6f8adea Merge pull request 'garage_db: refactor open function' (#1142) from factor-db-open into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1142
2025-08-27 21:10:59 +00:00
Alex Auvolat
54b9bf02a3 garage_db: refactor open function 2025-08-27 23:03:09 +02:00
Alex
469153233f Merge pull request 'garage_db: rename len to approximate_len as it is used for stats only' (#1141) from db-approximate-len into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1141
2025-08-27 20:44:50 +00:00
Alex Auvolat
90bba5889a garage_db: rename len to approximate_len as it is used for stats only 2025-08-27 21:23:45 +02:00
Alex
a64b567d43 Merge pull request 'Add experimental support for Fjall DB engine' (#906) from withings/garage:feat/fjall-db-engine into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/906
2025-08-27 19:09:40 +00:00
Alex Auvolat
6ea86db8cd document fjall db engine, remove flakey metadata_fsync implementation 2025-08-27 20:22:41 +02:00
Alex Auvolat
aa69c06f2b fix potential race condition and naming bug in fjall adapter 2025-08-27 20:22:38 +02:00
Alex Auvolat
a6c6c44310 nix: build and test fjall feature 2025-08-27 18:54:42 +02:00
Julien Kritter
96d7713915 Add support for an LSM-tree-based backend with Fjall 2025-08-27 18:54:34 +02:00
Alex
d64498c3d3 Merge pull request 'log access keys' (#1122) from 1686a/log-access-key into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1122
2025-08-27 16:18:16 +00:00
trinity-1686a
b340599e68 log access keys 2025-08-03 15:30:56 +02:00
Alex
5448012b27 Merge pull request 'Pixelfed_support' (#1118) from Lapineige/garage:Pixelfed_support into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1118
2025-08-02 15:03:57 +00:00
Alex
ce34d11a65 Merge pull request 'don't die on SIGHUP' (#1121) from 1686a/handle-sighup into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1121
2025-08-02 14:53:58 +00:00
Alex
8cb7623ebd Merge pull request 'handle ECONNABORTED' (#1120) from 1686a/handle-econnaborted into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1120
2025-08-02 14:53:45 +00:00
trinity-1686a
5469c95877 handle ECONNABORTED 2025-08-02 13:14:01 +02:00
trinity-1686a
f930c6f643 don't die on SIGHUP 2025-08-02 13:09:33 +02:00
Alex
afcb22bf16 Merge pull request 'Fix typo in peertube buckets names' (#1117) from Lapineige/garage:main into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1117
2025-08-02 08:27:01 +00:00
Lapineige
cc29a40d51 Actualiser doc/book/connect/apps/index.md 2025-08-01 21:35:15 +00:00
Lapineige
0f3f180c3e Merge branch 'main-v1' into main 2025-08-01 21:33:58 +00:00
Lapineige
70cf6004ae Fix typo in peertube buckets names 2025-08-01 21:32:59 +00:00
Alex
c7571ff89b Merge pull request 'Fix some unsoundness in lmdb adapter unsafe' (#1099) from krtab/garage:fix_some_ub into main-v1
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1099
2025-07-31 19:38:23 +00:00
Arthur Carcano
1b42919bf7 Fix some unsoundness in lmdb adapter unsafe 2025-07-25 23:33:51 +02:00
Alex
3f4ab3a4a3 Merge pull request 'Garage v1.2.0' (#1068) from rel-1.2.0 into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1068
2025-06-13 16:12:29 +00:00
Alex Auvolat
3a4afc04a9 cargo: update crossbeam-channel to avoid yanked version 2025-06-13 17:22:47 +02:00
Alex Auvolat
fbf03e9378 bump version to v1.2.0 2025-06-13 14:21:28 +02:00
Alex
9eb07d4c7b Merge pull request 'cli: mark block refs as deleted in garage block purge (fix #1055)' (#1067) from fix-1055 into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1067
2025-06-13 11:53:41 +00:00
Alex Auvolat
85ee4f5d8c cli: mark block refs as deleted in garage block purge 2025-06-13 13:52:02 +02:00
Alex
328072d122 Merge pull request 'put web error in a basic webpage' (#1064) from trinity-1686a/garage:1686a/non-xml-web-error into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1064
2025-06-12 06:06:38 +00:00
trinity-1686a
26bc807905 put web error in a basic webpage
before, it was a plain string, with an xml content type

this caused browsers to show very ugly and meaningless pages
2025-06-10 22:23:06 +02:00
Alex
a9f5f242b2 Merge pull request 'feat: add log to journald feature' (#1056) from ragazenta/garage:feat/tracing-journald into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1056
2025-06-10 18:38:23 +00:00
maximilien
ae98abca5c Merge pull request 'Add eddster2309/ansible-role-garage as deployment option' (#1057) from eddster2309/garage:main into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1057
Reviewed-by: maximilien <me@mricher.fr>
2025-06-08 11:56:31 +00:00
eddster2309
adfa44ad70 Add architecture support 2025-06-03 09:22:43 +00:00
eddster2309
47143b88ad Add eddster2309/ansible-role-garage as deployment option 2025-06-03 09:15:57 +00:00
Renjaya Raga Zenta
8843aa92fa
feat: add log to journald feature
The systemd-journald is used in most major Linux distros that use systemd.
This enables logging using the systemd-journald native protocol, instead
of just writing to stderr.
2025-06-02 11:55:27 +07:00
Alex
b601b3e46d Merge pull request 'documentation: Minor doc change to clarify why the capacity does not matter and how the zone name is used' (#1051) from ddxv/garage:docs-quick-start into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1051
2025-05-30 16:26:19 +00:00
Alex
a19d2f16e2 Merge pull request 'api: s3: implement get bucket acl' (#1045) from ragazenta/garage:feat/dummy-acl into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1045
2025-05-30 16:25:04 +00:00
trinity-1686a
fc8fc60f6d emit internal error when we detect race condition (#1053) (fix #1050)
i went with a `500`/`InternalError`/`Please try again.` because that is something i've seen AWS S3 report while developing other software, and i'm not convinced all clients would understand a 409 conflict properly (GET don't usually conflict)

Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1053
Co-authored-by: trinity-1686a <trinity@deuxfleurs.fr>
Co-committed-by: trinity-1686a <trinity@deuxfleurs.fr>
2025-05-30 16:24:12 +00:00
Alex
77079a1498 Merge pull request '[1.1.x] speed up UploadPartCopy' (#1047) from yuka/garage:uploadpartcopy-v1 into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1047
2025-05-30 16:22:35 +00:00
James O'Claire
2a4f729b57 Minor doc change to clarify why the capacity does not matter and how the zone name is used 2025-05-28 09:49:50 +08:00
Renjaya Raga Zenta
1b042e379e
api: s3: implement get bucket acl 2025-05-26 09:43:15 +07:00
Yureka
ffbce0f689 speed up UploadPartCopy
(cherry picked from commit db54bf96c7)
2025-05-23 20:36:32 +02:00
Alex
37e5621dde Merge pull request 'documentation updates' (#1046) from doc-updates into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1046
2025-05-23 15:05:19 +00:00
Alex Auvolat
6529ff379a documentation updates 2025-05-23 17:02:23 +02:00
Alex
a8d73682a4 Merge pull request 'more resilience to inconsistent alias states' (#989) from fix-bucket-aliases into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/989
2025-05-22 17:41:42 +00:00
Alex Auvolat
8654eb19bf implement repair procedure to fix inconsistent bucket aliases 2025-05-22 19:34:38 +02:00
maximilien
54ea412188 Merge pull request 'Add kubernetes CRD' (#994) from babykart/garage:k8s-crd into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/994
2025-05-22 17:15:56 +00:00
Alex Auvolat
2ade8c86f6 more resilience to inconsistent alias states 2025-05-22 19:12:05 +02:00
babykart
b15e2cbb6c Update Kubernetes cookbook
Signed-off-by: babykart <babykart@gmail.com>
2025-05-22 17:11:14 +00:00
babykart
0fd1b7342b Add Kubernetes CRD and the related kustomization
Signed-off-by: babykart <babykart@gmail.com>
2025-05-22 17:11:14 +00:00
Alex
be16bc7a05 Merge pull request 'Fix behavior of CopyObject wrt x-amz-website-redirect-location' (#1037) from Armael/garage:copy-website-redirect into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1037
2025-05-22 13:57:28 +00:00
Alex
bfaa1ca6b7 Merge pull request 'api: lifecycle: 404 if missing lifecycle config' (#1043) from ragazenta/garage:no-lifecycle-response into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1043
2025-05-22 13:56:52 +00:00
Alex
de8eeab4ad Merge pull request 'optionally support puny code (fix #273)' (#1042) from trinity-1686a/garage:1686a/punnycode into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1042
2025-05-22 12:49:10 +00:00
Renjaya Raga Zenta
ae3f7ee76c
api: lifecycle: 404 if missing lifecycle config 2025-05-22 19:33:54 +07:00
trinity-1686a
2dc3a6dbbe document allow_punycode configuration option 2025-05-22 14:08:06 +02:00
Armaël Guéneau
c6bc3f229b Fix behavior of CopyObject wrt x-amz-website-redirect-location 2025-05-22 14:03:11 +02:00
trinity-1686a
bba9202f31 add test for punycode 2025-05-19 20:36:03 +02:00
trinity-1686a
a605a80806 support punnycode in api/web endpoint 2025-05-19 18:11:55 +02:00
trinity-1686a
539af12d21 allow punnycode in bucket name 2025-05-19 18:07:04 +02:00
Alex
a2a9e3cec4 Merge pull request 'doc: Add systemd example to increase file descriptors limit' (#1023) from baptiste/garage:systemd_openfiles into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1023
2025-05-09 10:34:07 +00:00
Baptiste Jonglez
14274bc13c doc: Add systemd example to increase file descriptors limit 2025-05-08 10:27:53 +02:00
Alex
bf4691d98a Merge pull request 'Fix #1007: hint that region can be changed depending on cluster config' (#1015) from garage-1007-update-region-in-doc into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1015
2025-04-24 07:41:32 +00:00
Maximilien R.
ad151cb1dc Fix #1007: hint that region can be changed depending on cluster config 2025-04-23 23:30:16 +02:00
babykart
3c20984a08 helm-chart: Cosmetic changes
Signed-off-by: babykart <babykart@gmail.com>
2025-04-21 10:04:53 +00:00
babykart
e6e4e051a1 helm-chart: Add metadata_auto_snapshot_interval
Signed-off-by: babykart <babykart@gmail.com>
2025-04-21 10:04:53 +00:00
babykart
9b38cba6f3 helm-chart: Add livenessProbe & readinessProbe
Signed-off-by: babykart <babykart@gmail.com>
2025-04-21 10:04:53 +00:00
Alex
4ef954d176 Merge pull request 'Fix Docker run volume mappings' (#1012) from Zoob/garage:main into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1012
2025-04-19 20:05:16 +00:00
Zoob
02498a93d0 doc: fix Docker run volume mappings 2025-04-19 18:46:36 +00:00
Alex
4caad5425d Merge pull request 'metadata: Create compact LMDB snapshots' (#1008) from baptiste/garage:lmdb_compact_snapshot into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1008
2025-04-17 19:26:32 +00:00
Baptiste Jonglez
9ec3f8cc3c metadata: Create compact LMDB snapshots
See #1006

LMDB files never shrink, so we can end up with a large database that
contains a smaller amount of actual data.

Compacting the snapshots is an easy win: it will write faster to disk,
take less space, and if needed you can reimport an already-compacted
snapshot as the main database.
2025-04-12 23:18:50 +02:00
Quentin Dufour
8b35a946d9
Allow external HTTP client 2024-02-23 17:09:47 +01:00
172 changed files with 8115 additions and 16322 deletions

View file

@ -1,3 +1,6 @@
labels:
nix: "enabled"
when:
event:
- push
@ -9,27 +12,32 @@ when:
steps:
- name: check formatting
image: nixpkgs/nix:nixos-22.05
image: nixpkgs/nix:nixos-24.05
commands:
- nix-shell --attr devShell --run "cargo fmt -- --check"
- nix-build -j4 --attr flakePackages.fmt
- name: build
image: nixpkgs/nix:nixos-22.05
image: nixpkgs/nix:nixos-24.05
commands:
- nix-build -j4 --attr flakePackages.dev
- name: unit + func tests (lmdb)
image: nixpkgs/nix:nixos-22.05
image: nixpkgs/nix:nixos-24.05
commands:
- nix-build -j4 --attr flakePackages.tests-lmdb
- name: unit + func tests (sqlite)
image: nixpkgs/nix:nixos-22.05
image: nixpkgs/nix:nixos-24.05
commands:
- nix-build -j4 --attr flakePackages.tests-sqlite
- name: unit + func tests (fjall)
image: nixpkgs/nix:nixos-24.05
commands:
- nix-build -j4 --attr flakePackages.tests-fjall
- name: integration tests
image: nixpkgs/nix:nixos-22.05
image: nixpkgs/nix:nixos-24.05
commands:
- nix-build -j4 --attr flakePackages.dev
- nix-shell --attr ci --run ./script/test-smoke.sh || (cat /tmp/garage.log; false)

View file

@ -1,3 +1,6 @@
labels:
nix: "enabled"
when:
event:
- deployment
@ -8,7 +11,7 @@ depends_on:
steps:
- name: refresh-index
image: nixpkgs/nix:nixos-22.05
image: nixpkgs/nix:nixos-24.05
environment:
AWS_ACCESS_KEY_ID:
from_secret: garagehq_aws_access_key_id
@ -19,7 +22,7 @@ steps:
- nix-shell --attr ci --run "refresh_index"
- name: multiarch-docker
image: nixpkgs/nix:nixos-22.05
image: nixpkgs/nix:nixos-24.05
environment:
DOCKER_AUTH:
from_secret: docker_auth

View file

@ -1,3 +1,6 @@
labels:
nix: "enabled"
when:
event:
- deployment
@ -16,17 +19,17 @@ matrix:
steps:
- name: build
image: nixpkgs/nix:nixos-22.05
image: nixpkgs/nix:nixos-24.05
commands:
- nix-build --attr releasePackages.${ARCH} --argstr git_version ${CI_COMMIT_TAG:-$CI_COMMIT_SHA}
- name: check is static binary
image: nixpkgs/nix:nixos-22.05
image: nixpkgs/nix:nixos-24.05
commands:
- nix-shell --attr ci --run "./script/not-dynamic.sh result/bin/garage"
- name: integration tests
image: nixpkgs/nix:nixos-22.05
image: nixpkgs/nix:nixos-24.05
commands:
- nix-shell --attr ci --run ./script/test-smoke.sh || (cat /tmp/garage.log; false)
when:
@ -35,16 +38,8 @@ steps:
- matrix:
ARCH: i386
- name: upgrade tests from v1.0.0
image: nixpkgs/nix:nixos-22.05
commands:
- nix-shell --attr ci --run "./script/test-upgrade.sh v1.0.0 x86_64-unknown-linux-musl" || (cat /tmp/garage.log; false)
when:
- matrix:
ARCH: amd64
- name: upgrade tests from v0.8.4
image: nixpkgs/nix:nixos-22.05
- name: upgrade tests
image: nixpkgs/nix:nixos-24.05
commands:
- nix-shell --attr ci --run "./script/test-upgrade.sh v0.8.4 x86_64-unknown-linux-musl" || (cat /tmp/garage.log; false)
when:
@ -52,7 +47,7 @@ steps:
ARCH: amd64
- name: push static binary
image: nixpkgs/nix:nixos-22.05
image: nixpkgs/nix:nixos-24.05
environment:
TARGET: "${TARGET}"
AWS_ACCESS_KEY_ID:
@ -63,7 +58,7 @@ steps:
- nix-shell --attr ci --run "to_s3"
- name: docker build and publish
image: nixpkgs/nix:nixos-22.05
image: nixpkgs/nix:nixos-24.05
environment:
DOCKER_PLATFORM: "linux/${ARCH}"
CONTAINER_NAME: "dxflrs/${ARCH}_garage"

2008
Cargo.lock generated

File diff suppressed because it is too large Load diff

View file

@ -24,18 +24,18 @@ default-members = ["src/garage"]
# Internal Garage crates
format_table = { version = "0.1.1", path = "src/format-table" }
garage_api_common = { version = "2.0.0", path = "src/api/common" }
garage_api_admin = { version = "2.0.0", path = "src/api/admin" }
garage_api_s3 = { version = "2.0.0", path = "src/api/s3" }
garage_api_k2v = { version = "2.0.0", path = "src/api/k2v" }
garage_block = { version = "2.0.0", path = "src/block" }
garage_db = { version = "2.0.0", path = "src/db", default-features = false }
garage_model = { version = "2.0.0", path = "src/model", default-features = false }
garage_net = { version = "2.0.0", path = "src/net" }
garage_rpc = { version = "2.0.0", path = "src/rpc" }
garage_table = { version = "2.0.0", path = "src/table" }
garage_util = { version = "2.0.0", path = "src/util" }
garage_web = { version = "2.0.0", path = "src/web" }
garage_api_common = { version = "1.3.1", path = "src/api/common" }
garage_api_admin = { version = "1.3.1", path = "src/api/admin" }
garage_api_s3 = { version = "1.3.1", path = "src/api/s3" }
garage_api_k2v = { version = "1.3.1", path = "src/api/k2v" }
garage_block = { version = "1.3.1", path = "src/block" }
garage_db = { version = "1.3.1", path = "src/db", default-features = false }
garage_model = { version = "1.3.1", path = "src/model", default-features = false }
garage_net = { version = "1.3.1", path = "src/net" }
garage_rpc = { version = "1.3.1", path = "src/rpc" }
garage_table = { version = "1.3.1", path = "src/table" }
garage_util = { version = "1.3.1", path = "src/util" }
garage_web = { version = "1.3.1", path = "src/web" }
k2v-client = { version = "0.0.4", path = "src/k2v-client" }
# External crates from crates.io
@ -48,18 +48,15 @@ blake2 = "0.10"
bytes = "1.0"
bytesize = "1.1"
cfg-if = "1.0"
chrono = { version = "0.4", features = ["serde"] }
chrono = "0.4"
crc32fast = "1.4"
crc32c = "0.6"
crc64fast-nvme = "1.2"
crypto-common = "0.1"
err-derive = "0.3"
gethostname = "0.4"
git-version = "0.3.4"
hex = "0.4"
hexdump = "0.1"
hmac = "0.12"
idna = "0.5"
itertools = "0.12"
ipnet = "2.9.0"
lazy_static = "1.4"
@ -67,8 +64,8 @@ md-5 = "0.10"
mktemp = "0.5"
nix = { version = "0.29", default-features = false, features = ["fs"] }
nom = "7.1"
parking_lot = "0.12"
parse_duration = "2.1"
paste = "1.0"
pin-project = "1.0.12"
pnet_datalink = "0.34"
rand = "0.8"
@ -86,12 +83,14 @@ pretty_env_logger = "0.5"
structopt = { version = "0.3", default-features = false }
syslog-tracing = "0.3"
tracing = "0.1"
tracing-journald = "0.3.1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
heed = { version = "0.11", default-features = false, features = ["lmdb"] }
rusqlite = "0.31.0"
rusqlite = "0.37"
r2d2 = "0.8"
r2d2_sqlite = "0.24"
r2d2_sqlite = "0.31"
fjall = "2.4"
async-compression = { version = "0.4", features = ["tokio", "zstd"] }
zstd = { version = "0.13", default-features = false }
@ -102,7 +101,6 @@ serde = { version = "1.0", default-features = false, features = ["derive", "rc"]
serde_bytes = "0.11"
serde_json = "1.0"
toml = { version = "0.8", default-features = false, features = ["parse"] }
utoipa = { version = "5.3.1", features = ["chrono"] }
# newer version requires rust edition 2021
k8s-openapi = { version = "0.21", features = ["v1_24"] }
@ -138,7 +136,7 @@ prometheus = "0.13"
aws-sigv4 = { version = "1.1", default-features = false }
hyper-rustls = { version = "0.26", default-features = false, features = ["http1", "http2", "ring", "rustls-native-certs"] }
log = "0.4"
thiserror = "1.0"
thiserror = "2.0"
# ---- used only as build / dev dependencies ----
assert-json-diff = "2.0"
@ -148,12 +146,8 @@ aws-smithy-runtime = { version = "1.8", default-features = false, features = ["t
aws-sdk-config = { version = "1.62", default-features = false }
aws-sdk-s3 = { version = "1.79", default-features = false, features = ["rt-tokio"] }
[profile.dev]
#lto = "thin" # disabled for now, adds 2-4 min to each CI build
lto = "off"
[profile.release]
lto = true
codegen-units = 1
opt-level = "s"
strip = true
lto = "thin"
codegen-units = 16
opt-level = 3
strip = "debuginfo"

View file

@ -1,7 +1,7 @@
<!DOCTYPE html>
<html>
<head>
<title>Garage adminstration API v0</title>
<title>Garage Adminstration API v0</title>
<!-- needed for adaptive design -->
<meta charset="utf-8"/>
<meta name="viewport" content="width=device-width, initial-scale=1">

View file

@ -1,7 +1,7 @@
<!DOCTYPE html>
<html>
<head>
<title>Garage adminstration API v1</title>
<title>Garage Adminstration API v0</title>
<!-- needed for adaptive design -->
<meta charset="utf-8"/>
<meta name="viewport" content="width=device-width, initial-scale=1">

View file

@ -1,24 +0,0 @@
<!DOCTYPE html>
<html>
<head>
<title>Garage adminstration API v2</title>
<!-- needed for adaptive design -->
<meta charset="utf-8"/>
<meta name="viewport" content="width=device-width, initial-scale=1">
<link href="./css/redoc.css" rel="stylesheet">
<!--
Redoc doesn't change outer page styles
-->
<style>
body {
margin: 0;
padding: 0;
}
</style>
</head>
<body>
<redoc spec-url='./garage-admin-v2.json'></redoc>
<script src="./redoc.standalone.js"> </script>
</body>
</html>

File diff suppressed because it is too large Load diff

View file

@ -12,7 +12,7 @@ In this section, we cover the following web applications:
| [Mastodon](#mastodon) | ✅ | Natively supported |
| [Matrix](#matrix) | ✅ | Tested with `synapse-s3-storage-provider` |
| [ejabberd](#ejabberd) | ✅ | `mod_s3_upload` |
| [Pixelfed](#pixelfed) | ❓ | Not yet tested |
| [Pixelfed](#pixelfed) | ✅ | Natively supported |
| [Pleroma](#pleroma) | ❓ | Not yet tested |
| [Lemmy](#lemmy) | ✅ | Supported with pict-rs |
| [Funkwhale](#funkwhale) | ❓ | Not yet tested |
@ -69,7 +69,7 @@ $CONFIG = array(
'hostname' => '127.0.0.1', // Can also be a domain name, eg. garage.example.com
'port' => 3900, // Put your reverse proxy port or your S3 API port
'use_ssl' => false, // Set it to true if you have a TLS enabled reverse proxy
'region' => 'garage', // Garage has only one region named "garage"
'region' => 'garage', // Garage default region is named "garage", edit according to your cluster config
'use_path_style' => true // Garage supports only path style, must be set to true
],
],
@ -135,7 +135,7 @@ bucket but doesn't also know the secret encryption key.
*Click on the picture to zoom*
Add a new external storage. Put what you want in "folder name" (eg. "shared"). Select "Amazon S3". Keep "Access Key" for the Authentication field.
In Configuration, put your bucket name (eg. nextcloud), the host (eg. 127.0.0.1), the port (eg. 3900 or 443), the region (garage). Tick the SSL box if you have put an HTTPS proxy in front of garage. You must tick the "Path access" box and you must leave the "Legacy authentication (v2)" box empty. Put your Key ID (eg. GK...) and your Secret Key in the last two input boxes. Finally click on the tick symbol on the right of your screen.
In Configuration, put your bucket name (eg. nextcloud), the host (eg. 127.0.0.1), the port (eg. 3900 or 443), the region ("garage" if you use the default, or the one your configured in your `garage.toml`). Tick the SSL box if you have put an HTTPS proxy in front of garage. You must tick the "Path access" box and you must leave the "Legacy authentication (v2)" box empty. Put your Key ID (eg. GK...) and your Secret Key in the last two input boxes. Finally click on the tick symbol on the right of your screen.
Now go to your "Files" app and a new "linked folder" has appeared with the name you chose earlier (eg. "shared").
@ -191,10 +191,10 @@ garage key create peertube-key
Keep the Key ID and the Secret key in a pad, they will be needed later.
We need two buckets, one for normal videos (named peertube-video) and one for webtorrent videos (named peertube-playlist).
We need two buckets, one for normal videos (named peertube-videos) and one for webtorrent videos (named peertube-playlists).
```bash
garage bucket create peertube-videos
garage bucket create peertube-playlist
garage bucket create peertube-playlists
```
Now we allow our key to read and write on these buckets:
@ -238,7 +238,7 @@ object_storage:
# Put localhost only if you have a garage instance running on that node
endpoint: 'http://localhost:3900' # or "garage.example.com" if you have TLS on port 443
# Garage supports only one region for now, named garage
# Garage default region is named "garage", edit according to your config
region: 'garage'
credentials:
@ -253,7 +253,7 @@ object_storage:
proxify_private_files: false
streaming_playlists:
bucket_name: 'peertube-playlist'
bucket_name: 'peertube-playlists'
# Keep it empty for our example
prefix: ''
@ -441,7 +441,7 @@ media_storage_providers:
store_synchronous: True # do we want to wait that the file has been written before returning?
config:
bucket: matrix # the name of our bucket, we chose matrix earlier
region_name: garage # only "garage" is supported for the region field
region_name: garage # "garage" by default, edit according to your cluster config
endpoint_url: http://localhost:3900 # the path to the S3 endpoint
access_key_id: "GKxxx" # your Key ID
secret_access_key: "xxxx" # your Secret Key

View file

@ -161,3 +161,49 @@ kopia repository validate-provider
You can then run all the standard kopia commands: `kopia snapshot create`, `kopia mount`...
Everything should work out-of-the-box.
## Plakar
Create your key and bucket on Garage server:
```bash
garage key create my-plakar-key
garage bucket create plakar-backups
garage bucket allow plakar-backups --read --write --key my-plakar-key
```
On Plakar server, add your Garage as a storage location:
```bash
plakar store add garageS3 s3://my-garage.tld/plakar-backups \
region=garage # Or as you've specified in garage.toml \
access_key=<Key ID from "garage key info my-plakar-key"> \
secret_access_key=<Secret key from "garage key info my-plakar-key">
```
Then create the repository.
```bash
plakar at @garageS3 create -plaintext # Unencrypted
# or
plakar at @garageS3 create #encrypted
```
If you encrypt your backups (Plakar default), you will need to define a strong passphrase. Do not forget to save your password safely. It will be needed to decrypt your backups.
After the repository has been created, check that everything works as expected (that might give an empty result as no file has been added yet, but no error message):
```bash
plakar at @garageS3 check
```
Now that everything is configure, you can use Garage as your backups storage. For instance sync it with a local backup storage:
```bash
$ plakar at ~/backups sync to @garageS3
```
Or list the S3 storage content:
```bash
$ plakar at @garageS3 ls
```
More information in Plakar documentation: https://www.plakar.io/docs/main/quickstart/

View file

@ -8,18 +8,18 @@ have published Ansible roles. We list them and compare them below.
## Comparison of Ansible roles
| Feature | [ansible-role-garage](#zorun-ansible-role-garage) | [garage-docker-ansible-deploy](#moan0s-garage-docker-ansible-deploy) |
|------------------------------------|---------------------------------------------|---------------------------------------------------------------|
| **Runtime** | Systemd | Docker |
| **Target OS** | Any Linux | Any Linux |
| **Architecture** | amd64, arm64, i686 | amd64, arm64 |
| **Additional software** | None | Traefik |
| **Automatic node connection** | ❌ | ✅ |
| **Layout management** | ❌ | ✅ |
| **Manage buckets & keys** | ❌ | ✅ (basic) |
| **Allow custom Garage config** | ✅ | ❌ |
| **Facilitate Garage upgrades** | ✅ | ❌ |
| **Multiple instances on one host** | ✅ | ✅ |
| Feature | [ansible-role-garage](#zorun-ansible-role-garage) | [garage-docker-ansible-deploy](#moan0s-garage-docker-ansible-deploy) | [eddster ansible-role-garage](#eddster-ansible-role-garage) |
|------------------------------------|---------------------------------------------|---------------------------------------------------------------|---------------------------------|
| **Runtime** | Systemd | Docker | Systemd |
| **Target OS** | Any Linux | Any Linux | Any Linux |
| **Architecture** | amd64, arm64, i686 | amd64, arm64 | arm64, arm, 386, amd64 |
| **Additional software** | None | Traefik | Ngnix and Keepalived (optional) |
| **Automatic node connection** | ❌ | ✅ | ✅ |
| **Layout management** | ❌ | ✅ | ✅ |
| **Manage buckets & keys** | ❌ | ✅ (basic) | ✅ |
| **Allow custom Garage config** | ✅ | ❌ | ❌ |
| **Facilitate Garage upgrades** | ✅ | ❌ | ✅ |
| **Multiple instances on one host** | ✅ | ✅ | ❌ |
## zorun/ansible-role-garage
@ -49,3 +49,15 @@ structured DNS names, etc).
As a result, this role makes it easier to start with Garage on Ansible,
but is less flexible.
## eddster2309/ansible-role-garage
[Source code](https://github.com/eddster2309/ansible-role-garage), [Ansible galaxy](https://galaxy.ansible.com/ui/standalone/roles/eddster2309/garage/)
This role is a opinionated but customisable role using the official Garage
static binaries and only requires Systemd. As such it should work on any
Linux based host. It includes all the nesscary configuration to
automatically setup a clustered Garage deployment. Most Garage
configuration options are exposed through Ansible variables so while you
can't provide a custom config you can get very close. It can optionally
installed a HA nginx deployment with Keepalived.

View file

@ -15,9 +15,10 @@ Alpine Linux repositories (available since v3.17):
apk add garage
```
The default configuration file is installed to `/etc/garage.toml`. You can run
Garage using: `rc-service garage start`. If you don't specify `rpc_secret`, it
will be automatically replaced with a random string on the first start.
The default configuration file is installed to `/etc/garage/garage.toml`. You can run
Garage using: `rc-service garage start`.
If you don't specify `rpc_secret`, it will be automatically replaced with a random string on the first start.
Please note that this package is built without Consul discovery, Kubernetes
discovery, OpenTelemetry exporter, and K2V features (K2V will be enabled once
@ -26,7 +27,7 @@ it's stable).
## Arch Linux
Garage is available in the [AUR](https://aur.archlinux.org/packages/garage).
Garage is available in the official repositories under [extra](https://archlinux.org/packages/extra/x86_64/garage).
## FreeBSD

View file

@ -11,7 +11,7 @@ Firstly clone the repository:
```bash
git clone https://git.deuxfleurs.fr/Deuxfleurs/garage
cd garage/scripts/helm
cd garage/script/helm
```
Deploy with default options:
@ -26,6 +26,13 @@ Or deploy with custom values:
helm install --create-namespace --namespace garage garage ./garage -f values.override.yaml
```
If you want to manage the CustomRessourceDefinition used by garage for its `kubernetes_discovery` outside of the helm chart, add `garage.kubernetesSkipCrd: true` to your custom values and use the kustomization before deploying the helm chart:
```bash
kubectl apply -k ../k8s/crd
helm install --create-namespace --namespace garage garage ./garage -f values.override.yaml
```
After deploying, cluster layout must be configured manually as described in [Creating a cluster layout](@/documentation/quick-start/_index.md#creating-a-cluster-layout). Use the following command to access garage CLI:
```bash

View file

@ -96,14 +96,14 @@ to store 2 TB of data in total.
## Get a Docker image
Our docker image is currently named `dxflrs/garage` and is stored on the [Docker Hub](https://hub.docker.com/r/dxflrs/garage/tags?page=1&ordering=last_updated).
We encourage you to use a fixed tag (eg. `v2.0.0`) and not the `latest` tag.
For this example, we will use the latest published version at the time of the writing which is `v2.0.0` but it's up to you
We encourage you to use a fixed tag (eg. `v1.3.0`) and not the `latest` tag.
For this example, we will use the latest published version at the time of the writing which is `v1.3.0` but it's up to you
to check [the most recent versions on the Docker Hub](https://hub.docker.com/r/dxflrs/garage/tags?page=1&ordering=last_updated).
For example:
```
sudo docker pull dxflrs/garage:v2.0.0
sudo docker pull dxflrs/garage:v1.3.0
```
## Deploying and configuring Garage
@ -171,7 +171,7 @@ docker run \
-v /etc/garage.toml:/etc/garage.toml \
-v /var/lib/garage/meta:/var/lib/garage/meta \
-v /var/lib/garage/data:/var/lib/garage/data \
dxflrs/garage:v2.0.0
dxflrs/garage:v1.3.0
```
With this command line, Garage should be started automatically at each boot.
@ -185,7 +185,7 @@ If you want to use `docker-compose`, you may use the following `docker-compose.y
version: "3"
services:
garage:
image: dxflrs/garage:v2.0.0
image: dxflrs/garage:v1.3.0
network_mode: "host"
restart: unless-stopped
volumes:

View file

@ -28,6 +28,7 @@ StateDirectory=garage
DynamicUser=true
ProtectHome=true
NoNewPrivileges=true
LimitNOFILE=42000
[Install]
WantedBy=multi-user.target

View file

@ -129,10 +129,10 @@ docker run \
-d \
--name garaged \
-p 3900:3900 -p 3901:3901 -p 3902:3902 -p 3903:3903 \
-v /etc/garage.toml:/path/to/garage.toml \
-v /var/lib/garage/meta:/path/to/garage/meta \
-v /var/lib/garage/data:/path/to/garage/data \
dxflrs/garage:v2.0.0
-v /path/to/garage.toml:/etc/garage.toml \
-v /path/to/garage/meta:/var/lib/garage/meta \
-v /path/to/garage/data:/var/lib/garage/data \
dxflrs/garage:v1.3.0
```
Under Linux, you can substitute `--network host` for `-p 3900:3900 -p 3901:3901 -p 3902:3902 -p 3903:3903`
@ -182,11 +182,12 @@ ID Hostname Address Tag Zone Capacit
## Creating a cluster layout
Creating a cluster layout for a Garage deployment means informing Garage
of the disk space available on each node of the cluster
as well as the zone (e.g. datacenter) each machine is located in.
of the disk space available on each node of the cluster, `-c`,
as well as the name of the zone (e.g. datacenter), `-z`, each machine is located in.
For our test deployment, we are using only one node. The way in which we configure
it does not matter, you can simply write:
For our test deployment, we are have only one node with zone named `dc1` and a
capacity of `1G`, though the capacity is ignored for a single node deployment
and can be changed later when adding new nodes.
```bash
garage layout assign -z dc1 -c 1G <node_id>

View file

@ -24,7 +24,8 @@ db_engine = "lmdb"
block_size = "1M"
block_ram_buffer_max = "256MiB"
block_max_concurrent_reads = 16
block_max_concurrent_writes_per_request =10
lmdb_map_size = "1T"
compression_level = 1
@ -46,6 +47,7 @@ bootstrap_peers = [
"212fd62eeaca72c122b45a7f4fa0f55e012aa5e24ac384a72a3016413fa724ff@[fc00:F::1]:3901",
]
allow_punycode = false
[consul_discovery]
api = "catalog"
@ -80,7 +82,6 @@ add_host_to_metrics = true
[admin]
api_bind_addr = "0.0.0.0:3903"
metrics_token = "BCAdFjoa9G0KJR0WXnHHm7fs1ZAbfpI8iIZ+Z/a2NgI="
metrics_require_token = true
admin_token = "UkLeGWEvHnXBqnueR3ISEMWpOnm40jH2tM2HnnL/0F4="
trace_sink = "http://localhost:4317"
```
@ -93,29 +94,32 @@ The following gives details about each available configuration option.
[Environment variables](#env_variables).
Top-level configuration options:
Top-level configuration options, in alphabetical order:
[`allow_punycode`](#allow_punycode),
[`allow_world_readable_secrets`](#allow_world_readable_secrets),
[`block_max_concurrent_reads`](`block_max_concurrent_reads),
[`block_ram_buffer_max`](#block_ram_buffer_max),
[`block_max_concurrent_writes_per_request`](#block_max_concurrent_writes_per_request),
[`block_size`](#block_size),
[`bootstrap_peers`](#bootstrap_peers),
[`compression_level`](#compression_level),
[`consistency_mode`](#consistency_mode),
[`data_dir`](#data_dir),
[`data_fsync`](#data_fsync),
[`db_engine`](#db_engine),
[`disable_scrub`](#disable_scrub),
[`use_local_tz`](#use_local_tz),
[`lmdb_map_size`](#lmdb_map_size),
[`metadata_auto_snapshot_interval`](#metadata_auto_snapshot_interval),
[`metadata_dir`](#metadata_dir),
[`metadata_fsync`](#metadata_fsync),
[`metadata_snapshots_dir`](#metadata_snapshots_dir),
[`replication_factor`](#replication_factor),
[`consistency_mode`](#consistency_mode),
[`rpc_bind_addr`](#rpc_bind_addr),
[`rpc_bind_outgoing`](#rpc_bind_outgoing),
[`rpc_public_addr`](#rpc_public_addr),
[`rpc_public_addr_subnet`](#rpc_public_addr_subnet)
[`rpc_secret`/`rpc_secret_file`](#rpc_secret).
[`rpc_secret`/`rpc_secret_file`](#rpc_secret),
[`use_local_tz`](#use_local_tz).
The `[consul_discovery]` section:
[`api`](#consul_api),
@ -146,20 +150,23 @@ The `[s3_web]` section:
The `[admin]` section:
[`api_bind_addr`](#admin_api_bind_addr),
[`metrics_require_token`](#admin_metrics_require_token),
[`metrics_token`/`metrics_token_file`](#admin_metrics_token),
[`admin_token`/`admin_token_file`](#admin_token),
[`trace_sink`](#admin_trace_sink),
### Environment variables {#env_variables}
The following configuration parameter must be specified as an environment
variable, it does not exist in the configuration file:
The following configuration parameters must be specified as environment variables,
they do not exist in the configuration file:
- `GARAGE_LOG_TO_SYSLOG` (since `v0.9.4`): set this to `1` or `true` to make the
Garage daemon send its logs to `syslog` (using the libc `syslog` function)
instead of printing to stderr.
- `GARAGE_LOG_TO_JOURNALD` (since `v1.2.0`): set this to `1` or `true` to make the
Garage daemon send its logs to `journald` (using the native protocol of `systemd-journald`)
instead of printing to stderr.
The following environment variables can be used to override the corresponding
values in the configuration file:
@ -171,7 +178,7 @@ values in the configuration file:
### Top-level configuration options
#### `replication_factor` {#replication_factor}
#### `replication_factor` (since `v1.0.0`) {#replication_factor}
The replication factor can be any positive integer smaller or equal the node count in your cluster.
The chosen replication factor has a big impact on the cluster's failure tolerancy and performance characteristics.
@ -219,7 +226,7 @@ is in progress. In theory, no data should be lost as rebalancing is a
routine operation for Garage, although we cannot guarantee you that everything
will go right in such an extreme scenario.
#### `consistency_mode` {#consistency_mode}
#### `consistency_mode` (since `v1.0.0`) {#consistency_mode}
The consistency mode setting determines the read and write behaviour of your cluster.
@ -329,6 +336,7 @@ Since `v0.8.0`, Garage can use alternative storage backends as follows:
| --------- | ----------------- | ------------- |
| [LMDB](https://www.symas.com/lmdb) (since `v0.8.0`, default since `v0.9.0`) | `"lmdb"` | `<metadata_dir>/db.lmdb/` |
| [Sqlite](https://sqlite.org) (since `v0.8.0`) | `"sqlite"` | `<metadata_dir>/db.sqlite` |
| [Fjall](https://github.com/fjall-rs/fjall) (**experimental support** since `v1.3.0`) | `"fjall"` | `<metadata_dir>/db.fjall/` |
| [Sled](https://sled.rs) (old default, removed since `v1.0`) | `"sled"` | `<metadata_dir>/db/` |
Sled was supported until Garage v0.9.x, and was removed in Garage v1.0.
@ -365,6 +373,14 @@ LMDB works very well, but is known to have the following limitations:
so it is not the best choice for high-performance storage clusters,
but it should work fine in many cases.
- Fjall: a storage engine based on LSM trees, which theoretically allow for
higher write throughput than other storage engines that are based on B-trees.
Using Fjall could potentially improve Garage's performance significantly in
write-heavy workloads. **Support for Fjall is experimental at this point**,
we have added it to Garage for evaluation purposes only. **Do not use it for
production-critical workloads.**
It is possible to convert Garage's metadata directory from one format to another
using the `garage convert-db` command, which should be used as follows:
@ -402,6 +418,7 @@ Here is how this option impacts the different database engines:
|----------|------------------------------------|-------------------------------|
| Sqlite | `PRAGMA synchronous = OFF` | `PRAGMA synchronous = NORMAL` |
| LMDB | `MDB_NOMETASYNC` + `MDB_NOSYNC` | `MDB_NOMETASYNC` |
| Fjall | default options | not supported |
Note that the Sqlite database is always ran in `WAL` mode (`PRAGMA journal_mode = WAL`).
@ -508,6 +525,37 @@ node.
The default value is 256MiB.
#### `block_max_concurrent_reads` (since `v1.3.0` / `v2.1.0`) {#block_max_concurrent_reads}
The maximum number of blocks (individual files in the data directory) open
simultaneously for reading.
Reducing this number does not limit the number of data blocks that can be
transferred through the network simultaneously. This mechanism was just added
as a backpressure mechanism for HDD read speed: it helps avoid a situation
where too many requests are coming in and Garage is reading too many block
files simultaneously, thus not making timely progress on any of the reads.
When a request to read a data block comes in through the network, the requests
awaits for one of the `block_max_concurrent_reads` slots to be available
(internally implemented using a Semaphore object). Once it acquired a read
slot, it reads the entire block file to RAM and frees the slot as soon as the
block file is finished reading. Only after the slot is released will the
block's data start being transferred over the network. If the request fails to
acquire a reading slot wihtin 15 seconds, it fails with a timeout error.
Timeout events can be monitored through the `block_read_semaphore_timeouts`
metric in Prometheus: a non-zero number of such events indicates an I/O
bottleneck on HDD read speed.
#### `block_max_concurrent_writes_per_request` (since `v2.1.0`) {#block_max_concurrent_writes_per_request}
This parameter is designed to adapt to the concurrent write performance of
different storage media.Maximum number of parallel block writes per put request
Higher values improve throughput but increase memory usage.
Default: 3, Recommended: 10-30 for NVMe, 3-10 for HDD
#### `lmdb_map_size` {#lmdb_map_size}
This parameters can be used to set the map size used by LMDB,
@ -606,7 +654,7 @@ be obtained by running `garage node id` and then included directly in the
key will be returned by `garage node id` and you will have to add the IP
yourself.
### `allow_world_readable_secrets` or `GARAGE_ALLOW_WORLD_READABLE_SECRETS` (env) {#allow_world_readable_secrets}
#### `allow_world_readable_secrets` or `GARAGE_ALLOW_WORLD_READABLE_SECRETS` (env) {#allow_world_readable_secrets}
Garage checks the permissions of your secret files to make sure they're not
world-readable. In some cases, the check might fail and consider your files as
@ -618,6 +666,13 @@ permission verification.
Alternatively, you can set the `GARAGE_ALLOW_WORLD_READABLE_SECRETS`
environment variable to `true` to bypass the permissions check.
#### `allow_punycode` {#allow_punycode}
Allow creating buckets with names containing punycode. When used for buckets served
as websites, this allows using almost any unicode character in the domain name.
Default to `false`.
### The `[consul_discovery]` section
Garage supports discovering other nodes of the cluster using Consul. For this
@ -769,34 +824,10 @@ See [administration API reference](@/documentation/reference-manual/admin-api.md
Alternatively, since `v0.8.5`, a path can be used to create a unix socket. Note that for security reasons,
the socket will have 0220 mode. Make sure to set user and group permissions accordingly.
#### `admin_token`, `admin_token_file` or `GARAGE_ADMIN_TOKEN`, `GARAGE_ADMIN_TOKEN_FILE` (env) {#admin_token}
The token for accessing all administration functions on the admin endpoint,
with the exception of the metrics endpoint (see `metrics_token`).
You can use any random string for this value. We recommend generating a random
token with `openssl rand -base64 32`.
For Garage version earlier than `v2.0`, if this token is not set,
access to these endpoints is disabled entirely.
Since Garage `v2.0`, additional admin API tokens can be defined dynamically
in your Garage cluster using administration commands. This new admin token system
is more flexible since it allows admin tokens to have an expiration date,
and to have a scope restricted to certain admin API functions. If `admin_token`
is set, it behaves as an admin token without expiration and with full scope.
Otherwise, only admin API tokens defined dynamically can be used.
`admin_token` was introduced in Garage `v0.7.2`.
`admin_token_file` and the `GARAGE_ADMIN_TOKEN` environment variable are supported since Garage `v0.8.2`.
`GARAGE_ADMIN_TOKEN_FILE` is supported since `v0.8.5` / `v0.9.1`.
#### `metrics_token`, `metrics_token_file` or `GARAGE_METRICS_TOKEN`, `GARAGE_METRICS_TOKEN_FILE` (env) {#admin_metrics_token}
The token for accessing the Prometheus metrics endpoint (`/metrics`).
If this token is not set, and unless `metrics_require_token` is set to `true`,
the metrics endpoint can be accessed without access control.
The token for accessing the Metrics endpoint. If this token is not set, the
Metrics endpoint can be accessed without access control.
You can use any random string for this value. We recommend generating a random token with `openssl rand -base64 32`.
@ -805,12 +836,17 @@ You can use any random string for this value. We recommend generating a random t
`GARAGE_METRICS_TOKEN_FILE` is supported since `v0.8.5` / `v0.9.1`.
#### `metrics_require_token` (since `v2.0.0`) {#admin_metrics_require_token}
#### `admin_token`, `admin_token_file` or `GARAGE_ADMIN_TOKEN`, `GARAGE_ADMIN_TOKEN_FILE` (env) {#admin_token}
If this is set to `true`, accessing the metrics endpoint will always require
an access token. Valid tokens include the `metrics_token` if it is set,
and admin API token defined dynamicaly in Garage which have
the `Metrics` endpoint in their scope.
The token for accessing all of the other administration endpoints. If this
token is not set, access to these endpoints is disabled entirely.
You can use any random string for this value. We recommend generating a random token with `openssl rand -base64 32`.
`admin_token` was introduced in Garage `v0.7.2`.
`admin_token_file` and the `GARAGE_ADMIN_TOKEN` environment variable are supported since Garage `v0.8.2`.
`GARAGE_ADMIN_TOKEN_FILE` is supported since `v0.8.5` / `v0.9.1`.
#### `trace_sink` {#admin_trace_sink}

View file

@ -23,17 +23,17 @@ Feel free to open a PR to suggest fixes this table. Minio is missing because the
- 2022-05-25 - Many Ceph S3 endpoints are not documented but implemented. Following a notification from the Ceph community, we added them.
## High-level features
| Feature | Garage | [Openstack Swift](https://docs.openstack.org/swift/latest/s3_compat.html) | [Ceph Object Gateway](https://docs.ceph.com/en/latest/radosgw/s3/) | [Riak CS](https://docs.riak.com/riak/cs/2.1.1/references/apis/storage/s3/index.html) | [OpenIO](https://docs.openio.io/latest/source/arch-design/s3_compliancy.html) |
|------------------------------|----------------------------------|-----------------|---------------|---------|-----|
| [signature v2](https://docs.aws.amazon.com/general/latest/gr/signature-version-2.html) (deprecated) | ❌ Missing | ✅ | ✅ | ✅ | ✅ |
| [signature v2](https://docs.aws.amazon.com/AmazonS3/latest/API/Appendix-Sigv2.html) (deprecated) | ❌ Missing | ✅ | ✅ | ✅ | ✅ |
| [signature v4](https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-authenticating-requests.html) | ✅ Implemented | ✅ | ✅ | ❌ | ✅ |
| [URL path-style](https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html#path-style-access) (eg. `host.tld/bucket/key`) | ✅ Implemented | ✅ | ✅ | ❓| ✅ |
| [URL vhost-style](https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html#virtual-hosted-style-access) URL (eg. `bucket.host.tld/key`) | ✅ Implemented | ❌| ✅| ✅ | ✅ |
| [Presigned URLs](https://docs.aws.amazon.com/AmazonS3/latest/userguide/ShareObjectPreSignedURL.html) | ✅ Implemented | ❌| ✅ | ✅ | ✅(❓) |
| [SSE-C encryption](https://docs.aws.amazon.com/AmazonS3/latest/userguide/ServerSideEncryptionCustomerKeys.html) | ✅ Implemented | ❓ | ✅ | ❌ | ✅ |
| [Bucket versioning](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Versioning.html) | ❌ Missing | ✅ | ✅ | ❌ | ✅ |
*Note:* OpenIO does not says if it supports presigned URLs. Because it is part
of signature v4 and they claim they support it without additional precisions,

View file

@ -13,9 +13,8 @@ We will bump the version numbers prefixed to each API endpoint each time the syn
or semantics change, meaning that code that relies on these endpoints will break
when changes are introduced.
The Garage administration API was introduced in version 0.7.2, and was
changed several times.
This document applies only to the Garage v2 API (starting with Garage v2.0.0).
The Garage administration API was introduced in version 0.7.2, this document
does not apply to older versions of Garage.
## Access control
@ -53,28 +52,34 @@ Returns an HTTP status 200 if the node is ready to answer user's requests,
and an HTTP status 503 (Service Unavailable) if there are some partitions
for which a quorum of nodes is not available.
A simple textual message is also returned in a body with content-type `text/plain`.
See `/v2/GetClusterHealth` for an API that also returns JSON output.
### Other special endpoints
#### CheckDomain `GET /check?domain=<domain>`
Checks whether this Garage cluster serves a website for domain `<domain>`.
Returns HTTP 200 Ok if yes, or HTTP 4xx if no website is available for this domain.
See `/v1/health` for an API that also returns JSON output.
### Cluster operations
#### GetClusterStatus `GET /v2/GetClusterStatus`
#### GetClusterStatus `GET /v1/status`
Returns the cluster's current status in JSON, including:
- ID of the node being queried and its version of the Garage daemon
- Live nodes
- Currently configured cluster layout
- Staged changes to the cluster layout
Example response body:
```json
{
"node": "b10c110e4e854e5aa3f4637681befac755154b20059ec163254ddbfae86b09df",
"garageVersion": "v1.3.0",
"garageFeatures": [
"k2v",
"lmdb",
"sqlite",
"metrics",
"bundled-libs"
],
"rustVersion": "1.68.0",
"dbEngine": "LMDB (using Heed crate)",
"layoutVersion": 5,
"nodes": [
{
@ -164,7 +169,7 @@ Example response body:
}
```
#### GetClusterHealth `GET /v2/GetClusterHealth`
#### GetClusterHealth `GET /v1/health`
Returns the cluster's current health in JSON format, with the following variables:
@ -197,7 +202,7 @@ Example response body:
}
```
#### ConnectClusterNodes `POST /v2/ConnectClusterNodes`
#### ConnectClusterNodes `POST /v1/connect`
Instructs this Garage node to connect to other Garage nodes at specified addresses.
@ -227,7 +232,7 @@ Example response:
]
```
#### GetClusterLayout `GET /v2/GetClusterLayout`
#### GetClusterLayout `GET /v1/layout`
Returns the cluster's current layout in JSON, including:
@ -288,7 +293,7 @@ Example response body:
}
```
#### UpdateClusterLayout `POST /v2/UpdateClusterLayout`
#### UpdateClusterLayout `POST /v1/layout`
Send modifications to the cluster layout. These modifications will
be included in the staged role changes, visible in subsequent calls
@ -325,7 +330,7 @@ This returns the new cluster layout with the proposed staged changes,
as returned by GetClusterLayout.
#### ApplyClusterLayout `POST /v2/ApplyClusterLayout`
#### ApplyClusterLayout `POST /v1/layout/apply`
Applies to the cluster the layout changes currently registered as
staged layout changes.
@ -345,11 +350,23 @@ existing layout in the cluster.
This returns the message describing all the calculations done to compute the new
layout, as well as the description of the layout as returned by GetClusterLayout.
#### RevertClusterLayout `POST /v2/RevertClusterLayout`
#### RevertClusterLayout `POST /v1/layout/revert`
Clears all of the staged layout changes.
This requests contains an empty body.
Request body format:
```json
{
"version": 13
}
```
Reverting the staged changes is done by incrementing the version number
and clearing the contents of the staged change list.
Similarly to the CLI, the body must include the incremented
version number, which MUST be 1 + the value of the currently
existing layout in the cluster.
This returns the new cluster layout with all changes reverted,
as returned by GetClusterLayout.
@ -357,7 +374,7 @@ as returned by GetClusterLayout.
### Access key operations
#### ListKeys `GET /v2/ListKeys`
#### ListKeys `GET /v1/key`
Returns all API access keys in the cluster.
@ -376,8 +393,8 @@ Example response:
]
```
#### GetKeyInfo `GET /v2/GetKeyInfo?id=<acces key id>`
#### GetKeyInfo `GET /v2/GetKeyInfo?search=<pattern>`
#### GetKeyInfo `GET /v1/key?id=<acces key id>`
#### GetKeyInfo `GET /v1/key?search=<pattern>`
Returns information about the requested API access key.
@ -451,7 +468,7 @@ Example response:
}
```
#### CreateKey `POST /v2/CreateKey`
#### CreateKey `POST /v1/key`
Creates a new API access key.
@ -466,7 +483,7 @@ Request body format:
This returns the key info, including the created secret key,
in the same format as the result of GetKeyInfo.
#### ImportKey `POST /v2/ImportKey`
#### ImportKey `POST /v1/key/import`
Imports an existing API key.
This will check that the imported key is in the valid format, i.e.
@ -484,7 +501,7 @@ Request body format:
This returns the key info in the same format as the result of GetKeyInfo.
#### UpdateKey `POST /v2/UpdateKey?id=<acces key id>`
#### UpdateKey `POST /v1/key?id=<acces key id>`
Updates information about the specified API access key.
@ -506,14 +523,14 @@ The possible flags in `allow` and `deny` are: `createBucket`.
This returns the key info in the same format as the result of GetKeyInfo.
#### DeleteKey `POST /v2/DeleteKey?id=<acces key id>`
#### DeleteKey `DELETE /v1/key?id=<acces key id>`
Deletes an API access key.
### Bucket operations
#### ListBuckets `GET /v2/ListBuckets`
#### ListBuckets `GET /v1/bucket`
Returns all storage buckets in the cluster.
@ -555,8 +572,8 @@ Example response:
]
```
#### GetBucketInfo `GET /v2/GetBucketInfo?id=<bucket id>`
#### GetBucketInfo `GET /v2/GetBucketInfo?globalAlias=<alias>`
#### GetBucketInfo `GET /v1/bucket?id=<bucket id>`
#### GetBucketInfo `GET /v1/bucket?globalAlias=<alias>`
Returns information about the requested storage bucket.
@ -599,7 +616,7 @@ Example response:
}
```
#### CreateBucket `POST /v2/CreateBucket`
#### CreateBucket `POST /v1/bucket`
Creates a new storage bucket.
@ -639,7 +656,7 @@ or no alias at all.
Technically, you can also specify both `globalAlias` and `localAlias` and that would create
two aliases, but I don't see why you would want to do that.
#### UpdateBucket `POST /v2/UpdateBucket?id=<bucket id>`
#### UpdateBucket `PUT /v1/bucket?id=<bucket id>`
Updates configuration of the given bucket.
@ -671,38 +688,16 @@ In `quotas`: new values of `maxSize` and `maxObjects` must both be specified, or
to remove the quotas. An absent value will be considered the same as a `null`. It is not possible
to change only one of the two quotas.
#### DeleteBucket `POST /v2/DeleteBucket?id=<bucket id>`
#### DeleteBucket `DELETE /v1/bucket?id=<bucket id>`
Deletes a storage bucket. A bucket cannot be deleted if it is not empty.
Warning: this will delete all aliases associated with the bucket!
#### CleanupIncompleteUploads `POST /v2/CleanupIncompleteUploads`
Cleanup all incomplete uploads in a bucket that are older than a specified number
of seconds.
Request body format:
```json
{
"bucketId": "e6a14cd6a27f48684579ec6b381c078ab11697e6bc8513b72b2f5307e25fff9b",
"olderThanSecs": 3600
}
```
Response format
```json
{
"uploadsDeleted": 12
}
```
### Operations on permissions for keys on buckets
#### AllowBucketKey `POST /v2/AllowBucketKey`
#### BucketAllowKey `POST /v1/bucket/allow`
Allows a key to do read/write/owner operations on a bucket.
@ -723,7 +718,7 @@ Request body format:
Flags in `permissions` which have the value `true` will be activated.
Other flags will remain unchanged.
#### DenyBucketKey `POST /v2/DenyBucketKey`
#### BucketDenyKey `POST /v1/bucket/deny`
Denies a key from doing read/write/owner operations on a bucket.
@ -747,35 +742,19 @@ Other flags will remain unchanged.
### Operations on bucket aliases
#### AddBucketAlias `POST /v2/AddBucketAlias`
#### GlobalAliasBucket `PUT /v1/bucket/alias/global?id=<bucket id>&alias=<global alias>`
Creates an alias for a bucket in the namespace of a specific access key.
To create a global alias, specify the `globalAlias` field.
To create a local alias, specify the `localAlias` and `accessKeyId` fields.
Empty body. Creates a global alias for a bucket.
Request body format:
#### GlobalUnaliasBucket `DELETE /v1/bucket/alias/global?id=<bucket id>&alias=<global alias>`
```json
{
"bucketId": "e6a14cd6a27f48684579ec6b381c078ab11697e6bc8513b72b2f5307e25fff9b",
"globalAlias": "my-bucket"
}
```
Removes a global alias for a bucket.
or:
#### LocalAliasBucket `PUT /v1/bucket/alias/local?id=<bucket id>&accessKeyId=<access key ID>&alias=<local alias>`
```json
{
"bucketId": "e6a14cd6a27f48684579ec6b381c078ab11697e6bc8513b72b2f5307e25fff9b",
"accessKeyId": "GK31c2f218a2e44f485b94239e",
"localAlias": "my-bucket"
}
```
Empty body. Creates a local alias for a bucket in the namespace of a specific access key.
#### RemoveBucketAlias `POST /v2/RemoveBucketAlias`
#### LocalUnaliasBucket `DELETE /v1/bucket/alias/local?id=<bucket id>&accessKeyId<access key ID>&alias=<local alias>`
Removes an alias for a bucket in the namespace of a specific access key.
To remove a global alias, specify the `globalAlias` field.
To remove a local alias, specify the `localAlias` and `accessKeyId` fields.
Removes a local alias for a bucket in the namespace of a specific access key.
Request body format: same as AddBucketAlias.

16
flake.lock generated
View file

@ -50,17 +50,17 @@
},
"nixpkgs": {
"locked": {
"lastModified": 1736692550,
"narHash": "sha256-7tk8xH+g0sJkKLTJFOxphJxxOjMDFMWv24nXslaU2ro=",
"lastModified": 1763977559,
"narHash": "sha256-g4MKqsIRy5yJwEsI+fYODqLUnAqIY4kZai0nldAP6EM=",
"owner": "NixOS",
"repo": "nixpkgs",
"rev": "7c4869c47090dd7f9f1bdfb49a22aea026996815",
"rev": "cfe2c7d5b5d3032862254e68c37a6576b633d632",
"type": "github"
},
"original": {
"owner": "NixOS",
"repo": "nixpkgs",
"rev": "7c4869c47090dd7f9f1bdfb49a22aea026996815",
"rev": "cfe2c7d5b5d3032862254e68c37a6576b633d632",
"type": "github"
}
},
@ -80,17 +80,17 @@
]
},
"locked": {
"lastModified": 1738549608,
"narHash": "sha256-GdyT9QEUSx5k/n8kILuNy83vxxdyUfJ8jL5mMpQZWfw=",
"lastModified": 1763952169,
"narHash": "sha256-+PeDBD8P+NKauH+w7eO/QWCIp8Cx4mCfWnh9sJmy9CM=",
"owner": "oxalica",
"repo": "rust-overlay",
"rev": "35c6f8c4352f995ecd53896200769f80a3e8f22d",
"rev": "ab726555a9a72e6dc80649809147823a813fa95b",
"type": "github"
},
"original": {
"owner": "oxalica",
"repo": "rust-overlay",
"rev": "35c6f8c4352f995ecd53896200769f80a3e8f22d",
"rev": "ab726555a9a72e6dc80649809147823a813fa95b",
"type": "github"
}
},

View file

@ -2,13 +2,13 @@
description =
"Garage, an S3-compatible distributed object store for self-hosted deployments";
# Nixpkgs 24.11 as of 2025-01-12
# Nixpkgs 25.05 as of 2025-11-24
inputs.nixpkgs.url =
"github:NixOS/nixpkgs/7c4869c47090dd7f9f1bdfb49a22aea026996815";
"github:NixOS/nixpkgs/cfe2c7d5b5d3032862254e68c37a6576b633d632";
# Rust overlay as of 2025-02-03
# Rust overlay as of 2025-11-24
inputs.rust-overlay.url =
"github:oxalica/rust-overlay/35c6f8c4352f995ecd53896200769f80a3e8f22d";
"github:oxalica/rust-overlay/ab726555a9a72e6dc80649809147823a813fa95b";
inputs.rust-overlay.inputs.nixpkgs.follows = "nixpkgs";
inputs.crane.url = "github:ipetkov/crane";
@ -30,6 +30,10 @@
inherit system nixpkgs crane rust-overlay extraTestEnv;
release = false;
}).garage-test;
lints = (compile {
inherit system nixpkgs crane rust-overlay;
release = false;
});
in
{
packages = {
@ -53,6 +57,13 @@
tests-sqlite = testWith {
GARAGE_TEST_INTEGRATION_DB_ENGINE = "sqlite";
};
tests-fjall = testWith {
GARAGE_TEST_INTEGRATION_DB_ENGINE = "fjall";
};
# lints (fmt, clippy)
fmt = lints.garage-cargo-fmt;
clippy = lints.garage-cargo-clippy;
};
# ---- developpment shell, for making native builds only ----

View file

@ -48,7 +48,7 @@ let
inherit (pkgs) lib stdenv;
toolchainFn = (p: p.rust-bin.stable."1.82.0".default.override {
toolchainFn = (p: p.rust-bin.stable."1.91.0".default.override {
targets = lib.optionals (target != null) [ rustTarget ];
extensions = [
"rust-src"
@ -68,12 +68,13 @@ let
rootFeatures = if features != null then
features
else
([ "bundled-libs" "lmdb" "sqlite" "k2v" ] ++ (lib.optionals release [
([ "bundled-libs" "lmdb" "sqlite" "fjall" "k2v" ] ++ (lib.optionals release [
"consul-discovery"
"kubernetes-discovery"
"metrics"
"telemetry-otlp"
"syslog"
"journald"
]));
featuresStr = lib.concatStringsSep "," rootFeatures;
@ -189,4 +190,15 @@ in rec {
pkgs.cacert
];
} // extraTestEnv);
# ---- source code linting ----
garage-cargo-fmt = craneLib.cargoFmt (commonArgs // {
cargoExtraArgs = "";
});
garage-cargo-clippy = craneLib.cargoClippy (commonArgs // {
cargoArtifacts = garage-deps;
cargoClippyExtraArgs = "--all-targets -- -D warnings";
});
}

View file

@ -17,19 +17,13 @@ else
fi
$GARAGE_BIN -c /tmp/config.1.toml bucket create eprouvette
if [ "$GARAGE_OLDVER" = "v08" ]; then
if [ "$GARAGE_08" = "1" ]; then
KEY_INFO=$($GARAGE_BIN -c /tmp/config.1.toml key new --name opérateur)
ACCESS_KEY=`echo $KEY_INFO|grep -Po 'GK[a-f0-9]+'`
SECRET_KEY=`echo $KEY_INFO|grep -Po 'Secret key: [a-f0-9]+'|grep -Po '[a-f0-9]+$'`
elif [ "$GARAGE_OLDVER" = "v1" ]; then
KEY_INFO=$($GARAGE_BIN -c /tmp/config.1.toml key create opérateur)
ACCESS_KEY=`echo $KEY_INFO|grep -Po 'GK[a-f0-9]+'`
SECRET_KEY=`echo $KEY_INFO|grep -Po 'Secret key: [a-f0-9]+'|grep -Po '[a-f0-9]+$'`
else
KEY_INFO=$($GARAGE_BIN -c /tmp/config.1.toml json-api CreateKey '{"name":"opérateur"}')
ACCESS_KEY=`echo $KEY_INFO|jq -r .accessKeyId`
SECRET_KEY=`echo $KEY_INFO|jq -r .secretAccessKey`
KEY_INFO=$($GARAGE_BIN -c /tmp/config.1.toml key create opérateur)
fi
ACCESS_KEY=`echo $KEY_INFO|grep -Po 'GK[a-f0-9]+'`
SECRET_KEY=`echo $KEY_INFO|grep -Po 'Secret key: [a-f0-9]+'|grep -Po '[a-f0-9]+$'`
$GARAGE_BIN -c /tmp/config.1.toml bucket allow eprouvette --read --write --owner --key $ACCESS_KEY
echo "$ACCESS_KEY $SECRET_KEY" > /tmp/garage.s3

View file

@ -29,7 +29,7 @@ until $GARAGE_BIN -c /tmp/config.1.toml status 2>&1|grep -q HEALTHY ; do
sleep 1
done
if [ "$GARAGE_OLDVER" = "v08" ]; then
if [ "$GARAGE_08" = "1" ]; then
$GARAGE_BIN -c /tmp/config.1.toml status \
| grep 'NO ROLE' \
| grep -Po '^[0-9a-f]+' \

View file

@ -1,6 +1,7 @@
export AWS_ACCESS_KEY_ID=`cat /tmp/garage.s3 |cut -d' ' -f1`
export AWS_SECRET_ACCESS_KEY=`cat /tmp/garage.s3 |cut -d' ' -f2`
export AWS_DEFAULT_REGION='garage'
export AWS_REQUEST_CHECKSUM_CALCULATION='when_required'
# FUTUREWORK: set AWS_ENDPOINT_URL instead, once nixpkgs bumps awscli to >=2.13.0.
function aws { command aws --endpoint-url http://127.0.0.1:3911 $@ ; }

View file

@ -1,24 +1,18 @@
apiVersion: v2
name: garage
description: S3-compatible object store for small self-hosted geo-distributed deployments
# A chart can be either an 'application' or a 'library' chart.
#
# Application charts are a collection of templates that can be packaged into versioned archives
# to be deployed.
#
# Library charts provide useful utilities or functions for the chart developer. They're included as
# a dependency of application charts to inject those utilities and functions into the rendering
# pipeline. Library charts do not define any templates and therefore cannot be deployed.
type: application
version: 0.7.3
appVersion: "v1.3.1"
home: https://garagehq.deuxfleurs.fr/
icon: https://garagehq.deuxfleurs.fr/images/garage-logo.svg
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.8.0
keywords:
- geo-distributed
- read-after-write-consistency
- s3-compatible
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
# It is recommended to use it with quotes.
appVersion: "v2.0.0"
sources:
- https://git.deuxfleurs.fr/Deuxfleurs/garage.git
maintainers: []

View file

@ -1,9 +1,15 @@
# garage
![Version: 0.6.0](https://img.shields.io/badge/Version-0.6.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: v1.0.1](https://img.shields.io/badge/AppVersion-v1.0.1-informational?style=flat-square)
![Version: 0.7.3](https://img.shields.io/badge/Version-0.7.3-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: v1.3.1](https://img.shields.io/badge/AppVersion-v1.3.1-informational?style=flat-square)
S3-compatible object store for small self-hosted geo-distributed deployments
**Homepage:** <https://garagehq.deuxfleurs.fr/>
## Source Code
* <https://git.deuxfleurs.fr/Deuxfleurs/garage.git>
## Values
| Key | Type | Default | Description |
@ -23,6 +29,7 @@ S3-compatible object store for small self-hosted geo-distributed deployments
| garage.existingConfigMap | string | `""` | if not empty string, allow using an existing ConfigMap for the garage.toml, if set, ignores garage.toml |
| garage.garageTomlString | string | `""` | String Template for the garage configuration if set, ignores above values. Values can be templated, see https://garagehq.deuxfleurs.fr/documentation/reference-manual/configuration/ |
| garage.kubernetesSkipCrd | bool | `false` | Set to true if you want to use k8s discovery but install the CRDs manually outside of the helm chart, for example if you operate at namespace level without cluster ressources |
| garage.metadataAutoSnapshotInterval | string | `""` | If this value is set, Garage will automatically take a snapshot of the metadata DB file at a regular interval and save it in the metadata directory. https://garagehq.deuxfleurs.fr/documentation/reference-manual/configuration/#metadata_auto_snapshot_interval |
| garage.replicationMode | string | `"3"` | Default to 3 replicas, see the replication_mode section at https://garagehq.deuxfleurs.fr/documentation/reference-manual/configuration/#replication-mode |
| garage.rpcBindAddr | string | `"[::]:3901"` | |
| garage.rpcSecret | string | `""` | If not given, a random secret will be generated and stored in a Secret object |
@ -49,6 +56,7 @@ S3-compatible object store for small self-hosted geo-distributed deployments
| initImage.pullPolicy | string | `"IfNotPresent"` | |
| initImage.repository | string | `"busybox"` | |
| initImage.tag | string | `"stable"` | |
| livenessProbe | object | `{}` | Specifies a livenessProbe |
| monitoring.metrics.enabled | bool | `false` | If true, a service for monitoring is created with a prometheus.io/scrape annotation |
| monitoring.metrics.serviceMonitor.enabled | bool | `false` | If true, a ServiceMonitor CRD is created for a prometheus operator https://github.com/coreos/prometheus-operator |
| monitoring.metrics.serviceMonitor.interval | string | `"15s"` | |
@ -71,6 +79,7 @@ S3-compatible object store for small self-hosted geo-distributed deployments
| podSecurityContext.runAsGroup | int | `1000` | |
| podSecurityContext.runAsNonRoot | bool | `true` | |
| podSecurityContext.runAsUser | int | `1000` | |
| readinessProbe | object | `{}` | Specifies a readinessProbe |
| resources | object | `{}` | |
| securityContext.capabilities | object | `{"drop":["ALL"]}` | The default security context is heavily restricted, feel free to tune it to your requirements |
| securityContext.readOnlyRootFilesystem | bool | `true` | |

View file

@ -19,6 +19,10 @@ data:
compression_level = {{ .Values.garage.compressionLevel }}
{{- if .Values.garage.metadataAutoSnapshotInterval }}
metadata_auto_snapshot_interval = {{ .Values.garage.metadataAutoSnapshotInterval | quote }}
{{- end }}
rpc_bind_addr = "{{ .Values.garage.rpcBindAddr }}"
# rpc_secret will be populated by the init container from a k8s secret object
rpc_secret = "__RPC_SECRET_REPLACE__"

View file

@ -4,6 +4,10 @@ metadata:
name: {{ include "garage.fullname" . }}
labels:
{{- include "garage.labels" . | nindent 4 }}
{{- with .Values.service.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
type: {{ .Values.service.type }}
ports:

View file

@ -78,15 +78,14 @@ spec:
{{- with .Values.extraVolumeMounts }}
{{- toYaml . | nindent 12 }}
{{- end }}
# TODO
# livenessProbe:
# httpGet:
# path: /
# port: 3900
# readinessProbe:
# httpGet:
# path: /
# port: 3900
{{- with .Values.livenessProbe }}
livenessProbe:
{{- toYaml . | nindent 12 }}
{{- end }}
{{- with .Values.readinessProbe }}
readinessProbe:
{{- toYaml . | nindent 12 }}
{{- end }}
resources:
{{- toYaml .Values.resources | nindent 12 }}
volumes:

View file

@ -21,6 +21,10 @@ garage:
# https://garagehq.deuxfleurs.fr/documentation/reference-manual/configuration/#compression-level
compressionLevel: "1"
# -- If this value is set, Garage will automatically take a snapshot of the metadata DB file at a regular interval and save it in the metadata directory.
# https://garagehq.deuxfleurs.fr/documentation/reference-manual/configuration/#metadata_auto_snapshot_interval
metadataAutoSnapshotInterval: ""
rpcBindAddr: "[::]:3901"
# -- If not given, a random secret will be generated and stored in a Secret object
rpcSecret: ""
@ -120,6 +124,8 @@ service:
# - NodePort (+ Ingress)
# - LoadBalancer
type: ClusterIP
# -- Annotations to add to the service
annotations: {}
s3:
api:
port: 3900
@ -191,6 +197,21 @@ resources: {}
# cpu: 100m
# memory: 512Mi
# -- Specifies a livenessProbe
livenessProbe: {}
#httpGet:
# path: /health
# port: 3903
#initialDelaySeconds: 5
#periodSeconds: 30
# -- Specifies a readinessProbe
readinessProbe: {}
#httpGet:
# path: /health
# port: 3903
#initialDelaySeconds: 5
#periodSeconds: 30
nodeSelector: {}
tolerations: []

View file

@ -0,0 +1,43 @@
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: garagenodes.deuxfleurs.fr
spec:
conversion:
strategy: None
group: deuxfleurs.fr
names:
kind: GarageNode
listKind: GarageNodeList
plural: garagenodes
singular: garagenode
scope: Namespaced
versions:
- name: v1
schema:
openAPIV3Schema:
description: Auto-generated derived type for Node via `CustomResource`
properties:
spec:
properties:
address:
format: ip
type: string
hostname:
type: string
port:
format: uint16
minimum: 0
type: integer
required:
- address
- hostname
- port
type: object
required:
- spec
title: GarageNode
type: object
served: true
storage: true
subresources: {}

View file

@ -0,0 +1,5 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- garagenodes.deuxfleurs.fr.yaml

View file

@ -24,17 +24,9 @@ echo "============= insert data into old version cluster ================="
export GARAGE_BIN=/tmp/old_garage
if echo $OLD_VERSION | grep 'v0\.8\.'; then
echo "Detected Garage v0.8.x"
export GARAGE_OLDVER=v08
elif (echo $OLD_VERSION | grep 'v0\.9\.') || (echo $OLD_VERSION | grep 'v1\.'); then
echo "Detected Garage v0.9.x / v1.x"
export GARAGE_OLDVER=v1
export GARAGE_08=1
fi
if echo $OLD_VERSION | grep 'v1\.'; then
DO_SSEC_TEST=1
fi
SSEC_KEY="u8zCfnEyt5Imo/krN+sxA1DQXxLWtPJavU6T6gOVj1Y="
echo "⏳ Setup cluster using old version"
$GARAGE_BIN --version
${SCRIPT_FOLDER}/dev-clean.sh
@ -45,23 +37,7 @@ ${SCRIPT_FOLDER}/dev-bucket.sh
echo "🛠️ Inserting data in old cluster"
source ${SCRIPT_FOLDER}/dev-env-rclone.sh
rclone copy "${SCRIPT_FOLDER}/../.git/" garage:eprouvette/test_dotgit \
--stats=1s --stats-log-level=NOTICE --stats-one-line
if [ "$DO_SSEC_TEST" = "1" ]; then
# upload small file (should be single part)
rclone copy "${SCRIPT_FOLDER}/test-upgrade.sh" garage:eprouvette/test-ssec \
--s3-sse-customer-algorithm AES256 \
--s3-sse-customer-key-base64 "$SSEC_KEY" \
--stats=1s --stats-log-level=NOTICE --stats-one-line
# do a multipart upload
dd if=/dev/urandom of=/tmp/randfile-for-upgrade bs=5M count=5
rclone copy "/tmp/randfile-for-upgrade" garage:eprouvette/test-ssec \
--s3-chunk-size 5M \
--s3-sse-customer-algorithm AES256 \
--s3-sse-customer-key-base64 "$SSEC_KEY" \
--stats=1s --stats-log-level=NOTICE --stats-one-line
fi
rclone copy "${SCRIPT_FOLDER}/../.git/" garage:eprouvette/test_dotgit --stats=1s --stats-log-level=NOTICE --stats-one-line
echo "🏁 Stopping old cluster"
killall -INT old_garage
@ -71,7 +47,7 @@ killall -9 old_garage || true
echo "🏁 Removing old garage version"
rm -rv $GARAGE_BIN
export -n GARAGE_BIN
export -n GARAGE_OLDVER
export -n GARAGE_08
echo "================ read data from new cluster ==================="
@ -84,8 +60,7 @@ ${SCRIPT_FOLDER}/dev-cluster.sh >> /tmp/garage.log 2>&1 &
sleep 3
echo "🛠️ Retrieving data from old cluster"
rclone copy garage:eprouvette/test_dotgit /tmp/test_dotgit \
--stats=1s --stats-log-level=NOTICE --stats-one-line --fast-list
rclone copy garage:eprouvette/test_dotgit /tmp/test_dotgit --stats=1s --stats-log-level=NOTICE --stats-one-line --fast-list
if ! diff <(find "${SCRIPT_FOLDER}/../.git" -type f | xargs md5sum | cut -d ' ' -f 1 | sort) <(find /tmp/test_dotgit -type f | xargs md5sum | cut -d ' ' -f 1 | sort); then
echo "TEST FAILURE: directories are different"
@ -93,23 +68,6 @@ if ! diff <(find "${SCRIPT_FOLDER}/../.git" -type f | xargs md5sum | cut -d ' '
fi
rm -r /tmp/test_dotgit
if [ "$DO_SSEC_TEST" = "1" ]; then
rclone copy garage:eprouvette/test-ssec /tmp/test_ssec_out \
--s3-sse-customer-algorithm AES256 \
--s3-sse-customer-key-base64 "$SSEC_KEY" \
--stats=1s --stats-log-level=NOTICE --stats-one-line
if ! diff "/tmp/test_ssec_out/test-upgrade.sh" "${SCRIPT_FOLDER}/test-upgrade.sh"; then
echo "SSEC-FAILURE (small file)"
exit 1
fi
if ! diff "/tmp/test_ssec_out/randfile-for-upgrade" "/tmp/randfile-for-upgrade"; then
echo "SSEC-FAILURE (big file)"
exit 1
fi
rm -r /tmp/test_ssec_out
rm /tmp/randfile-for-upgrade
fi
echo "🏁 Teardown"
rm -rf /tmp/garage-{data,meta}-*
rm -rf /tmp/config.*.toml

View file

@ -34,6 +34,8 @@ in
jq
];
shellHook = ''
export AWS_REQUEST_CHECKSUM_CALCULATION='when_required'
function to_s3 {
aws \
--endpoint-url https://garage.deuxfleurs.fr \

View file

@ -1,6 +1,6 @@
[package]
name = "garage_api_admin"
version = "2.0.0"
version = "1.3.1"
authors = ["Alex Auvolat <alex@adnab.me>"]
edition = "2018"
license = "AGPL-3.0"
@ -14,9 +14,7 @@ path = "lib.rs"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
format_table.workspace = true
garage_model.workspace = true
garage_block.workspace = true
garage_table.workspace = true
garage_util.workspace = true
garage_rpc.workspace = true
@ -24,11 +22,8 @@ garage_api_common.workspace = true
argon2.workspace = true
async-trait.workspace = true
bytesize.workspace = true
chrono.workspace = true
err-derive.workspace = true
thiserror.workspace = true
hex.workspace = true
paste.workspace = true
tracing.workspace = true
futures.workspace = true
@ -39,7 +34,6 @@ url.workspace = true
serde.workspace = true
serde_json.workspace = true
utoipa.workspace = true
opentelemetry.workspace = true
opentelemetry-prometheus = { workspace = true, optional = true }
@ -47,4 +41,3 @@ prometheus = { workspace = true, optional = true }
[features]
metrics = [ "opentelemetry-prometheus", "prometheus" ]
k2v = [ "garage_model/k2v" ]

View file

@ -1,235 +0,0 @@
use std::sync::Arc;
use chrono::{DateTime, Utc};
use garage_table::*;
use garage_util::time::now_msec;
use garage_model::admin_token_table::*;
use garage_model::garage::Garage;
use crate::api::*;
use crate::error::*;
use crate::{Admin, RequestHandler};
impl RequestHandler for ListAdminTokensRequest {
type Response = ListAdminTokensResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<ListAdminTokensResponse, Error> {
let now = now_msec();
let mut res = garage
.admin_token_table
.get_range(
&EmptyKey,
None,
Some(KeyFilter::Deleted(DeletedFilter::NotDeleted)),
10000,
EnumerationOrder::Forward,
)
.await?
.iter()
.map(|t| admin_token_info_results(t, now))
.collect::<Vec<_>>();
if garage.config.admin.admin_token.is_some() {
res.insert(
0,
GetAdminTokenInfoResponse {
id: None,
created: None,
name: "admin_token (from daemon configuration)".into(),
expiration: None,
expired: false,
scope: vec!["*".into()],
},
);
}
if garage.config.admin.metrics_token.is_some() {
res.insert(
1,
GetAdminTokenInfoResponse {
id: None,
created: None,
name: "metrics_token (from daemon configuration)".into(),
expiration: None,
expired: false,
scope: vec!["Metrics".into()],
},
);
}
Ok(ListAdminTokensResponse(res))
}
}
impl RequestHandler for GetAdminTokenInfoRequest {
type Response = GetAdminTokenInfoResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<GetAdminTokenInfoResponse, Error> {
let token = match (self.id, self.search) {
(Some(id), None) => get_existing_admin_token(garage, &id).await?,
(None, Some(search)) => {
let candidates = garage
.admin_token_table
.get_range(
&EmptyKey,
None,
Some(KeyFilter::MatchesAndNotDeleted(search.to_string())),
10,
EnumerationOrder::Forward,
)
.await?
.into_iter()
.collect::<Vec<_>>();
if candidates.len() != 1 {
return Err(Error::bad_request(format!(
"{} matching admin tokens",
candidates.len()
)));
}
candidates.into_iter().next().unwrap()
}
_ => {
return Err(Error::bad_request(
"Either id or search must be provided (but not both)",
));
}
};
Ok(admin_token_info_results(&token, now_msec()))
}
}
impl RequestHandler for CreateAdminTokenRequest {
type Response = CreateAdminTokenResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<CreateAdminTokenResponse, Error> {
let (mut token, secret) = if self.0.name.is_some() {
AdminApiToken::new("")
} else {
AdminApiToken::new(&format!("token_{}", Utc::now().format("%Y%m%d_%H%M")))
};
apply_token_updates(&mut token, self.0)?;
garage.admin_token_table.insert(&token).await?;
Ok(CreateAdminTokenResponse {
secret_token: secret,
info: admin_token_info_results(&token, now_msec()),
})
}
}
impl RequestHandler for UpdateAdminTokenRequest {
type Response = UpdateAdminTokenResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<UpdateAdminTokenResponse, Error> {
let mut token = get_existing_admin_token(&garage, &self.id).await?;
apply_token_updates(&mut token, self.body)?;
garage.admin_token_table.insert(&token).await?;
Ok(UpdateAdminTokenResponse(admin_token_info_results(
&token,
now_msec(),
)))
}
}
impl RequestHandler for DeleteAdminTokenRequest {
type Response = DeleteAdminTokenResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<DeleteAdminTokenResponse, Error> {
let token = get_existing_admin_token(&garage, &self.id).await?;
garage
.admin_token_table
.insert(&AdminApiToken::delete(token.prefix))
.await?;
Ok(DeleteAdminTokenResponse)
}
}
// ---- helpers ----
fn admin_token_info_results(token: &AdminApiToken, now: u64) -> GetAdminTokenInfoResponse {
let params = token.params().unwrap();
GetAdminTokenInfoResponse {
id: Some(token.prefix.clone()),
created: Some(
DateTime::from_timestamp_millis(params.created as i64)
.expect("invalid timestamp stored in db"),
),
name: params.name.get().to_string(),
expiration: params.expiration.get().map(|x| {
DateTime::from_timestamp_millis(x as i64).expect("invalid timestamp stored in db")
}),
expired: params.is_expired(now),
scope: params.scope.get().0.clone(),
}
}
async fn get_existing_admin_token(garage: &Garage, id: &String) -> Result<AdminApiToken, Error> {
garage
.admin_token_table
.get(&EmptyKey, id)
.await?
.filter(|k| !k.state.is_deleted())
.ok_or_else(|| Error::NoSuchAdminToken(id.to_string()))
}
fn apply_token_updates(
token: &mut AdminApiToken,
updates: UpdateAdminTokenRequestBody,
) -> Result<(), Error> {
if updates.never_expires && updates.expiration.is_some() {
return Err(Error::bad_request(
"cannot specify `expiration` and `never_expires`",
));
}
let params = token.params_mut().unwrap();
if let Some(name) = updates.name {
params.name.update(name);
}
if let Some(expiration) = updates.expiration {
params
.expiration
.update(Some(expiration.timestamp_millis() as u64));
}
if updates.never_expires {
params.expiration.update(None);
}
if let Some(scope) = updates.scope {
params.scope.update(AdminApiTokenScope(scope));
}
Ok(())
}

File diff suppressed because it is too large Load diff

View file

@ -1,230 +1,333 @@
use std::borrow::Cow;
use std::collections::HashMap;
use std::sync::Arc;
use http::header::{HeaderValue, ACCESS_CONTROL_ALLOW_ORIGIN, AUTHORIZATION};
use hyper::{body::Incoming as IncomingBody, Request, Response};
use serde::{Deserialize, Serialize};
use argon2::password_hash::PasswordHash;
use http::header::{ACCESS_CONTROL_ALLOW_METHODS, ACCESS_CONTROL_ALLOW_ORIGIN, ALLOW};
use hyper::{body::Incoming as IncomingBody, Request, Response, StatusCode};
use tokio::sync::watch;
use opentelemetry::trace::SpanRef;
#[cfg(feature = "metrics")]
use opentelemetry_prometheus::PrometheusExporter;
#[cfg(feature = "metrics")]
use prometheus::{Encoder, TextEncoder};
use garage_model::garage::Garage;
use garage_rpc::{Endpoint as RpcEndpoint, *};
use garage_table::EmptyKey;
use garage_util::background::BackgroundRunner;
use garage_util::data::Uuid;
use garage_rpc::system::ClusterHealthStatus;
use garage_util::error::Error as GarageError;
use garage_util::socket_address::UnixOrTCPSocketAddress;
use garage_util::time::now_msec;
use garage_api_common::generic_server::*;
use garage_api_common::helpers::*;
use crate::api::*;
use crate::bucket::*;
use crate::cluster::*;
use crate::error::*;
use crate::key::*;
use crate::router_v0;
use crate::router_v1;
use crate::Authorization;
use crate::RequestHandler;
// ---- FOR RPC ----
pub const ADMIN_RPC_PATH: &str = "garage_api/admin/rpc.rs/Rpc";
#[derive(Debug, Serialize, Deserialize)]
pub enum AdminRpc {
Proxy(AdminApiRequest),
Internal(LocalAdminApiRequest),
}
#[derive(Debug, Serialize, Deserialize)]
pub enum AdminRpcResponse {
ProxyApiOkResponse(TaggedAdminApiResponse),
InternalApiOkResponse(LocalAdminApiResponse),
ApiErrorResponse {
http_code: u16,
error_code: String,
message: String,
},
}
impl Rpc for AdminRpc {
type Response = Result<AdminRpcResponse, GarageError>;
}
impl EndpointHandler<AdminRpc> for AdminApiServer {
async fn handle(
self: &Arc<Self>,
message: &AdminRpc,
_from: NodeID,
) -> Result<AdminRpcResponse, GarageError> {
match message {
AdminRpc::Proxy(req) => {
info!("Proxied admin API request: {}", req.name());
let res = req.clone().handle(&self.garage, &self).await;
match res {
Ok(res) => Ok(AdminRpcResponse::ProxyApiOkResponse(res.tagged())),
Err(e) => Ok(AdminRpcResponse::ApiErrorResponse {
http_code: e.http_status_code().as_u16(),
error_code: e.code().to_string(),
message: e.to_string(),
}),
}
}
AdminRpc::Internal(req) => {
info!("Internal admin API request: {}", req.name());
let res = req.clone().handle(&self.garage, &self).await;
match res {
Ok(res) => Ok(AdminRpcResponse::InternalApiOkResponse(res)),
Err(e) => Ok(AdminRpcResponse::ApiErrorResponse {
http_code: e.http_status_code().as_u16(),
error_code: e.code().to_string(),
message: e.to_string(),
}),
}
}
}
}
}
// ---- FOR HTTP ----
use crate::router_v1::{Authorization, Endpoint};
pub type ResBody = BoxBody<Error>;
pub struct AdminApiServer {
garage: Arc<Garage>,
#[cfg(feature = "metrics")]
pub(crate) exporter: PrometheusExporter,
exporter: PrometheusExporter,
metrics_token: Option<String>,
metrics_require_token: bool,
admin_token: Option<String>,
pub(crate) background: Arc<BackgroundRunner>,
pub(crate) endpoint: Arc<RpcEndpoint<AdminRpc, Self>>,
}
pub enum HttpEndpoint {
Old(router_v1::Endpoint),
New(String),
}
impl AdminApiServer {
pub fn new(
garage: Arc<Garage>,
background: Arc<BackgroundRunner>,
#[cfg(feature = "metrics")] exporter: PrometheusExporter,
) -> Arc<Self> {
) -> Self {
let cfg = &garage.config.admin;
let metrics_token = cfg.metrics_token.as_deref().map(hash_bearer_token);
let admin_token = cfg.admin_token.as_deref().map(hash_bearer_token);
let metrics_require_token = cfg.metrics_require_token;
let endpoint = garage.system.netapp.endpoint(ADMIN_RPC_PATH.into());
let admin = Arc::new(Self {
Self {
garage,
#[cfg(feature = "metrics")]
exporter,
metrics_token,
metrics_require_token,
admin_token,
background,
endpoint,
});
admin.endpoint.set_handler(admin.clone());
admin
}
}
pub async fn run(
self: Arc<Self>,
self,
bind_addr: UnixOrTCPSocketAddress,
must_exit: watch::Receiver<bool>,
) -> Result<(), GarageError> {
let region = self.garage.config.s3_api.s3_region.clone();
ApiServer::new(region, ArcAdminApiServer(self))
ApiServer::new(region, self)
.run_server(bind_addr, Some(0o220), must_exit)
.await
}
async fn handle_http_api(
fn handle_options(&self, _req: &Request<IncomingBody>) -> Result<Response<ResBody>, Error> {
Ok(Response::builder()
.status(StatusCode::NO_CONTENT)
.header(ALLOW, "OPTIONS, GET, POST")
.header(ACCESS_CONTROL_ALLOW_METHODS, "OPTIONS, GET, POST")
.header(ACCESS_CONTROL_ALLOW_ORIGIN, "*")
.body(empty_body())?)
}
async fn handle_check_domain(
&self,
req: Request<IncomingBody>,
endpoint: HttpEndpoint,
) -> Result<Response<ResBody>, Error> {
let auth_header = req.headers().get(AUTHORIZATION).cloned();
let query_params: HashMap<String, String> = req
.uri()
.query()
.map(|v| {
url::form_urlencoded::parse(v.as_bytes())
.into_owned()
.collect()
})
.unwrap_or_else(HashMap::new);
let request = match endpoint {
HttpEndpoint::Old(endpoint_v1) => AdminApiRequest::from_v1(endpoint_v1, req).await?,
HttpEndpoint::New(_) => AdminApiRequest::from_request(req).await?,
let has_domain_key = query_params.contains_key("domain");
if !has_domain_key {
return Err(Error::bad_request("No domain query string found"));
}
let domain = query_params
.get("domain")
.ok_or_internal_error("Could not parse domain query string")?;
if self.check_domain(domain).await? {
Ok(Response::builder()
.status(StatusCode::OK)
.body(string_body(format!(
"Domain '{domain}' is managed by Garage"
)))?)
} else {
Err(Error::bad_request(format!(
"Domain '{domain}' is not managed by Garage"
)))
}
}
async fn check_domain(&self, domain: &str) -> Result<bool, Error> {
// Resolve bucket from domain name, inferring if the website must be activated for the
// domain to be valid.
let (bucket_name, must_check_website) = if let Some(bname) = self
.garage
.config
.s3_api
.root_domain
.as_ref()
.and_then(|rd| host_to_bucket(domain, rd))
{
(bname.to_string(), false)
} else if let Some(bname) = self
.garage
.config
.s3_web
.as_ref()
.and_then(|sw| host_to_bucket(domain, sw.root_domain.as_str()))
{
(bname.to_string(), true)
} else {
(domain.to_string(), true)
};
let (global_token_hash, token_required) = match request.authorization_type() {
Authorization::None => (None, false),
Authorization::MetricsToken => (
self.metrics_token.as_deref(),
self.metrics_token.is_some() || self.metrics_require_token,
let bucket_id = match self
.garage
.bucket_helper()
.resolve_global_bucket_name(&bucket_name)
.await?
{
Some(bucket_id) => bucket_id,
None => return Ok(false),
};
if !must_check_website {
return Ok(true);
}
let bucket = self
.garage
.bucket_helper()
.get_existing_bucket(bucket_id)
.await?;
let bucket_state = bucket.state.as_option().unwrap();
let bucket_website_config = bucket_state.website_config.get();
match bucket_website_config {
Some(_v) => Ok(true),
None => Ok(false),
}
}
fn handle_health(&self) -> Result<Response<ResBody>, Error> {
let health = self.garage.system.health();
let (status, status_str) = match health.status {
ClusterHealthStatus::Healthy => (StatusCode::OK, "Garage is fully operational"),
ClusterHealthStatus::Degraded => (
StatusCode::OK,
"Garage is operational but some storage nodes are unavailable",
),
ClusterHealthStatus::Unavailable => (
StatusCode::SERVICE_UNAVAILABLE,
"Quorum is not available for some/all partitions, reads and writes will fail",
),
Authorization::AdminToken => (self.admin_token.as_deref(), true),
};
let status_str = format!(
"{}\nConsult the full health check API endpoint at /v1/health for more details\n",
status_str
);
if token_required {
verify_authorization(&self.garage, global_token_hash, auth_header, request.name())?;
}
Ok(Response::builder()
.status(status)
.header(http::header::CONTENT_TYPE, "text/plain")
.body(string_body(status_str))?)
}
match request {
AdminApiRequest::Options(req) => req.handle(&self.garage, &self).await,
AdminApiRequest::CheckDomain(req) => req.handle(&self.garage, &self).await,
AdminApiRequest::Health(req) => req.handle(&self.garage, &self).await,
AdminApiRequest::Metrics(req) => req.handle(&self.garage, &self).await,
req => {
let res = req.handle(&self.garage, &self).await?;
let mut res = json_ok_response(&res)?;
res.headers_mut()
.insert(ACCESS_CONTROL_ALLOW_ORIGIN, HeaderValue::from_static("*"));
Ok(res)
}
fn handle_metrics(&self) -> Result<Response<ResBody>, Error> {
#[cfg(feature = "metrics")]
{
use opentelemetry::trace::Tracer;
let mut buffer = vec![];
let encoder = TextEncoder::new();
let tracer = opentelemetry::global::tracer("garage");
let metric_families = tracer.in_span("admin/gather_metrics", |_| {
self.exporter.registry().gather()
});
encoder
.encode(&metric_families, &mut buffer)
.ok_or_internal_error("Could not serialize metrics")?;
Ok(Response::builder()
.status(StatusCode::OK)
.header(http::header::CONTENT_TYPE, encoder.format_type())
.body(bytes_body(buffer.into()))?)
}
#[cfg(not(feature = "metrics"))]
Err(Error::bad_request(
"Garage was built without the metrics feature".to_string(),
))
}
}
struct ArcAdminApiServer(Arc<AdminApiServer>);
impl ApiHandler for ArcAdminApiServer {
impl ApiHandler for AdminApiServer {
const API_NAME: &'static str = "admin";
const API_NAME_DISPLAY: &'static str = "Admin";
type Endpoint = HttpEndpoint;
type Endpoint = Endpoint;
type Error = Error;
fn parse_endpoint(&self, req: &Request<IncomingBody>) -> Result<HttpEndpoint, Error> {
fn parse_endpoint(&self, req: &Request<IncomingBody>) -> Result<Endpoint, Error> {
if req.uri().path().starts_with("/v0/") {
let endpoint_v0 = router_v0::Endpoint::from_request(req)?;
let endpoint_v1 = router_v1::Endpoint::from_v0(endpoint_v0)?;
Ok(HttpEndpoint::Old(endpoint_v1))
} else if req.uri().path().starts_with("/v1/") {
let endpoint_v1 = router_v1::Endpoint::from_request(req)?;
Ok(HttpEndpoint::Old(endpoint_v1))
Endpoint::from_v0(endpoint_v0)
} else {
Ok(HttpEndpoint::New(req.uri().path().to_string()))
Endpoint::from_request(req)
}
}
async fn handle(
&self,
req: Request<IncomingBody>,
endpoint: HttpEndpoint,
endpoint: Endpoint,
) -> Result<Response<ResBody>, Error> {
self.0.handle_http_api(req, endpoint).await
let required_auth_hash =
match endpoint.authorization_type() {
Authorization::None => None,
Authorization::MetricsToken => self.metrics_token.as_deref(),
Authorization::AdminToken => match self.admin_token.as_deref() {
None => return Err(Error::forbidden(
"Admin token isn't configured, admin API access is disabled for security.",
)),
Some(t) => Some(t),
},
};
if let Some(password_hash) = required_auth_hash {
match req.headers().get("Authorization") {
None => return Err(Error::forbidden("Authorization token must be provided")),
Some(authorization) => {
verify_bearer_token(&authorization, password_hash)?;
}
}
}
match endpoint {
Endpoint::Options => self.handle_options(&req),
Endpoint::CheckDomain => self.handle_check_domain(req).await,
Endpoint::Health => self.handle_health(),
Endpoint::Metrics => self.handle_metrics(),
Endpoint::GetClusterStatus => handle_get_cluster_status(&self.garage).await,
Endpoint::GetClusterHealth => handle_get_cluster_health(&self.garage).await,
Endpoint::ConnectClusterNodes => handle_connect_cluster_nodes(&self.garage, req).await,
// Layout
Endpoint::GetClusterLayout => handle_get_cluster_layout(&self.garage).await,
Endpoint::UpdateClusterLayout => handle_update_cluster_layout(&self.garage, req).await,
Endpoint::ApplyClusterLayout => handle_apply_cluster_layout(&self.garage, req).await,
Endpoint::RevertClusterLayout => handle_revert_cluster_layout(&self.garage).await,
// Keys
Endpoint::ListKeys => handle_list_keys(&self.garage).await,
Endpoint::GetKeyInfo {
id,
search,
show_secret_key,
} => {
let show_secret_key = show_secret_key.map(|x| x == "true").unwrap_or(false);
handle_get_key_info(&self.garage, id, search, show_secret_key).await
}
Endpoint::CreateKey => handle_create_key(&self.garage, req).await,
Endpoint::ImportKey => handle_import_key(&self.garage, req).await,
Endpoint::UpdateKey { id } => handle_update_key(&self.garage, id, req).await,
Endpoint::DeleteKey { id } => handle_delete_key(&self.garage, id).await,
// Buckets
Endpoint::ListBuckets => handle_list_buckets(&self.garage).await,
Endpoint::GetBucketInfo { id, global_alias } => {
handle_get_bucket_info(&self.garage, id, global_alias).await
}
Endpoint::CreateBucket => handle_create_bucket(&self.garage, req).await,
Endpoint::DeleteBucket { id } => handle_delete_bucket(&self.garage, id).await,
Endpoint::UpdateBucket { id } => handle_update_bucket(&self.garage, id, req).await,
// Bucket-key permissions
Endpoint::BucketAllowKey => {
handle_bucket_change_key_perm(&self.garage, req, true).await
}
Endpoint::BucketDenyKey => {
handle_bucket_change_key_perm(&self.garage, req, false).await
}
// Bucket aliasing
Endpoint::GlobalAliasBucket { id, alias } => {
handle_global_alias_bucket(&self.garage, id, alias).await
}
Endpoint::GlobalUnaliasBucket { id, alias } => {
handle_global_unalias_bucket(&self.garage, id, alias).await
}
Endpoint::LocalAliasBucket {
id,
access_key_id,
alias,
} => handle_local_alias_bucket(&self.garage, id, access_key_id, alias).await,
Endpoint::LocalUnaliasBucket {
id,
access_key_id,
alias,
} => handle_local_unalias_bucket(&self.garage, id, access_key_id, alias).await,
}
}
}
impl ApiEndpoint for HttpEndpoint {
fn name(&self) -> Cow<'static, str> {
match self {
Self::Old(endpoint_v1) => Cow::Borrowed(endpoint_v1.name()),
Self::New(path) => Cow::Owned(path.clone()),
}
impl ApiEndpoint for Endpoint {
fn name(&self) -> &'static str {
Endpoint::name(self)
}
fn add_span_attributes(&self, _span: SpanRef<'_>) {}
@ -244,87 +347,20 @@ fn hash_bearer_token(token: &str) -> String {
.to_string()
}
fn verify_authorization(
garage: &Garage,
global_token_hash: Option<&str>,
auth_header: Option<hyper::http::HeaderValue>,
endpoint_name: &str,
) -> Result<(), Error> {
use argon2::{password_hash::PasswordHash, password_hash::PasswordVerifier, Argon2};
fn verify_bearer_token(token: &hyper::http::HeaderValue, password_hash: &str) -> Result<(), Error> {
use argon2::{password_hash::PasswordVerifier, Argon2};
let invalid_msg = "Invalid bearer token";
let parsed_hash = PasswordHash::new(&password_hash).unwrap();
let token = match &auth_header {
None => {
return Err(Error::forbidden(
"Bearer token must be provided in Authorization header",
))
}
Some(authorization) => authorization
.to_str()?
.strip_prefix("Bearer ")
.ok_or_else(|| Error::forbidden("Invalid Authorization header"))?
.trim(),
};
let token_hash_string = if let Some((prefix, _)) = token.split_once('.') {
garage
.admin_token_table
.get_local(&EmptyKey, &prefix.to_string())?
.and_then(|k| k.state.into_option())
.filter(|p| !p.is_expired(now_msec()))
.filter(|p| p.has_scope(endpoint_name))
.ok_or_else(|| Error::forbidden(invalid_msg))?
.token_hash
} else {
global_token_hash
.ok_or_else(|| Error::forbidden(invalid_msg))?
.to_string()
};
let token_hash =
PasswordHash::new(&token_hash_string).ok_or_internal_error("Could not parse token hash")?;
Argon2::default()
.verify_password(token.as_bytes(), &token_hash)
.map_err(|_| Error::forbidden(invalid_msg))?;
token
.to_str()?
.strip_prefix("Bearer ")
.and_then(|token| {
Argon2::default()
.verify_password(token.trim().as_bytes(), &parsed_hash)
.ok()
})
.ok_or_else(|| Error::forbidden("Invalid authorization token"))?;
Ok(())
}
pub(crate) fn find_matching_nodes(garage: &Garage, spec: &str) -> Result<Vec<Uuid>, Error> {
let mut res = vec![];
if spec == "*" {
res = garage.system.cluster_layout().all_nodes().to_vec();
for node in garage.system.get_known_nodes() {
if node.is_up && !res.contains(&node.id) {
res.push(node.id);
}
}
} else if spec == "self" {
res.push(garage.system.id);
} else {
let layout = garage.system.cluster_layout();
let known_nodes = garage.system.get_known_nodes();
let all_nodes = layout
.all_nodes()
.iter()
.copied()
.chain(known_nodes.iter().filter(|x| x.is_up).map(|x| x.id));
for node in all_nodes {
if !res.contains(&node) && hex::encode(node).starts_with(spec) {
res.push(node);
}
}
if res.is_empty() {
return Err(Error::bad_request(format!("No nodes matching {}", spec)));
}
if res.len() > 1 {
return Err(Error::bad_request(format!(
"Multiple nodes matching {}: {:?}",
spec, res
)));
}
}
Ok(res)
}

View file

@ -1,276 +0,0 @@
use std::sync::Arc;
use garage_util::data::*;
use garage_util::error::Error as GarageError;
use garage_util::time::now_msec;
use garage_table::EmptyKey;
use garage_model::garage::Garage;
use garage_model::s3::object_table::*;
use garage_model::s3::version_table::*;
use garage_api_common::common_error::CommonErrorDerivative;
use crate::api::*;
use crate::error::*;
use crate::{Admin, RequestHandler};
impl RequestHandler for LocalListBlockErrorsRequest {
type Response = LocalListBlockErrorsResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<LocalListBlockErrorsResponse, Error> {
let errors = garage.block_manager.list_resync_errors()?;
let now = now_msec();
let errors = errors
.into_iter()
.map(|e| BlockError {
block_hash: hex::encode(&e.hash),
refcount: e.refcount,
error_count: e.error_count,
last_try_secs_ago: now.saturating_sub(e.last_try) / 1000,
next_try_in_secs: e.next_try.saturating_sub(now) / 1000,
})
.collect();
Ok(LocalListBlockErrorsResponse(errors))
}
}
impl RequestHandler for LocalGetBlockInfoRequest {
type Response = LocalGetBlockInfoResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<LocalGetBlockInfoResponse, Error> {
let hash = find_block_hash_by_prefix(garage, &self.block_hash)?;
let refcount = garage.block_manager.get_block_rc(&hash)?;
let block_refs = garage
.block_ref_table
.get_range(&hash, None, None, 10000, Default::default())
.await?;
let mut versions = vec![];
for br in block_refs {
if let Some(v) = garage.version_table.get(&br.version, &EmptyKey).await? {
let bl = match &v.backlink {
VersionBacklink::MultipartUpload { upload_id } => {
if let Some(u) = garage.mpu_table.get(upload_id, &EmptyKey).await? {
BlockVersionBacklink::Upload {
upload_id: hex::encode(&upload_id),
upload_deleted: u.deleted.get(),
upload_garbage_collected: false,
bucket_id: Some(hex::encode(&u.bucket_id)),
key: Some(u.key.to_string()),
}
} else {
BlockVersionBacklink::Upload {
upload_id: hex::encode(&upload_id),
upload_deleted: true,
upload_garbage_collected: true,
bucket_id: None,
key: None,
}
}
}
VersionBacklink::Object { bucket_id, key } => BlockVersionBacklink::Object {
bucket_id: hex::encode(&bucket_id),
key: key.to_string(),
},
};
versions.push(BlockVersion {
version_id: hex::encode(&br.version),
ref_deleted: br.deleted.get(),
version_deleted: v.deleted.get(),
garbage_collected: false,
backlink: Some(bl),
});
} else {
versions.push(BlockVersion {
version_id: hex::encode(&br.version),
ref_deleted: br.deleted.get(),
version_deleted: true,
garbage_collected: true,
backlink: None,
});
}
}
Ok(LocalGetBlockInfoResponse {
block_hash: hex::encode(&hash),
refcount,
versions,
})
}
}
impl RequestHandler for LocalRetryBlockResyncRequest {
type Response = LocalRetryBlockResyncResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<LocalRetryBlockResyncResponse, Error> {
match self {
Self::All { all: true } => {
let blocks = garage.block_manager.list_resync_errors()?;
for b in blocks.iter() {
garage.block_manager.resync.clear_backoff(&b.hash)?;
}
Ok(LocalRetryBlockResyncResponse {
count: blocks.len() as u64,
})
}
Self::All { all: false } => Err(Error::bad_request("nonsense")),
Self::Blocks { block_hashes } => {
for hash in block_hashes.iter() {
let hash = hex::decode(hash).ok_or_bad_request("invalid hash")?;
let hash = Hash::try_from(&hash).ok_or_bad_request("invalid hash")?;
garage.block_manager.resync.clear_backoff(&hash)?;
}
Ok(LocalRetryBlockResyncResponse {
count: block_hashes.len() as u64,
})
}
}
}
}
impl RequestHandler for LocalPurgeBlocksRequest {
type Response = LocalPurgeBlocksResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<LocalPurgeBlocksResponse, Error> {
let mut obj_dels = 0;
let mut mpu_dels = 0;
let mut ver_dels = 0;
for hash in self.0.iter() {
let hash = hex::decode(hash).ok_or_bad_request("invalid hash")?;
let hash = Hash::try_from(&hash).ok_or_bad_request("invalid hash")?;
let block_refs = garage
.block_ref_table
.get_range(&hash, None, None, 10000, Default::default())
.await?;
for br in block_refs {
if let Some(version) = garage.version_table.get(&br.version, &EmptyKey).await? {
handle_block_purge_version_backlink(
garage,
&version,
&mut obj_dels,
&mut mpu_dels,
)
.await?;
if !version.deleted.get() {
let deleted_version = Version::new(version.uuid, version.backlink, true);
garage.version_table.insert(&deleted_version).await?;
ver_dels += 1;
}
}
}
}
Ok(LocalPurgeBlocksResponse {
blocks_purged: self.0.len() as u64,
versions_deleted: ver_dels,
objects_deleted: obj_dels,
uploads_deleted: mpu_dels,
})
}
}
fn find_block_hash_by_prefix(garage: &Arc<Garage>, prefix: &str) -> Result<Hash, Error> {
if prefix.len() < 4 {
return Err(Error::bad_request(
"Please specify at least 4 characters of the block hash",
));
}
let prefix_bin = hex::decode(&prefix[..prefix.len() & !1]).ok_or_bad_request("invalid hash")?;
let iter = garage
.block_ref_table
.data
.store
.range(&prefix_bin[..]..)
.map_err(GarageError::from)?;
let mut found = None;
for item in iter {
let (k, _v) = item.map_err(GarageError::from)?;
let hash = Hash::try_from(&k[..32]).unwrap();
if &hash.as_slice()[..prefix_bin.len()] != prefix_bin {
break;
}
if hex::encode(hash.as_slice()).starts_with(prefix) {
match &found {
Some(x) if *x == hash => (),
Some(_) => {
return Err(Error::bad_request(format!(
"Several blocks match prefix `{}`",
prefix
)));
}
None => {
found = Some(hash);
}
}
}
}
found.ok_or_else(|| Error::NoSuchBlock(prefix.to_string()))
}
async fn handle_block_purge_version_backlink(
garage: &Arc<Garage>,
version: &Version,
obj_dels: &mut u64,
mpu_dels: &mut u64,
) -> Result<(), Error> {
let (bucket_id, key, ov_id) = match &version.backlink {
VersionBacklink::Object { bucket_id, key } => (*bucket_id, key.clone(), version.uuid),
VersionBacklink::MultipartUpload { upload_id } => {
if let Some(mut mpu) = garage.mpu_table.get(upload_id, &EmptyKey).await? {
if !mpu.deleted.get() {
mpu.parts.clear();
mpu.deleted.set();
garage.mpu_table.insert(&mpu).await?;
*mpu_dels += 1;
}
(mpu.bucket_id, mpu.key.clone(), *upload_id)
} else {
return Ok(());
}
}
};
if let Some(object) = garage.object_table.get(&bucket_id, &key).await? {
let ov = object.versions().iter().rev().find(|v| v.is_complete());
if let Some(ov) = ov {
if ov.uuid == ov_id {
let del_uuid = gen_uuid();
let deleted_object = Object::new(
bucket_id,
key,
vec![ObjectVersion {
uuid: del_uuid,
timestamp: ov.timestamp + 1,
state: ObjectVersionState::Complete(ObjectVersionData::DeleteMarker),
}],
);
garage.object_table.insert(&deleted_object).await?;
*obj_dels += 1;
}
}
}
Ok(())
}

File diff suppressed because it is too large Load diff

View file

@ -1,276 +1,411 @@
use std::collections::HashMap;
use std::fmt::Write;
use std::net::SocketAddr;
use std::sync::Arc;
use format_table::format_table_to_string;
use hyper::{body::Incoming as IncomingBody, Request, Response};
use serde::{Deserialize, Serialize};
use garage_util::crdt::*;
use garage_util::data::*;
use garage_rpc::layout;
use garage_rpc::layout::PARTITION_BITS;
use garage_model::garage::Garage;
use crate::api::*;
use garage_api_common::helpers::{json_ok_response, parse_json_body};
use crate::api_server::ResBody;
use crate::error::*;
use crate::{Admin, RequestHandler};
impl RequestHandler for GetClusterStatusRequest {
type Response = GetClusterStatusResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<GetClusterStatusResponse, Error> {
let layout = garage.system.cluster_layout();
let mut nodes = garage
.system
.get_known_nodes()
.into_iter()
.map(|i| {
(
i.id,
NodeResp {
id: hex::encode(i.id),
garage_version: i.status.garage_version,
addr: i.addr,
hostname: i.status.hostname,
is_up: i.is_up,
last_seen_secs_ago: i.last_seen_secs_ago,
data_partition: i.status.data_disk_avail.map(|(avail, total)| {
FreeSpaceResp {
available: avail,
total,
}
pub async fn handle_get_cluster_status(garage: &Arc<Garage>) -> Result<Response<ResBody>, Error> {
let layout = garage.system.cluster_layout();
let mut nodes = garage
.system
.get_known_nodes()
.into_iter()
.map(|i| {
(
i.id,
NodeResp {
id: hex::encode(i.id),
addr: i.addr,
hostname: i.status.hostname,
is_up: i.is_up,
last_seen_secs_ago: i.last_seen_secs_ago,
data_partition: i
.status
.data_disk_avail
.map(|(avail, total)| FreeSpaceResp {
available: avail,
total,
}),
metadata_partition: i.status.meta_disk_avail.map(|(avail, total)| {
FreeSpaceResp {
available: avail,
total,
}
}),
..Default::default()
},
)
})
.collect::<HashMap<_, _>>();
metadata_partition: i.status.meta_disk_avail.map(|(avail, total)| {
FreeSpaceResp {
available: avail,
total,
}
}),
..Default::default()
},
)
})
.collect::<HashMap<_, _>>();
for (id, _, role) in layout.current().roles.items().iter() {
for (id, _, role) in layout.current().roles.items().iter() {
if let layout::NodeRoleV(Some(r)) = role {
let role = NodeRoleResp {
id: hex::encode(id),
zone: r.zone.to_string(),
capacity: r.capacity,
tags: r.tags.clone(),
};
match nodes.get_mut(id) {
None => {
nodes.insert(
*id,
NodeResp {
id: hex::encode(id),
role: Some(role),
..Default::default()
},
);
}
Some(n) => {
n.role = Some(role);
}
}
}
}
for ver in layout.versions().iter().rev().skip(1) {
for (id, _, role) in ver.roles.items().iter() {
if let layout::NodeRoleV(Some(r)) = role {
let role = NodeAssignedRole {
zone: r.zone.to_string(),
capacity: r.capacity,
tags: r.tags.clone(),
};
match nodes.get_mut(id) {
None => {
if r.capacity.is_some() {
if let Some(n) = nodes.get_mut(id) {
if n.role.is_none() {
n.draining = true;
}
} else {
nodes.insert(
*id,
NodeResp {
id: hex::encode(id),
role: Some(role),
draining: true,
..Default::default()
},
);
}
Some(n) => {
n.role = Some(role);
}
}
}
}
}
for ver in layout.versions().iter().rev().skip(1) {
for (id, _, role) in ver.roles.items().iter() {
if let layout::NodeRoleV(Some(r)) = role {
if r.capacity.is_some() {
if let Some(n) = nodes.get_mut(id) {
if n.role.is_none() {
n.draining = true;
}
} else {
nodes.insert(
*id,
NodeResp {
id: hex::encode(id),
draining: true,
..Default::default()
},
);
}
}
}
}
}
let mut nodes = nodes.into_values().collect::<Vec<_>>();
nodes.sort_by(|x, y| x.id.cmp(&y.id));
let mut nodes = nodes.into_values().collect::<Vec<_>>();
nodes.sort_by(|x, y| x.id.cmp(&y.id));
let res = GetClusterStatusResponse {
node: hex::encode(garage.system.id),
garage_version: garage_util::version::garage_version(),
garage_features: garage_util::version::garage_features(),
rust_version: garage_util::version::rust_version(),
db_engine: garage.db.engine(),
layout_version: layout.current().version,
nodes,
};
Ok(GetClusterStatusResponse {
layout_version: layout.current().version,
nodes,
Ok(json_ok_response(&res)?)
}
pub async fn handle_get_cluster_health(garage: &Arc<Garage>) -> Result<Response<ResBody>, Error> {
use garage_rpc::system::ClusterHealthStatus;
let health = garage.system.health();
let health = ClusterHealth {
status: match health.status {
ClusterHealthStatus::Healthy => "healthy",
ClusterHealthStatus::Degraded => "degraded",
ClusterHealthStatus::Unavailable => "unavailable",
},
known_nodes: health.known_nodes,
connected_nodes: health.connected_nodes,
storage_nodes: health.storage_nodes,
storage_nodes_ok: health.storage_nodes_ok,
partitions: health.partitions,
partitions_quorum: health.partitions_quorum,
partitions_all_ok: health.partitions_all_ok,
};
Ok(json_ok_response(&health)?)
}
pub async fn handle_connect_cluster_nodes(
garage: &Arc<Garage>,
req: Request<IncomingBody>,
) -> Result<Response<ResBody>, Error> {
let req = parse_json_body::<Vec<String>, _, Error>(req).await?;
let res = futures::future::join_all(req.iter().map(|node| garage.system.connect(node)))
.await
.into_iter()
.map(|r| match r {
Ok(()) => ConnectClusterNodesResponse {
success: true,
error: None,
},
Err(e) => ConnectClusterNodesResponse {
success: false,
error: Some(format!("{}", e)),
},
})
.collect::<Vec<_>>();
Ok(json_ok_response(&res)?)
}
pub async fn handle_get_cluster_layout(garage: &Arc<Garage>) -> Result<Response<ResBody>, Error> {
let res = format_cluster_layout(garage.system.cluster_layout().inner());
Ok(json_ok_response(&res)?)
}
fn format_cluster_layout(layout: &layout::LayoutHistory) -> GetClusterLayoutResponse {
let roles = layout
.current()
.roles
.items()
.iter()
.filter_map(|(k, _, v)| v.0.clone().map(|x| (k, x)))
.map(|(k, v)| NodeRoleResp {
id: hex::encode(k),
zone: v.zone.clone(),
capacity: v.capacity,
tags: v.tags.clone(),
})
.collect::<Vec<_>>();
let staged_role_changes = layout
.staging
.get()
.roles
.items()
.iter()
.filter(|(k, _, v)| layout.current().roles.get(k) != Some(v))
.map(|(k, _, v)| match &v.0 {
None => NodeRoleChange {
id: hex::encode(k),
action: NodeRoleChangeEnum::Remove { remove: true },
},
Some(r) => NodeRoleChange {
id: hex::encode(k),
action: NodeRoleChangeEnum::Update {
zone: r.zone.clone(),
capacity: r.capacity,
tags: r.tags.clone(),
},
},
})
.collect::<Vec<_>>();
GetClusterLayoutResponse {
version: layout.current().version,
roles,
staged_role_changes,
}
}
impl RequestHandler for GetClusterHealthRequest {
type Response = GetClusterHealthResponse;
// ----
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<GetClusterHealthResponse, Error> {
use garage_rpc::system::ClusterHealthStatus;
let health = garage.system.health();
let health = GetClusterHealthResponse {
status: match health.status {
ClusterHealthStatus::Healthy => "healthy",
ClusterHealthStatus::Degraded => "degraded",
ClusterHealthStatus::Unavailable => "unavailable",
}
.to_string(),
known_nodes: health.known_nodes,
connected_nodes: health.connected_nodes,
storage_nodes: health.storage_nodes,
storage_nodes_ok: health.storage_nodes_ok,
partitions: health.partitions,
partitions_quorum: health.partitions_quorum,
partitions_all_ok: health.partitions_all_ok,
#[derive(Debug, Clone, Copy, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct ClusterHealth {
status: &'static str,
known_nodes: usize,
connected_nodes: usize,
storage_nodes: usize,
storage_nodes_ok: usize,
partitions: usize,
partitions_quorum: usize,
partitions_all_ok: usize,
}
#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
struct GetClusterStatusResponse {
node: String,
garage_version: &'static str,
garage_features: Option<&'static [&'static str]>,
rust_version: &'static str,
db_engine: String,
layout_version: u64,
nodes: Vec<NodeResp>,
}
#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
struct ApplyClusterLayoutResponse {
message: Vec<String>,
layout: GetClusterLayoutResponse,
}
#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
struct ConnectClusterNodesResponse {
success: bool,
error: Option<String>,
}
#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
struct GetClusterLayoutResponse {
version: u64,
roles: Vec<NodeRoleResp>,
staged_role_changes: Vec<NodeRoleChange>,
}
#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
struct NodeRoleResp {
id: String,
zone: String,
capacity: Option<u64>,
tags: Vec<String>,
}
#[derive(Serialize, Default)]
#[serde(rename_all = "camelCase")]
struct FreeSpaceResp {
available: u64,
total: u64,
}
#[derive(Serialize, Default)]
#[serde(rename_all = "camelCase")]
struct NodeResp {
id: String,
role: Option<NodeRoleResp>,
addr: Option<SocketAddr>,
hostname: Option<String>,
is_up: bool,
last_seen_secs_ago: Option<u64>,
draining: bool,
#[serde(skip_serializing_if = "Option::is_none")]
data_partition: Option<FreeSpaceResp>,
#[serde(skip_serializing_if = "Option::is_none")]
metadata_partition: Option<FreeSpaceResp>,
}
// ---- update functions ----
pub async fn handle_update_cluster_layout(
garage: &Arc<Garage>,
req: Request<IncomingBody>,
) -> Result<Response<ResBody>, Error> {
let updates = parse_json_body::<UpdateClusterLayoutRequest, _, Error>(req).await?;
let mut layout = garage.system.cluster_layout().inner().clone();
let mut roles = layout.current().roles.clone();
roles.merge(&layout.staging.get().roles);
for change in updates {
let node = hex::decode(&change.id).ok_or_bad_request("Invalid node identifier")?;
let node = Uuid::try_from(&node).ok_or_bad_request("Invalid node identifier")?;
let new_role = match change.action {
NodeRoleChangeEnum::Remove { remove: true } => None,
NodeRoleChangeEnum::Update {
zone,
capacity,
tags,
} => Some(layout::NodeRole {
zone,
capacity,
tags,
}),
_ => return Err(Error::bad_request("Invalid layout change")),
};
Ok(health)
layout
.staging
.get_mut()
.roles
.merge(&roles.update_mutator(node, layout::NodeRoleV(new_role)));
}
garage
.system
.layout_manager
.update_cluster_layout(&layout)
.await?;
let res = format_cluster_layout(&layout);
Ok(json_ok_response(&res)?)
}
impl RequestHandler for GetClusterStatisticsRequest {
type Response = GetClusterStatisticsResponse;
pub async fn handle_apply_cluster_layout(
garage: &Arc<Garage>,
req: Request<IncomingBody>,
) -> Result<Response<ResBody>, Error> {
let param = parse_json_body::<ApplyLayoutRequest, _, Error>(req).await?;
// FIXME: return this as a JSON struct instead of text
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<GetClusterStatisticsResponse, Error> {
let mut ret = String::new();
let layout = garage.system.cluster_layout().inner().clone();
let (layout, msg) = layout.apply_staged_changes(Some(param.version))?;
// Gather storage node and free space statistics for current nodes
let layout = &garage.system.cluster_layout();
let mut node_partition_count = HashMap::<Uuid, u64>::new();
for short_id in layout.current().ring_assignment_data.iter() {
let id = layout.current().node_id_vec[*short_id as usize];
*node_partition_count.entry(id).or_default() += 1;
}
let node_info = garage
.system
.get_known_nodes()
.into_iter()
.map(|n| (n.id, n))
.collect::<HashMap<_, _>>();
garage
.system
.layout_manager
.update_cluster_layout(&layout)
.await?;
let mut table = vec![" ID\tHostname\tZone\tCapacity\tPart.\tDataAvail\tMetaAvail".into()];
for (id, parts) in node_partition_count.iter() {
let info = node_info.get(id);
let status = info.map(|x| &x.status);
let role = layout.current().roles.get(id).and_then(|x| x.0.as_ref());
let hostname = status.and_then(|x| x.hostname.as_deref()).unwrap_or("?");
let zone = role.map(|x| x.zone.as_str()).unwrap_or("?");
let capacity = role
.map(|x| x.capacity_string())
.unwrap_or_else(|| "?".into());
let avail_str = |x| match x {
Some((avail, total)) => {
let pct = (avail as f64) / (total as f64) * 100.;
let avail = bytesize::ByteSize::b(avail);
let total = bytesize::ByteSize::b(total);
format!("{}/{} ({:.1}%)", avail, total, pct)
}
None => "?".into(),
};
let data_avail = avail_str(status.and_then(|x| x.data_disk_avail));
let meta_avail = avail_str(status.and_then(|x| x.meta_disk_avail));
table.push(format!(
" {:?}\t{}\t{}\t{}\t{}\t{}\t{}",
id, hostname, zone, capacity, parts, data_avail, meta_avail
));
}
write!(
&mut ret,
"Storage nodes:\n{}",
format_table_to_string(table)
)
.unwrap();
let meta_part_avail = node_partition_count
.iter()
.filter_map(|(id, parts)| {
node_info
.get(id)
.and_then(|x| x.status.meta_disk_avail)
.map(|c| c.0 / *parts)
})
.collect::<Vec<_>>();
let data_part_avail = node_partition_count
.iter()
.filter_map(|(id, parts)| {
node_info
.get(id)
.and_then(|x| x.status.data_disk_avail)
.map(|c| c.0 / *parts)
})
.collect::<Vec<_>>();
if !meta_part_avail.is_empty() && !data_part_avail.is_empty() {
let meta_avail =
bytesize::ByteSize(meta_part_avail.iter().min().unwrap() * (1 << PARTITION_BITS));
let data_avail =
bytesize::ByteSize(data_part_avail.iter().min().unwrap() * (1 << PARTITION_BITS));
writeln!(
&mut ret,
"\nEstimated available storage space cluster-wide (might be lower in practice):"
)
.unwrap();
if meta_part_avail.len() < node_partition_count.len()
|| data_part_avail.len() < node_partition_count.len()
{
ret += &format_table_to_string(vec![
format!(" data: < {}", data_avail),
format!(" metadata: < {}", meta_avail),
]);
writeln!(&mut ret, "A precise estimate could not be given as information is missing for some storage nodes.").unwrap();
} else {
ret += &format_table_to_string(vec![
format!(" data: {}", data_avail),
format!(" metadata: {}", meta_avail),
]);
}
}
Ok(GetClusterStatisticsResponse { freeform: ret })
}
let res = ApplyClusterLayoutResponse {
message: msg,
layout: format_cluster_layout(&layout),
};
Ok(json_ok_response(&res)?)
}
impl RequestHandler for ConnectClusterNodesRequest {
type Response = ConnectClusterNodesResponse;
pub async fn handle_revert_cluster_layout(
garage: &Arc<Garage>,
) -> Result<Response<ResBody>, Error> {
let layout = garage.system.cluster_layout().inner().clone();
let layout = layout.revert_staged_changes()?;
garage
.system
.layout_manager
.update_cluster_layout(&layout)
.await?;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<ConnectClusterNodesResponse, Error> {
let res = futures::future::join_all(self.0.iter().map(|node| garage.system.connect(node)))
.await
.into_iter()
.map(|r| match r {
Ok(()) => ConnectNodeResponse {
success: true,
error: None,
},
Err(e) => ConnectNodeResponse {
success: false,
error: Some(format!("{}", e)),
},
})
.collect::<Vec<_>>();
Ok(ConnectClusterNodesResponse(res))
}
let res = format_cluster_layout(&layout);
Ok(json_ok_response(&res)?)
}
// ----
type UpdateClusterLayoutRequest = Vec<NodeRoleChange>;
#[derive(Deserialize)]
#[serde(rename_all = "camelCase")]
struct ApplyLayoutRequest {
version: u64,
}
// ----
#[derive(Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
struct NodeRoleChange {
id: String,
#[serde(flatten)]
action: NodeRoleChangeEnum,
}
#[derive(Serialize, Deserialize)]
#[serde(untagged)]
enum NodeRoleChangeEnum {
#[serde(rename_all = "camelCase")]
Remove { remove: bool },
#[serde(rename_all = "camelCase")]
Update {
zone: String,
capacity: Option<u64>,
tags: Vec<String>,
},
}

View file

@ -1,8 +1,8 @@
use std::convert::TryFrom;
use err_derive::Error;
use hyper::header::HeaderValue;
use hyper::{HeaderMap, StatusCode};
use thiserror::Error;
pub use garage_model::helper::error::Error as HelperError;
@ -16,36 +16,17 @@ use garage_api_common::helpers::*;
/// Errors of this crate
#[derive(Debug, Error)]
pub enum Error {
#[error(display = "{}", _0)]
#[error("{0}")]
/// Error from common error
Common(#[error(source)] CommonError),
Common(#[from] CommonError),
// Category: cannot process
/// The admin API token does not exist
#[error(display = "Admin token not found: {}", _0)]
NoSuchAdminToken(String),
/// The API access key does not exist
#[error(display = "Access key not found: {}", _0)]
#[error("Access key not found: {0}")]
NoSuchAccessKey(String),
/// The requested block does not exist
#[error(display = "Block not found: {}", _0)]
NoSuchBlock(String),
/// The requested worker does not exist
#[error(display = "Worker not found: {}", _0)]
NoSuchWorker(u64),
/// The object requested don't exists
#[error(display = "Key not found")]
NoSuchKey,
/// In Import key, the key already exists
#[error(
display = "Key {} already exists in data store. Even if it is deleted, we can't let you create a new key with the same ID. Sorry.",
_0
)]
#[error("Key {0} already exists in data store. Even if it is deleted, we can't let you create a new key with the same ID. Sorry.")]
KeyAlreadyExists(String),
}
@ -65,15 +46,11 @@ impl From<HelperError> for Error {
}
impl Error {
pub fn code(&self) -> &'static str {
fn code(&self) -> &'static str {
match self {
Error::Common(c) => c.aws_code(),
Error::NoSuchAdminToken(_) => "NoSuchAdminToken",
Error::NoSuchAccessKey(_) => "NoSuchAccessKey",
Error::NoSuchWorker(_) => "NoSuchWorker",
Error::NoSuchBlock(_) => "NoSuchBlock",
Error::KeyAlreadyExists(_) => "KeyAlreadyExists",
Error::NoSuchKey => "NoSuchKey",
}
}
}
@ -83,11 +60,7 @@ impl ApiError for Error {
fn http_status_code(&self) -> StatusCode {
match self {
Error::Common(c) => c.http_status_code(),
Error::NoSuchAdminToken(_)
| Error::NoSuchAccessKey(_)
| Error::NoSuchWorker(_)
| Error::NoSuchBlock(_)
| Error::NoSuchKey => StatusCode::NOT_FOUND,
Error::NoSuchAccessKey(_) => StatusCode::NOT_FOUND,
Error::KeyAlreadyExists(_) => StatusCode::CONFLICT,
}
}

View file

@ -1,190 +1,173 @@
use std::collections::HashMap;
use std::sync::Arc;
use chrono::DateTime;
use hyper::{body::Incoming as IncomingBody, Request, Response, StatusCode};
use serde::{Deserialize, Serialize};
use garage_table::*;
use garage_util::time::now_msec;
use garage_model::garage::Garage;
use garage_model::key_table::*;
use crate::api::*;
use garage_api_common::helpers::*;
use crate::api_server::ResBody;
use crate::error::*;
use crate::{Admin, RequestHandler};
impl RequestHandler for ListKeysRequest {
type Response = ListKeysResponse;
async fn handle(self, garage: &Arc<Garage>, _admin: &Admin) -> Result<ListKeysResponse, Error> {
let now = now_msec();
let res = garage
.key_table
.get_range(
&EmptyKey,
None,
Some(KeyFilter::Deleted(DeletedFilter::NotDeleted)),
10000,
EnumerationOrder::Forward,
)
.await?
.iter()
.map(|k| {
let p = k.params().unwrap();
ListKeysResponseItem {
id: k.key_id.to_string(),
name: p.name.get().clone(),
created: p.created.map(|x| {
DateTime::from_timestamp_millis(x as i64)
.expect("invalid timestamp stored in db")
}),
expiration: p.expiration.get().map(|x| {
DateTime::from_timestamp_millis(x as i64)
.expect("invalid timestamp stored in db")
}),
expired: p.is_expired(now),
}
})
.collect::<Vec<_>>();
Ok(ListKeysResponse(res))
}
}
impl RequestHandler for GetKeyInfoRequest {
type Response = GetKeyInfoResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<GetKeyInfoResponse, Error> {
let key = match (self.id, self.search) {
(Some(id), None) => garage.key_helper().get_existing_key(&id).await?,
(None, Some(search)) => {
let candidates = garage
.key_table
.get_range(
&EmptyKey,
None,
Some(KeyFilter::MatchesAndNotDeleted(search.to_string())),
10,
EnumerationOrder::Forward,
)
.await?
.into_iter()
.collect::<Vec<_>>();
if candidates.len() != 1 {
return Err(Error::bad_request(format!(
"{} matching keys",
candidates.len()
)));
}
candidates.into_iter().next().unwrap()
}
_ => {
return Err(Error::bad_request(
"Either id or search must be provided (but not both)",
));
}
};
Ok(key_info_results(garage, key, self.show_secret_key).await?)
}
}
impl RequestHandler for CreateKeyRequest {
type Response = CreateKeyResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<CreateKeyResponse, Error> {
let mut key = Key::new("Unnamed key");
apply_key_updates(&mut key, self.0)?;
garage.key_table.insert(&key).await?;
Ok(CreateKeyResponse(
key_info_results(garage, key, true).await?,
))
}
}
impl RequestHandler for ImportKeyRequest {
type Response = ImportKeyResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<ImportKeyResponse, Error> {
let prev_key = garage.key_table.get(&EmptyKey, &self.access_key_id).await?;
if prev_key.is_some() {
return Err(Error::KeyAlreadyExists(self.access_key_id.to_string()));
}
let imported_key = Key::import(
&self.access_key_id,
&self.secret_access_key,
self.name.as_deref().unwrap_or("Imported key"),
pub async fn handle_list_keys(garage: &Arc<Garage>) -> Result<Response<ResBody>, Error> {
let res = garage
.key_table
.get_range(
&EmptyKey,
None,
Some(KeyFilter::Deleted(DeletedFilter::NotDeleted)),
10000,
EnumerationOrder::Forward,
)
.ok_or_bad_request("Invalid key format")?;
garage.key_table.insert(&imported_key).await?;
.await?
.iter()
.map(|k| ListKeyResultItem {
id: k.key_id.to_string(),
name: k.params().unwrap().name.get().clone(),
})
.collect::<Vec<_>>();
Ok(ImportKeyResponse(
key_info_results(garage, imported_key, false).await?,
))
}
Ok(json_ok_response(&res)?)
}
impl RequestHandler for UpdateKeyRequest {
type Response = UpdateKeyResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<UpdateKeyResponse, Error> {
let mut key = garage.key_helper().get_existing_key(&self.id).await?;
apply_key_updates(&mut key, self.body)?;
garage.key_table.insert(&key).await?;
Ok(UpdateKeyResponse(
key_info_results(garage, key, false).await?,
))
}
#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
struct ListKeyResultItem {
id: String,
name: String,
}
impl RequestHandler for DeleteKeyRequest {
type Response = DeleteKeyResponse;
pub async fn handle_get_key_info(
garage: &Arc<Garage>,
id: Option<String>,
search: Option<String>,
show_secret_key: bool,
) -> Result<Response<ResBody>, Error> {
let key = if let Some(id) = id {
garage.key_helper().get_existing_key(&id).await?
} else if let Some(search) = search {
garage
.key_helper()
.get_existing_matching_key(&search)
.await?
} else {
unreachable!();
};
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<DeleteKeyResponse, Error> {
let helper = garage.locked_helper().await;
key_info_results(garage, key, show_secret_key).await
}
let mut key = helper.key().get_existing_key(&self.id).await?;
pub async fn handle_create_key(
garage: &Arc<Garage>,
req: Request<IncomingBody>,
) -> Result<Response<ResBody>, Error> {
let req = parse_json_body::<CreateKeyRequest, _, Error>(req).await?;
helper.delete_key(&mut key).await?;
let key = Key::new(req.name.as_deref().unwrap_or("Unnamed key"));
garage.key_table.insert(&key).await?;
Ok(DeleteKeyResponse)
key_info_results(garage, key, true).await
}
#[derive(Deserialize)]
#[serde(rename_all = "camelCase")]
struct CreateKeyRequest {
name: Option<String>,
}
pub async fn handle_import_key(
garage: &Arc<Garage>,
req: Request<IncomingBody>,
) -> Result<Response<ResBody>, Error> {
let req = parse_json_body::<ImportKeyRequest, _, Error>(req).await?;
let prev_key = garage.key_table.get(&EmptyKey, &req.access_key_id).await?;
if prev_key.is_some() {
return Err(Error::KeyAlreadyExists(req.access_key_id.to_string()));
}
let imported_key = Key::import(
&req.access_key_id,
&req.secret_access_key,
req.name.as_deref().unwrap_or("Imported key"),
)
.ok_or_bad_request("Invalid key format")?;
garage.key_table.insert(&imported_key).await?;
key_info_results(garage, imported_key, false).await
}
#[derive(Deserialize)]
#[serde(rename_all = "camelCase")]
struct ImportKeyRequest {
access_key_id: String,
secret_access_key: String,
name: Option<String>,
}
pub async fn handle_update_key(
garage: &Arc<Garage>,
id: String,
req: Request<IncomingBody>,
) -> Result<Response<ResBody>, Error> {
let req = parse_json_body::<UpdateKeyRequest, _, Error>(req).await?;
let mut key = garage.key_helper().get_existing_key(&id).await?;
let key_state = key.state.as_option_mut().unwrap();
if let Some(new_name) = req.name {
key_state.name.update(new_name);
}
if let Some(allow) = req.allow {
if allow.create_bucket {
key_state.allow_create_bucket.update(true);
}
}
if let Some(deny) = req.deny {
if deny.create_bucket {
key_state.allow_create_bucket.update(false);
}
}
garage.key_table.insert(&key).await?;
key_info_results(garage, key, false).await
}
#[derive(Deserialize)]
#[serde(rename_all = "camelCase")]
struct UpdateKeyRequest {
name: Option<String>,
allow: Option<KeyPerm>,
deny: Option<KeyPerm>,
}
pub async fn handle_delete_key(
garage: &Arc<Garage>,
id: String,
) -> Result<Response<ResBody>, Error> {
let helper = garage.locked_helper().await;
let mut key = helper.key().get_existing_key(&id).await?;
helper.delete_key(&mut key).await?;
Ok(Response::builder()
.status(StatusCode::NO_CONTENT)
.body(empty_body())?)
}
async fn key_info_results(
garage: &Arc<Garage>,
key: Key,
show_secret: bool,
) -> Result<GetKeyInfoResponse, Error> {
) -> Result<Response<ResBody>, Error> {
let mut relevant_buckets = HashMap::new();
let key_state = key.state.as_option().unwrap();
@ -210,15 +193,8 @@ async fn key_info_results(
}
}
let res = GetKeyInfoResponse {
let res = GetKeyInfoResult {
name: key_state.name.get().clone(),
created: key_state.created.map(|x| {
DateTime::from_timestamp_millis(x as i64).expect("invalid timestamp stored in db")
}),
expiration: key_state.expiration.get().map(|x| {
DateTime::from_timestamp_millis(x as i64).expect("invalid timestamp stored in db")
}),
expired: key_state.is_expired(now_msec()),
access_key_id: key.key_id.clone(),
secret_access_key: if show_secret {
Some(key_state.secret_key.clone())
@ -232,7 +208,7 @@ async fn key_info_results(
.into_values()
.map(|bucket| {
let state = bucket.state.as_option().unwrap();
KeyInfoBucketResponse {
KeyInfoBucketResult {
id: hex::encode(bucket.id),
global_aliases: state
.aliases
@ -262,39 +238,43 @@ async fn key_info_results(
.collect::<Vec<_>>(),
};
Ok(res)
Ok(json_ok_response(&res)?)
}
fn apply_key_updates(key: &mut Key, updates: UpdateKeyRequestBody) -> Result<(), Error> {
if updates.never_expires && updates.expiration.is_some() {
return Err(Error::bad_request(
"cannot specify `expiration` and `never_expires`",
));
}
let key_state = key.state.as_option_mut().unwrap();
if let Some(new_name) = updates.name {
key_state.name.update(new_name);
}
if let Some(expiration) = updates.expiration {
key_state
.expiration
.update(Some(expiration.timestamp_millis() as u64));
}
if updates.never_expires {
key_state.expiration.update(None);
}
if let Some(allow) = updates.allow {
if allow.create_bucket {
key_state.allow_create_bucket.update(true);
}
}
if let Some(deny) = updates.deny {
if deny.create_bucket {
key_state.allow_create_bucket.update(false);
}
}
Ok(())
#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
struct GetKeyInfoResult {
name: String,
access_key_id: String,
#[serde(skip_serializing_if = "is_default")]
secret_access_key: Option<String>,
permissions: KeyPerm,
buckets: Vec<KeyInfoBucketResult>,
}
#[derive(Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
struct KeyPerm {
#[serde(default)]
create_bucket: bool,
}
#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
struct KeyInfoBucketResult {
id: String,
global_aliases: Vec<String>,
local_aliases: Vec<String>,
permissions: ApiBucketKeyPerm,
}
#[derive(Serialize, Deserialize, Default)]
#[serde(rename_all = "camelCase")]
pub(crate) struct ApiBucketKeyPerm {
#[serde(default)]
pub(crate) read: bool,
#[serde(default)]
pub(crate) write: bool,
#[serde(default)]
pub(crate) owner: bool,
}

View file

@ -1,406 +0,0 @@
use std::sync::Arc;
use garage_util::crdt::*;
use garage_util::data::*;
use garage_util::error::Error as GarageError;
use garage_rpc::layout;
use garage_model::garage::Garage;
use crate::api::*;
use crate::error::*;
use crate::{Admin, RequestHandler};
impl RequestHandler for GetClusterLayoutRequest {
type Response = GetClusterLayoutResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<GetClusterLayoutResponse, Error> {
Ok(format_cluster_layout(
garage.system.cluster_layout().inner(),
))
}
}
fn format_cluster_layout(layout: &layout::LayoutHistory) -> GetClusterLayoutResponse {
let current = layout.current();
let roles = current
.roles
.items()
.iter()
.filter_map(|(k, _, v)| v.0.clone().map(|x| (k, x)))
.map(|(k, v)| {
let stored_partitions = current.get_node_usage(k).ok().map(|x| x as u64);
LayoutNodeRole {
id: hex::encode(k),
zone: v.zone.clone(),
capacity: v.capacity,
stored_partitions,
usable_capacity: stored_partitions.map(|x| x * current.partition_size),
tags: v.tags.clone(),
}
})
.collect::<Vec<_>>();
let staged_role_changes = layout
.staging
.get()
.roles
.items()
.iter()
.filter(|(k, _, v)| current.roles.get(k) != Some(v))
.map(|(k, _, v)| match &v.0 {
None => NodeRoleChange {
id: hex::encode(k),
action: NodeRoleChangeEnum::Remove { remove: true },
},
Some(r) => NodeRoleChange {
id: hex::encode(k),
action: NodeRoleChangeEnum::Update(NodeAssignedRole {
zone: r.zone.clone(),
capacity: r.capacity,
tags: r.tags.clone(),
}),
},
})
.collect::<Vec<_>>();
let staged_parameters = if *layout.staging.get().parameters.get() != current.parameters {
Some((*layout.staging.get().parameters.get()).into())
} else {
None
};
GetClusterLayoutResponse {
version: current.version,
roles,
partition_size: current.partition_size,
parameters: current.parameters.into(),
staged_role_changes,
staged_parameters,
}
}
impl RequestHandler for GetClusterLayoutHistoryRequest {
type Response = GetClusterLayoutHistoryResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<GetClusterLayoutHistoryResponse, Error> {
let layout_helper = garage.system.cluster_layout();
let layout = layout_helper.inner();
let min_stored = layout.min_stored();
let versions = layout
.versions
.iter()
.rev()
.chain(layout.old_versions.iter().rev())
.map(|ver| {
let status = if ver.version == layout.current().version {
ClusterLayoutVersionStatus::Current
} else if ver.version >= min_stored {
ClusterLayoutVersionStatus::Draining
} else {
ClusterLayoutVersionStatus::Historical
};
ClusterLayoutVersion {
version: ver.version,
status,
storage_nodes: ver
.roles
.items()
.iter()
.filter(
|(_, _, x)| matches!(x, layout::NodeRoleV(Some(c)) if c.capacity.is_some()),
)
.count() as u64,
gateway_nodes: ver
.roles
.items()
.iter()
.filter(
|(_, _, x)| matches!(x, layout::NodeRoleV(Some(c)) if c.capacity.is_none()),
)
.count() as u64,
}
})
.collect::<Vec<_>>();
let all_nodes = layout.get_all_nodes();
let min_ack = layout_helper.ack_map_min();
let update_trackers = if layout.versions.len() > 1 {
Some(
all_nodes
.iter()
.map(|node| {
(
hex::encode(&node),
NodeUpdateTrackers {
ack: layout.update_trackers.ack_map.get(node, min_stored),
sync: layout.update_trackers.sync_map.get(node, min_stored),
sync_ack: layout.update_trackers.sync_ack_map.get(node, min_stored),
},
)
})
.collect(),
)
} else {
None
};
Ok(GetClusterLayoutHistoryResponse {
current_version: layout.current().version,
min_ack,
versions,
update_trackers,
})
}
}
// ----
// ---- update functions ----
impl RequestHandler for UpdateClusterLayoutRequest {
type Response = UpdateClusterLayoutResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<UpdateClusterLayoutResponse, Error> {
let mut layout = garage.system.cluster_layout().inner().clone();
let mut roles = layout.current().roles.clone();
roles.merge(&layout.staging.get().roles);
for change in self.roles {
let node = hex::decode(&change.id).ok_or_bad_request("Invalid node identifier")?;
let node = Uuid::try_from(&node).ok_or_bad_request("Invalid node identifier")?;
let new_role = match change.action {
NodeRoleChangeEnum::Remove { remove: true } => None,
NodeRoleChangeEnum::Update(NodeAssignedRole {
zone,
capacity,
tags,
}) => {
if matches!(capacity, Some(cap) if cap < 1024) {
return Err(Error::bad_request("Capacity should be at least 1K (1024)"));
}
Some(layout::NodeRole {
zone,
capacity,
tags,
})
}
_ => return Err(Error::bad_request("Invalid layout change")),
};
layout
.staging
.get_mut()
.roles
.merge(&roles.update_mutator(node, layout::NodeRoleV(new_role)));
}
if let Some(param) = self.parameters {
if let ZoneRedundancy::AtLeast(r_int) = param.zone_redundancy {
if r_int > layout.current().replication_factor {
return Err(Error::bad_request(format!(
"The zone redundancy must be smaller or equal to the replication factor ({}).",
layout.current().replication_factor
)));
} else if r_int < 1 {
return Err(Error::bad_request(
"The zone redundancy must be at least 1.",
));
}
}
layout.staging.get_mut().parameters.update(param.into());
}
garage
.system
.layout_manager
.update_cluster_layout(&layout)
.await?;
let res = format_cluster_layout(&layout);
Ok(UpdateClusterLayoutResponse(res))
}
}
impl RequestHandler for PreviewClusterLayoutChangesRequest {
type Response = PreviewClusterLayoutChangesResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<PreviewClusterLayoutChangesResponse, Error> {
let layout = garage.system.cluster_layout().inner().clone();
let new_ver = layout.current().version + 1;
match layout.apply_staged_changes(new_ver) {
Err(GarageError::Message(error)) => {
Ok(PreviewClusterLayoutChangesResponse::Error { error })
}
Err(e) => Err(e.into()),
Ok((new_layout, msg)) => Ok(PreviewClusterLayoutChangesResponse::Success {
message: msg,
new_layout: format_cluster_layout(&new_layout),
}),
}
}
}
impl RequestHandler for ApplyClusterLayoutRequest {
type Response = ApplyClusterLayoutResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<ApplyClusterLayoutResponse, Error> {
let layout = garage.system.cluster_layout().inner().clone();
let (layout, msg) = layout.apply_staged_changes(self.version)?;
garage
.system
.layout_manager
.update_cluster_layout(&layout)
.await?;
Ok(ApplyClusterLayoutResponse {
message: msg,
layout: format_cluster_layout(&layout),
})
}
}
impl RequestHandler for RevertClusterLayoutRequest {
type Response = RevertClusterLayoutResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<RevertClusterLayoutResponse, Error> {
let layout = garage.system.cluster_layout().inner().clone();
let layout = layout.revert_staged_changes()?;
garage
.system
.layout_manager
.update_cluster_layout(&layout)
.await?;
let res = format_cluster_layout(&layout);
Ok(RevertClusterLayoutResponse(res))
}
}
impl RequestHandler for ClusterLayoutSkipDeadNodesRequest {
type Response = ClusterLayoutSkipDeadNodesResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<ClusterLayoutSkipDeadNodesResponse, Error> {
let status = garage.system.get_known_nodes();
let mut layout = garage.system.cluster_layout().inner().clone();
let mut ack_updated = vec![];
let mut sync_updated = vec![];
if layout.versions.len() == 1 {
return Err(Error::bad_request(
"This command cannot be called when there is only one live cluster layout version",
));
}
let min_v = layout.min_stored();
if self.version <= min_v || self.version > layout.current().version {
return Err(Error::bad_request(format!(
"Invalid version, you may use the following version numbers: {}",
(min_v + 1..=layout.current().version)
.map(|x| x.to_string())
.collect::<Vec<_>>()
.join(" ")
)));
}
let all_nodes = layout.get_all_nodes();
for node in all_nodes.iter() {
// Update ACK tracker for dead nodes or for all nodes if --allow-missing-data
if self.allow_missing_data || !status.iter().any(|x| x.id == *node && x.is_up) {
if layout.update_trackers.ack_map.set_max(*node, self.version) {
ack_updated.push(hex::encode(node));
}
}
// If --allow-missing-data, update SYNC tracker for all nodes.
if self.allow_missing_data {
if layout.update_trackers.sync_map.set_max(*node, self.version) {
sync_updated.push(hex::encode(node));
}
}
}
garage
.system
.layout_manager
.update_cluster_layout(&layout)
.await?;
Ok(ClusterLayoutSkipDeadNodesResponse {
ack_updated,
sync_updated,
})
}
}
// ----
impl From<layout::ZoneRedundancy> for ZoneRedundancy {
fn from(x: layout::ZoneRedundancy) -> Self {
match x {
layout::ZoneRedundancy::Maximum => ZoneRedundancy::Maximum,
layout::ZoneRedundancy::AtLeast(x) => ZoneRedundancy::AtLeast(x),
}
}
}
impl Into<layout::ZoneRedundancy> for ZoneRedundancy {
fn into(self) -> layout::ZoneRedundancy {
match self {
ZoneRedundancy::Maximum => layout::ZoneRedundancy::Maximum,
ZoneRedundancy::AtLeast(x) => layout::ZoneRedundancy::AtLeast(x),
}
}
}
impl From<layout::LayoutParameters> for LayoutParameters {
fn from(x: layout::LayoutParameters) -> Self {
LayoutParameters {
zone_redundancy: x.zone_redundancy.into(),
}
}
}
impl Into<layout::LayoutParameters> for LayoutParameters {
fn into(self) -> layout::LayoutParameters {
layout::LayoutParameters {
zone_redundancy: self.zone_redundancy.into(),
}
}
}

View file

@ -3,44 +3,9 @@ extern crate tracing;
pub mod api_server;
mod error;
mod macros;
pub mod api;
pub mod openapi;
mod router_v0;
mod router_v1;
mod router_v2;
mod admin_token;
mod bucket;
mod cluster;
mod key;
mod layout;
mod special;
mod block;
mod node;
mod repair;
mod worker;
use std::sync::Arc;
use garage_model::garage::Garage;
pub use api_server::AdminApiServer as Admin;
pub enum Authorization {
None,
MetricsToken,
AdminToken,
}
pub trait RequestHandler {
type Response;
fn handle(
self,
garage: &Arc<Garage>,
admin: &Admin,
) -> impl std::future::Future<Output = Result<Self::Response, error::Error>> + Send;
}

View file

@ -1,206 +0,0 @@
macro_rules! admin_endpoints {
[
$(@special $special_endpoint:ident,)*
$($endpoint:ident,)*
] => {
paste! {
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum AdminApiRequest {
$(
$special_endpoint( [<$special_endpoint Request>] ),
)*
$(
$endpoint( [<$endpoint Request>] ),
)*
}
#[derive(Debug, Clone, Serialize)]
#[serde(untagged)]
pub enum AdminApiResponse {
$(
$endpoint( [<$endpoint Response>] ),
)*
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum TaggedAdminApiResponse {
$(
$endpoint( [<$endpoint Response>] ),
)*
}
impl AdminApiRequest {
pub fn name(&self) -> &'static str {
match self {
$(
Self::$special_endpoint(_) => stringify!($special_endpoint),
)*
$(
Self::$endpoint(_) => stringify!($endpoint),
)*
}
}
}
impl AdminApiResponse {
pub fn tagged(self) -> TaggedAdminApiResponse {
match self {
$(
Self::$endpoint(res) => TaggedAdminApiResponse::$endpoint(res),
)*
}
}
}
$(
impl From< [< $endpoint Request >] > for AdminApiRequest {
fn from(req: [< $endpoint Request >]) -> AdminApiRequest {
AdminApiRequest::$endpoint(req)
}
}
impl TryFrom<TaggedAdminApiResponse> for [< $endpoint Response >] {
type Error = TaggedAdminApiResponse;
fn try_from(resp: TaggedAdminApiResponse) -> Result< [< $endpoint Response >], TaggedAdminApiResponse> {
match resp {
TaggedAdminApiResponse::$endpoint(v) => Ok(v),
x => Err(x),
}
}
}
)*
impl RequestHandler for AdminApiRequest {
type Response = AdminApiResponse;
async fn handle(self, garage: &Arc<Garage>, admin: &Admin) -> Result<AdminApiResponse, Error> {
Ok(match self {
$(
AdminApiRequest::$special_endpoint(_) => panic!(
concat!(stringify!($special_endpoint), " needs to go through a special handler")
),
)*
$(
AdminApiRequest::$endpoint(req) => AdminApiResponse::$endpoint(req.handle(garage, admin).await?),
)*
})
}
}
}
};
}
macro_rules! local_admin_endpoints {
[
$($endpoint:ident,)*
] => {
paste! {
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum LocalAdminApiRequest {
$(
$endpoint( [<Local $endpoint Request>] ),
)*
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum LocalAdminApiResponse {
$(
$endpoint( [<Local $endpoint Response>] ),
)*
}
$(
pub type [< $endpoint Request >] = MultiRequest< [< Local $endpoint Request >] >;
pub type [< $endpoint RequestBody >] = [< Local $endpoint Request >];
pub type [< $endpoint Response >] = MultiResponse< [< Local $endpoint Response >] >;
impl From< [< Local $endpoint Request >] > for LocalAdminApiRequest {
fn from(req: [< Local $endpoint Request >]) -> LocalAdminApiRequest {
LocalAdminApiRequest::$endpoint(req)
}
}
impl TryFrom<LocalAdminApiResponse> for [< Local $endpoint Response >] {
type Error = LocalAdminApiResponse;
fn try_from(resp: LocalAdminApiResponse) -> Result< [< Local $endpoint Response >], LocalAdminApiResponse> {
match resp {
LocalAdminApiResponse::$endpoint(v) => Ok(v),
x => Err(x),
}
}
}
impl RequestHandler for [< $endpoint Request >] {
type Response = [< $endpoint Response >];
async fn handle(self, garage: &Arc<Garage>, admin: &Admin) -> Result<Self::Response, Error> {
let to = find_matching_nodes(garage, self.node.as_str())?;
let resps = garage.system.rpc_helper().call_many(&admin.endpoint,
&to,
AdminRpc::Internal(self.body.into()),
RequestStrategy::with_priority(PRIO_NORMAL),
).await?;
let mut ret = [< $endpoint Response >] {
success: HashMap::new(),
error: HashMap::new(),
};
for (node, resp) in resps {
match resp {
Ok(AdminRpcResponse::InternalApiOkResponse(r)) => {
match [< Local $endpoint Response >]::try_from(r) {
Ok(r) => {
ret.success.insert(hex::encode(node), r);
}
Err(_) => {
ret.error.insert(hex::encode(node), "returned invalid value".to_string());
}
}
}
Ok(AdminRpcResponse::ApiErrorResponse{error_code, http_code, message}) => {
ret.error.insert(hex::encode(node), format!("{} ({}): {}", error_code, http_code, message));
}
Ok(_) => {
ret.error.insert(hex::encode(node), "returned invalid value".to_string());
}
Err(e) => {
ret.error.insert(hex::encode(node), e.to_string());
}
}
}
Ok(ret)
}
}
)*
impl LocalAdminApiRequest {
pub fn name(&self) -> &'static str {
match self {
$(
Self::$endpoint(_) => stringify!($endpoint),
)*
}
}
}
impl RequestHandler for LocalAdminApiRequest {
type Response = LocalAdminApiResponse;
async fn handle(self, garage: &Arc<Garage>, admin: &Admin) -> Result<LocalAdminApiResponse, Error> {
Ok(match self {
$(
LocalAdminApiRequest::$endpoint(req) => LocalAdminApiResponse::$endpoint(req.handle(garage, admin).await?),
)*
})
}
}
}
};
}
pub(crate) use admin_endpoints;
pub(crate) use local_admin_endpoints;

View file

@ -1,144 +0,0 @@
use std::fmt::Write;
use std::sync::Arc;
use format_table::format_table_to_string;
use garage_util::error::Error as GarageError;
use garage_table::replication::*;
use garage_table::*;
use garage_model::garage::Garage;
use crate::api::*;
use crate::error::Error;
use crate::{Admin, RequestHandler};
impl RequestHandler for LocalGetNodeInfoRequest {
type Response = LocalGetNodeInfoResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<LocalGetNodeInfoResponse, Error> {
Ok(LocalGetNodeInfoResponse {
node_id: hex::encode(garage.system.id),
garage_version: garage_util::version::garage_version().to_string(),
garage_features: garage_util::version::garage_features()
.map(|features| features.iter().map(ToString::to_string).collect()),
rust_version: garage_util::version::rust_version().to_string(),
db_engine: garage.db.engine(),
})
}
}
impl RequestHandler for LocalCreateMetadataSnapshotRequest {
type Response = LocalCreateMetadataSnapshotResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<LocalCreateMetadataSnapshotResponse, Error> {
garage_model::snapshot::async_snapshot_metadata(garage).await?;
Ok(LocalCreateMetadataSnapshotResponse)
}
}
impl RequestHandler for LocalGetNodeStatisticsRequest {
type Response = LocalGetNodeStatisticsResponse;
// FIXME: return this as a JSON struct instead of text
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<LocalGetNodeStatisticsResponse, Error> {
let sys_status = garage.system.local_status();
let mut ret = format_table_to_string(vec![
format!("Node ID:\t{:?}", garage.system.id),
format!("Hostname:\t{}", sys_status.hostname.unwrap_or_default(),),
format!(
"Garage version:\t{}",
garage_util::version::garage_version(),
),
format!(
"Garage features:\t{}",
garage_util::version::garage_features()
.map(|list| list.join(", "))
.unwrap_or_else(|| "(unknown)".into()),
),
format!(
"Rust compiler version:\t{}",
garage_util::version::rust_version(),
),
format!("Database engine:\t{}", garage.db.engine()),
]);
// Gather table statistics
let mut table = vec![" Table\tItems\tMklItems\tMklTodo\tInsQueue\tGcTodo".into()];
table.push(gather_table_stats(&garage.admin_token_table)?);
table.push(gather_table_stats(&garage.bucket_table)?);
table.push(gather_table_stats(&garage.bucket_alias_table)?);
table.push(gather_table_stats(&garage.key_table)?);
table.push(gather_table_stats(&garage.object_table)?);
table.push(gather_table_stats(&garage.object_counter_table.table)?);
table.push(gather_table_stats(&garage.mpu_table)?);
table.push(gather_table_stats(&garage.mpu_counter_table.table)?);
table.push(gather_table_stats(&garage.version_table)?);
table.push(gather_table_stats(&garage.block_ref_table)?);
#[cfg(feature = "k2v")]
{
table.push(gather_table_stats(&garage.k2v.item_table)?);
table.push(gather_table_stats(&garage.k2v.counter_table.table)?);
}
write!(
&mut ret,
"\nTable stats:\n{}",
format_table_to_string(table)
)
.unwrap();
// Gather block manager statistics
writeln!(&mut ret, "\nBlock manager stats:").unwrap();
let rc_len = garage.block_manager.rc_len()?.to_string();
ret += &format_table_to_string(vec![
format!(" number of RC entries:\t{} (~= number of blocks)", rc_len),
format!(
" resync queue length:\t{}",
garage.block_manager.resync.queue_len()?
),
format!(
" blocks with resync errors:\t{}",
garage.block_manager.resync.errors_len()?
),
]);
Ok(LocalGetNodeStatisticsResponse { freeform: ret })
}
}
fn gather_table_stats<F, R>(t: &Arc<Table<F, R>>) -> Result<String, Error>
where
F: TableSchema + 'static,
R: TableReplication + 'static,
{
let data_len = t.data.store.len().map_err(GarageError::from)?.to_string();
let mkl_len = t.merkle_updater.merkle_tree_len()?.to_string();
Ok(format!(
" {}\t{}\t{}\t{}\t{}\t{}",
F::TABLE_NAME,
data_len,
mkl_len,
t.merkle_updater.todo_len()?,
t.data.insert_queue_len()?,
t.data.gc_todo_len()?
))
}

View file

@ -1,924 +0,0 @@
#![allow(dead_code)]
#![allow(non_snake_case)]
use utoipa::{Modify, OpenApi};
use crate::api::*;
// **********************************************
// Special endpoints
// **********************************************
#[utoipa::path(get,
path = "/metrics",
tag = "Special endpoints",
description = "Prometheus metrics endpoint",
security((), ("bearerAuth" = [])),
responses(
(status = 200, description = "Garage daemon metrics exported in Prometheus format"),
),
)]
fn Metrics() -> () {}
#[utoipa::path(get,
path = "/health",
tag = "Special endpoints",
description = "
Check cluster health. The status code returned by this function indicates
whether this Garage daemon can answer API requests.
Garage will return `200 OK` even if some storage nodes are disconnected,
as long as it is able to have a quorum of nodes for read and write operations.
",
security(()),
responses(
(status = 200, description = "Garage is able to answer requests"),
(status = 503, description = "This Garage daemon is not able to handle requests")
),
)]
fn Health() -> () {}
#[utoipa::path(get,
path = "/check",
tag = "Special endpoints",
description = "
Static website domain name check. Checks whether a bucket is configured to serve
a static website for the requested domain. This is used by reverse proxies such
as Caddy or Tricot, to avoid requesting TLS certificates for domain names that
do not correspond to an actual website.
",
params(
("domain", description = "The domain name to check for"),
),
security(()),
responses(
(status = 200, description = "The domain name redirects to a static website bucket"),
(status = 400, description = "No static website bucket exists for this domain")
),
)]
fn CheckDomain() -> () {}
// **********************************************
// Cluster operations
// **********************************************
#[utoipa::path(get,
path = "/v2/GetClusterStatus",
tag = "Cluster",
description = "
Returns the cluster's current status, including:
- ID of the node being queried and its version of the Garage daemon
- Live nodes
- Currently configured cluster layout
- Staged changes to the cluster layout
*Capacity is given in bytes*
",
responses(
(status = 200, description = "Cluster status report", body = GetClusterStatusResponse),
(status = 500, description = "Internal server error")
),
)]
fn GetClusterStatus() -> () {}
#[utoipa::path(get,
path = "/v2/GetClusterHealth",
tag = "Cluster",
description = "Returns the global status of the cluster, the number of connected nodes (over the number of known ones), the number of healthy storage nodes (over the declared ones), and the number of healthy partitions (over the total).",
responses(
(status = 200, description = "Cluster health report", body = GetClusterHealthResponse),
),
)]
fn GetClusterHealth() -> () {}
#[utoipa::path(get,
path = "/v2/GetClusterStatistics",
tag = "Cluster",
description = "
Fetch global cluster statistics.
*Note: do not try to parse the `freeform` field of the response, it is given as a string specifically because its format is not stable.*
",
responses(
(status = 200, description = "Global cluster statistics", body = GetClusterStatisticsResponse),
(status = 500, description = "Internal server error")
),
)]
fn GetClusterStatistics() -> () {}
#[utoipa::path(post,
path = "/v2/ConnectClusterNodes",
tag = "Cluster",
description = "Instructs this Garage node to connect to other Garage nodes at specified `<node_id>@<net_address>`. `node_id` is generated automatically on node start.",
request_body=ConnectClusterNodesRequest,
responses(
(status = 200, description = "The request has been handled correctly but it does not mean that all connection requests succeeded; some might have fail, you need to check the body!", body = ConnectClusterNodesResponse),
(status = 500, description = "Internal server error")
),
)]
fn ConnectClusterNodes() -> () {}
// **********************************************
// Admin API token operations
// **********************************************
#[utoipa::path(get,
path = "/v2/ListAdminTokens",
tag = "Admin API token",
description = "Returns all admin API tokens in the cluster.",
responses(
(status = 200, description = "Returns info about all admin API tokens", body = ListAdminTokensResponse),
(status = 500, description = "Internal server error")
),
)]
fn ListAdminTokens() -> () {}
#[utoipa::path(get,
path = "/v2/GetAdminTokenInfo",
tag = "Admin API token",
description = "
Return information about a specific admin API token.
You can search by specifying the exact token identifier (`id`) or by specifying a pattern (`search`).
",
params(GetAdminTokenInfoRequest),
responses(
(status = 200, description = "Information about the admin token", body = GetAdminTokenInfoResponse),
(status = 500, description = "Internal server error")
),
)]
fn GetAdminTokenInfo() -> () {}
#[utoipa::path(post,
path = "/v2/CreateAdminToken",
tag = "Admin API token",
description = "Creates a new admin API token",
request_body = UpdateAdminTokenRequestBody,
responses(
(status = 200, description = "Admin token has been created", body = CreateAdminTokenResponse),
(status = 500, description = "Internal server error")
),
)]
fn CreateAdminToken() -> () {}
#[utoipa::path(post,
path = "/v2/UpdateAdminToken",
tag = "Admin API token",
description = "
Updates information about the specified admin API token.
",
request_body = UpdateAdminTokenRequestBody,
params(
("id", description = "Admin API token ID"),
),
responses(
(status = 200, description = "Admin token has been updated", body = UpdateAdminTokenResponse),
(status = 500, description = "Internal server error")
),
)]
fn UpdateAdminToken() -> () {}
#[utoipa::path(post,
path = "/v2/DeleteAdminToken",
tag = "Admin API token",
description = "Delete an admin API token from the cluster, revoking all its permissions.",
params(
("id", description = "Admin API token ID"),
),
responses(
(status = 200, description = "Admin token has been deleted"),
(status = 500, description = "Internal server error")
),
)]
fn DeleteAdminToken() -> () {}
// **********************************************
// Layout operations
// **********************************************
#[utoipa::path(get,
path = "/v2/GetClusterLayout",
tag = "Cluster layout",
description = "
Returns the cluster's current layout, including:
- Currently configured cluster layout
- Staged changes to the cluster layout
*Capacity is given in bytes*
",
responses(
(status = 200, description = "Current cluster layout", body = GetClusterLayoutResponse),
(status = 500, description = "Internal server error")
),
)]
fn GetClusterLayout() -> () {}
#[utoipa::path(get,
path = "/v2/GetClusterLayoutHistory",
tag = "Cluster layout",
description = "
Returns the history of layouts in the cluster
",
responses(
(status = 200, description = "Cluster layout history", body = GetClusterLayoutHistoryResponse),
(status = 500, description = "Internal server error")
),
)]
fn GetClusterLayoutHistory() -> () {}
#[utoipa::path(post,
path = "/v2/UpdateClusterLayout",
tag = "Cluster layout",
description = "
Send modifications to the cluster layout. These modifications will be included in the staged role changes, visible in subsequent calls of `GET /GetClusterHealth`. Once the set of staged changes is satisfactory, the user may call `POST /ApplyClusterLayout` to apply the changed changes, or `POST /RevertClusterLayout` to clear all of the staged changes in the layout.
Setting the capacity to `null` will configure the node as a gateway.
Otherwise, capacity must be now set in bytes (before Garage 0.9 it was arbitrary weights).
For example to declare 100GB, you must set `capacity: 100000000000`.
Garage uses internally the International System of Units (SI), it assumes that 1kB = 1000 bytes, and displays storage as kB, MB, GB (and not KiB, MiB, GiB that assume 1KiB = 1024 bytes).
",
request_body(
content=UpdateClusterLayoutRequest,
description="
To add a new node to the layout or to change the configuration of an existing node, simply set the values you want (`zone`, `capacity`, and `tags`).
To remove a node, simply pass the `remove: true` field.
This logic is represented in OpenAPI with a 'One Of' object.
Contrary to the CLI that may update only a subset of the fields capacity, zone and tags, when calling this API all of these values must be specified.
"
),
responses(
(status = 200, description = "Proposed changes have been added to the list of pending changes", body = UpdateClusterLayoutResponse),
(status = 500, description = "Internal server error")
),
)]
fn UpdateClusterLayout() -> () {}
#[utoipa::path(post,
path = "/v2/PreviewClusterLayoutChanges",
tag = "Cluster layout",
description = "
Computes a new layout taking into account the staged parameters, and returns it with detailed statistics. The new layout is not applied in the cluster.
*Note: do not try to parse the `message` field of the response, it is given as an array of string specifically because its format is not stable.*
",
responses(
(status = 200, description = "Information about the new layout", body = PreviewClusterLayoutChangesResponse),
(status = 500, description = "Internal server error")
),
)]
fn PreviewClusterLayoutChanges() -> () {}
#[utoipa::path(post,
path = "/v2/ApplyClusterLayout",
tag = "Cluster layout",
description = "
Applies to the cluster the layout changes currently registered as staged layout changes.
*Note: do not try to parse the `message` field of the response, it is given as an array of string specifically because its format is not stable.*
",
request_body=ApplyClusterLayoutRequest,
responses(
(status = 200, description = "The updated cluster layout has been applied in the cluster", body = ApplyClusterLayoutResponse),
(status = 500, description = "Internal server error")
),
)]
fn ApplyClusterLayout() -> () {}
#[utoipa::path(post,
path = "/v2/RevertClusterLayout",
tag = "Cluster layout",
description = "Clear staged layout changes",
responses(
(status = 200, description = "All pending changes to the cluster layout have been erased", body = RevertClusterLayoutResponse),
(status = 500, description = "Internal server error")
),
)]
fn RevertClusterLayout() -> () {}
#[utoipa::path(post,
path = "/v2/ClusterLayoutSkipDeadNodes",
tag = "Cluster layout",
description = "Force progress in layout update trackers",
request_body = ClusterLayoutSkipDeadNodesRequest,
responses(
(status = 200, description = "Request has been taken into account", body = ClusterLayoutSkipDeadNodesResponse),
(status = 500, description = "Internal server error")
),
)]
fn ClusterLayoutSkipDeadNodes() -> () {}
// **********************************************
// Access key operations
// **********************************************
#[utoipa::path(get,
path = "/v2/ListKeys",
tag = "Access key",
description = "Returns all API access keys in the cluster.",
responses(
(status = 200, description = "Returns the key identifier (aka `AWS_ACCESS_KEY_ID`) and its associated, human friendly, name if any (otherwise return an empty string)", body = ListKeysResponse),
(status = 500, description = "Internal server error")
),
)]
fn ListKeys() -> () {}
#[utoipa::path(get,
path = "/v2/GetKeyInfo",
tag = "Access key",
description = "
Return information about a specific key like its identifiers, its permissions and buckets on which it has permissions.
You can search by specifying the exact key identifier (`id`) or by specifying a pattern (`search`).
For confidentiality reasons, the secret key is not returned by default: you must pass the `showSecretKey` query parameter to get it.
",
params(GetKeyInfoRequest),
responses(
(status = 200, description = "Information about the access key", body = GetKeyInfoResponse),
(status = 500, description = "Internal server error")
),
)]
fn GetKeyInfo() -> () {}
#[utoipa::path(post,
path = "/v2/CreateKey",
tag = "Access key",
description = "Creates a new API access key.",
request_body = CreateKeyRequest,
responses(
(status = 200, description = "Access key has been created", body = CreateKeyResponse),
(status = 500, description = "Internal server error")
),
)]
fn CreateKey() -> () {}
#[utoipa::path(post,
path = "/v2/ImportKey",
tag = "Access key",
description = "
Imports an existing API key. This feature must only be used for migrations and backup restore.
**Do not use it to generate custom key identifiers or you will break your Garage cluster.**
",
request_body = ImportKeyRequest,
responses(
(status = 200, description = "Access key has been imported", body = ImportKeyResponse),
(status = 500, description = "Internal server error")
),
)]
fn ImportKey() -> () {}
#[utoipa::path(post,
path = "/v2/UpdateKey",
tag = "Access key",
description = "
Updates information about the specified API access key.
*Note: the secret key is not returned in the response, `null` is sent instead.*
",
request_body = UpdateKeyRequestBody,
params(
("id", description = "Access key ID"),
),
responses(
(status = 200, description = "Access key has been updated", body = UpdateKeyResponse),
(status = 500, description = "Internal server error")
),
)]
fn UpdateKey() -> () {}
#[utoipa::path(post,
path = "/v2/DeleteKey",
tag = "Access key",
description = "Delete a key from the cluster. Its access will be removed from all the buckets. Buckets are not automatically deleted and can be dangling. You should manually delete them before. ",
params(
("id", description = "Access key ID"),
),
responses(
(status = 200, description = "Access key has been deleted"),
(status = 500, description = "Internal server error")
),
)]
fn DeleteKey() -> () {}
// **********************************************
// Bucket operations
// **********************************************
#[utoipa::path(get,
path = "/v2/ListBuckets",
tag = "Bucket",
description = "List all the buckets on the cluster with their UUID and their global and local aliases.",
responses(
(status = 200, description = "Returns the UUID of all the buckets and all their aliases", body = ListBucketsResponse),
(status = 500, description = "Internal server error")
),
)]
fn ListBuckets() -> () {}
#[utoipa::path(get,
path = "/v2/GetBucketInfo",
tag = "Bucket",
description = "
Given a bucket identifier (`id`) or a global alias (`alias`), get its information.
It includes its aliases, its web configuration, keys that have some permissions
on it, some statistics (number of objects, size), number of dangling multipart uploads,
and its quotas (if any).
",
params(GetBucketInfoRequest),
responses(
(status = 200, description = "Returns exhaustive information about the bucket", body = GetBucketInfoResponse),
(status = 500, description = "Internal server error")
),
)]
fn GetBucketInfo() -> () {}
#[utoipa::path(post,
path = "/v2/CreateBucket",
tag = "Bucket",
description = "
Creates a new bucket, either with a global alias, a local one, or no alias at all.
Technically, you can also specify both `globalAlias` and `localAlias` and that would create two aliases.
",
request_body = CreateBucketRequest,
responses(
(status = 200, description = "Returns exhaustive information about the bucket", body = CreateBucketResponse),
(status = 500, description = "Internal server error")
),
)]
fn CreateBucket() -> () {}
#[utoipa::path(post,
path = "/v2/UpdateBucket",
tag = "Bucket",
description = "
All fields (`websiteAccess` and `quotas`) are optional.
If they are present, the corresponding modifications are applied to the bucket, otherwise nothing is changed.
In `websiteAccess`: if `enabled` is `true`, `indexDocument` must be specified.
The field `errorDocument` is optional, if no error document is set a generic
error message is displayed when errors happen. Conversely, if `enabled` is
`false`, neither `indexDocument` nor `errorDocument` must be specified.
In `quotas`: new values of `maxSize` and `maxObjects` must both be specified, or set to `null`
to remove the quotas. An absent value will be considered the same as a `null`. It is not possible
to change only one of the two quotas.
",
params(
("id", description = "ID of the bucket to update"),
),
request_body = UpdateBucketRequestBody,
responses(
(status = 200, description = "Bucket has been updated", body = UpdateBucketResponse),
(status = 404, description = "Bucket not found"),
(status = 500, description = "Internal server error")
),
)]
fn UpdateBucket() -> () {}
#[utoipa::path(post,
path = "/v2/DeleteBucket",
tag = "Bucket",
description = "
Deletes a storage bucket. A bucket cannot be deleted if it is not empty.
**Warning:** this will delete all aliases associated with the bucket!
",
params(
("id", description = "ID of the bucket to delete"),
),
responses(
(status = 200, description = "Bucket has been deleted"),
(status = 400, description = "Bucket is not empty"),
(status = 404, description = "Bucket not found"),
(status = 500, description = "Internal server error")
),
)]
fn DeleteBucket() -> () {}
#[utoipa::path(post,
path = "/v2/CleanupIncompleteUploads",
tag = "Bucket",
description = "Removes all incomplete multipart uploads that are older than the specified number of seconds.",
request_body = CleanupIncompleteUploadsRequest,
responses(
(status = 200, description = "The bucket was cleaned up successfully", body = CleanupIncompleteUploadsResponse),
(status = 500, description = "Internal server error")
),
)]
fn CleanupIncompleteUploads() -> () {}
#[utoipa::path(get,
path = "/v2/InspectObject",
tag = "Bucket",
description = "
Returns detailed information about an object in a bucket, including its internal state in Garage.
This API call can be used to list the data blocks referenced by an object,
as well as to view metadata associated to the object.
This call may return a list of more than one version for the object, for instance in the
case where there is a currently stored version of the object, and a newer version whose
upload is in progress and not yet finished.
",
params(InspectObjectRequest),
responses(
(status = 200, description = "Returns exhaustive information about the object", body = InspectObjectResponse),
(status = 404, description = "Object not found"),
(status = 500, description = "Internal server error")
),
)]
fn InspectObject() -> () {}
// **********************************************
// Operations on permissions for keys on buckets
// **********************************************
#[utoipa::path(post,
path = "/v2/AllowBucketKey",
tag = "Permission",
description = "
**DISCLAIMER**: Garage's developers are aware that this endpoint has an unconventional semantic. Be extra careful when implementing it, its behavior is not obvious.
Allows a key to do read/write/owner operations on a bucket.
Flags in permissions which have the value true will be activated. Other flags will remain unchanged (ie. they will keep their internal value).
For example, if you set read to true, the key will be allowed to read the bucket.
If you set it to false, the key will keeps its previous read permission.
If you want to disallow read for the key, check the DenyBucketKey operation.
",
request_body = AllowBucketKeyRequest,
responses(
(status = 200, description = "Returns exhaustive information about the bucket", body = AllowBucketKeyResponse),
(status = 500, description = "Internal server error")
),
)]
fn AllowBucketKey() -> () {}
#[utoipa::path(post,
path = "/v2/DenyBucketKey",
tag = "Permission",
description = "
**DISCLAIMER**: Garage's developers are aware that this endpoint has an unconventional semantic. Be extra careful when implementing it, its behavior is not obvious.
Denies a key from doing read/write/owner operations on a bucket.
Flags in permissions which have the value true will be deactivated. Other flags will remain unchanged.
For example, if you set read to true, the key will be denied from reading.
If you set read to false, the key will keep its previous permissions.
If you want the key to have the reading permission, check the AllowBucketKey operation.
",
request_body = DenyBucketKeyRequest,
responses(
(status = 200, description = "Returns exhaustive information about the bucket", body = DenyBucketKeyResponse),
(status = 500, description = "Internal server error")
),
)]
fn DenyBucketKey() -> () {}
// **********************************************
// Operations on bucket aliases
// **********************************************
#[utoipa::path(post,
path = "/v2/AddBucketAlias",
tag = "Bucket alias",
description = "Add an alias for the target bucket. This can be either a global or a local alias, depending on which fields are specified.",
request_body = AddBucketAliasRequest,
responses(
(status = 200, description = "Returns exhaustive information about the bucket", body = AddBucketAliasResponse),
(status = 500, description = "Internal server error")
),
)]
fn AddBucketAlias() -> () {}
#[utoipa::path(post,
path = "/v2/RemoveBucketAlias",
tag = "Bucket alias",
description = "Remove an alias for the target bucket. This can be either a global or a local alias, depending on which fields are specified.",
request_body = RemoveBucketAliasRequest,
responses(
(status = 200, description = "Returns exhaustive information about the bucket", body = RemoveBucketAliasResponse),
(status = 500, description = "Internal server error")
),
)]
fn RemoveBucketAlias() -> () {}
// **********************************************
// Node operations
// **********************************************
#[utoipa::path(get,
path = "/v2/GetNodeInfo",
tag = "Node",
description = "
Return information about the Garage daemon running on one or several nodes.
",
params(
("node", description = "Node ID to query, or `*` for all nodes, or `self` for the node responding to the request"),
),
responses(
(status = 200, description = "Responses from individual cluster nodes", body = MultiResponse<LocalGetNodeInfoResponse>),
(status = 500, description = "Internal server error")
),
)]
fn GetNodeInfo() -> () {}
#[utoipa::path(get,
path = "/v2/GetNodeStatistics",
tag = "Node",
description = "
Fetch statistics for one or several Garage nodes.
*Note: do not try to parse the `freeform` field of the response, it is given as a string specifically because its format is not stable.*
",
params(
("node", description = "Node ID to query, or `*` for all nodes, or `self` for the node responding to the request"),
),
responses(
(status = 200, description = "Responses from individual cluster nodes", body = MultiResponse<LocalGetNodeStatisticsResponse>),
(status = 500, description = "Internal server error")
),
)]
fn GetNodeStatistics() -> () {}
#[utoipa::path(post,
path = "/v2/CreateMetadataSnapshot",
tag = "Node",
description = "
Instruct one or several nodes to take a snapshot of their metadata databases.
",
params(
("node", description = "Node ID to query, or `*` for all nodes, or `self` for the node responding to the request"),
),
responses(
(status = 200, description = "Responses from individual cluster nodes", body = MultiResponse<LocalCreateMetadataSnapshotResponse>),
(status = 500, description = "Internal server error")
),
)]
fn CreateMetadataSnapshot() -> () {}
#[utoipa::path(post,
path = "/v2/LaunchRepairOperation",
tag = "Node",
description = "
Launch a repair operation on one or several cluster nodes.
",
params(
("node", description = "Node ID to query, or `*` for all nodes, or `self` for the node responding to the request"),
),
request_body = LocalLaunchRepairOperationRequest,
responses(
(status = 200, description = "Responses from individual cluster nodes", body = MultiResponse<LocalLaunchRepairOperationResponse>),
(status = 500, description = "Internal server error")
),
)]
fn LaunchRepairOperation() -> () {}
// **********************************************
// Worker operations
// **********************************************
#[utoipa::path(post,
path = "/v2/ListWorkers",
tag = "Worker",
description = "
List background workers currently running on one or several cluster nodes.
",
params(
("node", description = "Node ID to query, or `*` for all nodes, or `self` for the node responding to the request"),
),
request_body = LocalListWorkersRequest,
responses(
(status = 200, description = "Responses from individual cluster nodes", body = MultiResponse<LocalListWorkersResponse>),
(status = 500, description = "Internal server error")
),
)]
fn ListWorkers() -> () {}
#[utoipa::path(post,
path = "/v2/GetWorkerInfo",
tag = "Worker",
description = "
Get information about the specified background worker on one or several cluster nodes.
",
params(
("node", description = "Node ID to query, or `*` for all nodes, or `self` for the node responding to the request"),
),
request_body = LocalGetWorkerInfoRequest,
responses(
(status = 200, description = "Responses from individual cluster nodes", body = MultiResponse<LocalGetWorkerInfoResponse>),
(status = 500, description = "Internal server error")
),
)]
fn GetWorkerInfo() -> () {}
#[utoipa::path(post,
path = "/v2/GetWorkerVariable",
tag = "Worker",
description = "
Fetch values of one or several worker variables, from one or several cluster nodes.
",
params(
("node", description = "Node ID to query, or `*` for all nodes, or `self` for the node responding to the request"),
),
request_body = LocalGetWorkerVariableRequest,
responses(
(status = 200, description = "Responses from individual cluster nodes", body = MultiResponse<LocalGetWorkerVariableResponse>),
(status = 500, description = "Internal server error")
),
)]
fn GetWorkerVariable() -> () {}
#[utoipa::path(post,
path = "/v2/SetWorkerVariable",
tag = "Worker",
description = "
Set the value for a worker variable, on one or several cluster nodes.
",
params(
("node", description = "Node ID to query, or `*` for all nodes, or `self` for the node responding to the request"),
),
request_body = LocalSetWorkerVariableRequest,
responses(
(status = 200, description = "Responses from individual cluster nodes", body = MultiResponse<LocalSetWorkerVariableResponse>),
(status = 500, description = "Internal server error")
),
)]
fn SetWorkerVariable() -> () {}
// **********************************************
// Block operations
// **********************************************
#[utoipa::path(get,
path = "/v2/ListBlockErrors",
tag = "Block",
description = "
List data blocks that are currently in an errored state on one or several Garage nodes.
",
params(
("node", description = "Node ID to query, or `*` for all nodes, or `self` for the node responding to the request"),
),
responses(
(status = 200, description = "Responses from individual cluster nodes", body = MultiResponse<LocalListBlockErrorsResponse>),
(status = 500, description = "Internal server error")
),
)]
fn ListBlockErrors() -> () {}
#[utoipa::path(post,
path = "/v2/GetBlockInfo",
tag = "Block",
description = "
Get detailed information about a data block stored on a Garage node, including all object versions and in-progress multipart uploads that contain a reference to this block.
",
params(
("node", description = "Node ID to query, or `*` for all nodes, or `self` for the node responding to the request"),
),
request_body = LocalGetBlockInfoRequest,
responses(
(status = 200, description = "Detailed block information", body = MultiResponse<LocalGetBlockInfoResponse>),
(status = 500, description = "Internal server error")
),
)]
fn GetBlockInfo() -> () {}
#[utoipa::path(post,
path = "/v2/RetryBlockResync",
tag = "Block",
description = "
Instruct Garage node(s) to retry the resynchronization of one or several missing data block(s).
",
params(
("node", description = "Node ID to query, or `*` for all nodes, or `self` for the node responding to the request"),
),
request_body = LocalRetryBlockResyncRequest,
responses(
(status = 200, description = "Responses from individual cluster nodes", body = MultiResponse<LocalRetryBlockResyncResponse>),
(status = 500, description = "Internal server error")
),
)]
fn RetryBlockResync() -> () {}
#[utoipa::path(post,
path = "/v2/PurgeBlocks",
tag = "Block",
description = "
Purge references to one or several missing data blocks.
This will remove all objects and in-progress multipart uploads that contain the specified data block(s). The objects will be permanently deleted from the buckets in which they appear. Use with caution.
",
params(
("node", description = "Node ID to query, or `*` for all nodes, or `self` for the node responding to the request"),
),
request_body = LocalPurgeBlocksRequest,
responses(
(status = 200, description = "Responses from individual cluster nodes", body = MultiResponse<LocalPurgeBlocksResponse>),
(status = 500, description = "Internal server error")
),
)]
fn PurgeBlocks() -> () {}
// **********************************************
// **********************************************
// **********************************************
struct SecurityAddon;
impl Modify for SecurityAddon {
fn modify(&self, openapi: &mut utoipa::openapi::OpenApi) {
use utoipa::openapi::security::*;
let components = openapi.components.as_mut().unwrap(); // we can unwrap safely since there already is components registered.
components.add_security_scheme(
"bearerAuth",
SecurityScheme::Http(Http::builder().scheme(HttpAuthScheme::Bearer).build()),
)
}
}
#[derive(OpenApi)]
#[openapi(
info(
version = "v2.0.0",
title = "Garage administration API",
description = "Administrate your Garage cluster programatically, including status, layout, keys, buckets, and maintainance tasks.
*Disclaimer: This API may change in future Garage versions. Read the changelog and upgrade your scripts before upgrading. Additionnaly, this specification is early stage and can contain bugs, so be careful and please report any issues on our issue tracker.*",
contact(
name = "The Garage team",
email = "garagehq@deuxfleurs.fr",
url = "https://garagehq.deuxfleurs.fr/",
),
),
modifiers(&SecurityAddon),
security(("bearerAuth" = [])),
paths(
// Special ops
Metrics,
Health,
CheckDomain,
// Cluster operations
GetClusterHealth,
GetClusterStatus,
GetClusterStatistics,
ConnectClusterNodes,
// Admin token operations
ListAdminTokens,
GetAdminTokenInfo,
CreateAdminToken,
UpdateAdminToken,
DeleteAdminToken,
// Layout operations
GetClusterLayout,
GetClusterLayoutHistory,
UpdateClusterLayout,
PreviewClusterLayoutChanges,
ApplyClusterLayout,
RevertClusterLayout,
ClusterLayoutSkipDeadNodes,
// Key operations
ListKeys,
GetKeyInfo,
CreateKey,
ImportKey,
UpdateKey,
DeleteKey,
// Bucket operations
ListBuckets,
GetBucketInfo,
CreateBucket,
UpdateBucket,
DeleteBucket,
CleanupIncompleteUploads,
InspectObject,
// Operations on permissions
AllowBucketKey,
DenyBucketKey,
// Operations on aliases
AddBucketAlias,
RemoveBucketAlias,
// Node operations
GetNodeInfo,
GetNodeStatistics,
CreateMetadataSnapshot,
LaunchRepairOperation,
// Worker operations
ListWorkers,
GetWorkerInfo,
GetWorkerVariable,
SetWorkerVariable,
// Block operations
ListBlockErrors,
GetBlockInfo,
RetryBlockResync,
PurgeBlocks,
),
servers(
(url = "http://localhost:3903/", description = "A local server")
),
)]
pub struct ApiDoc;

View file

@ -7,6 +7,12 @@ use garage_api_common::router_macros::*;
use crate::error::*;
use crate::router_v0;
pub enum Authorization {
None,
MetricsToken,
AdminToken,
}
router_match! {@func
/// List of all Admin API endpoints.
@ -205,6 +211,15 @@ impl Endpoint {
))),
}
}
/// Get the kind of authorization which is required to perform the operation.
pub fn authorization_type(&self) -> Authorization {
match self {
Self::Health => Authorization::None,
Self::CheckDomain => Authorization::None,
Self::Metrics => Authorization::MetricsToken,
_ => Authorization::AdminToken,
}
}
}
generateQueryParameters! {

View file

@ -1,275 +0,0 @@
use std::borrow::Cow;
use hyper::body::Incoming as IncomingBody;
use hyper::{Method, Request};
use paste::paste;
use garage_api_common::helpers::*;
use garage_api_common::router_macros::*;
use crate::api::*;
use crate::error::*;
use crate::router_v1;
use crate::Authorization;
impl AdminApiRequest {
/// Determine which S3 endpoint a request is for using the request, and a bucket which was
/// possibly extracted from the Host header.
/// Returns Self plus bucket name, if endpoint is not Endpoint::ListBuckets
pub async fn from_request(req: Request<IncomingBody>) -> Result<Self, Error> {
let uri = req.uri().clone();
let path = uri.path();
let query = uri.query();
let method = req.method().clone();
let mut query = QueryParameters::from_query(query.unwrap_or_default())?;
let res = router_match!(@gen_path_parser_v2 (&method, path, "/v2/", query, req) [
@special OPTIONS _ => Options (),
@special GET "/check" => CheckDomain (query::domain),
@special GET "/health" => Health (),
@special GET "/metrics" => Metrics (),
// Cluster endpoints
GET GetClusterStatus (),
GET GetClusterHealth (),
POST ConnectClusterNodes (body),
// Admin token endpoints
GET ListAdminTokens (),
GET GetAdminTokenInfo (query_opt::id, query_opt::search),
POST CreateAdminToken (body),
POST UpdateAdminToken (body_field, query::id),
POST DeleteAdminToken (query::id),
// Layout endpoints
GET GetClusterLayout (),
GET GetClusterLayoutHistory (),
POST UpdateClusterLayout (body),
POST PreviewClusterLayoutChanges (),
POST ApplyClusterLayout (body),
POST RevertClusterLayout (),
POST ClusterLayoutSkipDeadNodes (body),
// API key endpoints
GET GetKeyInfo (query_opt::id, query_opt::search, parse_default(false)::show_secret_key),
POST UpdateKey (body_field, query::id),
POST CreateKey (body),
POST ImportKey (body),
POST DeleteKey (query::id),
GET ListKeys (),
// Bucket endpoints
GET GetBucketInfo (query_opt::id, query_opt::global_alias, query_opt::search),
GET ListBuckets (),
POST CreateBucket (body),
POST DeleteBucket (query::id),
POST UpdateBucket (body_field, query::id),
POST CleanupIncompleteUploads (body),
GET InspectObject (query::bucket_id, query::key),
// Bucket-key permissions
POST AllowBucketKey (body),
POST DenyBucketKey (body),
// Bucket aliases
POST AddBucketAlias (body),
POST RemoveBucketAlias (body),
// Node APIs
GET GetNodeInfo (default::body, query::node),
POST CreateMetadataSnapshot (default::body, query::node),
GET GetNodeStatistics (default::body, query::node),
GET GetClusterStatistics (),
POST LaunchRepairOperation (body_field, query::node),
// Worker APIs
POST ListWorkers (body_field, query::node),
POST GetWorkerInfo (body_field, query::node),
POST GetWorkerVariable (body_field, query::node),
POST SetWorkerVariable (body_field, query::node),
// Block APIs
GET ListBlockErrors (default::body, query::node),
POST GetBlockInfo (body_field, query::node),
POST RetryBlockResync (body_field, query::node),
POST PurgeBlocks (body_field, query::node),
]);
if let Some(message) = query.nonempty_message() {
debug!("Unused query parameter: {}", message)
}
Ok(res)
}
/// Some endpoints work exactly the same in their v2/ version as they did in their v1/ version.
/// For these endpoints, we can convert a v1/ call to its equivalent as if it was made using
/// its v2/ URL.
pub async fn from_v1(
v1_endpoint: router_v1::Endpoint,
req: Request<IncomingBody>,
) -> Result<Self, Error> {
use router_v1::Endpoint;
match v1_endpoint {
// GetClusterStatus semantics changed:
// info about local node is no longer returned
Endpoint::GetClusterHealth => {
Ok(AdminApiRequest::GetClusterHealth(GetClusterHealthRequest))
}
Endpoint::ConnectClusterNodes => {
let req = parse_json_body::<ConnectClusterNodesRequest, _, Error>(req).await?;
Ok(AdminApiRequest::ConnectClusterNodes(req))
}
// Layout
Endpoint::GetClusterLayout => {
Ok(AdminApiRequest::GetClusterLayout(GetClusterLayoutRequest))
}
// UpdateClusterLayout semantics changed
Endpoint::ApplyClusterLayout => {
let param = parse_json_body::<ApplyClusterLayoutRequest, _, Error>(req).await?;
Ok(AdminApiRequest::ApplyClusterLayout(param))
}
Endpoint::RevertClusterLayout => Ok(AdminApiRequest::RevertClusterLayout(
RevertClusterLayoutRequest,
)),
// Keys
Endpoint::ListKeys => Ok(AdminApiRequest::ListKeys(ListKeysRequest)),
Endpoint::GetKeyInfo {
id,
search,
show_secret_key,
} => {
let show_secret_key = show_secret_key.map(|x| x == "true").unwrap_or(false);
Ok(AdminApiRequest::GetKeyInfo(GetKeyInfoRequest {
id,
search,
show_secret_key,
}))
}
Endpoint::CreateKey => {
let req = parse_json_body::<CreateKeyRequest, _, Error>(req).await?;
Ok(AdminApiRequest::CreateKey(req))
}
Endpoint::ImportKey => {
let req = parse_json_body::<ImportKeyRequest, _, Error>(req).await?;
Ok(AdminApiRequest::ImportKey(req))
}
Endpoint::UpdateKey { id } => {
let body = parse_json_body::<UpdateKeyRequestBody, _, Error>(req).await?;
Ok(AdminApiRequest::UpdateKey(UpdateKeyRequest { id, body }))
}
// DeleteKey semantics changed:
// - in v1/ : HTTP DELETE => HTTP 204 No Content
// - in v2/ : HTTP POST => HTTP 200 Ok
// Endpoint::DeleteKey { id } => Ok(AdminApiRequest::DeleteKey(DeleteKeyRequest { id })),
// Buckets
Endpoint::ListBuckets => Ok(AdminApiRequest::ListBuckets(ListBucketsRequest)),
Endpoint::GetBucketInfo { id, global_alias } => {
Ok(AdminApiRequest::GetBucketInfo(GetBucketInfoRequest {
id,
global_alias,
search: None,
}))
}
Endpoint::CreateBucket => {
let req = parse_json_body::<CreateBucketRequest, _, Error>(req).await?;
Ok(AdminApiRequest::CreateBucket(req))
}
// DeleteBucket semantics changed::
// - in v1/ : HTTP DELETE => HTTP 204 No Content
// - in v2/ : HTTP POST => HTTP 200 Ok
// Endpoint::DeleteBucket { id } => {
// Ok(AdminApiRequest::DeleteBucket(DeleteBucketRequest { id }))
// }
Endpoint::UpdateBucket { id } => {
let body = parse_json_body::<UpdateBucketRequestBody, _, Error>(req).await?;
Ok(AdminApiRequest::UpdateBucket(UpdateBucketRequest {
id,
body,
}))
}
// Bucket-key permissions
Endpoint::BucketAllowKey => {
let req = parse_json_body::<BucketKeyPermChangeRequest, _, Error>(req).await?;
Ok(AdminApiRequest::AllowBucketKey(AllowBucketKeyRequest(req)))
}
Endpoint::BucketDenyKey => {
let req = parse_json_body::<BucketKeyPermChangeRequest, _, Error>(req).await?;
Ok(AdminApiRequest::DenyBucketKey(DenyBucketKeyRequest(req)))
}
// Bucket aliasing
Endpoint::GlobalAliasBucket { id, alias } => {
Ok(AdminApiRequest::AddBucketAlias(AddBucketAliasRequest {
bucket_id: id,
alias: BucketAliasEnum::Global {
global_alias: alias,
},
}))
}
Endpoint::GlobalUnaliasBucket { id, alias } => Ok(AdminApiRequest::RemoveBucketAlias(
RemoveBucketAliasRequest {
bucket_id: id,
alias: BucketAliasEnum::Global {
global_alias: alias,
},
},
)),
Endpoint::LocalAliasBucket {
id,
access_key_id,
alias,
} => Ok(AdminApiRequest::AddBucketAlias(AddBucketAliasRequest {
bucket_id: id,
alias: BucketAliasEnum::Local {
local_alias: alias,
access_key_id,
},
})),
Endpoint::LocalUnaliasBucket {
id,
access_key_id,
alias,
} => Ok(AdminApiRequest::RemoveBucketAlias(
RemoveBucketAliasRequest {
bucket_id: id,
alias: BucketAliasEnum::Local {
local_alias: alias,
access_key_id,
},
},
)),
// For endpoints that have different body content syntax, issue
// deprecation warning
_ => Err(Error::bad_request(format!(
"v1/ endpoint is no longer supported: {}",
v1_endpoint.name()
))),
}
}
/// Get the kind of authorization which is required to perform the operation.
pub fn authorization_type(&self) -> Authorization {
match self {
Self::Options(_) | Self::Health(_) | Self::CheckDomain(_) => Authorization::None,
Self::Metrics(_) => Authorization::MetricsToken,
_ => Authorization::AdminToken,
}
}
}
generateQueryParameters! {
keywords: [],
fields: [
"node" => node,
"domain" => domain,
"format" => format,
"id" => id,
"search" => search,
"globalAlias" => global_alias,
"alias" => alias,
"accessKeyId" => access_key_id,
"showSecretKey" => show_secret_key,
"bucketId" => bucket_id,
"key" => key
]
}

View file

@ -1,173 +0,0 @@
use std::sync::Arc;
use http::header::{
ACCESS_CONTROL_ALLOW_HEADERS, ACCESS_CONTROL_ALLOW_METHODS, ACCESS_CONTROL_ALLOW_ORIGIN, ALLOW,
};
use hyper::{Response, StatusCode};
#[cfg(feature = "metrics")]
use prometheus::{Encoder, TextEncoder};
use garage_model::garage::Garage;
use garage_rpc::system::ClusterHealthStatus;
use garage_api_common::helpers::*;
use crate::api::{CheckDomainRequest, HealthRequest, MetricsRequest, OptionsRequest};
use crate::api_server::ResBody;
use crate::error::*;
use crate::{Admin, RequestHandler};
impl RequestHandler for OptionsRequest {
type Response = Response<ResBody>;
async fn handle(
self,
_garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<Response<ResBody>, Error> {
Ok(Response::builder()
.status(StatusCode::OK)
.header(ALLOW, "OPTIONS,GET,POST")
.header(ACCESS_CONTROL_ALLOW_METHODS, "OPTIONS,GET,POST")
.header(ACCESS_CONTROL_ALLOW_HEADERS, "authorization,content-type")
.header(ACCESS_CONTROL_ALLOW_ORIGIN, "*")
.body(empty_body())?)
}
}
impl RequestHandler for MetricsRequest {
type Response = Response<ResBody>;
async fn handle(
self,
_garage: &Arc<Garage>,
admin: &Admin,
) -> Result<Response<ResBody>, Error> {
#[cfg(feature = "metrics")]
{
use opentelemetry::trace::Tracer;
let mut buffer = vec![];
let encoder = TextEncoder::new();
let tracer = opentelemetry::global::tracer("garage");
let metric_families = tracer.in_span("admin/gather_metrics", |_| {
admin.exporter.registry().gather()
});
encoder
.encode(&metric_families, &mut buffer)
.ok_or_internal_error("Could not serialize metrics")?;
Ok(Response::builder()
.status(StatusCode::OK)
.header(http::header::CONTENT_TYPE, encoder.format_type())
.body(bytes_body(buffer.into()))?)
}
#[cfg(not(feature = "metrics"))]
Err(Error::bad_request(
"Garage was built without the metrics feature".to_string(),
))
}
}
impl RequestHandler for HealthRequest {
type Response = Response<ResBody>;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<Response<ResBody>, Error> {
let health = garage.system.health();
let (status, status_str) = match health.status {
ClusterHealthStatus::Healthy => (StatusCode::OK, "Garage is fully operational"),
ClusterHealthStatus::Degraded => (
StatusCode::OK,
"Garage is operational but some storage nodes are unavailable",
),
ClusterHealthStatus::Unavailable => (
StatusCode::SERVICE_UNAVAILABLE,
"Quorum is not available for some/all partitions, reads and writes will fail",
),
};
let status_str = format!(
"{}\nConsult the full health check API endpoint at /v2/GetClusterHealth for more details\n",
status_str
);
Ok(Response::builder()
.status(status)
.header(http::header::CONTENT_TYPE, "text/plain")
.body(string_body(status_str))?)
}
}
impl RequestHandler for CheckDomainRequest {
type Response = Response<ResBody>;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<Response<ResBody>, Error> {
if check_domain(garage, &self.domain).await? {
Ok(Response::builder()
.status(StatusCode::OK)
.body(string_body(format!(
"Domain '{}' is managed by Garage",
self.domain
)))?)
} else {
Err(Error::bad_request(format!(
"Domain '{}' is not managed by Garage",
self.domain
)))
}
}
}
async fn check_domain(garage: &Arc<Garage>, domain: &str) -> Result<bool, Error> {
// Resolve bucket from domain name, inferring if the website must be activated for the
// domain to be valid.
let (bucket_name, must_check_website) = if let Some(bname) = garage
.config
.s3_api
.root_domain
.as_ref()
.and_then(|rd| host_to_bucket(domain, rd))
{
(bname.to_string(), false)
} else if let Some(bname) = garage
.config
.s3_web
.as_ref()
.and_then(|sw| host_to_bucket(domain, sw.root_domain.as_str()))
{
(bname.to_string(), true)
} else {
(domain.to_string(), true)
};
let bucket = match garage
.bucket_helper()
.resolve_global_bucket_fast(&bucket_name)?
{
Some(b) => b,
None => return Ok(false),
};
if !must_check_website {
return Ok(true);
}
let bucket_state = bucket.state.as_option().unwrap();
let bucket_website_config = bucket_state.website_config.get();
match bucket_website_config {
Some(_v) => Ok(true),
None => Ok(false),
}
}

View file

@ -1,118 +0,0 @@
use std::collections::HashMap;
use std::sync::Arc;
use garage_util::background::*;
use garage_util::time::now_msec;
use garage_model::garage::Garage;
use crate::api::*;
use crate::error::Error;
use crate::{Admin, RequestHandler};
impl RequestHandler for LocalListWorkersRequest {
type Response = LocalListWorkersResponse;
async fn handle(
self,
_garage: &Arc<Garage>,
admin: &Admin,
) -> Result<LocalListWorkersResponse, Error> {
let workers = admin.background.get_worker_info();
let info = workers
.into_iter()
.filter(|(_, w)| {
(!self.busy_only
|| matches!(w.state, WorkerState::Busy | WorkerState::Throttled(_)))
&& (!self.error_only || w.errors > 0)
})
.map(|(id, w)| worker_info_to_api(id as u64, w))
.collect::<Vec<_>>();
Ok(LocalListWorkersResponse(info))
}
}
impl RequestHandler for LocalGetWorkerInfoRequest {
type Response = LocalGetWorkerInfoResponse;
async fn handle(
self,
_garage: &Arc<Garage>,
admin: &Admin,
) -> Result<LocalGetWorkerInfoResponse, Error> {
let info = admin
.background
.get_worker_info()
.get(&(self.id as usize))
.ok_or(Error::NoSuchWorker(self.id))?
.clone();
Ok(LocalGetWorkerInfoResponse(worker_info_to_api(
self.id, info,
)))
}
}
impl RequestHandler for LocalGetWorkerVariableRequest {
type Response = LocalGetWorkerVariableResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<LocalGetWorkerVariableResponse, Error> {
let mut res = HashMap::new();
if let Some(k) = self.variable {
res.insert(k.clone(), garage.bg_vars.get(&k)?);
} else {
let vars = garage.bg_vars.get_all();
for (k, v) in vars.iter() {
res.insert(k.to_string(), v.to_string());
}
}
Ok(LocalGetWorkerVariableResponse(res))
}
}
impl RequestHandler for LocalSetWorkerVariableRequest {
type Response = LocalSetWorkerVariableResponse;
async fn handle(
self,
garage: &Arc<Garage>,
_admin: &Admin,
) -> Result<LocalSetWorkerVariableResponse, Error> {
garage.bg_vars.set(&self.variable, &self.value)?;
Ok(LocalSetWorkerVariableResponse {
variable: self.variable,
value: self.value,
})
}
}
// ---- helper functions ----
fn worker_info_to_api(id: u64, info: WorkerInfo) -> WorkerInfoResp {
WorkerInfoResp {
id,
name: info.name,
state: match info.state {
WorkerState::Busy => WorkerStateResp::Busy,
WorkerState::Throttled(t) => WorkerStateResp::Throttled { duration_secs: t },
WorkerState::Idle => WorkerStateResp::Idle,
WorkerState::Done => WorkerStateResp::Done,
},
errors: info.errors as u64,
consecutive_errors: info.consecutive_errors as u64,
last_error: info.last_error.map(|(message, t)| WorkerLastError {
message,
secs_ago: now_msec().saturating_sub(t) / 1000,
}),
tranquility: info.status.tranquility,
progress: info.status.progress,
queue_length: info.status.queue_length,
persistent_errors: info.status.persistent_errors,
freeform: info.status.freeform,
}
}

View file

@ -1,6 +1,6 @@
[package]
name = "garage_api_common"
version = "2.0.0"
version = "1.3.1"
authors = ["Alex Auvolat <alex@adnab.me>"]
edition = "2018"
license = "AGPL-3.0"
@ -23,13 +23,11 @@ bytes.workspace = true
chrono.workspace = true
crc32fast.workspace = true
crc32c.workspace = true
crc64fast-nvme.workspace = true
crypto-common.workspace = true
err-derive.workspace = true
thiserror.workspace = true
hex.workspace = true
hmac.workspace = true
md-5.workspace = true
idna.workspace = true
tracing.workspace = true
nom.workspace = true
pin-project.workspace = true

View file

@ -1,7 +1,7 @@
use std::convert::TryFrom;
use err_derive::Error;
use hyper::StatusCode;
use thiserror::Error;
use garage_util::error::Error as GarageError;
@ -12,48 +12,48 @@ use garage_model::helper::error::Error as HelperError;
pub enum CommonError {
// ---- INTERNAL ERRORS ----
/// Error related to deeper parts of Garage
#[error(display = "Internal error: {}", _0)]
InternalError(#[error(source)] GarageError),
#[error("Internal error: {0}")]
InternalError(#[from] GarageError),
/// Error related to Hyper
#[error(display = "Internal error (Hyper error): {}", _0)]
Hyper(#[error(source)] hyper::Error),
#[error("Internal error (Hyper error): {0}")]
Hyper(#[from] hyper::Error),
/// Error related to HTTP
#[error(display = "Internal error (HTTP error): {}", _0)]
Http(#[error(source)] http::Error),
#[error("Internal error (HTTP error): {0}")]
Http(#[from] http::Error),
// ---- GENERIC CLIENT ERRORS ----
/// Proper authentication was not provided
#[error(display = "Forbidden: {}", _0)]
#[error("Forbidden: {0}")]
Forbidden(String),
/// Generic bad request response with custom message
#[error(display = "Bad request: {}", _0)]
#[error("Bad request: {0}")]
BadRequest(String),
/// The client sent a header with invalid value
#[error(display = "Invalid header value: {}", _0)]
InvalidHeader(#[error(source)] hyper::header::ToStrError),
#[error("Invalid header value: {0}")]
InvalidHeader(#[from] hyper::header::ToStrError),
// ---- SPECIFIC ERROR CONDITIONS ----
// These have to be error codes referenced in the S3 spec here:
// https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html#ErrorCodeList
/// The bucket requested don't exists
#[error(display = "Bucket not found: {}", _0)]
#[error("Bucket not found: {0}")]
NoSuchBucket(String),
/// Tried to create a bucket that already exist
#[error(display = "Bucket already exists")]
#[error("Bucket already exists")]
BucketAlreadyExists,
/// Tried to delete a non-empty bucket
#[error(display = "Tried to delete a non-empty bucket")]
#[error("Tried to delete a non-empty bucket")]
BucketNotEmpty,
// Category: bad request
/// Bucket name is not valid according to AWS S3 specs
#[error(display = "Invalid bucket name: {}", _0)]
#[error("Invalid bucket name: {0}")]
InvalidBucketName(String),
}

View file

@ -9,7 +9,9 @@ use hyper::{body::Body, body::Incoming as IncomingBody, Request, Response, Statu
use garage_model::bucket_table::{BucketParams, CorsRule as GarageCorsRule};
use garage_model::garage::Garage;
use crate::common_error::{CommonError, OkOrBadRequest, OkOrInternalError};
use crate::common_error::{
helper_error_as_internal, CommonError, OkOrBadRequest, OkOrInternalError,
};
use crate::helpers::*;
pub fn find_matching_cors_rule<'a, B>(
@ -74,7 +76,7 @@ pub fn add_cors_headers(
Ok(())
}
pub fn handle_options_api(
pub async fn handle_options_api(
garage: Arc<Garage>,
req: &Request<IncomingBody>,
bucket_name: Option<String>,
@ -91,8 +93,16 @@ pub fn handle_options_api(
// OPTIONS calls are not auhtenticated).
if let Some(bn) = bucket_name {
let helper = garage.bucket_helper();
let bucket_opt = helper.resolve_global_bucket_fast(&bn)?;
if let Some(bucket) = bucket_opt {
let bucket_id = helper
.resolve_global_bucket_name(&bn)
.await
.map_err(helper_error_as_internal)?;
if let Some(id) = bucket_id {
let bucket = garage
.bucket_helper()
.get_existing_bucket(id)
.await
.map_err(helper_error_as_internal)?;
let bucket_params = bucket.state.into_option().unwrap();
handle_options_for_bucket(req, &bucket_params)
} else {

View file

@ -1,4 +1,3 @@
use std::borrow::Cow;
use std::convert::Infallible;
use std::fs::{self, Permissions};
use std::os::unix::fs::PermissionsExt;
@ -36,7 +35,7 @@ use garage_util::socket_address::UnixOrTCPSocketAddress;
use crate::helpers::{BoxBody, ErrorBody};
pub trait ApiEndpoint: Send + Sync + 'static {
fn name(&self) -> Cow<'static, str>;
fn name(&self) -> &'static str;
fn add_span_attributes(&self, span: SpanRef<'_>);
}
@ -59,6 +58,12 @@ pub trait ApiHandler: Send + Sync + 'static {
req: Request<IncomingBody>,
endpoint: Self::Endpoint,
) -> impl Future<Output = Result<Response<BoxBody<Self::Error>>, Self::Error>> + Send;
/// Returns the key id used to authenticate this request. The ID returned must be safe to
/// log.
fn key_id_from_request(&self, _req: &Request<IncomingBody>) -> Option<String> {
None
}
}
pub struct ApiServer<A: ApiHandler> {
@ -143,19 +148,20 @@ impl<A: ApiHandler> ApiServer<A> {
) -> Result<Response<BoxBody<A::Error>>, http::Error> {
let uri = req.uri().clone();
if let Ok(forwarded_for_ip_addr) =
let source = if let Ok(forwarded_for_ip_addr) =
forwarded_headers::handle_forwarded_for_headers(req.headers())
{
info!(
"{} (via {}) {} {}",
forwarded_for_ip_addr,
addr,
req.method(),
uri
);
format!("{forwarded_for_ip_addr} (via {addr})")
} else {
info!("{} {} {}", addr, req.method(), uri);
}
format!("{addr}")
};
// we only do this to log the access key, so we can discard any error
let key = self
.api_handler
.key_id_from_request(&req)
.map(|k| format!("(key {k}) "))
.unwrap_or_default();
info!("{source} {key}{} {uri}", req.method());
debug!("{:?}", req);
let tracer = opentelemetry::global::tracer("garage");
@ -344,7 +350,11 @@ where
while !*must_exit.borrow() {
let (stream, client_addr) = tokio::select! {
acc = listener.accept() => acc?,
acc = listener.accept() => match acc {
Ok(r) => r,
Err(e) if e.kind() == std::io::ErrorKind::ConnectionAborted => continue,
Err(e) => return Err(e.into()),
},
_ = must_exit.changed() => continue,
};

View file

@ -8,7 +8,6 @@ use hyper::{
body::{Body, Bytes},
Request, Response,
};
use idna::domain_to_unicode;
use serde::{Deserialize, Serialize};
use garage_model::bucket_table::BucketParams;
@ -97,7 +96,7 @@ pub fn authority_to_host(authority: &str) -> Result<String, Error> {
authority
))),
};
authority.map(|h| domain_to_unicode(h).0)
authority.map(|h| h.to_ascii_lowercase())
}
/// Extract the bucket name and the key name from an HTTP path and possibly a bucket provided in

View file

@ -45,68 +45,6 @@ macro_rules! router_match {
}
}
}};
(@gen_path_parser_v2 ($method:expr, $reqpath:expr, $pathprefix:literal, $query:expr, $req:expr)
[
$(@special $spec_meth:ident $spec_path:pat => $spec_api:ident $spec_params:tt,)*
$($meth:ident $api:ident $params:tt,)*
]) => {{
{
#[allow(unused_parens)]
match ($method, $reqpath) {
$(
(&Method::$spec_meth, $spec_path) => AdminApiRequest::$spec_api (
router_match!(@@gen_parse_request $spec_api, $spec_params, $query, $req)
),
)*
$(
(&Method::$meth, concat!($pathprefix, stringify!($api)))
=> AdminApiRequest::$api (
router_match!(@@gen_parse_request $api, $params, $query, $req)
),
)*
(m, p) => {
return Err(Error::bad_request(format!(
"Unknown API endpoint: {} {}",
m, p
)))
}
}
}
}};
(@@gen_parse_request $api:ident, (), $query: expr, $req:expr) => {{
paste!(
[< $api Request >]
)
}};
(@@gen_parse_request $api:ident, (body), $query: expr, $req:expr) => {{
paste!({
parse_json_body::< [<$api Request>], _, Error>($req).await?
})
}};
(@@gen_parse_request $api:ident, (body_field, $($conv:ident $(($conv_arg:expr))? :: $param:ident),*), $query: expr, $req:expr)
=>
{{
paste!({
let body = parse_json_body::< [<$api RequestBody>], _, Error>($req).await?;
[< $api Request >] {
body,
$(
$param: router_match!(@@parse_param $query, $conv $(($conv_arg))?, $param),
)+
}
})
}};
(@@gen_parse_request $api:ident, ($($conv:ident $(($conv_arg:expr))? :: $param:ident),*), $query: expr, $req:expr)
=>
{{
paste!({
[< $api Request >] {
$(
$param: router_match!(@@parse_param $query, $conv $(($conv_arg))?, $param),
)+
}
})
}};
(@gen_parser ($keyword:expr, $key:ident, $query:expr, $header:expr),
key: [$($kw_k:ident $(if $required_k:ident)? $(header $header_k:expr)? => $api_k:ident $(($($conv_k:ident :: $param_k:ident),*))?,)*],
no_key: [$($kw_nk:ident $(if $required_nk:ident)? $(if_header $header_nk:expr)? => $api_nk:ident $(($($conv_nk:ident :: $param_nk:ident),*))?,)*]) => {{
@ -141,19 +79,13 @@ macro_rules! router_match {
}
}};
(@@parse_param $query:expr, default, $param:ident) => {{
Default::default()
}};
(@@parse_param $query:expr, query_opt, $param:ident) => {{
// extract optional query parameter
$query.$param.take().map(|param| param.into_owned())
}};
(@@parse_param $query:expr, query, $param:ident) => {{
// extract mendatory query parameter
$query.$param.take()
.ok_or_bad_request(
format!("Missing argument `{}` for endpoint", stringify!($param))
)?.into_owned()
$query.$param.take().ok_or_bad_request("Missing argument for endpoint")?.into_owned()
}};
(@@parse_param $query:expr, opt_parse, $param:ident) => {{
// extract and parse optional query parameter
@ -167,22 +99,10 @@ macro_rules! router_match {
(@@parse_param $query:expr, parse, $param:ident) => {{
// extract and parse mandatory query parameter
// both missing and un-parseable parameters are reported as errors
$query.$param.take()
.ok_or_bad_request(
format!("Missing argument `{}` for endpoint", stringify!($param))
)?
$query.$param.take().ok_or_bad_request("Missing argument for endpoint")?
.parse()
.map_err(|_| Error::bad_request("Failed to parse query parameter"))?
}};
(@@parse_param $query:expr, parse_default($default:expr), $param:ident) => {{
// extract and parse optional query parameter
// using provided value as default if paramter is missing
$query.$param.take().map(|x| x
.parse()
.map_err(|_| Error::bad_request("Failed to parse query parameter")))
.transpose()?
.unwrap_or($default)
}};
(@func
$(#[$doc:meta])*
pub enum Endpoint {
@ -267,7 +187,6 @@ macro_rules! generateQueryParameters {
},
)*
$(
// FIXME: remove if !v.is_empty() ?
$f_param => if !v.is_empty() {
if res.$f_name.replace(v).is_some() {
return Err(Error::bad_request(format!(

View file

@ -4,7 +4,6 @@ use std::hash::Hasher;
use base64::prelude::*;
use crc32c::Crc32cHasher as Crc32c;
use crc32fast::Hasher as Crc32;
use crc64fast_nvme::Digest as Crc64Nvme;
use md5::{Digest, Md5};
use sha1::Sha1;
use sha2::Sha256;
@ -24,14 +23,11 @@ pub const X_AMZ_CHECKSUM_ALGORITHM: HeaderName =
pub const X_AMZ_CHECKSUM_MODE: HeaderName = HeaderName::from_static("x-amz-checksum-mode");
pub const X_AMZ_CHECKSUM_CRC32: HeaderName = HeaderName::from_static("x-amz-checksum-crc32");
pub const X_AMZ_CHECKSUM_CRC32C: HeaderName = HeaderName::from_static("x-amz-checksum-crc32c");
pub const X_AMZ_CHECKSUM_CRC64NVME: HeaderName =
HeaderName::from_static("x-amz-checksum-crc64nvme");
pub const X_AMZ_CHECKSUM_SHA1: HeaderName = HeaderName::from_static("x-amz-checksum-sha1");
pub const X_AMZ_CHECKSUM_SHA256: HeaderName = HeaderName::from_static("x-amz-checksum-sha256");
pub type Crc32Checksum = [u8; 4];
pub type Crc32cChecksum = [u8; 4];
pub type Crc64NvmeChecksum = [u8; 8];
pub type Md5Checksum = [u8; 16];
pub type Sha1Checksum = [u8; 20];
pub type Sha256Checksum = [u8; 32];
@ -49,7 +45,6 @@ pub struct ExpectedChecksums {
pub struct Checksummer {
pub crc32: Option<Crc32>,
pub crc32c: Option<Crc32c>,
pub crc64nvme: Option<Crc64Nvme>,
pub md5: Option<Md5>,
pub sha1: Option<Sha1>,
pub sha256: Option<Sha256>,
@ -59,7 +54,6 @@ pub struct Checksummer {
pub struct Checksums {
pub crc32: Option<Crc32Checksum>,
pub crc32c: Option<Crc32cChecksum>,
pub crc64nvme: Option<Crc64NvmeChecksum>,
pub md5: Option<Md5Checksum>,
pub sha1: Option<Sha1Checksum>,
pub sha256: Option<Sha256Checksum>,
@ -70,7 +64,6 @@ impl Checksummer {
Self {
crc32: None,
crc32c: None,
crc64nvme: None,
md5: None,
sha1: None,
sha256: None,
@ -103,9 +96,6 @@ impl Checksummer {
if matches!(&expected.extra, Some(ChecksumValue::Crc32c(_))) {
self.crc32c = Some(Crc32c::default());
}
if matches!(&expected.extra, Some(ChecksumValue::Crc64Nvme(_))) {
self.crc64nvme = Some(Crc64Nvme::default());
}
if matches!(&expected.extra, Some(ChecksumValue::Sha1(_))) {
self.sha1 = Some(Sha1::new());
}
@ -119,9 +109,6 @@ impl Checksummer {
Some(ChecksumAlgorithm::Crc32c) => {
self.crc32c = Some(Crc32c::default());
}
Some(ChecksumAlgorithm::Crc64Nvme) => {
self.crc64nvme = Some(Crc64Nvme::default());
}
Some(ChecksumAlgorithm::Sha1) => {
self.sha1 = Some(Sha1::new());
}
@ -140,9 +127,6 @@ impl Checksummer {
if let Some(crc32c) = &mut self.crc32c {
crc32c.write(bytes);
}
if let Some(crc64nvme) = &mut self.crc64nvme {
crc64nvme.write(bytes);
}
if let Some(md5) = &mut self.md5 {
md5.update(bytes);
}
@ -160,7 +144,6 @@ impl Checksummer {
crc32c: self
.crc32c
.map(|x| u32::to_be_bytes(u32::try_from(x.finish()).unwrap())),
crc64nvme: self.crc64nvme.map(|x| u64::to_be_bytes(x.sum64())),
md5: self.md5.map(|x| x.finalize()[..].try_into().unwrap()),
sha1: self.sha1.map(|x| x.finalize()[..].try_into().unwrap()),
sha256: self.sha256.map(|x| x.finalize()[..].try_into().unwrap()),
@ -207,9 +190,6 @@ impl Checksums {
None => None,
Some(ChecksumAlgorithm::Crc32) => Some(ChecksumValue::Crc32(self.crc32.unwrap())),
Some(ChecksumAlgorithm::Crc32c) => Some(ChecksumValue::Crc32c(self.crc32c.unwrap())),
Some(ChecksumAlgorithm::Crc64Nvme) => {
Some(ChecksumValue::Crc64Nvme(self.crc64nvme.unwrap()))
}
Some(ChecksumAlgorithm::Sha1) => Some(ChecksumValue::Sha1(self.sha1.unwrap())),
Some(ChecksumAlgorithm::Sha256) => Some(ChecksumValue::Sha256(self.sha256.unwrap())),
}
@ -222,7 +202,6 @@ pub fn parse_checksum_algorithm(algo: &str) -> Result<ChecksumAlgorithm, Error>
match algo {
"CRC32" => Ok(ChecksumAlgorithm::Crc32),
"CRC32C" => Ok(ChecksumAlgorithm::Crc32c),
"CRC64NVME" => Ok(ChecksumAlgorithm::Crc64Nvme),
"SHA1" => Ok(ChecksumAlgorithm::Sha1),
"SHA256" => Ok(ChecksumAlgorithm::Sha256),
_ => Err(Error::bad_request("invalid checksum algorithm")),
@ -246,7 +225,6 @@ pub fn request_trailer_checksum_algorithm(
None => Ok(None),
Some(x) if x == X_AMZ_CHECKSUM_CRC32 => Ok(Some(ChecksumAlgorithm::Crc32)),
Some(x) if x == X_AMZ_CHECKSUM_CRC32C => Ok(Some(ChecksumAlgorithm::Crc32c)),
Some(x) if x == X_AMZ_CHECKSUM_CRC64NVME => Ok(Some(ChecksumAlgorithm::Crc64Nvme)),
Some(x) if x == X_AMZ_CHECKSUM_SHA1 => Ok(Some(ChecksumAlgorithm::Sha1)),
Some(x) if x == X_AMZ_CHECKSUM_SHA256 => Ok(Some(ChecksumAlgorithm::Sha256)),
_ => Err(Error::bad_request("invalid checksum algorithm")),
@ -265,12 +243,6 @@ pub fn request_checksum_value(
if headers.contains_key(X_AMZ_CHECKSUM_CRC32C) {
ret.push(extract_checksum_value(headers, ChecksumAlgorithm::Crc32c)?);
}
if headers.contains_key(X_AMZ_CHECKSUM_CRC64NVME) {
ret.push(extract_checksum_value(
headers,
ChecksumAlgorithm::Crc64Nvme,
)?);
}
if headers.contains_key(X_AMZ_CHECKSUM_SHA1) {
ret.push(extract_checksum_value(headers, ChecksumAlgorithm::Sha1)?);
}
@ -309,14 +281,6 @@ pub fn extract_checksum_value(
.ok_or_bad_request("invalid x-amz-checksum-crc32c header")?;
Ok(ChecksumValue::Crc32c(crc32c))
}
ChecksumAlgorithm::Crc64Nvme => {
let crc64nvme = headers
.get(X_AMZ_CHECKSUM_CRC64NVME)
.and_then(|x| BASE64_STANDARD.decode(&x).ok())
.and_then(|x| x.try_into().ok())
.ok_or_bad_request("invalid x-amz-checksum-crc64nvme header")?;
Ok(ChecksumValue::Crc64Nvme(crc64nvme))
}
ChecksumAlgorithm::Sha1 => {
let sha1 = headers
.get(X_AMZ_CHECKSUM_SHA1)
@ -347,9 +311,6 @@ pub fn add_checksum_response_headers(
Some(ChecksumValue::Crc32c(crc32c)) => {
resp = resp.header(X_AMZ_CHECKSUM_CRC32C, BASE64_STANDARD.encode(&crc32c));
}
Some(ChecksumValue::Crc64Nvme(crc64nvme)) => {
resp = resp.header(X_AMZ_CHECKSUM_CRC64NVME, BASE64_STANDARD.encode(&crc64nvme));
}
Some(ChecksumValue::Sha1(sha1)) => {
resp = resp.header(X_AMZ_CHECKSUM_SHA1, BASE64_STANDARD.encode(&sha1));
}

View file

@ -1,4 +1,4 @@
use err_derive::Error;
use thiserror::Error;
use crate::common_error::CommonError;
pub use crate::common_error::{CommonErrorDerivative, OkOrBadRequest, OkOrInternalError};
@ -6,21 +6,21 @@ pub use crate::common_error::{CommonErrorDerivative, OkOrBadRequest, OkOrInterna
/// Errors of this crate
#[derive(Debug, Error)]
pub enum Error {
#[error(display = "{}", _0)]
#[error("{0}")]
/// Error from common error
Common(CommonError),
/// Authorization Header Malformed
#[error(display = "Authorization header malformed, unexpected scope: {}", _0)]
#[error("Authorization header malformed, unexpected scope: {0}")]
AuthorizationHeaderMalformed(String),
// Category: bad request
/// The request contained an invalid UTF-8 sequence in its path or in other parameters
#[error(display = "Invalid UTF-8: {}", _0)]
InvalidUtf8Str(#[error(source)] std::str::Utf8Error),
#[error("Invalid UTF-8: {0}")]
InvalidUtf8Str(#[from] std::str::Utf8Error),
/// The provided digest (checksum) value was invalid
#[error(display = "Invalid digest: {}", _0)]
#[error("Invalid digest: {0}")]
InvalidDigest(String),
}

View file

@ -64,12 +64,12 @@ pub struct VerifiedRequest {
pub content_sha256_header: ContentSha256Header,
}
pub fn verify_request(
pub async fn verify_request(
garage: &Garage,
mut req: Request<IncomingBody>,
service: &'static str,
) -> Result<VerifiedRequest, Error> {
let checked_signature = payload::check_payload_signature(&garage, &mut req, service)?;
let checked_signature = payload::check_payload_signature(&garage, &mut req, service).await?;
let request = streaming::parse_streaming_body(
req,

View file

@ -9,7 +9,6 @@ use sha2::{Digest, Sha256};
use garage_table::*;
use garage_util::data::Hash;
use garage_util::time::now_msec;
use garage_model::garage::Garage;
use garage_model::key_table::*;
@ -33,7 +32,7 @@ pub struct CheckedSignature {
pub signature_header: Option<String>,
}
pub fn check_payload_signature(
pub async fn check_payload_signature(
garage: &Garage,
request: &mut Request<IncomingBody>,
service: &'static str,
@ -44,9 +43,9 @@ pub fn check_payload_signature(
// We check for presigned-URL-style authentication first, because
// the browser or something else could inject an Authorization header
// that is totally unrelated to AWS signatures.
check_presigned_signature(garage, service, request, query)
check_presigned_signature(garage, service, request, query).await
} else if request.headers().contains_key(AUTHORIZATION) {
check_standard_signature(garage, service, request, query)
check_standard_signature(garage, service, request, query).await
} else {
// Unsigned (anonymous) request
let content_sha256 = request
@ -94,7 +93,7 @@ fn parse_x_amz_content_sha256(header: Option<&str>) -> Result<ContentSha256Heade
}
}
fn check_standard_signature(
async fn check_standard_signature(
garage: &Garage,
service: &'static str,
request: &Request<IncomingBody>,
@ -105,7 +104,7 @@ fn check_standard_signature(
// Verify that all necessary request headers are included in signed_headers
// The following must be included for all signatures:
// - the Host header (mandatory)
// - all x-amz-* headers used in the request
// - all x-amz-* headers used in the request (except x-amz-content-sha256)
// AWS also indicates that the Content-Type header should be signed if
// it is used, but Minio client doesn't sign it so we don't check it for compatibility.
let signed_headers = split_signed_headers(&authorization)?;
@ -129,7 +128,7 @@ fn check_standard_signature(
trace!("canonical request:\n{}", canonical_request);
trace!("string to sign:\n{}", string_to_sign);
let key = verify_v4(garage, service, &authorization, string_to_sign.as_bytes())?;
let key = verify_v4(garage, service, &authorization, string_to_sign.as_bytes()).await?;
let content_sha256_header = parse_x_amz_content_sha256(Some(&authorization.content_sha256))?;
@ -140,7 +139,7 @@ fn check_standard_signature(
})
}
fn check_presigned_signature(
async fn check_presigned_signature(
garage: &Garage,
service: &'static str,
request: &mut Request<IncomingBody>,
@ -152,7 +151,7 @@ fn check_presigned_signature(
// Verify that all necessary request headers are included in signed_headers
// For AWSv4 pre-signed URLs, the following must be included:
// - the Host header (mandatory)
// - all x-amz-* headers used in the request
// - all x-amz-* headers used in the request (except x-amz-content-sha256)
let signed_headers = split_signed_headers(&authorization)?;
verify_signed_headers(request.headers(), &signed_headers)?;
@ -179,7 +178,7 @@ fn check_presigned_signature(
trace!("canonical request (presigned url):\n{}", canonical_request);
trace!("string to sign (presigned url):\n{}", string_to_sign);
let key = verify_v4(garage, service, &authorization, string_to_sign.as_bytes())?;
let key = verify_v4(garage, service, &authorization, string_to_sign.as_bytes()).await?;
// In the page on presigned URLs, AWS specifies that if a signed query
// parameter and a signed header of the same name have different values,
@ -269,7 +268,9 @@ fn verify_signed_headers(headers: &HeaderMap, signed_headers: &[HeaderName]) ->
return Err(Error::bad_request("Header `Host` should be signed"));
}
for (name, _) in headers.iter() {
if name.as_str().starts_with("x-amz-") {
// Enforce signature of all x-amz-* headers, except x-amz-content-sh256
// because it is included in the canonical request in all cases
if name.as_str().starts_with("x-amz-") && name != X_AMZ_CONTENT_SHA256 {
if !signed_headers.contains(name) {
return Err(Error::bad_request(format!(
"Header `{}` should be signed",
@ -379,7 +380,7 @@ pub fn parse_date(date: &str) -> Result<DateTime<Utc>, Error> {
Ok(Utc.from_utc_datetime(&date))
}
pub fn verify_v4(
pub async fn verify_v4(
garage: &Garage,
service: &str,
auth: &Authorization,
@ -392,18 +393,12 @@ pub fn verify_v4(
let key = garage
.key_table
.get_local(&EmptyKey, &auth.key_id)?
.get(&EmptyKey, &auth.key_id)
.await?
.filter(|k| !k.state.is_deleted())
.ok_or_else(|| Error::forbidden(format!("No such key: {}", &auth.key_id)))?;
let key_p = key.params().unwrap();
if key_p.is_expired(now_msec()) {
return Err(Error::forbidden(format!(
"Access key {} has expired",
key.key_id
)));
}
let mut hmac = signing_hmac(
&auth.date,
&key_p.secret_key,
@ -424,7 +419,7 @@ pub fn verify_v4(
// ============ Authorization header, or X-Amz-* query params =========
pub struct Authorization {
key_id: String,
pub key_id: String,
scope: String,
signed_headers: String,
signature: String,
@ -433,7 +428,7 @@ pub struct Authorization {
}
impl Authorization {
fn parse_header(headers: &HeaderMap) -> Result<Self, Error> {
pub fn parse_header(headers: &HeaderMap) -> Result<Self, Error> {
let authorization = headers
.get(AUTHORIZATION)
.ok_or_bad_request("Missing authorization header")?
@ -475,8 +470,7 @@ impl Authorization {
let date = headers
.get(X_AMZ_DATE)
.ok_or_bad_request("Missing X-Amz-Date field")
.map_err(Error::from)?
.ok_or_bad_request("Missing X-Amz-Date field")?
.to_str()?;
let date = parse_date(date)?;

View file

@ -1,6 +1,6 @@
[package]
name = "garage_api_k2v"
version = "2.0.0"
version = "1.3.1"
authors = ["Alex Auvolat <alex@adnab.me>"]
edition = "2018"
license = "AGPL-3.0"
@ -20,7 +20,7 @@ garage_util = { workspace = true, features = [ "k2v" ] }
garage_api_common.workspace = true
base64.workspace = true
err-derive.workspace = true
thiserror.workspace = true
tracing.workspace = true
futures.workspace = true

View file

@ -1,4 +1,3 @@
use std::borrow::Cow;
use std::sync::Arc;
use hyper::{body::Incoming as IncomingBody, Method, Request, Response};
@ -77,19 +76,25 @@ impl ApiHandler for K2VApiServer {
// The OPTIONS method is processed early, before we even check for an API key
if let Endpoint::Options = endpoint {
let options_res = handle_options_api(garage, &req, Some(bucket_name))
.await
.ok_or_bad_request("Error handling OPTIONS")?;
return Ok(options_res.map(|_empty_body: EmptyBody| empty_body()));
}
let verified_request = verify_request(&garage, req, "k2v")?;
let verified_request = verify_request(&garage, req, "k2v").await?;
let req = verified_request.request;
let api_key = verified_request.access_key;
let bucket_id = garage
.bucket_helper()
.resolve_bucket(&bucket_name, &api_key)
.await
.map_err(pass_helper_error)?;
let bucket = garage
.bucket_helper()
.resolve_bucket_fast(&bucket_name, &api_key)
.map_err(pass_helper_error)?;
let bucket_id = bucket.id;
.get_existing_bucket(bucket_id)
.await
.map_err(helper_error_as_internal)?;
let bucket_params = bucket.state.into_option().unwrap();
let allowed = match endpoint.authorization_type() {
@ -171,11 +176,17 @@ impl ApiHandler for K2VApiServer {
Ok(resp_ok)
}
fn key_id_from_request(&self, req: &Request<IncomingBody>) -> Option<String> {
garage_api_common::signature::payload::Authorization::parse_header(req.headers())
.map(|auth| auth.key_id)
.ok()
}
}
impl ApiEndpoint for K2VApiEndpoint {
fn name(&self) -> Cow<'static, str> {
Cow::Borrowed(self.endpoint.name())
fn name(&self) -> &'static str {
self.endpoint.name()
}
fn add_span_attributes(&self, span: SpanRef<'_>) {

View file

@ -1,9 +1,9 @@
use err_derive::Error;
use hyper::header::HeaderValue;
use hyper::{HeaderMap, StatusCode};
use thiserror::Error;
pub(crate) use garage_api_common::common_error::pass_helper_error;
use garage_api_common::common_error::{commonErrorDerivative, CommonError};
pub(crate) use garage_api_common::common_error::{helper_error_as_internal, pass_helper_error};
pub use garage_api_common::common_error::{
CommonErrorDerivative, OkOrBadRequest, OkOrInternalError,
};
@ -14,38 +14,38 @@ use garage_api_common::signature::error::Error as SignatureError;
/// Errors of this crate
#[derive(Debug, Error)]
pub enum Error {
#[error(display = "{}", _0)]
#[error("{0}")]
/// Error from common error
Common(#[error(source)] CommonError),
Common(#[from] CommonError),
// Category: cannot process
/// Authorization Header Malformed
#[error(display = "Authorization header malformed, unexpected scope: {}", _0)]
#[error("Authorization header malformed, unexpected scope: {0}")]
AuthorizationHeaderMalformed(String),
/// The provided digest (checksum) value was invalid
#[error(display = "Invalid digest: {}", _0)]
#[error("Invalid digest: {0}")]
InvalidDigest(String),
/// The object requested don't exists
#[error(display = "Key not found")]
#[error("Key not found")]
NoSuchKey,
/// Some base64 encoded data was badly encoded
#[error(display = "Invalid base64: {}", _0)]
InvalidBase64(#[error(source)] base64::DecodeError),
#[error("Invalid base64: {0}")]
InvalidBase64(#[from] base64::DecodeError),
/// Invalid causality token
#[error(display = "Invalid causality token")]
#[error("Invalid causality token")]
InvalidCausalityToken,
/// The client asked for an invalid return format (invalid Accept header)
#[error(display = "Not acceptable: {}", _0)]
#[error("Not acceptable: {0}")]
NotAcceptable(String),
/// The request contained an invalid UTF-8 sequence in its path or in other parameters
#[error(display = "Invalid UTF-8: {}", _0)]
InvalidUtf8Str(#[error(source)] std::str::Utf8Error),
#[error("Invalid UTF-8: {0}")]
InvalidUtf8Str(#[from] std::str::Utf8Error),
}
commonErrorDerivative!(Error);

View file

@ -1,6 +1,6 @@
[package]
name = "garage_api_s3"
version = "2.0.0"
version = "1.3.1"
authors = ["Alex Auvolat <alex@adnab.me>"]
edition = "2018"
license = "AGPL-3.0"
@ -29,10 +29,8 @@ bytes.workspace = true
chrono.workspace = true
crc32fast.workspace = true
crc32c.workspace = true
crc64fast-nvme.workspace = true
err-derive.workspace = true
thiserror.workspace = true
hex.workspace = true
hmac.workspace = true
tracing.workspace = true
md-5.workspace = true
pin-project.workspace = true

View file

@ -1,4 +1,3 @@
use std::borrow::Cow;
use std::sync::Arc;
use hyper::header;
@ -118,11 +117,11 @@ impl ApiHandler for S3ApiServer {
return handle_post_object(garage, req, bucket_name.unwrap()).await;
}
if let Endpoint::Options = endpoint {
let options_res = handle_options_api(garage, &req, bucket_name)?;
let options_res = handle_options_api(garage, &req, bucket_name).await?;
return Ok(options_res.map(|_empty_body: EmptyBody| empty_body()));
}
let verified_request = verify_request(&garage, req, "s3")?;
let verified_request = verify_request(&garage, req, "s3").await?;
let req = verified_request.request;
let api_key = verified_request.access_key;
@ -140,11 +139,15 @@ impl ApiHandler for S3ApiServer {
return handle_create_bucket(&garage, req, &api_key.key_id, bucket_name).await;
}
let bucket_id = garage
.bucket_helper()
.resolve_bucket(&bucket_name, &api_key)
.await
.map_err(pass_helper_error)?;
let bucket = garage
.bucket_helper()
.resolve_bucket_fast(&bucket_name, &api_key)
.map_err(pass_helper_error)?;
let bucket_id = bucket.id;
.get_existing_bucket(bucket_id)
.await?;
let bucket_params = bucket.state.into_option().unwrap();
let allowed = match endpoint.authorization_type() {
@ -223,6 +226,7 @@ impl ApiHandler for S3ApiServer {
Endpoint::DeleteBucket {} => handle_delete_bucket(ctx).await,
Endpoint::GetBucketLocation {} => handle_get_bucket_location(ctx),
Endpoint::GetBucketVersioning {} => handle_get_bucket_versioning(),
Endpoint::GetBucketAcl {} => handle_get_bucket_acl(ctx),
Endpoint::ListObjects {
delimiter,
encoding_type,
@ -339,11 +343,17 @@ impl ApiHandler for S3ApiServer {
Ok(resp_ok)
}
fn key_id_from_request(&self, req: &Request<IncomingBody>) -> Option<String> {
garage_api_common::signature::payload::Authorization::parse_header(req.headers())
.map(|auth| auth.key_id)
.ok()
}
}
impl ApiEndpoint for S3ApiEndpoint {
fn name(&self) -> Cow<'static, str> {
Cow::Borrowed(self.endpoint.name())
fn name(&self) -> &'static str {
self.endpoint.name()
}
fn add_span_attributes(&self, span: SpanRef<'_>) {

View file

@ -5,7 +5,7 @@ use hyper::{Request, Response, StatusCode};
use garage_model::bucket_alias_table::*;
use garage_model::bucket_table::Bucket;
use garage_model::garage::Garage;
use garage_model::key_table::Key;
use garage_model::key_table::{Key, KeyParams};
use garage_model::permission::BucketKeyPerm;
use garage_table::util::*;
use garage_util::crdt::*;
@ -44,6 +44,55 @@ pub fn handle_get_bucket_versioning() -> Result<Response<ResBody>, Error> {
.body(string_body(xml))?)
}
pub fn handle_get_bucket_acl(ctx: ReqCtx) -> Result<Response<ResBody>, Error> {
let ReqCtx {
bucket_id, api_key, ..
} = ctx;
let key_p = api_key.params().ok_or_internal_error(
"Key should not be in deleted state at this point (in handle_get_bucket_acl)",
)?;
let mut grants: Vec<s3_xml::Grant> = vec![];
let kp = api_key.bucket_permissions(&bucket_id);
if kp.allow_owner {
grants.push(s3_xml::Grant {
grantee: create_grantee(&key_p, &api_key),
permission: s3_xml::Value("FULL_CONTROL".to_string()),
});
} else {
if kp.allow_read {
grants.push(s3_xml::Grant {
grantee: create_grantee(&key_p, &api_key),
permission: s3_xml::Value("READ".to_string()),
});
grants.push(s3_xml::Grant {
grantee: create_grantee(&key_p, &api_key),
permission: s3_xml::Value("READ_ACP".to_string()),
});
}
if kp.allow_write {
grants.push(s3_xml::Grant {
grantee: create_grantee(&key_p, &api_key),
permission: s3_xml::Value("WRITE".to_string()),
});
}
}
let access_control_policy = s3_xml::AccessControlPolicy {
xmlns: (),
owner: None,
acl: s3_xml::AccessControlList { entries: grants },
};
let xml = s3_xml::to_xml_with_header(&access_control_policy)?;
trace!("xml: {}", xml);
Ok(Response::builder()
.header("Content-Type", "application/xml")
.body(string_body(xml))?)
}
pub async fn handle_list_buckets(
garage: &Garage,
api_key: &Key,
@ -143,16 +192,21 @@ pub async fn handle_create_bucket(
let api_key = helper.key().get_existing_key(api_key_id).await?;
let key_params = api_key.params().unwrap();
let existing_bucket = helper
.bucket()
.resolve_bucket(&bucket_name, &api_key.key_id)
.await?;
let existing_bucket = if let Some(Some(bucket_id)) = key_params.local_aliases.get(&bucket_name)
{
Some(*bucket_id)
} else {
helper
.bucket()
.resolve_global_bucket_name(&bucket_name)
.await?
};
if let Some(bucket) = existing_bucket {
if let Some(bucket_id) = existing_bucket {
// Check we have write or owner permission on the bucket,
// in that case it's fine, return 200 OK, bucket exists;
// otherwise return a forbidden error.
let kp = api_key.bucket_permissions(&bucket.id);
let kp = api_key.bucket_permissions(&bucket_id);
if !(kp.allow_write || kp.allow_owner) {
return Err(CommonError::BucketAlreadyExists.into());
}
@ -167,7 +221,7 @@ pub async fn handle_create_bucket(
}
// Create the bucket!
if !is_valid_bucket_name(&bucket_name) {
if !is_valid_bucket_name(&bucket_name, garage.config.allow_punycode) {
return Err(Error::bad_request(format!(
"{}: {}",
bucket_name, INVALID_BUCKET_NAME_MESSAGE
@ -236,11 +290,11 @@ pub async fn handle_delete_bucket(ctx: ReqCtx) -> Result<Response<ResBody>, Erro
// 1. delete bucket alias
if is_local_alias {
helper
.unset_local_bucket_alias(*bucket_id, &api_key.key_id, bucket_name)
.purge_local_bucket_alias(*bucket_id, &api_key.key_id, bucket_name)
.await?;
} else {
helper
.unset_global_bucket_alias(*bucket_id, bucket_name)
.purge_global_bucket_alias(*bucket_id, bucket_name)
.await?;
}
@ -306,6 +360,15 @@ fn parse_create_bucket_xml(xml_bytes: &[u8]) -> Option<Option<String>> {
Some(ret)
}
fn create_grantee(key_params: &KeyParams, api_key: &Key) -> s3_xml::Grantee {
s3_xml::Grantee {
xmlns_xsi: (),
typ: "CanonicalUser".to_string(),
display_name: Some(s3_xml::Value(key_params.name.get().to_string())),
id: Some(s3_xml::Value(api_key.key_id.to_string())),
}
}
#[cfg(test)]
mod tests {
use super::*;

View file

@ -24,11 +24,12 @@ use garage_api_common::helpers::*;
use garage_api_common::signature::checksum::*;
use crate::api_server::{ReqBody, ResBody};
use crate::encryption::{EncryptionParams, OekDerivationInfo};
use crate::encryption::EncryptionParams;
use crate::error::*;
use crate::get::{full_object_byte_stream, PreconditionHeaders};
use crate::get::{check_version_not_deleted, full_object_byte_stream, PreconditionHeaders};
use crate::multipart;
use crate::put::{extract_metadata_headers, save_stream, ChecksumMode, SaveStreamResult};
use crate::website::X_AMZ_WEBSITE_REDIRECT_LOCATION;
use crate::xml::{self as s3_xml, xmlns_tag};
pub const X_AMZ_COPY_SOURCE_IF_MATCH: HeaderName =
@ -65,18 +66,8 @@ pub async fn handle_copy(
&ctx.garage,
req.headers(),
&source_version_meta.encryption,
OekDerivationInfo::for_object(&source_object, source_version),
)?;
let dest_uuid = gen_uuid();
let dest_encryption = EncryptionParams::new_from_headers(
&ctx.garage,
req.headers(),
OekDerivationInfo {
bucket_id: ctx.bucket_id,
version_id: dest_uuid,
object_key: dest_key,
},
)?;
let dest_encryption = EncryptionParams::new_from_headers(&ctx.garage, req.headers())?;
// Extract source checksum info before source_object_meta_inner is consumed
let source_checksum = source_object_meta_inner.checksum;
@ -94,7 +85,18 @@ pub async fn handle_copy(
Some(v) if v == hyper::header::HeaderValue::from_static("REPLACE") => {
extract_metadata_headers(req.headers())?
}
_ => source_object_meta_inner.into_owned().headers,
_ => {
// The x-amz-website-redirect-location header is not copied, instead
// it is replaced by the value from the request (or removed if no
// value was specified)
let is_redirect =
|(key, _): &(String, String)| key == X_AMZ_WEBSITE_REDIRECT_LOCATION.as_str();
let mut headers: Vec<_> = source_object_meta_inner.headers.clone();
headers.retain(|h| !is_redirect(h));
let new_headers = extract_metadata_headers(req.headers())?;
headers.extend(new_headers.into_iter().filter(is_redirect));
headers
}
},
checksum: source_checksum,
};
@ -125,7 +127,6 @@ pub async fn handle_copy(
handle_copy_metaonly(
ctx,
dest_key,
dest_uuid,
dest_object_meta,
dest_encryption,
source_version,
@ -149,7 +150,6 @@ pub async fn handle_copy(
handle_copy_reencrypt(
ctx,
dest_key,
dest_uuid,
dest_object_meta,
dest_encryption,
source_version,
@ -181,7 +181,6 @@ pub async fn handle_copy(
async fn handle_copy_metaonly(
ctx: ReqCtx,
dest_key: &str,
dest_uuid: Uuid,
dest_object_meta: ObjectVersionMetaInner,
dest_encryption: EncryptionParams,
source_version: &ObjectVersion,
@ -195,6 +194,7 @@ async fn handle_copy_metaonly(
} = ctx;
// Generate parameters for copied object
let new_uuid = gen_uuid();
let new_timestamp = now_msec();
let new_meta = ObjectVersionMeta {
@ -204,7 +204,7 @@ async fn handle_copy_metaonly(
};
let res = SaveStreamResult {
version_uuid: dest_uuid,
version_uuid: new_uuid,
version_timestamp: new_timestamp,
etag: new_meta.etag.clone(),
};
@ -216,7 +216,7 @@ async fn handle_copy_metaonly(
// bytes is either plaintext before&after or encrypted with the
// same keys, so it's ok to just copy it as is
let dest_object_version = ObjectVersion {
uuid: dest_uuid,
uuid: new_uuid,
timestamp: new_timestamp,
state: ObjectVersionState::Complete(ObjectVersionData::Inline(
new_meta,
@ -237,12 +237,13 @@ async fn handle_copy_metaonly(
.get(&source_version.uuid, &EmptyKey)
.await?;
let source_version = source_version.ok_or(Error::NoSuchKey)?;
check_version_not_deleted(&source_version)?;
// Write an "uploading" marker in Object table
// This holds a reference to the object in the Version table
// so that it won't be deleted, e.g. by repair_versions.
let tmp_dest_object_version = ObjectVersion {
uuid: dest_uuid,
uuid: new_uuid,
timestamp: new_timestamp,
state: ObjectVersionState::Uploading {
encryption: new_meta.encryption.clone(),
@ -262,7 +263,7 @@ async fn handle_copy_metaonly(
// marked as deleted (they are marked as deleted only if the Version
// doesn't exist or is marked as deleted).
let mut dest_version = Version::new(
dest_uuid,
new_uuid,
VersionBacklink::Object {
bucket_id: dest_bucket_id,
key: dest_key.to_string(),
@ -281,7 +282,7 @@ async fn handle_copy_metaonly(
.iter()
.map(|b| BlockRef {
block: b.1.hash,
version: dest_uuid,
version: new_uuid,
deleted: false.into(),
})
.collect::<Vec<_>>();
@ -297,7 +298,7 @@ async fn handle_copy_metaonly(
// with the stuff before, the block's reference counts could be decremented before
// they are incremented again for the new version, leading to data being deleted.
let dest_object_version = ObjectVersion {
uuid: dest_uuid,
uuid: new_uuid,
timestamp: new_timestamp,
state: ObjectVersionState::Complete(ObjectVersionData::FirstBlock(
new_meta,
@ -319,7 +320,6 @@ async fn handle_copy_metaonly(
async fn handle_copy_reencrypt(
ctx: ReqCtx,
dest_key: &str,
dest_uuid: Uuid,
dest_object_meta: ObjectVersionMetaInner,
dest_encryption: EncryptionParams,
source_version: &ObjectVersion,
@ -339,7 +339,6 @@ async fn handle_copy_reencrypt(
save_stream(
&ctx,
dest_uuid,
dest_object_meta,
dest_encryption,
source_stream.map_err(|e| Error::from(GarageError::from(e))),
@ -363,7 +362,7 @@ pub async fn handle_upload_part_copy(
let dest_upload_id = multipart::decode_upload_id(upload_id)?;
let dest_key = dest_key.to_string();
let (source_object, (dest_object, dest_version, mut dest_mpu)) = futures::try_join!(
let (source_object, (_, dest_version, mut dest_mpu)) = futures::try_join!(
get_copy_source(&ctx, req),
multipart::get_upload(&ctx, &dest_key, &dest_upload_id)
)?;
@ -381,10 +380,7 @@ pub async fn handle_upload_part_copy(
&garage,
req.headers(),
&source_version_meta.encryption,
OekDerivationInfo::for_object(&source_object, source_object_version),
)?;
let dest_oek_params = OekDerivationInfo::for_object(&dest_object, &dest_version);
let (dest_object_encryption, dest_object_checksum_algorithm) = match dest_version.state {
ObjectVersionState::Uploading {
encryption,
@ -393,12 +389,8 @@ pub async fn handle_upload_part_copy(
} => (encryption, checksum_algorithm),
_ => unreachable!(),
};
let (dest_encryption, _) = EncryptionParams::check_decrypt(
&garage,
req.headers(),
&dest_object_encryption,
dest_oek_params,
)?;
let (dest_encryption, _) =
EncryptionParams::check_decrypt(&garage, req.headers(), &dest_object_encryption)?;
let same_encryption = EncryptionParams::is_same(&source_encryption, &dest_encryption);
// Check source range is valid
@ -437,6 +429,7 @@ pub async fn handle_upload_part_copy(
.get(&source_object_version.uuid, &EmptyKey)
.await?
.ok_or(Error::NoSuchKey)?;
check_version_not_deleted(&source_version)?;
// We want to reuse blocks from the source version as much as possible.
// However, we still need to get the data from these blocks
@ -568,6 +561,7 @@ pub async fn handle_upload_part_copy(
let mut current_offset = 0;
let mut next_block = defragmenter.next().await?;
let mut blocks_to_dup = dest_version.clone();
// TODO this could be optimized similarly to read_and_put_blocks
// low priority because uploadpartcopy is rarely used
@ -597,8 +591,7 @@ pub async fn handle_upload_part_copy(
.unwrap()?;
checksummer = checksummer_updated;
dest_version.blocks.clear();
dest_version.blocks.put(
let (version_block_key, version_block) = (
VersionBlockKey {
part_number,
offset: current_offset,
@ -610,37 +603,56 @@ pub async fn handle_upload_part_copy(
);
current_offset += data_len;
let block_ref = BlockRef {
block: final_hash,
version: dest_version_id,
deleted: false.into(),
let next = if let Some(final_data) = data_to_upload {
dest_version.blocks.clear();
dest_version.blocks.put(version_block_key, version_block);
let block_ref = BlockRef {
block: final_hash,
version: dest_version_id,
deleted: false.into(),
};
let (_, _, _, next) = futures::try_join!(
// Thing 1: if the block is not exactly a block that existed before,
// we need to insert that data as a new block.
garage.block_manager.rpc_put_block(
final_hash,
final_data,
dest_encryption.is_encrypted(),
None
),
// Thing 2: we need to insert the block in the version
garage.version_table.insert(&dest_version),
// Thing 3: we need to add a block reference
garage.block_ref_table.insert(&block_ref),
// Thing 4: we need to read the next block
defragmenter.next(),
)?;
next
} else {
blocks_to_dup.blocks.put(version_block_key, version_block);
defragmenter.next().await?
};
let (_, _, _, next) = futures::try_join!(
// Thing 1: if the block is not exactly a block that existed before,
// we need to insert that data as a new block.
async {
if let Some(final_data) = data_to_upload {
garage
.block_manager
.rpc_put_block(final_hash, final_data, dest_encryption.is_encrypted(), None)
.await
} else {
Ok(())
}
},
// Thing 2: we need to insert the block in the version
garage.version_table.insert(&dest_version),
// Thing 3: we need to add a block reference
garage.block_ref_table.insert(&block_ref),
// Thing 4: we need to read the next block
defragmenter.next(),
)?;
next_block = next;
}
assert_eq!(current_offset, source_range.length);
// Put the duplicated blocks into the version & block_refs tables
let block_refs_to_put = blocks_to_dup
.blocks
.items()
.iter()
.map(|b| BlockRef {
block: b.1.hash,
version: dest_version_id,
deleted: false.into(),
})
.collect::<Vec<_>>();
futures::try_join!(
garage.version_table.insert(&blocks_to_dup),
garage.block_ref_table.insert_many(&block_refs_to_put[..]),
)?;
let checksums = checksummer.finalize();
let etag = dest_encryption.etag_from_md5(&checksums.md5);
let checksum = checksums.extract(dest_object_checksum_algorithm);
@ -683,15 +695,16 @@ async fn get_copy_source(ctx: &ReqCtx, req: &Request<ReqBody>) -> Result<Object,
let copy_source = percent_encoding::percent_decode_str(copy_source).decode_utf8()?;
let (source_bucket, source_key) = parse_bucket_key(&copy_source, None)?;
let source_bucket = garage
let source_bucket_id = garage
.bucket_helper()
.resolve_bucket_fast(&source_bucket.to_string(), api_key)
.resolve_bucket(&source_bucket.to_string(), api_key)
.await
.map_err(pass_helper_error)?;
if !api_key.allow_read(&source_bucket.id) {
if !api_key.allow_read(&source_bucket_id) {
return Err(Error::forbidden(format!(
"Reading from bucket {:?} not allowed for this key",
source_bucket.id
"Reading from bucket {} not allowed for this key",
source_bucket
)));
}
@ -699,7 +712,7 @@ async fn get_copy_source(ctx: &ReqCtx, req: &Request<ReqBody>) -> Result<Object,
let source_object = garage
.object_table
.get(&source_bucket.id, &source_key.to_string())
.get(&source_bucket_id, &source_key.to_string())
.await?
.ok_or(Error::NoSuchKey)?;

View file

@ -88,7 +88,9 @@ pub async fn handle_put_cors(
pub struct CorsConfiguration {
#[serde(serialize_with = "xmlns_tag", skip_deserializing)]
pub xmlns: (),
#[serde(rename = "CORSRule")]
// "default" is required to be able to parse an empty list of rules,
// cf https://docs.rs/quick-xml/latest/quick_xml/de/#sequences-xsall-and-xssequence-xml-schema-types
#[serde(rename = "CORSRule", default)]
pub cors_rules: Vec<CorsRule>,
}
@ -270,4 +272,26 @@ mod tests {
Ok(())
}
#[test]
fn test_deserialize_norules() -> Result<(), Error> {
let message = r#"<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/" />"#;
let conf: CorsConfiguration = from_str(message).unwrap();
let ref_value = CorsConfiguration {
xmlns: (),
cors_rules: vec![],
};
assert_eq! {
ref_value,
conf
};
let message2 = to_xml_with_header(&ref_value)?;
let cleanup = |c: &str| c.replace(char::is_whitespace, "");
assert_eq!(cleanup(message), cleanup(&message2));
Ok(())
}
}

View file

@ -11,7 +11,6 @@ use aes_gcm::{
};
use base64::prelude::*;
use bytes::Bytes;
use sha2::Sha256;
use futures::stream::Stream;
use futures::task;
@ -22,12 +21,12 @@ use http::header::{HeaderMap, HeaderName, HeaderValue};
use garage_net::bytes_buf::BytesBuf;
use garage_net::stream::{stream_asyncread, ByteStream};
use garage_rpc::rpc_helper::OrderTag;
use garage_util::data::{Hash, Uuid};
use garage_util::data::Hash;
use garage_util::error::Error as GarageError;
use garage_util::migrate::Migrate;
use garage_model::garage::Garage;
use garage_model::s3::object_table::*;
use garage_model::s3::object_table::{ObjectVersionEncryption, ObjectVersionMetaInner};
use garage_api_common::common_error::*;
use garage_api_common::signature::checksum::Md5Checksum;
@ -65,45 +64,32 @@ const STREAM_ENC_CYPER_CHUNK_SIZE: usize = STREAM_ENC_PLAIN_CHUNK_SIZE + 16;
pub enum EncryptionParams {
Plaintext,
SseC {
/// the value of x-amz-server-side-encryption-customer-key
client_key: Key<Aes256Gcm>,
/// the value of x-amz-server-side-encryption-customer-key-md5
client_key_md5: Md5Output,
/// the object encryption key, for uploads created in garage v2+
object_key: Option<Key<Aes256Gcm>>,
/// the compression level used for compressing data blocks
compression_level: Option<i32>,
},
}
#[derive(Clone, Copy)]
pub struct OekDerivationInfo<'a> {
pub bucket_id: Uuid,
pub version_id: Uuid,
pub object_key: &'a str,
}
impl EncryptionParams {
pub fn is_encrypted(&self) -> bool {
!matches!(self, Self::Plaintext)
}
pub fn is_same(a: &Self, b: &Self) -> bool {
// This function is used in CopyObject and UploadPartCopy to determine
// whether the object must be re-encrypted. If this returns true,
// data blocks are reused as-is. Since Garage v2, we are using
// object-specific encryption keys, so we know that if both source
// and destination are encrypted, it can't be with the same key.
match (a, b) {
(Self::Plaintext, Self::Plaintext) => true,
_ => false,
}
let relevant_info = |x: &Self| match x {
Self::Plaintext => None,
Self::SseC {
client_key,
compression_level,
..
} => Some((*client_key, compression_level.is_some())),
};
relevant_info(a) == relevant_info(b)
}
pub fn new_from_headers(
garage: &Garage,
headers: &HeaderMap,
oek_info: OekDerivationInfo<'_>,
) -> Result<EncryptionParams, Error> {
let key = parse_request_headers(
headers,
@ -115,7 +101,6 @@ impl EncryptionParams {
Some((client_key, client_key_md5)) => Ok(EncryptionParams::SseC {
client_key,
client_key_md5,
object_key: Some(oek_info.derive_oek(&client_key)),
compression_level: garage.config.compression_level,
}),
None => Ok(EncryptionParams::Plaintext),
@ -141,7 +126,6 @@ impl EncryptionParams {
garage: &Garage,
headers: &HeaderMap,
obj_enc: &'a ObjectVersionEncryption,
oek_info: OekDerivationInfo<'_>,
) -> Result<(Self, Cow<'a, ObjectVersionMetaInner>), Error> {
let key = parse_request_headers(
headers,
@ -149,14 +133,13 @@ impl EncryptionParams {
&X_AMZ_SERVER_SIDE_ENCRYPTION_CUSTOMER_KEY,
&X_AMZ_SERVER_SIDE_ENCRYPTION_CUSTOMER_KEY_MD5,
)?;
Self::check_decrypt_common(garage, key, obj_enc, oek_info)
Self::check_decrypt_common(garage, key, obj_enc)
}
pub fn check_decrypt_for_copy_source<'a>(
garage: &Garage,
headers: &HeaderMap,
obj_enc: &'a ObjectVersionEncryption,
oek_info: OekDerivationInfo<'_>,
) -> Result<(Self, Cow<'a, ObjectVersionMetaInner>), Error> {
let key = parse_request_headers(
headers,
@ -164,32 +147,22 @@ impl EncryptionParams {
&X_AMZ_COPY_SOURCE_SERVER_SIDE_ENCRYPTION_CUSTOMER_KEY,
&X_AMZ_COPY_SOURCE_SERVER_SIDE_ENCRYPTION_CUSTOMER_KEY_MD5,
)?;
Self::check_decrypt_common(garage, key, obj_enc, oek_info)
Self::check_decrypt_common(garage, key, obj_enc)
}
fn check_decrypt_common<'a>(
garage: &Garage,
key: Option<(Key<Aes256Gcm>, Md5Output)>,
obj_enc: &'a ObjectVersionEncryption,
oek_info: OekDerivationInfo<'_>,
) -> Result<(Self, Cow<'a, ObjectVersionMetaInner>), Error> {
match (key, &obj_enc) {
(
Some((client_key, client_key_md5)),
ObjectVersionEncryption::SseC {
inner,
compressed,
use_oek,
},
ObjectVersionEncryption::SseC { inner, compressed },
) => {
let enc = Self::SseC {
client_key,
client_key_md5,
object_key: if *use_oek {
Some(oek_info.derive_oek(&client_key))
} else {
None
},
compression_level: if *compressed {
Some(garage.config.compression_level.unwrap_or(1))
} else {
@ -220,16 +193,13 @@ impl EncryptionParams {
) -> Result<ObjectVersionEncryption, Error> {
match self {
Self::SseC {
compression_level,
object_key,
..
compression_level, ..
} => {
let plaintext = meta.encode().map_err(GarageError::from)?;
let ciphertext = self.encrypt_blob(&plaintext)?;
Ok(ObjectVersionEncryption::SseC {
inner: ciphertext.into_owned(),
compressed: compression_level.is_some(),
use_oek: object_key.is_some(),
})
}
Self::Plaintext => Ok(ObjectVersionEncryption::Plaintext { inner: meta }),
@ -258,37 +228,24 @@ impl EncryptionParams {
// This is used for encrypting object metadata and inlined data for small objects.
// This does not compress anything.
fn cipher(&self) -> Option<Aes256Gcm> {
match self {
Self::SseC {
object_key: Some(oek),
..
} => Some(Aes256Gcm::new(&oek)),
Self::SseC {
client_key,
object_key: None,
..
} => Some(Aes256Gcm::new(&client_key)),
Self::Plaintext => None,
}
}
pub fn encrypt_blob<'a>(&self, blob: &'a [u8]) -> Result<Cow<'a, [u8]>, Error> {
match self.cipher() {
Some(cipher) => {
match self {
Self::SseC { client_key, .. } => {
let cipher = Aes256Gcm::new(&client_key);
let nonce = Aes256Gcm::generate_nonce(&mut OsRng);
let ciphertext = cipher
.encrypt(&nonce, blob)
.ok_or_internal_error("Encryption failed")?;
Ok(Cow::Owned([nonce.to_vec(), ciphertext].concat()))
}
None => Ok(Cow::Borrowed(blob)),
Self::Plaintext => Ok(Cow::Borrowed(blob)),
}
}
pub fn decrypt_blob<'a>(&self, blob: &'a [u8]) -> Result<Cow<'a, [u8]>, Error> {
match self.cipher() {
Some(cipher) => {
match self {
Self::SseC { client_key, .. } => {
let cipher = Aes256Gcm::new(&client_key);
let nonce_size = <Aes256Gcm as AeadCore>::NonceSize::to_usize();
let nonce = Nonce::from_slice(
blob.get(..nonce_size)
@ -301,7 +258,7 @@ impl EncryptionParams {
)?;
Ok(Cow::Owned(plaintext))
}
None => Ok(Cow::Borrowed(blob)),
Self::Plaintext => Ok(Cow::Borrowed(blob)),
}
}
@ -327,12 +284,10 @@ impl EncryptionParams {
Self::Plaintext => stream,
Self::SseC {
client_key,
object_key,
compression_level,
..
} => {
let key = object_key.as_ref().unwrap_or(client_key);
let plaintext = DecryptStream::new(stream, *key);
let plaintext = DecryptStream::new(stream, *client_key);
if compression_level.is_some() {
let reader = stream_asyncread(Box::pin(plaintext));
let reader = BufReader::new(reader);
@ -352,12 +307,9 @@ impl EncryptionParams {
Self::Plaintext => Ok(block),
Self::SseC {
client_key,
object_key,
compression_level,
..
} => {
let key = object_key.as_ref().unwrap_or(client_key);
let block = if let Some(level) = compression_level {
Cow::Owned(
garage_block::zstd_encode(block.as_ref(), *level)
@ -373,7 +325,7 @@ impl EncryptionParams {
OsRng.fill_bytes(&mut nonce);
ret.extend_from_slice(nonce.as_slice());
let mut cipher = EncryptorLE31::<Aes256Gcm>::new(key, &nonce);
let mut cipher = EncryptorLE31::<Aes256Gcm>::new(&client_key, &nonce);
let mut iter = block.chunks(STREAM_ENC_PLAIN_CHUNK_SIZE).peekable();
if iter.peek().is_none() {
@ -409,13 +361,6 @@ impl EncryptionParams {
}
}
pub fn has_encryption_header(headers: &HeaderMap) -> bool {
match headers.get(X_AMZ_SERVER_SIDE_ENCRYPTION_CUSTOMER_ALGORITHM) {
Some(h) => h.as_bytes() == CUSTOMER_ALGORITHM_AES256,
None => false,
}
}
fn parse_request_headers(
headers: &HeaderMap,
alg_header: &HeaderName,
@ -475,30 +420,6 @@ fn parse_request_headers(
}
}
impl<'a> OekDerivationInfo<'a> {
pub fn for_object<'b>(object: &'a Object, version: &'b ObjectVersion) -> Self {
Self {
bucket_id: object.bucket_id,
version_id: version.uuid,
object_key: &object.key,
}
}
fn derive_oek(&self, client_key: &Key<Aes256Gcm>) -> Key<Aes256Gcm> {
use hmac::{Hmac, Mac};
// info = bucket_id + object_name + version_uuid + "garage-object-encryption-key"
// oek = hmac_sha256(ssec_key, info)
let mut hmac = <Hmac<Sha256> as Mac>::new_from_slice(client_key.as_slice())
.expect("create hmac-sha256");
hmac.update(b"garage-object-encryption-key");
hmac.update(self.bucket_id.as_slice());
hmac.update(self.version_id.as_slice());
hmac.update(self.object_key.as_bytes());
hmac.finalize().into_bytes()
}
}
// ---- encrypt & decrypt streams ----
#[pin_project::pin_project]
@ -648,7 +569,6 @@ mod tests {
let enc = EncryptionParams::SseC {
client_key: Aes256Gcm::generate_key(&mut OsRng),
client_key_md5: Default::default(), // not needed
object_key: Some(Aes256Gcm::generate_key(&mut OsRng)),
compression_level,
};

View file

@ -1,8 +1,8 @@
use std::convert::TryInto;
use err_derive::Error;
use hyper::header::HeaderValue;
use hyper::{HeaderMap, StatusCode};
use thiserror::Error;
use garage_model::helper::error::Error as HelperError;
@ -25,67 +25,67 @@ use crate::xml as s3_xml;
/// Errors of this crate
#[derive(Debug, Error)]
pub enum Error {
#[error(display = "{}", _0)]
#[error("{0}")]
/// Error from common error
Common(#[error(source)] CommonError),
Common(#[from] CommonError),
// Category: cannot process
/// Authorization Header Malformed
#[error(display = "Authorization header malformed, unexpected scope: {}", _0)]
#[error("Authorization header malformed, unexpected scope: {0}")]
AuthorizationHeaderMalformed(String),
/// The object requested don't exists
#[error(display = "Key not found")]
#[error("Key not found")]
NoSuchKey,
/// The multipart upload requested don't exists
#[error(display = "Upload not found")]
#[error("Upload not found")]
NoSuchUpload,
/// Precondition failed (e.g. x-amz-copy-source-if-match)
#[error(display = "At least one of the preconditions you specified did not hold")]
#[error("At least one of the preconditions you specified did not hold")]
PreconditionFailed,
/// Parts specified in CMU request do not match parts actually uploaded
#[error(display = "Parts given to CompleteMultipartUpload do not match uploaded parts")]
#[error("Parts given to CompleteMultipartUpload do not match uploaded parts")]
InvalidPart,
/// Parts given to CompleteMultipartUpload were not in ascending order
#[error(display = "Parts given to CompleteMultipartUpload were not in ascending order")]
#[error("Parts given to CompleteMultipartUpload were not in ascending order")]
InvalidPartOrder,
/// In CompleteMultipartUpload: not enough data
/// (here we are more lenient than AWS S3)
#[error(display = "Proposed upload is smaller than the minimum allowed object size")]
#[error("Proposed upload is smaller than the minimum allowed object size")]
EntityTooSmall,
// Category: bad request
/// The request contained an invalid UTF-8 sequence in its path or in other parameters
#[error(display = "Invalid UTF-8: {}", _0)]
InvalidUtf8Str(#[error(source)] std::str::Utf8Error),
#[error("Invalid UTF-8: {0}")]
InvalidUtf8Str(#[from] std::str::Utf8Error),
/// The request used an invalid path
#[error(display = "Invalid UTF-8: {}", _0)]
InvalidUtf8String(#[error(source)] std::string::FromUtf8Error),
#[error("Invalid UTF-8: {0}")]
InvalidUtf8String(#[from] std::string::FromUtf8Error),
/// The client sent invalid XML data
#[error(display = "Invalid XML: {}", _0)]
#[error("Invalid XML: {0}")]
InvalidXml(String),
/// The client sent a range header with invalid value
#[error(display = "Invalid HTTP range: {:?}", _0)]
InvalidRange(#[error(from)] (http_range::HttpRangeParseError, u64)),
#[error("Invalid HTTP range: {0:?}")]
InvalidRange((http_range::HttpRangeParseError, u64)),
/// The client sent a range header with invalid value
#[error(display = "Invalid encryption algorithm: {:?}, should be AES256", _0)]
#[error("Invalid encryption algorithm: {0:?}, should be AES256")]
InvalidEncryptionAlgorithm(String),
/// The provided digest (checksum) value was invalid
#[error(display = "Invalid digest: {}", _0)]
#[error("Invalid digest: {0}")]
InvalidDigest(String),
/// The client sent a request for an action not supported by garage
#[error(display = "Unimplemented action: {}", _0)]
#[error("Unimplemented action: {0}")]
NotImplemented(String),
}
@ -99,6 +99,12 @@ impl From<HelperError> for Error {
}
}
impl From<(http_range::HttpRangeParseError, u64)> for Error {
fn from(err: (http_range::HttpRangeParseError, u64)) -> Error {
Error::InvalidRange(err)
}
}
impl From<roxmltree::Error> for Error {
fn from(err: roxmltree::Error) -> Self {
Self::InvalidXml(format!("{}", err))

View file

@ -19,18 +19,19 @@ use garage_net::stream::ByteStream;
use garage_rpc::rpc_helper::OrderTag;
use garage_table::EmptyKey;
use garage_util::data::*;
use garage_util::error::OkOrMessage;
use garage_util::error::{Error as UtilError, OkOrMessage};
use garage_model::garage::Garage;
use garage_model::s3::object_table::*;
use garage_model::s3::version_table::*;
use garage_api_common::common_error::CommonError;
use garage_api_common::helpers::*;
use garage_api_common::signature::checksum::{add_checksum_response_headers, X_AMZ_CHECKSUM_MODE};
use crate::api_server::ResBody;
use crate::copy::*;
use crate::encryption::{EncryptionParams, OekDerivationInfo};
use crate::encryption::EncryptionParams;
use crate::error::*;
const X_AMZ_MP_PARTS_COUNT: HeaderName = HeaderName::from_static("x-amz-mp-parts-count");
@ -181,12 +182,8 @@ pub async fn handle_head_without_ctx(
return Ok(res);
}
let (encryption, headers) = EncryptionParams::check_decrypt(
&garage,
req.headers(),
&version_meta.encryption,
OekDerivationInfo::for_object(&object, object_version),
)?;
let (encryption, headers) =
EncryptionParams::check_decrypt(&garage, req.headers(), &version_meta.encryption)?;
let checksum_mode = checksum_mode(&req);
@ -219,6 +216,7 @@ pub async fn handle_head_without_ctx(
.get(&object_version.uuid, &EmptyKey)
.await?
.ok_or(Error::NoSuchKey)?;
check_version_not_deleted(&version)?;
let (part_offset, part_end) =
calculate_part_bounds(&version, pn).ok_or(Error::InvalidPart)?;
@ -307,12 +305,8 @@ pub async fn handle_get_without_ctx(
return Ok(res);
}
let (enc, headers) = EncryptionParams::check_decrypt(
&garage,
req.headers(),
&last_v_meta.encryption,
OekDerivationInfo::for_object(&object, last_v),
)?;
let (enc, headers) =
EncryptionParams::check_decrypt(&garage, req.headers(), &last_v_meta.encryption)?;
let checksum_mode = checksum_mode(&req);
@ -373,6 +367,21 @@ pub async fn handle_get_without_ctx(
}
}
pub(crate) fn check_version_not_deleted(version: &Version) -> Result<(), Error> {
if version.deleted.get() {
// the version was deleted between when the object_table was consulted
// and now, this could mean the object was deleted, or overriden.
// Rather than say the key doesn't exist, return a transient error
// to signal the client to try again.
return Err(CommonError::InternalError(UtilError::Message(
"conflict/inconsistency between object and version state, version is deleted"
.to_string(),
))
.into());
}
Ok(())
}
async fn handle_get_full(
garage: Arc<Garage>,
version: &ObjectVersion,
@ -439,6 +448,7 @@ pub fn full_object_byte_stream(
.ok_or_message("channel closed")?;
let version = version_fut.await.unwrap()?.ok_or(Error::NoSuchKey)?;
check_version_not_deleted(&version)?;
for (i, (_, vb)) in version.blocks.items().iter().enumerate().skip(1) {
let stream_block_i = encryption
.get_block(&garage, &vb.hash, Some(order_stream.order(i as u64)))
@ -454,6 +464,14 @@ pub fn full_object_byte_stream(
{
Ok(()) => (),
Err(e) => {
// TODO i think this is a bad idea, we should log
// an error and stop there. If the error happens to
// be exactly the size of what hasn't been streamed
// yet, the client will see the request as a
// success
// instead truncating the output notify the client
// something happened with their download, so that
// they can retry it
let _ = tx.send(error_stream_item(e)).await;
}
}
@ -505,7 +523,7 @@ async fn handle_get_range(
.get(&version.uuid, &EmptyKey)
.await?
.ok_or(Error::NoSuchKey)?;
check_version_not_deleted(&version)?;
let body =
body_from_blocks_range(garage, encryption, version.blocks.items(), begin, end);
Ok(resp_builder.body(body)?)
@ -556,6 +574,8 @@ async fn handle_get_part(
.await?
.ok_or(Error::NoSuchKey)?;
check_version_not_deleted(&version)?;
let (begin, end) =
calculate_part_bounds(&version, part_number).ok_or(Error::InvalidPart)?;
@ -825,7 +845,9 @@ impl PreconditionHeaders {
}
fn check(&self, v: &ObjectVersion, etag: &str) -> Result<Option<StatusCode>, Error> {
let v_date = UNIX_EPOCH + Duration::from_millis(v.timestamp);
// we store date with ms precision, but headers are precise to the second: truncate
// the timestamp to handle the same-second edge case
let v_date = UNIX_EPOCH + Duration::from_secs(v.timestamp / 1000);
// Implemented from https://datatracker.ietf.org/doc/html/rfc7232#section-6

View file

@ -27,7 +27,7 @@ pub async fn handle_get_lifecycle(ctx: ReqCtx) -> Result<Response<ResBody>, Erro
.body(string_body(xml))?)
} else {
Ok(Response::builder()
.status(StatusCode::NO_CONTENT)
.status(StatusCode::NOT_FOUND)
.body(empty_body())?)
}
}

View file

@ -17,7 +17,7 @@ use garage_api_common::encoding::*;
use garage_api_common::helpers::*;
use crate::api_server::{ReqBody, ResBody};
use crate::encryption::{EncryptionParams, OekDerivationInfo};
use crate::encryption::EncryptionParams;
use crate::error::*;
use crate::multipart as s3_multipart;
use crate::xml as s3_xml;
@ -285,16 +285,8 @@ pub async fn handle_list_parts(
ObjectVersionState::Uploading { encryption, .. } => encryption,
_ => unreachable!(),
};
let encryption_res = EncryptionParams::check_decrypt(
&ctx.garage,
req.headers(),
&object_encryption,
OekDerivationInfo {
bucket_id: ctx.bucket_id,
version_id: upload_id,
object_key: &query.key,
},
);
let encryption_res =
EncryptionParams::check_decrypt(&ctx.garage, req.headers(), &object_encryption);
let (info, next) = fetch_part_info(query, &mpu)?;
@ -334,12 +326,6 @@ pub async fn handle_list_parts(
}
_ => None,
},
checksum_crc64nvme: match &checksum {
Some(ChecksumValue::Crc64Nvme(x)) => {
Some(s3_xml::Value(BASE64_STANDARD.encode(&x)))
}
_ => None,
},
checksum_sha1: match &checksum {
Some(ChecksumValue::Sha1(x)) => {
Some(s3_xml::Value(BASE64_STANDARD.encode(&x)))

View file

@ -6,7 +6,6 @@ use std::sync::Arc;
use base64::prelude::*;
use crc32c::Crc32cHasher as Crc32c;
use crc32fast::Hasher as Crc32;
use crc64fast_nvme::Digest as Crc64Nvme;
use futures::prelude::*;
use hyper::{Request, Response};
use md5::{Digest, Md5};
@ -27,7 +26,7 @@ use garage_api_common::helpers::*;
use garage_api_common::signature::checksum::*;
use crate::api_server::{ReqBody, ResBody};
use crate::encryption::{has_encryption_header, EncryptionParams, OekDerivationInfo};
use crate::encryption::EncryptionParams;
use crate::error::*;
use crate::put::*;
use crate::xml as s3_xml;
@ -57,15 +56,7 @@ pub async fn handle_create_multipart_upload(
};
// Determine whether object should be encrypted, and if so the key
let encryption = EncryptionParams::new_from_headers(
&garage,
req.headers(),
OekDerivationInfo {
bucket_id: *bucket_id,
version_id: upload_id,
object_key: &key,
},
)?;
let encryption = EncryptionParams::new_from_headers(&garage, req.headers())?;
let object_encryption = encryption.encrypt_meta(meta)?;
let checksum_algorithm = request_checksum_algorithm(req.headers())?;
@ -129,7 +120,8 @@ pub async fn handle_put_part(
// Before we stream the body, configure the needed checksums.
req_body.add_expected_checksums(expected_checksums.clone());
if !has_encryption_header(&req_head.headers) {
// TODO: avoid parsing encryption headers twice...
if !EncryptionParams::new_from_headers(&garage, &req_head.headers)?.is_encrypted() {
// For non-encrypted objects, we need to compute the md5sum in all cases
// (even if content-md5 is not set), because it is used as an etag of the
// part, which is in turn used in the etag computation of the whole object
@ -142,11 +134,10 @@ pub async fn handle_put_part(
let mut chunker = StreamChunker::new(stream, garage.config.block_size);
// Read first chuck, and at the same time try to get object to see if it exists
let ((object, object_version, mut mpu), first_block) =
let ((_, object_version, mut mpu), first_block) =
futures::try_join!(get_upload(&ctx, &key, &upload_id), chunker.next(),)?;
// Check encryption params
let oek_params = OekDerivationInfo::for_object(&object, &object_version);
let (object_encryption, checksum_algorithm) = match object_version.state {
ObjectVersionState::Uploading {
encryption,
@ -155,12 +146,8 @@ pub async fn handle_put_part(
} => (encryption, checksum_algorithm),
_ => unreachable!(),
};
let (encryption, _) = EncryptionParams::check_decrypt(
&garage,
&req_head.headers,
&object_encryption,
oek_params,
)?;
let (encryption, _) =
EncryptionParams::check_decrypt(&garage, &req_head.headers, &object_encryption)?;
// Check object is valid and part can be accepted
let first_block = first_block.ok_or_bad_request("Empty body")?;
@ -310,7 +297,6 @@ pub async fn handle_complete_multipart_upload(
return Err(Error::bad_request("No data was uploaded"));
}
let oek_params = OekDerivationInfo::for_object(&object, &object_version);
let (object_encryption, checksum_algorithm) = match object_version.state {
ObjectVersionState::Uploading {
encryption,
@ -431,12 +417,8 @@ pub async fn handle_complete_multipart_upload(
let object_encryption = match checksum_algorithm {
None => object_encryption,
Some(_) => {
let (encryption, meta) = EncryptionParams::check_decrypt(
&garage,
&req_head.headers,
&object_encryption,
oek_params,
)?;
let (encryption, meta) =
EncryptionParams::check_decrypt(&garage, &req_head.headers, &object_encryption)?;
let new_meta = ObjectVersionMetaInner {
headers: meta.into_owned().headers,
checksum: checksum_extra,
@ -482,10 +464,6 @@ pub async fn handle_complete_multipart_upload(
Some(ChecksumValue::Crc32c(x)) => Some(s3_xml::Value(BASE64_STANDARD.encode(&x))),
_ => None,
},
checksum_crc64nvme: match &checksum_extra {
Some(ChecksumValue::Crc64Nvme(x)) => Some(s3_xml::Value(BASE64_STANDARD.encode(&x))),
_ => None,
},
checksum_sha1: match &checksum_extra {
Some(ChecksumValue::Sha1(x)) => Some(s3_xml::Value(BASE64_STANDARD.encode(&x))),
_ => None,
@ -609,15 +587,6 @@ fn parse_complete_multipart_upload_body(
.try_into()
.ok()?,
))
} else if let Some(crc64nvme) = item
.children()
.find(|e| e.has_tag_name("ChecksumCRC64NVME"))
{
Some(ChecksumValue::Crc64Nvme(
BASE64_STANDARD.decode(crc64nvme.text()?).ok()?[..]
.try_into()
.ok()?,
))
} else if let Some(sha1) = item.children().find(|e| e.has_tag_name("ChecksumSHA1")) {
Some(ChecksumValue::Sha1(
BASE64_STANDARD.decode(sha1.text()?).ok()?[..]
@ -658,7 +627,6 @@ pub(crate) struct MultipartChecksummer {
pub(crate) enum MultipartExtraChecksummer {
Crc32(Crc32),
Crc32c(Crc32c),
Crc64Nvme(Crc64Nvme),
Sha1(Sha1),
Sha256(Sha256),
}
@ -675,9 +643,6 @@ impl MultipartChecksummer {
Some(ChecksumAlgorithm::Crc32c) => {
Some(MultipartExtraChecksummer::Crc32c(Crc32c::default()))
}
Some(ChecksumAlgorithm::Crc64Nvme) => {
Some(MultipartExtraChecksummer::Crc64Nvme(Crc64Nvme::default()))
}
Some(ChecksumAlgorithm::Sha1) => Some(MultipartExtraChecksummer::Sha1(Sha1::new())),
Some(ChecksumAlgorithm::Sha256) => {
Some(MultipartExtraChecksummer::Sha256(Sha256::new()))
@ -707,12 +672,6 @@ impl MultipartChecksummer {
) => {
crc32c.write(&x);
}
(
Some(MultipartExtraChecksummer::Crc64Nvme(ref mut crc64nvme)),
Some(ChecksumValue::Crc64Nvme(x)),
) => {
crc64nvme.write(&x);
}
(Some(MultipartExtraChecksummer::Sha1(ref mut sha1)), Some(ChecksumValue::Sha1(x))) => {
sha1.update(&x);
}
@ -742,9 +701,6 @@ impl MultipartChecksummer {
Some(MultipartExtraChecksummer::Crc32c(crc32c)) => Some(ChecksumValue::Crc32c(
u32::to_be_bytes(u32::try_from(crc32c.finish()).unwrap()),
)),
Some(MultipartExtraChecksummer::Crc64Nvme(crc64nvme)) => Some(
ChecksumValue::Crc64Nvme(u64::to_be_bytes(crc64nvme.sum64())),
),
Some(MultipartExtraChecksummer::Sha1(sha1)) => {
Some(ChecksumValue::Sha1(sha1.finalize()[..].try_into().unwrap()))
}

View file

@ -15,7 +15,6 @@ use serde::Deserialize;
use garage_model::garage::Garage;
use garage_model::s3::object_table::*;
use garage_util::data::gen_uuid;
use garage_api_common::cors::*;
use garage_api_common::helpers::*;
@ -23,7 +22,7 @@ use garage_api_common::signature::checksum::*;
use garage_api_common::signature::payload::{verify_v4, Authorization};
use crate::api_server::ResBody;
use crate::encryption::{EncryptionParams, OekDerivationInfo};
use crate::encryption::EncryptionParams;
use crate::error::*;
use crate::put::{extract_metadata_headers, save_stream, ChecksumMode};
use crate::xml as s3_xml;
@ -104,18 +103,22 @@ pub async fn handle_post_object(
key.to_owned()
};
let api_key = verify_v4(&garage, "s3", &authorization, policy.as_bytes())?;
let api_key = verify_v4(&garage, "s3", &authorization, policy.as_bytes()).await?;
let bucket = garage
let bucket_id = garage
.bucket_helper()
.resolve_bucket_fast(&bucket_name, &api_key)
.resolve_bucket(&bucket_name, &api_key)
.await
.map_err(pass_helper_error)?;
let bucket_id = bucket.id;
if !api_key.allow_write(&bucket_id) {
return Err(Error::forbidden("Operation is not allowed for this key."));
}
let bucket = garage
.bucket_helper()
.get_existing_bucket(bucket_id)
.await?;
let bucket_params = bucket.state.into_option().unwrap();
let matching_cors_rule = find_matching_cors_rule(
&bucket_params,
@ -138,10 +141,26 @@ pub async fn handle_post_object(
let mut conditions = decoded_policy.into_conditions()?;
// If there are conditions on the bucket name, check these against the actual bucket_name rather
// than the one in params, which is allowed to be absent.
if let Some(conds) = conditions.params.remove("bucket") {
for cond in conds {
let ok = match cond {
Operation::Equal(s) => s.as_str() == bucket_name,
Operation::StartsWith(s) => bucket_name.starts_with(&s),
};
if !ok {
return Err(Error::bad_request(
"Key 'bucket' has value not allowed in policy",
));
}
}
}
for (param_key, value) in params.iter() {
let param_key = param_key.as_str();
match param_key {
"policy" | "x-amz-signature" => (), // this is always accepted, as it's required to validate other fields
"policy" | "x-amz-signature" | "bucket" => (), // this is always accepted, as it's required to validate other fields
"content-type" => {
let conds = conditions.params.remove("content-type").ok_or_else(|| {
Error::bad_request(format!("Key '{}' is not allowed in policy", param_key))
@ -228,22 +247,12 @@ pub async fn handle_post_object(
.transpose()?,
};
let version_uuid = gen_uuid();
let meta = ObjectVersionMetaInner {
headers,
checksum: expected_checksums.extra,
};
let encryption = EncryptionParams::new_from_headers(
&garage,
&params,
OekDerivationInfo {
bucket_id,
version_id: version_uuid,
object_key: &key,
},
)?;
let encryption = EncryptionParams::new_from_headers(&garage, &params)?;
let stream = file_field.map(|r| r.map_err(Into::into));
let ctx = ReqCtx {
@ -256,7 +265,6 @@ pub async fn handle_post_object(
let res = save_stream(
&ctx,
version_uuid,
meta,
encryption,
StreamLimiter::new(stream, conditions.content_length),

View file

@ -35,12 +35,10 @@ use garage_api_common::signature::body::StreamingChecksumReceiver;
use garage_api_common::signature::checksum::*;
use crate::api_server::{ReqBody, ResBody};
use crate::encryption::{EncryptionParams, OekDerivationInfo};
use crate::encryption::EncryptionParams;
use crate::error::*;
use crate::website::X_AMZ_WEBSITE_REDIRECT_LOCATION;
const PUT_BLOCKS_MAX_PARALLEL: usize = 3;
pub(crate) struct SaveStreamResult {
pub(crate) version_uuid: Uuid,
pub(crate) version_timestamp: u64,
@ -62,10 +60,6 @@ pub async fn handle_put(
req: Request<ReqBody>,
key: &String,
) -> Result<Response<ResBody>, Error> {
// Generate version uuid now, because it is necessary to compute SSE-C
// encryption parameters
let version_uuid = gen_uuid();
// Retrieve interesting headers from request
let headers = extract_metadata_headers(req.headers())?;
debug!("Object headers: {:?}", headers);
@ -86,15 +80,7 @@ pub async fn handle_put(
};
// Determine whether object should be encrypted, and if so the key
let encryption = EncryptionParams::new_from_headers(
&ctx.garage,
req.headers(),
OekDerivationInfo {
bucket_id: ctx.bucket_id,
version_id: version_uuid,
object_key: &key,
},
)?;
let encryption = EncryptionParams::new_from_headers(&ctx.garage, req.headers())?;
// The request body is a special ReqBody object (see garage_api_common::signature::body)
// which supports calculating checksums while streaming the data.
@ -112,7 +98,6 @@ pub async fn handle_put(
let res = save_stream(
&ctx,
version_uuid,
meta,
encryption,
stream,
@ -134,7 +119,6 @@ pub async fn handle_put(
pub(crate) async fn save_stream<S: Stream<Item = Result<Bytes, Error>> + Unpin>(
ctx: &ReqCtx,
version_uuid: Uuid,
mut meta: ObjectVersionMetaInner,
encryption: EncryptionParams,
body: S,
@ -154,6 +138,7 @@ pub(crate) async fn save_stream<S: Stream<Item = Result<Bytes, Error>> + Unpin>(
let first_block = first_block_opt.unwrap_or_default();
// Generate identity of new version
let version_uuid = gen_uuid();
let version_timestamp = next_timestamp(existing_object.as_ref());
let mut checksummer = match &checksum_mode {
@ -506,7 +491,7 @@ pub(crate) async fn read_and_put_blocks<S: Stream<Item = Result<Bytes, Error>> +
};
let recv_next = async {
// If more than a maximum number of writes are in progress, don't add more for now
if currently_running >= PUT_BLOCKS_MAX_PARALLEL {
if currently_running >= ctx.garage.config.block_max_concurrent_writes_per_request {
futures::future::pending().await
} else {
block_rx3.recv().await

View file

@ -3,7 +3,7 @@ use quick_xml::de::from_reader;
use hyper::{header::HeaderName, Request, Response, StatusCode};
use serde::{Deserialize, Serialize};
use garage_model::bucket_table::{self, *};
use garage_model::bucket_table::*;
use garage_api_common::helpers::*;
@ -26,28 +26,7 @@ pub async fn handle_get_website(ctx: ReqCtx) -> Result<Response<ResBody>, Error>
suffix: Value(website.index_document.to_string()),
}),
redirect_all_requests_to: None,
routing_rules: RoutingRules {
rules: website
.routing_rules
.clone()
.into_iter()
.map(|rule| RoutingRule {
condition: rule.condition.map(|cond| Condition {
http_error_code: cond.http_error_code.map(|c| IntValue(c as i64)),
prefix: cond.prefix.map(Value),
}),
redirect: Redirect {
hostname: rule.redirect.hostname.map(Value),
http_redirect_code: Some(IntValue(
rule.redirect.http_redirect_code as i64,
)),
protocol: rule.redirect.protocol.map(Value),
replace_full: rule.redirect.replace_key.map(Value),
replace_prefix: rule.redirect.replace_key_prefix.map(Value),
},
})
.collect(),
},
routing_rules: None,
};
let xml = to_xml_with_header(&wc)?;
Ok(Response::builder()
@ -118,28 +97,18 @@ pub struct WebsiteConfiguration {
pub index_document: Option<Suffix>,
#[serde(rename = "RedirectAllRequestsTo")]
pub redirect_all_requests_to: Option<Target>,
#[serde(
rename = "RoutingRules",
default,
skip_serializing_if = "RoutingRules::is_empty"
)]
pub routing_rules: RoutingRules,
}
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, PartialOrd, Ord, Default)]
pub struct RoutingRules {
#[serde(rename = "RoutingRule")]
pub rules: Vec<RoutingRule>,
}
impl RoutingRules {
fn is_empty(&self) -> bool {
self.rules.is_empty()
}
#[serde(rename = "RoutingRules")]
pub routing_rules: Option<Vec<RoutingRule>>,
}
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, PartialOrd, Ord)]
pub struct RoutingRule {
#[serde(rename = "RoutingRule")]
pub inner: RoutingRuleInner,
}
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, PartialOrd, Ord)]
pub struct RoutingRuleInner {
#[serde(rename = "Condition")]
pub condition: Option<Condition>,
#[serde(rename = "Redirect")]
@ -193,7 +162,7 @@ impl WebsiteConfiguration {
if self.redirect_all_requests_to.is_some()
&& (self.error_document.is_some()
|| self.index_document.is_some()
|| !self.routing_rules.is_empty())
|| self.routing_rules.is_some())
{
return Err(Error::bad_request(
"Bad XML: can't have RedirectAllRequestsTo and other fields",
@ -208,15 +177,10 @@ impl WebsiteConfiguration {
if let Some(ref rart) = self.redirect_all_requests_to {
rart.validate()?;
}
for rr in &self.routing_rules.rules {
rr.validate()?;
}
if self.routing_rules.rules.len() > 1000 {
// we will do linear scans, best to avoid overly long configuration. The
// limit was choosen arbitrarily
return Err(Error::bad_request(
"Bad XML: RoutingRules can't have more than 1000 child elements",
));
if let Some(ref rrs) = self.routing_rules {
for rr in rrs {
rr.inner.validate()?;
}
}
Ok(())
@ -225,7 +189,11 @@ impl WebsiteConfiguration {
pub fn into_garage_website_config(self) -> Result<WebsiteConfig, Error> {
if self.redirect_all_requests_to.is_some() {
Err(Error::NotImplemented(
"RedirectAllRequestsTo is not currently implemented in Garage, however its effect can be emulated using a single inconditional RoutingRule.".into(),
"S3 website redirects are not currently implemented in Garage.".into(),
))
} else if self.routing_rules.map(|x| !x.is_empty()).unwrap_or(false) {
Err(Error::NotImplemented(
"S3 routing rules are not currently implemented in Garage.".into(),
))
} else {
Ok(WebsiteConfig {
@ -234,36 +202,6 @@ impl WebsiteConfiguration {
.map(|x| x.suffix.0)
.unwrap_or_else(|| "index.html".to_string()),
error_document: self.error_document.map(|x| x.key.0),
redirect_all: None,
routing_rules: self
.routing_rules
.rules
.into_iter()
.map(|rule| {
bucket_table::RoutingRule {
condition: rule.condition.map(|condition| {
bucket_table::RedirectCondition {
http_error_code: condition.http_error_code.map(|c| c.0 as u16),
prefix: condition.prefix.map(|p| p.0),
}
}),
redirect: bucket_table::Redirect {
hostname: rule.redirect.hostname.map(|h| h.0),
protocol: rule.redirect.protocol.map(|p| p.0),
// aws default to 301, which i find punitive in case of
// missconfiguration (can be permanently cached on the
// user agent)
http_redirect_code: rule
.redirect
.http_redirect_code
.map(|c| c.0 as u16)
.unwrap_or(302),
replace_key_prefix: rule.redirect.replace_prefix.map(|k| k.0),
replace_key: rule.redirect.replace_full.map(|k| k.0),
},
}
})
.collect(),
})
}
}
@ -304,69 +242,37 @@ impl Target {
}
}
impl RoutingRule {
impl RoutingRuleInner {
pub fn validate(&self) -> Result<(), Error> {
if let Some(condition) = &self.condition {
condition.validate()?;
}
self.redirect.validate()
}
}
impl Condition {
pub fn validate(&self) -> Result<bool, Error> {
if let Some(ref error_code) = self.http_error_code {
// TODO do other error codes make sense? Aws only allows 4xx and 5xx
if error_code.0 != 404 {
return Err(Error::bad_request(
"Bad XML: HttpErrorCodeReturnedEquals must be 404 or absent",
));
}
}
Ok(self.prefix.is_some())
let has_prefix = self
.condition
.as_ref()
.and_then(|c| c.prefix.as_ref())
.is_some();
self.redirect.validate(has_prefix)
}
}
impl Redirect {
pub fn validate(&self) -> Result<(), Error> {
if self.replace_prefix.is_some() && self.replace_full.is_some() {
return Err(Error::bad_request(
"Bad XML: both ReplaceKeyPrefixWith and ReplaceKeyWith are set",
));
pub fn validate(&self, has_prefix: bool) -> Result<(), Error> {
if self.replace_prefix.is_some() {
if self.replace_full.is_some() {
return Err(Error::bad_request(
"Bad XML: both ReplaceKeyPrefixWith and ReplaceKeyWith are set",
));
}
if !has_prefix {
return Err(Error::bad_request(
"Bad XML: ReplaceKeyPrefixWith is set, but KeyPrefixEquals isn't",
));
}
}
if let Some(ref protocol) = self.protocol {
if protocol.0 != "http" && protocol.0 != "https" {
return Err(Error::bad_request("Bad XML: invalid protocol"));
}
}
if let Some(ref http_redirect_code) = self.http_redirect_code {
match http_redirect_code.0 {
// aws allows all 3xx except 300, but some are non-sensical (not modified,
// use proxy...)
301 | 302 | 303 | 307 | 308 => {
if self.hostname.is_none() && self.protocol.is_some() {
return Err(Error::bad_request(
"Bad XML: HostName must be set if Protocol is set",
));
}
}
// aws doesn't allow these codes, but netlify does, and it seems like a
// cool feature (change the page seen without changing the url shown by the
// user agent)
200 | 404 => {
if self.hostname.is_some() || self.protocol.is_some() {
// hostname would mean different bucket, protocol doesn't make
// sense
return Err(Error::bad_request(
"Bad XML: an HttpRedirectCode of 200 is not acceptable alongside HostName or Protocol",
));
}
}
_ => {
return Err(Error::bad_request("Bad XML: invalid HttpRedirectCode"));
}
}
}
// TODO there are probably more invalid cases, but which ones?
Ok(())
}
}
@ -405,15 +311,6 @@ mod tests {
<ReplaceKeyWith>fullkey</ReplaceKeyWith>
</Redirect>
</RoutingRule>
<RoutingRule>
<Condition>
<KeyPrefixEquals></KeyPrefixEquals>
</Condition>
<Redirect>
<HttpRedirectCode>404</HttpRedirectCode>
<ReplaceKeyWith>missing</ReplaceKeyWith>
</Redirect>
</RoutingRule>
</RoutingRules>
</WebsiteConfiguration>"#;
let conf: WebsiteConfiguration = from_str(message).unwrap();
@ -429,36 +326,21 @@ mod tests {
hostname: Value("garage.tld".to_owned()),
protocol: Some(Value("https".to_owned())),
}),
routing_rules: RoutingRules {
rules: vec![
RoutingRule {
condition: Some(Condition {
http_error_code: Some(IntValue(404)),
prefix: Some(Value("prefix1".to_owned())),
}),
redirect: Redirect {
hostname: Some(Value("gara.ge".to_owned())),
protocol: Some(Value("http".to_owned())),
http_redirect_code: Some(IntValue(303)),
replace_prefix: Some(Value("prefix2".to_owned())),
replace_full: Some(Value("fullkey".to_owned())),
},
routing_rules: Some(vec![RoutingRule {
inner: RoutingRuleInner {
condition: Some(Condition {
http_error_code: Some(IntValue(404)),
prefix: Some(Value("prefix1".to_owned())),
}),
redirect: Redirect {
hostname: Some(Value("gara.ge".to_owned())),
protocol: Some(Value("http".to_owned())),
http_redirect_code: Some(IntValue(303)),
replace_prefix: Some(Value("prefix2".to_owned())),
replace_full: Some(Value("fullkey".to_owned())),
},
RoutingRule {
condition: Some(Condition {
http_error_code: None,
prefix: Some(Value("".to_owned())),
}),
redirect: Redirect {
hostname: None,
protocol: None,
http_redirect_code: Some(IntValue(404)),
replace_prefix: None,
replace_full: Some(Value("missing".to_owned())),
},
},
],
},
},
}]),
};
assert_eq! {
ref_value,

View file

@ -13,6 +13,10 @@ pub fn xmlns_tag<S: Serializer>(_v: &(), s: S) -> Result<S::Ok, S::Error> {
s.serialize_str("http://s3.amazonaws.com/doc/2006-03-01/")
}
pub fn xmlns_xsi_tag<S: Serializer>(_v: &(), s: S) -> Result<S::Ok, S::Error> {
s.serialize_str("http://www.w3.org/2001/XMLSchema-instance")
}
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, PartialOrd, Ord)]
pub struct Value(#[serde(rename = "$value")] pub String);
@ -135,8 +139,6 @@ pub struct CompleteMultipartUploadResult {
pub checksum_crc32: Option<Value>,
#[serde(rename = "ChecksumCRC32C")]
pub checksum_crc32c: Option<Value>,
#[serde(rename = "ChecksumCR64NVME")]
pub checksum_crc64nvme: Option<Value>,
#[serde(rename = "ChecksumSHA1")]
pub checksum_sha1: Option<Value>,
#[serde(rename = "ChecksumSHA256")]
@ -211,8 +213,6 @@ pub struct PartItem {
pub checksum_crc32: Option<Value>,
#[serde(rename = "ChecksumCRC32C")]
pub checksum_crc32c: Option<Value>,
#[serde(rename = "ChecksumCRC64NVME")]
pub checksum_crc64nvme: Option<Value>,
#[serde(rename = "ChecksumSHA1")]
pub checksum_sha1: Option<Value>,
#[serde(rename = "ChecksumSHA256")]
@ -323,6 +323,42 @@ pub struct PostObject {
pub etag: Value,
}
#[derive(Debug, Serialize, PartialEq, Eq)]
pub struct Grantee {
#[serde(rename = "xmlns:xsi", serialize_with = "xmlns_xsi_tag")]
pub xmlns_xsi: (),
#[serde(rename = "xsi:type")]
pub typ: String,
#[serde(rename = "DisplayName")]
pub display_name: Option<Value>,
#[serde(rename = "ID")]
pub id: Option<Value>,
}
#[derive(Debug, Serialize, PartialEq, Eq)]
pub struct Grant {
#[serde(rename = "Grantee")]
pub grantee: Grantee,
#[serde(rename = "Permission")]
pub permission: Value,
}
#[derive(Debug, Serialize, PartialEq, Eq)]
pub struct AccessControlList {
#[serde(rename = "Grant")]
pub entries: Vec<Grant>,
}
#[derive(Debug, Serialize, PartialEq, Eq)]
pub struct AccessControlPolicy {
#[serde(serialize_with = "xmlns_tag")]
pub xmlns: (),
#[serde(rename = "Owner")]
pub owner: Option<Owner>,
#[serde(rename = "AccessControlList")]
pub acl: AccessControlList,
}
#[cfg(test)]
mod tests {
use super::*;
@ -431,6 +467,43 @@ mod tests {
Ok(())
}
#[test]
fn get_bucket_acl_result() -> Result<(), ApiError> {
let grant = Grant {
grantee: Grantee {
xmlns_xsi: (),
typ: "CanonicalUser".to_string(),
display_name: Some(Value("owner_name".to_string())),
id: Some(Value("qsdfjklm".to_string())),
},
permission: Value("FULL_CONTROL".to_string()),
};
let get_bucket_acl = AccessControlPolicy {
xmlns: (),
owner: None,
acl: AccessControlList {
entries: vec![grant],
},
};
assert_eq!(
to_xml_with_header(&get_bucket_acl)?,
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\
<AccessControlPolicy xmlns=\"http://s3.amazonaws.com/doc/2006-03-01/\">\
<AccessControlList>\
<Grant>\
<Grantee xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xsi:type=\"CanonicalUser\">\
<DisplayName>owner_name</DisplayName>\
<ID>qsdfjklm</ID>\
</Grantee>\
<Permission>FULL_CONTROL</Permission>\
</Grant>\
</AccessControlList>\
</AccessControlPolicy>"
);
Ok(())
}
#[test]
fn delete_result() -> Result<(), ApiError> {
let delete_result = DeleteResult {
@ -522,7 +595,6 @@ mod tests {
etag: Value("\"3858f62230ac3c915f300c664312c11f-9\"".to_string()),
checksum_crc32: None,
checksum_crc32c: None,
checksum_crc64nvme: None,
checksum_sha1: Some(Value("ZJAnHyG8PeKz9tI8UTcHrJos39A=".into())),
checksum_sha256: None,
};
@ -808,7 +880,6 @@ mod tests {
size: IntValue(10485760),
checksum_crc32: None,
checksum_crc32c: None,
checksum_crc64nvme: None,
checksum_sha256: Some(Value(
"5RQ3A5uk0w7ojNjvegohch4JRBBGN/cLhsNrPzfv/hA=".into(),
)),
@ -822,7 +893,6 @@ mod tests {
checksum_sha256: None,
checksum_crc32c: None,
checksum_crc32: Some(Value("ZJAnHyG8=".into())),
checksum_crc64nvme: None,
checksum_sha1: None,
},
],

View file

@ -1,6 +1,6 @@
[package]
name = "garage_block"
version = "2.0.0"
version = "1.3.1"
authors = ["Alex Auvolat <alex@adnab.me>"]
edition = "2018"
license = "AGPL-3.0"
@ -18,6 +18,7 @@ garage_db.workspace = true
garage_net.workspace = true
garage_rpc.workspace = true
garage_util.workspace = true
garage_table.workspace = true
opentelemetry.workspace = true

View file

@ -33,6 +33,8 @@ use garage_rpc::rpc_helper::OrderTag;
use garage_rpc::system::System;
use garage_rpc::*;
use garage_table::replication::{TableReplication, TableShardedReplication};
use crate::block::*;
use crate::layout::*;
use crate::metrics::*;
@ -48,6 +50,8 @@ pub const INLINE_THRESHOLD: usize = 3072;
// to delete the block locally.
pub(crate) const BLOCK_GC_DELAY: Duration = Duration::from_secs(600);
const BLOCK_READ_SEMAPHORE_TIMEOUT: Duration = Duration::from_secs(15);
/// RPC messages used to share blocks of data between nodes
#[derive(Debug, Serialize, Deserialize)]
pub enum BlockRpc {
@ -72,8 +76,8 @@ impl Rpc for BlockRpc {
/// The block manager, handling block exchange between nodes, and block storage on local node
pub struct BlockManager {
/// Quorum of nodes for write operations
pub write_quorum: usize,
/// Replication strategy, allowing to find on which node blocks should be located
pub replication: TableShardedReplication,
/// Data layout
pub(crate) data_layout: ArcSwap<DataLayout>,
@ -85,6 +89,7 @@ pub struct BlockManager {
disable_scrub: bool,
mutation_lock: Vec<Mutex<BlockManagerLocked>>,
read_semaphore: Semaphore,
pub rc: BlockRc,
pub resync: BlockResyncManager,
@ -120,7 +125,7 @@ impl BlockManager {
pub fn new(
db: &db::Db,
config: &Config,
write_quorum: usize,
replication: TableShardedReplication,
system: Arc<System>,
) -> Result<Arc<Self>, Error> {
// Load or compute layout, i.e. assignment of data blocks to the different data directories
@ -164,7 +169,7 @@ impl BlockManager {
let scrub_persister = PersisterShared::new(&system.metadata_dir, "scrub_info");
let block_manager = Arc::new(Self {
write_quorum,
replication,
data_layout: ArcSwap::new(Arc::new(data_layout)),
data_layout_persister,
data_fsync: config.data_fsync,
@ -174,6 +179,8 @@ impl BlockManager {
.iter()
.map(|_| Mutex::new(BlockManagerLocked()))
.collect::<Vec<_>>(),
read_semaphore: Semaphore::new(config.block_max_concurrent_reads),
rc,
resync,
system,
@ -336,19 +343,6 @@ impl BlockManager {
Err(err)
}
/// Returns the set of nodes that should store a copy of a given block.
/// These are the nodes assigned to the block's hash in the current
/// layout version only: since blocks are immutable, we don't need to
/// do complex logic when several layour versions are active at once,
/// just move them directly to the new nodes.
pub(crate) fn storage_nodes_of(&self, hash: &Hash) -> Vec<Uuid> {
self.system
.cluster_layout()
.current()
.nodes_of(hash)
.collect()
}
// ---- Public interface ----
/// Ask nodes that might have a block for it, return it as a stream
@ -381,7 +375,7 @@ impl BlockManager {
prevent_compression: bool,
order_tag: Option<OrderTag>,
) -> Result<(), Error> {
let who = self.storage_nodes_of(&hash);
let who = self.system.cluster_layout().current_storage_nodes_of(&hash);
let compression_level = self.compression_level.filter(|_| !prevent_compression);
let (header, bytes) = DataBlock::from_buffer(data, compression_level)
@ -411,7 +405,7 @@ impl BlockManager {
put_block_rpc,
RequestStrategy::with_priority(PRIO_NORMAL | PRIO_SECONDARY)
.with_drop_on_completion(permit)
.with_quorum(self.write_quorum),
.with_quorum(self.replication.write_quorum()),
)
.await?;
@ -419,8 +413,8 @@ impl BlockManager {
}
/// Get number of items in the refcount table
pub fn rc_len(&self) -> Result<usize, Error> {
Ok(self.rc.rc_table.len()?)
pub fn rc_approximate_len(&self) -> Result<usize, Error> {
Ok(self.rc.rc_table.approximate_len()?)
}
/// Send command to start/stop/manager scrub worker
@ -438,7 +432,7 @@ impl BlockManager {
/// List all resync errors
pub fn list_resync_errors(&self) -> Result<Vec<BlockResyncErrorInfo>, Error> {
let mut blocks = Vec::with_capacity(self.resync.errors.len()?);
let mut blocks = Vec::with_capacity(self.resync.errors.approximate_len()?);
for ent in self.resync.errors.iter()? {
let (hash, cnt) = ent?;
let cnt = ErrorCounter::decode(&cnt);
@ -568,9 +562,6 @@ impl BlockManager {
match self.find_block(hash).await {
Some(p) => self.read_block_from(hash, &p).await,
None => {
// Not found but maybe we should have had it ??
self.resync
.put_to_resync(hash, 2 * self.system.rpc_helper().rpc_timeout())?;
return Err(Error::Message(format!(
"block {:?} not found on node",
hash
@ -592,6 +583,15 @@ impl BlockManager {
) -> Result<DataBlock, Error> {
let (header, path) = block_path.as_parts_ref();
let permit = tokio::select! {
sem = self.read_semaphore.acquire() => sem.ok_or_message("acquire read semaphore")?,
_ = tokio::time::sleep(BLOCK_READ_SEMAPHORE_TIMEOUT) => {
self.metrics.block_read_semaphore_timeouts.add(1);
debug!("read block {:?}: read_semaphore acquire timeout", hash);
return Err(Error::Message("read block: read_semaphore acquire timeout".into()));
}
};
let mut f = fs::File::open(&path).await?;
let mut data = vec![];
f.read_to_end(&mut data).await?;
@ -616,6 +616,8 @@ impl BlockManager {
return Err(Error::CorruptData(*hash));
}
drop(permit);
Ok(data)
}
@ -781,6 +783,7 @@ impl BlockManagerLocked {
let mut f = fs::File::create(&path_tmp).await?;
f.write_all(data).await?;
f.flush().await?;
mgr.metrics.bytes_written.add(data.len() as u64);
if mgr.data_fsync {

View file

@ -22,6 +22,7 @@ pub struct BlockManagerMetrics {
pub(crate) bytes_read: BoundCounter<u64>,
pub(crate) block_read_duration: BoundValueRecorder<f64>,
pub(crate) block_read_semaphore_timeouts: BoundCounter<u64>,
pub(crate) bytes_written: BoundCounter<u64>,
pub(crate) block_write_duration: BoundValueRecorder<f64>,
pub(crate) delete_counter: BoundCounter<u64>,
@ -50,7 +51,7 @@ impl BlockManagerMetrics {
.init(),
_rc_size: meter
.u64_value_observer("block.rc_size", move |observer| {
if let Ok(value) = rc_tree.len() {
if let Ok(value) = rc_tree.approximate_len() {
observer.observe(value as u64, &[])
}
})
@ -58,7 +59,7 @@ impl BlockManagerMetrics {
.init(),
_resync_queue_len: meter
.u64_value_observer("block.resync_queue_length", move |observer| {
if let Ok(value) = resync_queue.len() {
if let Ok(value) = resync_queue.approximate_len() {
observer.observe(value as u64, &[]);
}
})
@ -68,7 +69,7 @@ impl BlockManagerMetrics {
.init(),
_resync_errored_blocks: meter
.u64_value_observer("block.resync_errored_blocks", move |observer| {
if let Ok(value) = resync_errors.len() {
if let Ok(value) = resync_errors.approximate_len() {
observer.observe(value as u64, &[]);
}
})
@ -119,6 +120,11 @@ impl BlockManagerMetrics {
.with_description("Duration of block read operations")
.init()
.bind(&[]),
block_read_semaphore_timeouts: meter
.u64_counter("block.read_semaphore_timeouts")
.with_description("Number of block reads that failed due to semaphore acquire timeout")
.init()
.bind(&[]),
bytes_written: meter
.u64_counter("block.bytes_written")
.with_description("Number of bytes written to disk")

View file

@ -27,6 +27,8 @@ use garage_util::tranquilizer::Tranquilizer;
use garage_rpc::system::System;
use garage_rpc::*;
use garage_table::replication::TableReplication;
use crate::manager::*;
// The delay between the time where a resync operation fails
@ -104,13 +106,13 @@ impl BlockResyncManager {
}
/// Get length of resync queue
pub fn queue_len(&self) -> Result<usize, Error> {
Ok(self.queue.len()?)
pub fn queue_approximate_len(&self) -> Result<usize, Error> {
Ok(self.queue.approximate_len()?)
}
/// Get number of blocks that have an error
pub fn errors_len(&self) -> Result<usize, Error> {
Ok(self.errors.len()?)
pub fn errors_approximate_len(&self) -> Result<usize, Error> {
Ok(self.errors.approximate_len()?)
}
/// Clear the error counter for a block and put it in queue immediately
@ -131,6 +133,14 @@ impl BlockResyncManager {
)))
}
/// Clear the entire resync queue and list of errored blocks
/// Corresponds to `garage repair clear-resync-queue`
pub fn clear_resync_queue(&self) -> Result<(), Error> {
self.queue.clear()?;
self.errors.clear()?;
Ok(())
}
pub fn register_bg_vars(&self, vars: &mut vars::BgVars) {
let notify = self.notify.clone();
vars.register_rw(
@ -375,8 +385,11 @@ impl BlockResyncManager {
info!("Resync block {:?}: offloading and deleting", hash);
let existing_path = existing_path.unwrap();
let mut who = manager.storage_nodes_of(hash);
if who.len() < manager.write_quorum {
let mut who = manager
.system
.cluster_layout()
.current_storage_nodes_of(hash);
if who.len() < manager.replication.write_quorum() {
return Err(Error::Message("Not trying to offload block because we don't have a quorum of nodes to write to".to_string()));
}
who.retain(|id| *id != manager.system.id);
@ -458,7 +471,10 @@ impl BlockResyncManager {
// First, check whether we are still supposed to store that
// block in the latest cluster layout version.
let storage_nodes = manager.storage_nodes_of(&hash);
let storage_nodes = manager
.system
.cluster_layout()
.current_storage_nodes_of(&hash);
if !storage_nodes.contains(&manager.system.id) {
info!(
@ -540,9 +556,11 @@ impl Worker for ResyncWorker {
}
WorkerStatus {
queue_length: Some(self.manager.resync.queue_len().unwrap_or(0) as u64),
queue_length: Some(self.manager.resync.queue_approximate_len().unwrap_or(0) as u64),
tranquility: Some(tranquility),
persistent_errors: Some(self.manager.resync.errors_len().unwrap_or(0) as u64),
persistent_errors: Some(
self.manager.resync.errors_approximate_len().unwrap_or(0) as u64
),
..Default::default()
}
}

View file

@ -1,6 +1,6 @@
[package]
name = "garage_db"
version = "2.0.0"
version = "1.3.1"
authors = ["Alex Auvolat <alex@adnab.me>"]
edition = "2018"
license = "AGPL-3.0"
@ -12,14 +12,18 @@ readme = "../../README.md"
path = "lib.rs"
[dependencies]
err-derive.workspace = true
thiserror.workspace = true
tracing.workspace = true
heed = { workspace = true, optional = true }
rusqlite = { workspace = true, optional = true, features = ["backup"] }
r2d2 = { workspace = true, optional = true }
r2d2_sqlite = { workspace = true, optional = true }
fjall = { workspace = true, optional = true }
parking_lot = { workspace = true, optional = true }
[dev-dependencies]
mktemp.workspace = true
@ -27,4 +31,5 @@ mktemp.workspace = true
default = [ "lmdb", "sqlite" ]
bundled-libs = [ "rusqlite?/bundled" ]
lmdb = [ "heed" ]
fjall = [ "dep:fjall", "dep:parking_lot" ]
sqlite = [ "rusqlite", "r2d2", "r2d2_sqlite" ]

453
src/db/fjall_adapter.rs Normal file
View file

@ -0,0 +1,453 @@
use core::ops::Bound;
use std::path::PathBuf;
use std::sync::Arc;
use parking_lot::{MappedRwLockReadGuard, RwLock, RwLockReadGuard};
use fjall::{
PartitionCreateOptions, PersistMode, TransactionalKeyspace, TransactionalPartitionHandle,
WriteTransaction,
};
use crate::{
open::{Engine, OpenOpt},
Db, Error, IDb, ITx, ITxFn, OnCommit, Result, TxError, TxFnResult, TxOpError, TxOpResult,
TxResult, TxValueIter, Value, ValueIter,
};
pub use fjall;
// --
pub(crate) fn open_db(path: &PathBuf, opt: &OpenOpt) -> Result<Db> {
info!("Opening Fjall database at: {}", path.display());
if opt.fsync {
return Err(Error(
"metadata_fsync is not supported with the Fjall database engine".into(),
));
}
let mut config = fjall::Config::new(path);
if let Some(block_cache_size) = opt.fjall_block_cache_size {
config = config.cache_size(block_cache_size as u64);
}
let keyspace = config.open_transactional()?;
Ok(FjallDb::init(keyspace))
}
// -- err
impl From<fjall::Error> for Error {
fn from(e: fjall::Error) -> Error {
Error(format!("fjall: {}", e).into())
}
}
impl From<fjall::LsmError> for Error {
fn from(e: fjall::LsmError) -> Error {
Error(format!("fjall lsm_tree: {}", e).into())
}
}
impl From<fjall::Error> for TxOpError {
fn from(e: fjall::Error) -> TxOpError {
TxOpError(e.into())
}
}
// -- db
pub struct FjallDb {
keyspace: TransactionalKeyspace,
trees: RwLock<Vec<(String, TransactionalPartitionHandle)>>,
}
type ByteRefRangeBound<'r> = (Bound<&'r [u8]>, Bound<&'r [u8]>);
impl FjallDb {
pub fn init(keyspace: TransactionalKeyspace) -> Db {
let s = Self {
keyspace,
trees: RwLock::new(Vec::new()),
};
Db(Arc::new(s))
}
fn get_tree(
&self,
i: usize,
) -> Result<MappedRwLockReadGuard<'_, TransactionalPartitionHandle>> {
RwLockReadGuard::try_map(self.trees.read(), |trees: &Vec<_>| {
trees.get(i).map(|tup| &tup.1)
})
.map_err(|_| Error("invalid tree id".into()))
}
}
impl IDb for FjallDb {
fn engine(&self) -> String {
"Fjall (EXPERIMENTAL!)".into()
}
fn open_tree(&self, name: &str) -> Result<usize> {
let mut trees = self.trees.write();
let safe_name = encode_name(name)?;
if let Some(i) = trees.iter().position(|(name, _)| *name == safe_name) {
Ok(i)
} else {
let tree = self
.keyspace
.open_partition(&safe_name, PartitionCreateOptions::default())?;
let i = trees.len();
trees.push((safe_name, tree));
Ok(i)
}
}
fn list_trees(&self) -> Result<Vec<String>> {
Ok(self
.keyspace
.list_partitions()
.iter()
.map(|n| decode_name(&n))
.collect::<Result<Vec<_>>>()?)
}
fn snapshot(&self, base_path: &PathBuf) -> Result<()> {
std::fs::create_dir_all(base_path)?;
let path = Engine::Fjall.db_path(base_path);
let source_state = self.keyspace.read_tx();
let copy_keyspace = fjall::Config::new(path).open()?;
for partition_name in self.keyspace.list_partitions() {
let source_partition = self
.keyspace
.open_partition(&partition_name, PartitionCreateOptions::default())?;
let copy_partition =
copy_keyspace.open_partition(&partition_name, PartitionCreateOptions::default())?;
for entry in source_state.iter(&source_partition) {
let (key, value) = entry?;
copy_partition.insert(key, value)?;
}
}
copy_keyspace.persist(PersistMode::SyncAll)?;
Ok(())
}
// ----
fn get(&self, tree_idx: usize, key: &[u8]) -> Result<Option<Value>> {
let tree = self.get_tree(tree_idx)?;
let tx = self.keyspace.read_tx();
let val = tx.get(&tree, key)?;
match val {
None => Ok(None),
Some(v) => Ok(Some(v.to_vec())),
}
}
fn approximate_len(&self, tree_idx: usize) -> Result<usize> {
let tree = self.get_tree(tree_idx)?;
Ok(tree.approximate_len())
}
fn is_empty(&self, tree_idx: usize) -> Result<bool> {
let tree = self.get_tree(tree_idx)?;
let tx = self.keyspace.read_tx();
Ok(tx.is_empty(&tree)?)
}
fn insert(&self, tree_idx: usize, key: &[u8], value: &[u8]) -> Result<()> {
let tree = self.get_tree(tree_idx)?;
let mut tx = self.keyspace.write_tx();
tx.insert(&tree, key, value);
tx.commit()?;
Ok(())
}
fn remove(&self, tree_idx: usize, key: &[u8]) -> Result<()> {
let tree = self.get_tree(tree_idx)?;
let mut tx = self.keyspace.write_tx();
tx.remove(&tree, key);
tx.commit()?;
Ok(())
}
fn clear(&self, tree_idx: usize) -> Result<()> {
let mut trees = self.trees.write();
if tree_idx >= trees.len() {
return Err(Error("invalid tree id".into()));
}
let (name, tree) = trees.remove(tree_idx);
self.keyspace.delete_partition(tree)?;
let tree = self
.keyspace
.open_partition(&name, PartitionCreateOptions::default())?;
trees.insert(tree_idx, (name, tree));
Ok(())
}
fn iter(&self, tree_idx: usize) -> Result<ValueIter<'_>> {
let tree = self.get_tree(tree_idx)?;
let tx = self.keyspace.read_tx();
Ok(Box::new(tx.iter(&tree).map(iterator_remap)))
}
fn iter_rev(&self, tree_idx: usize) -> Result<ValueIter<'_>> {
let tree = self.get_tree(tree_idx)?;
let tx = self.keyspace.read_tx();
Ok(Box::new(tx.iter(&tree).rev().map(iterator_remap)))
}
fn range<'r>(
&self,
tree_idx: usize,
low: Bound<&'r [u8]>,
high: Bound<&'r [u8]>,
) -> Result<ValueIter<'_>> {
let tree = self.get_tree(tree_idx)?;
let tx = self.keyspace.read_tx();
Ok(Box::new(
tx.range::<&'r [u8], ByteRefRangeBound>(&tree, (low, high))
.map(iterator_remap),
))
}
fn range_rev<'r>(
&self,
tree_idx: usize,
low: Bound<&'r [u8]>,
high: Bound<&'r [u8]>,
) -> Result<ValueIter<'_>> {
let tree = self.get_tree(tree_idx)?;
let tx = self.keyspace.read_tx();
Ok(Box::new(
tx.range::<&'r [u8], ByteRefRangeBound>(&tree, (low, high))
.rev()
.map(iterator_remap),
))
}
// ----
fn transaction(&self, f: &dyn ITxFn) -> TxResult<OnCommit, ()> {
let trees = self.trees.read();
let mut tx = FjallTx {
trees: &trees[..],
tx: self.keyspace.write_tx(),
};
let res = f.try_on(&mut tx);
match res {
TxFnResult::Ok(on_commit) => {
tx.tx.commit().map_err(Error::from).map_err(TxError::Db)?;
Ok(on_commit)
}
TxFnResult::Abort => {
tx.tx.rollback();
Err(TxError::Abort(()))
}
TxFnResult::DbErr => {
tx.tx.rollback();
Err(TxError::Db(Error(
"(this message will be discarded)".into(),
)))
}
}
}
}
// ----
struct FjallTx<'a> {
trees: &'a [(String, TransactionalPartitionHandle)],
tx: WriteTransaction<'a>,
}
impl<'a> FjallTx<'a> {
fn get_tree(&self, i: usize) -> TxOpResult<&TransactionalPartitionHandle> {
self.trees.get(i).map(|tup| &tup.1).ok_or_else(|| {
TxOpError(Error(
"invalid tree id (it might have been openned after the transaction started)".into(),
))
})
}
}
impl<'a> ITx for FjallTx<'a> {
fn get(&self, tree_idx: usize, key: &[u8]) -> TxOpResult<Option<Value>> {
let tree = self.get_tree(tree_idx)?;
match self.tx.get(tree, key)? {
Some(v) => Ok(Some(v.to_vec())),
None => Ok(None),
}
}
fn len(&self, tree_idx: usize) -> TxOpResult<usize> {
let tree = self.get_tree(tree_idx)?;
Ok(self.tx.len(tree)? as usize)
}
fn insert(&mut self, tree_idx: usize, key: &[u8], value: &[u8]) -> TxOpResult<()> {
let tree = self.get_tree(tree_idx)?.clone();
self.tx.insert(&tree, key, value);
Ok(())
}
fn remove(&mut self, tree_idx: usize, key: &[u8]) -> TxOpResult<()> {
let tree = self.get_tree(tree_idx)?.clone();
self.tx.remove(&tree, key);
Ok(())
}
fn clear(&mut self, _tree_idx: usize) -> TxOpResult<()> {
unimplemented!("LSM tree clearing in cross-partition transaction is not supported")
}
fn iter(&self, tree_idx: usize) -> TxOpResult<TxValueIter<'_>> {
let tree = self.get_tree(tree_idx)?.clone();
Ok(Box::new(self.tx.iter(&tree).map(iterator_remap_tx)))
}
fn iter_rev(&self, tree_idx: usize) -> TxOpResult<TxValueIter<'_>> {
let tree = self.get_tree(tree_idx)?.clone();
Ok(Box::new(self.tx.iter(&tree).rev().map(iterator_remap_tx)))
}
fn range<'r>(
&self,
tree_idx: usize,
low: Bound<&'r [u8]>,
high: Bound<&'r [u8]>,
) -> TxOpResult<TxValueIter<'_>> {
let tree = self.get_tree(tree_idx)?;
let low = clone_bound(low);
let high = clone_bound(high);
Ok(Box::new(
self.tx
.range::<Vec<u8>, ByteVecRangeBounds>(&tree, (low, high))
.map(iterator_remap_tx),
))
}
fn range_rev<'r>(
&self,
tree_idx: usize,
low: Bound<&'r [u8]>,
high: Bound<&'r [u8]>,
) -> TxOpResult<TxValueIter<'_>> {
let tree = self.get_tree(tree_idx)?;
let low = clone_bound(low);
let high = clone_bound(high);
Ok(Box::new(
self.tx
.range::<Vec<u8>, ByteVecRangeBounds>(&tree, (low, high))
.rev()
.map(iterator_remap_tx),
))
}
}
// -- maps fjall's (k, v) to ours
fn iterator_remap(r: fjall::Result<(fjall::Slice, fjall::Slice)>) -> Result<(Value, Value)> {
r.map(|(k, v)| (k.to_vec(), v.to_vec()))
.map_err(|e| e.into())
}
fn iterator_remap_tx(r: fjall::Result<(fjall::Slice, fjall::Slice)>) -> TxOpResult<(Value, Value)> {
r.map(|(k, v)| (k.to_vec(), v.to_vec()))
.map_err(|e| e.into())
}
// -- utils to deal with Garage's tightness on Bound lifetimes
type ByteVecBound = Bound<Vec<u8>>;
type ByteVecRangeBounds = (ByteVecBound, ByteVecBound);
fn clone_bound(bound: Bound<&[u8]>) -> ByteVecBound {
let value = match bound {
Bound::Excluded(v) | Bound::Included(v) => v.to_vec(),
Bound::Unbounded => vec![],
};
match bound {
Bound::Included(_) => Bound::Included(value),
Bound::Excluded(_) => Bound::Excluded(value),
Bound::Unbounded => Bound::Unbounded,
}
}
// -- utils to encode table names --
fn encode_name(s: &str) -> Result<String> {
let base = 'A' as u32;
let mut ret = String::with_capacity(s.len() + 10);
for c in s.chars() {
if c.is_alphanumeric() || c == '_' || c == '-' || c == '#' {
ret.push(c);
} else if c <= u8::MAX as char {
ret.push('$');
let c_hi = c as u32 / 16;
let c_lo = c as u32 % 16;
ret.push(char::from_u32(base + c_hi).unwrap());
ret.push(char::from_u32(base + c_lo).unwrap());
} else {
return Err(Error(
format!("table name {} could not be safely encoded", s).into(),
));
}
}
Ok(ret)
}
fn decode_name(s: &str) -> Result<String> {
use std::convert::TryFrom;
let errfn = || Error(format!("encoded table name {} is invalid", s).into());
let c_map = |c: char| {
let c = c as u32;
let base = 'A' as u32;
if (base..base + 16).contains(&c) {
Some(c - base)
} else {
None
}
};
let mut ret = String::with_capacity(s.len());
let mut it = s.chars();
while let Some(c) = it.next() {
if c == '$' {
let c_hi = it.next().and_then(c_map).ok_or_else(errfn)?;
let c_lo = it.next().and_then(c_map).ok_or_else(errfn)?;
let c_dec = char::try_from(c_hi * 16 + c_lo).map_err(|_| errfn())?;
ret.push(c_dec);
} else {
ret.push(c);
}
}
Ok(ret)
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_encdec_name() {
for name in [
"testname",
"test_name",
"test name",
"test$name",
"test:name@help.me$get/this**right",
] {
let encname = encode_name(name).unwrap();
assert!(!encname.contains(' '));
assert!(!encname.contains('.'));
assert!(!encname.contains('*'));
assert_eq!(*name, decode_name(&encname).unwrap());
}
}
}

View file

@ -1,6 +1,8 @@
#[macro_use]
extern crate tracing;
#[cfg(feature = "fjall")]
pub mod fjall_adapter;
#[cfg(feature = "lmdb")]
pub mod lmdb_adapter;
#[cfg(feature = "sqlite")]
@ -18,7 +20,7 @@ use std::cell::Cell;
use std::path::PathBuf;
use std::sync::Arc;
use err_derive::Error;
use thiserror::Error;
pub use open::*;
@ -42,7 +44,7 @@ pub type TxValueIter<'a> = Box<dyn std::iter::Iterator<Item = TxOpResult<(Value,
// ----
#[derive(Debug, Error)]
#[error(display = "{}", _0)]
#[error("{0}")]
pub struct Error(pub Cow<'static, str>);
impl From<std::io::Error> for Error {
@ -54,7 +56,7 @@ impl From<std::io::Error> for Error {
pub type Result<T> = std::result::Result<T, Error>;
#[derive(Debug, Error)]
#[error(display = "{}", _0)]
#[error("{0}")]
pub struct TxOpError(pub(crate) Error);
pub type TxOpResult<T> = std::result::Result<T, TxOpError>;
@ -104,32 +106,44 @@ impl Db {
result: Cell::new(None),
};
let tx_res = self.0.transaction(&f);
let ret = f
.result
.into_inner()
.expect("Transaction did not store result");
let fn_res = f.result.into_inner();
match tx_res {
Ok(on_commit) => match ret {
Ok(value) => {
on_commit.into_iter().for_each(|f| f());
Ok(value)
}
_ => unreachable!(),
},
Err(TxError::Abort(())) => match ret {
Err(TxError::Abort(e)) => Err(TxError::Abort(e)),
_ => unreachable!(),
},
Err(TxError::Db(e2)) => match ret {
// Ok was stored -> the error occurred when finalizing
// transaction
Ok(_) => Err(TxError::Db(e2)),
// An error was already stored: that's the one we want to
// return
Err(TxError::Db(e)) => Err(TxError::Db(e)),
_ => unreachable!(),
},
match (tx_res, fn_res) {
(Ok(on_commit), Some(Ok(value))) => {
// Transaction succeeded
// TxFn stored the value to return to the user in fn_res
// tx_res contains the on_commit list of callbacks, run them now
on_commit.into_iter().for_each(|f| f());
Ok(value)
}
(Err(TxError::Abort(())), Some(Err(TxError::Abort(e)))) => {
// Transaction was aborted by user code
// The abort error value is stored in fn_res
Err(TxError::Abort(e))
}
(Err(TxError::Db(_tx_e)), Some(Err(TxError::Db(fn_e)))) => {
// Transaction encountered a DB error in user code
// The error value encountered is the one in fn_res,
// tx_res contains only a dummy error message
Err(TxError::Db(fn_e))
}
(Err(TxError::Db(tx_e)), None) => {
// Transaction encounterred a DB error when initializing the transaction,
// before user code was called
Err(TxError::Db(tx_e))
}
(Err(TxError::Db(tx_e)), Some(Ok(_))) => {
// Transaction encounterred a DB error when commiting the transaction,
// after user code was called
Err(TxError::Db(tx_e))
}
(tx_res, fn_res) => {
panic!(
"unexpected error case: tx_res={:?}, fn_res={:?}",
tx_res.map(|_| "..."),
fn_res.map(|x| x.map(|_| "...").map_err(|_| "..."))
);
}
}
}
@ -152,7 +166,7 @@ impl Db {
let tree_names = other.list_trees()?;
for name in tree_names {
let tree = self.open_tree(&name)?;
if tree.len()? > 0 {
if !tree.is_empty()? {
return Err(Error(format!("tree {} already contains data", name).into()));
}
@ -194,8 +208,12 @@ impl Tree {
self.0.get(self.1, key.as_ref())
}
#[inline]
pub fn len(&self) -> Result<usize> {
self.0.len(self.1)
pub fn approximate_len(&self) -> Result<usize> {
self.0.approximate_len(self.1)
}
#[inline]
pub fn is_empty(&self) -> Result<bool> {
self.0.is_empty(self.1)
}
#[inline]
@ -333,7 +351,8 @@ pub(crate) trait IDb: Send + Sync {
fn snapshot(&self, path: &PathBuf) -> Result<()>;
fn get(&self, tree: usize, key: &[u8]) -> Result<Option<Value>>;
fn len(&self, tree: usize) -> Result<usize>;
fn approximate_len(&self, tree: usize) -> Result<usize>;
fn is_empty(&self, tree: usize) -> Result<bool>;
fn insert(&self, tree: usize, key: &[u8], value: &[u8]) -> Result<()>;
fn remove(&self, tree: usize, key: &[u8]) -> Result<()>;

View file

@ -1,8 +1,8 @@
use core::ops::Bound;
use core::ptr::NonNull;
use std::collections::HashMap;
use std::convert::TryInto;
use std::marker::PhantomPinned;
use std::path::PathBuf;
use std::pin::Pin;
use std::sync::{Arc, RwLock};
@ -11,12 +11,55 @@ use heed::types::ByteSlice;
use heed::{BytesDecode, Env, RoTxn, RwTxn, UntypedDatabase as Database};
use crate::{
open::{Engine, OpenOpt},
Db, Error, IDb, ITx, ITxFn, OnCommit, Result, TxError, TxFnResult, TxOpError, TxOpResult,
TxResult, TxValueIter, Value, ValueIter,
};
pub use heed;
// ---- top-level open function
pub(crate) fn open_db(path: &PathBuf, opt: &OpenOpt) -> Result<Db> {
info!("Opening LMDB database at: {}", path.display());
if let Err(e) = std::fs::create_dir_all(&path) {
return Err(Error(
format!("Unable to create LMDB data directory: {}", e).into(),
));
}
let map_size = match opt.lmdb_map_size {
None => recommended_map_size(),
Some(v) => v - (v % 4096),
};
let mut env_builder = heed::EnvOpenOptions::new();
env_builder.max_dbs(100);
env_builder.map_size(map_size);
env_builder.max_readers(2048);
unsafe {
env_builder.flag(heed::flags::Flags::MdbNoRdAhead);
env_builder.flag(heed::flags::Flags::MdbNoMetaSync);
if !opt.fsync {
env_builder.flag(heed::flags::Flags::MdbNoSync);
}
}
match env_builder.open(&path) {
Err(heed::Error::Io(e)) if e.kind() == std::io::ErrorKind::OutOfMemory => {
return Err(Error(
"OutOfMemory error while trying to open LMDB database. This can happen \
if your operating system is not allowing you to use sufficient virtual \
memory address space. Please check that no limit is set (ulimit -v). \
You may also try to set a smaller `lmdb_map_size` configuration parameter. \
On 32-bit machines, you should probably switch to another database engine."
.into(),
))
}
Err(e) => Err(Error(format!("Cannot open LMDB database: {}", e).into())),
Ok(db) => Ok(LmdbDb::init(db)),
}
}
// -- err
impl From<heed::Error> for Error {
@ -104,12 +147,11 @@ impl IDb for LmdbDb {
Ok(ret2)
}
fn snapshot(&self, to: &PathBuf) -> Result<()> {
std::fs::create_dir_all(to)?;
let mut path = to.clone();
path.push("data.mdb");
fn snapshot(&self, base_path: &PathBuf) -> Result<()> {
std::fs::create_dir_all(base_path)?;
let path = Engine::Lmdb.db_path(base_path);
self.db
.copy_to_path(path, heed::CompactionOption::Disabled)?;
.copy_to_path(path, heed::CompactionOption::Enabled)?;
Ok(())
}
@ -126,11 +168,16 @@ impl IDb for LmdbDb {
}
}
fn len(&self, tree: usize) -> Result<usize> {
fn approximate_len(&self, tree: usize) -> Result<usize> {
let tree = self.get_tree(tree)?;
let tx = self.db.read_txn()?;
Ok(tree.len(&tx)?.try_into().unwrap())
}
fn is_empty(&self, tree: usize) -> Result<bool> {
let tree = self.get_tree(tree)?;
let tx = self.db.read_txn()?;
Ok(tree.is_empty(&tx)?)
}
fn insert(&self, tree: usize, key: &[u8], value: &[u8]) -> Result<()> {
let tree = self.get_tree(tree)?;
@ -159,13 +206,15 @@ impl IDb for LmdbDb {
fn iter(&self, tree: usize) -> Result<ValueIter<'_>> {
let tree = self.get_tree(tree)?;
let tx = self.db.read_txn()?;
TxAndIterator::make(tx, |tx| Ok(tree.iter(tx)?))
// Safety: the cloture does not store its argument anywhere,
unsafe { TxAndIterator::make(tx, |tx| Ok(tree.iter(tx)?)) }
}
fn iter_rev(&self, tree: usize) -> Result<ValueIter<'_>> {
let tree = self.get_tree(tree)?;
let tx = self.db.read_txn()?;
TxAndIterator::make(tx, |tx| Ok(tree.rev_iter(tx)?))
// Safety: the cloture does not store its argument anywhere,
unsafe { TxAndIterator::make(tx, |tx| Ok(tree.rev_iter(tx)?)) }
}
fn range<'r>(
@ -176,7 +225,8 @@ impl IDb for LmdbDb {
) -> Result<ValueIter<'_>> {
let tree = self.get_tree(tree)?;
let tx = self.db.read_txn()?;
TxAndIterator::make(tx, |tx| Ok(tree.range(tx, &(low, high))?))
// Safety: the cloture does not store its argument anywhere,
unsafe { TxAndIterator::make(tx, |tx| Ok(tree.range(tx, &(low, high))?)) }
}
fn range_rev<'r>(
&self,
@ -186,7 +236,8 @@ impl IDb for LmdbDb {
) -> Result<ValueIter<'_>> {
let tree = self.get_tree(tree)?;
let tx = self.db.read_txn()?;
TxAndIterator::make(tx, |tx| Ok(tree.rev_range(tx, &(low, high))?))
// Safety: the cloture does not store its argument anywhere,
unsafe { TxAndIterator::make(tx, |tx| Ok(tree.rev_range(tx, &(low, high))?)) }
}
// ----
@ -316,28 +367,41 @@ where
{
tx: RoTxn<'a>,
iter: Option<I>,
_pin: PhantomPinned,
}
impl<'a, I> TxAndIterator<'a, I>
where
I: Iterator<Item = IteratorItem<'a>> + 'a,
{
fn make<F>(tx: RoTxn<'a>, iterfun: F) -> Result<ValueIter<'a>>
fn iter(self: Pin<&mut Self>) -> &mut Option<I> {
// Safety: iter is not structural
unsafe { &mut self.get_unchecked_mut().iter }
}
/// Safety: iterfun must not store its argument anywhere but in its result.
unsafe fn make<F>(tx: RoTxn<'a>, iterfun: F) -> Result<ValueIter<'a>>
where
F: FnOnce(&'a RoTxn<'a>) -> Result<I>,
{
let res = TxAndIterator { tx, iter: None };
let res = TxAndIterator {
tx,
iter: None,
_pin: PhantomPinned,
};
let mut boxed = Box::pin(res);
// This unsafe allows us to bypass lifetime checks
let tx = unsafe { NonNull::from(&boxed.tx).as_ref() };
let iter = iterfun(tx)?;
let tx_lifetime_overextended: &'a RoTxn<'a> = {
let tx = &boxed.tx;
// Safety: Artificially extending the lifetime because
// this reference will only be stored and accessed from the
// returned ValueIter which guarantees that it is destroyed
// before the tx it is pointing to.
unsafe { &*&raw const *tx }
};
let iter = iterfun(&tx_lifetime_overextended)?;
let mut_ref = Pin::as_mut(&mut boxed);
// This unsafe allows us to write in a field of the pinned struct
unsafe {
Pin::get_unchecked_mut(mut_ref).iter = Some(iter);
}
*boxed.as_mut().iter() = Some(iter);
Ok(Box::new(TxAndIteratorPin(boxed)))
}
@ -348,8 +412,10 @@ where
I: Iterator<Item = IteratorItem<'a>> + 'a,
{
fn drop(&mut self) {
// ensure the iterator is dropped before the RoTxn it references
drop(self.iter.take());
// Safety: `new_unchecked` is okay because we know this value is never
// used again after being dropped.
let this = unsafe { Pin::new_unchecked(self) };
drop(this.iter().take());
}
}
@ -365,13 +431,12 @@ where
fn next(&mut self) -> Option<Self::Item> {
let mut_ref = Pin::as_mut(&mut self.0);
// This unsafe allows us to mutably access the iterator field
let next = unsafe { Pin::get_unchecked_mut(mut_ref).iter.as_mut()?.next() };
match next {
None => None,
Some(Err(e)) => Some(Err(e.into())),
Some(Ok((k, v))) => Some(Ok((k.to_vec(), v.to_vec()))),
}
let next = mut_ref.iter().as_mut()?.next()?;
let res = match next {
Err(e) => Err(e.into()),
Ok((k, v)) => Ok((k.to_vec(), v.to_vec())),
};
Some(res)
}
}

View file

@ -11,6 +11,7 @@ use crate::{Db, Error, Result};
pub enum Engine {
Lmdb,
Sqlite,
Fjall,
}
impl Engine {
@ -19,8 +20,26 @@ impl Engine {
match self {
Self::Lmdb => "lmdb",
Self::Sqlite => "sqlite",
Self::Fjall => "fjall",
}
}
/// Return engine-specific DB path from base path
pub fn db_path(&self, base_path: &PathBuf) -> PathBuf {
let mut ret = base_path.clone();
match self {
Self::Lmdb => {
ret.push("db.lmdb");
}
Self::Sqlite => {
ret.push("db.sqlite");
}
Self::Fjall => {
ret.push("db.fjall");
}
}
ret
}
}
impl std::fmt::Display for Engine {
@ -36,10 +55,11 @@ impl std::str::FromStr for Engine {
match text {
"lmdb" | "heed" => Ok(Self::Lmdb),
"sqlite" | "sqlite3" | "rusqlite" => Ok(Self::Sqlite),
"fjall" => Ok(Self::Fjall),
"sled" => Err(Error("Sled is no longer supported as a database engine. Converting your old metadata db can be done using an older Garage binary (e.g. v0.9.4).".into())),
kind => Err(Error(
format!(
"Invalid DB engine: {} (options are: lmdb, sqlite)",
"Invalid DB engine: {} (options are: lmdb, sqlite, fjall)",
kind
)
.into(),
@ -51,6 +71,7 @@ impl std::str::FromStr for Engine {
pub struct OpenOpt {
pub fsync: bool,
pub lmdb_map_size: Option<usize>,
pub fjall_block_cache_size: Option<usize>,
}
impl Default for OpenOpt {
@ -58,6 +79,7 @@ impl Default for OpenOpt {
Self {
fsync: false,
lmdb_map_size: None,
fjall_block_cache_size: None,
}
}
}
@ -66,53 +88,15 @@ pub fn open_db(path: &PathBuf, engine: Engine, opt: &OpenOpt) -> Result<Db> {
match engine {
// ---- Sqlite DB ----
#[cfg(feature = "sqlite")]
Engine::Sqlite => {
info!("Opening Sqlite database at: {}", path.display());
let manager = r2d2_sqlite::SqliteConnectionManager::file(path);
Ok(crate::sqlite_adapter::SqliteDb::new(manager, opt.fsync)?)
}
Engine::Sqlite => crate::sqlite_adapter::open_db(path, opt),
// ---- LMDB DB ----
#[cfg(feature = "lmdb")]
Engine::Lmdb => {
info!("Opening LMDB database at: {}", path.display());
if let Err(e) = std::fs::create_dir_all(&path) {
return Err(Error(
format!("Unable to create LMDB data directory: {}", e).into(),
));
}
Engine::Lmdb => crate::lmdb_adapter::open_db(path, opt),
let map_size = match opt.lmdb_map_size {
None => crate::lmdb_adapter::recommended_map_size(),
Some(v) => v - (v % 4096),
};
let mut env_builder = heed::EnvOpenOptions::new();
env_builder.max_dbs(100);
env_builder.map_size(map_size);
env_builder.max_readers(2048);
unsafe {
env_builder.flag(crate::lmdb_adapter::heed::flags::Flags::MdbNoRdAhead);
env_builder.flag(crate::lmdb_adapter::heed::flags::Flags::MdbNoMetaSync);
if !opt.fsync {
env_builder.flag(heed::flags::Flags::MdbNoSync);
}
}
match env_builder.open(&path) {
Err(heed::Error::Io(e)) if e.kind() == std::io::ErrorKind::OutOfMemory => {
return Err(Error(
"OutOfMemory error while trying to open LMDB database. This can happen \
if your operating system is not allowing you to use sufficient virtual \
memory address space. Please check that no limit is set (ulimit -v). \
You may also try to set a smaller `lmdb_map_size` configuration parameter. \
On 32-bit machines, you should probably switch to another database engine."
.into(),
))
}
Err(e) => Err(Error(format!("Cannot open LMDB database: {}", e).into())),
Ok(db) => Ok(crate::lmdb_adapter::LmdbDb::init(db)),
}
}
// ---- Fjall DB ----
#[cfg(feature = "fjall")]
Engine::Fjall => crate::fjall_adapter::open_db(path, opt),
// Pattern is unreachable when all supported DB engines are compiled into binary. The allow
// attribute is added so that we won't have to change this match in case stop building

View file

@ -11,12 +11,23 @@ use r2d2_sqlite::SqliteConnectionManager;
use rusqlite::{params, Rows, Statement, Transaction};
use crate::{
open::{Engine, OpenOpt},
Db, Error, IDb, ITx, ITxFn, OnCommit, Result, TxError, TxFnResult, TxOpError, TxOpResult,
TxResult, TxValueIter, Value, ValueIter,
};
pub use rusqlite;
// ---- top-level open function
pub(crate) fn open_db(path: &PathBuf, opt: &OpenOpt) -> Result<Db> {
info!("Opening Sqlite database at: {}", path.display());
let manager = r2d2_sqlite::SqliteConnectionManager::file(path);
Ok(SqliteDb::new(manager, opt.fsync)?)
}
// ----
type Connection = r2d2::PooledConnection<SqliteConnectionManager>;
// --- err
@ -139,17 +150,18 @@ impl IDb for SqliteDb {
Ok(trees)
}
fn snapshot(&self, to: &PathBuf) -> Result<()> {
fn progress(p: rusqlite::backup::Progress) {
let percent = (p.pagecount - p.remaining) * 100 / p.pagecount;
info!("Sqlite snapshot progress: {}%", percent);
}
std::fs::create_dir_all(to)?;
let mut path = to.clone();
path.push("db.sqlite");
self.db
.get()?
.backup(rusqlite::DatabaseName::Main, path, Some(progress))?;
fn snapshot(&self, base_path: &PathBuf) -> Result<()> {
std::fs::create_dir_all(base_path)?;
let path = Engine::Sqlite
.db_path(&base_path)
.into_os_string()
.into_string()
.map_err(|_| Error("invalid sqlite path string".into()))?;
info!("Start sqlite VACUUM INTO `{}`", path);
self.db.get()?.execute("VACUUM INTO ?1", params![path])?;
info!("Finished sqlite VACUUM INTO `{}`", path);
Ok(())
}
@ -160,7 +172,7 @@ impl IDb for SqliteDb {
self.internal_get(&self.db.get()?, &tree, key)
}
fn len(&self, tree: usize) -> Result<usize> {
fn approximate_len(&self, tree: usize) -> Result<usize> {
let tree = self.get_tree(tree)?;
let db = self.db.get()?;
@ -172,6 +184,10 @@ impl IDb for SqliteDb {
}
}
fn is_empty(&self, tree: usize) -> Result<bool> {
Ok(self.approximate_len(tree)? == 0)
}
fn insert(&self, tree: usize, key: &[u8], value: &[u8]) -> Result<()> {
let tree = self.get_tree(tree)?;
let db = self.db.get()?;

View file

@ -1,7 +1,7 @@
use crate::*;
fn test_suite(db: Db) {
let tree = db.open_tree("tree").unwrap();
let tree = db.open_tree("tree:this_is_a_tree").unwrap();
let ka: &[u8] = &b"test"[..];
let kb: &[u8] = &b"zwello"[..];
@ -14,7 +14,7 @@ fn test_suite(db: Db) {
assert!(tree.insert(ka, va).is_ok());
assert_eq!(tree.get(ka).unwrap().unwrap(), va);
assert_eq!(tree.len().unwrap(), 1);
assert_eq!(tree.iter().unwrap().count(), 1);
// ---- test transaction logic ----
@ -148,3 +148,15 @@ fn test_sqlite_db() {
let db = SqliteDb::new(manager, false).unwrap();
test_suite(db);
}
#[test]
#[cfg(feature = "fjall")]
fn test_fjall_db() {
use crate::fjall_adapter::{fjall, FjallDb};
let path = mktemp::Temp::new_dir().unwrap();
let config = fjall::Config::new(path).temporary(true);
let keyspace = config.open_transactional().unwrap();
let db = FjallDb::init(keyspace);
test_suite(db);
}

View file

@ -1,6 +1,6 @@
[package]
name = "garage"
version = "2.0.0"
version = "1.3.1"
authors = ["Alex Auvolat <alex@adnab.me>"]
edition = "2018"
license = "AGPL-3.0"
@ -26,7 +26,6 @@ garage_db.workspace = true
garage_api_admin.workspace = true
garage_api_s3.workspace = true
garage_api_k2v = { workspace = true, optional = true }
garage_api_common.workspace = true
garage_block.workspace = true
garage_model.workspace = true
garage_net.workspace = true
@ -38,7 +37,6 @@ garage_web.workspace = true
backtrace.workspace = true
bytes.workspace = true
bytesize.workspace = true
chrono.workspace = true
timeago.workspace = true
parse_duration.workspace = true
hex.workspace = true
@ -49,8 +47,8 @@ sha1.workspace = true
sodiumoxide.workspace = true
structopt.workspace = true
git-version.workspace = true
utoipa.workspace = true
serde_json.workspace = true
serde.workspace = true
futures.workspace = true
tokio.workspace = true
@ -59,6 +57,7 @@ opentelemetry.workspace = true
opentelemetry-prometheus = { workspace = true, optional = true }
opentelemetry-otlp = { workspace = true, optional = true }
syslog-tracing = { workspace = true, optional = true }
tracing-journald = { workspace = true, optional = true }
[dev-dependencies]
garage_api_common.workspace = true
@ -87,11 +86,12 @@ k2v-client.workspace = true
[features]
default = [ "bundled-libs", "metrics", "lmdb", "sqlite", "k2v" ]
k2v = [ "garage_util/k2v", "garage_api_k2v", "garage_api_admin/k2v" ]
k2v = [ "garage_util/k2v", "garage_api_k2v" ]
# Database engines
lmdb = [ "garage_model/lmdb" ]
sqlite = [ "garage_model/sqlite" ]
fjall = [ "garage_model/fjall" ]
# Automatic registration and discovery via Consul API
consul-discovery = [ "garage_rpc/consul-discovery" ]
@ -103,6 +103,8 @@ metrics = [ "garage_api_admin/metrics", "opentelemetry-prometheus" ]
telemetry-otlp = [ "opentelemetry-otlp" ]
# Logging to syslog
syslog = [ "syslog-tracing" ]
# Logging to journald
journald = [ "tracing-journald" ]
# NOTE: bundled-libs and system-libs should be treat as mutually exclusive;
# exactly one of them should be enabled.

243
src/garage/admin/block.rs Normal file
View file

@ -0,0 +1,243 @@
use garage_util::data::*;
use garage_table::*;
use garage_model::helper::error::{Error, OkOrBadRequest};
use garage_model::s3::object_table::*;
use garage_model::s3::version_table::*;
use crate::cli::*;
use super::*;
impl AdminRpcHandler {
pub(super) async fn handle_block_cmd(&self, cmd: &BlockOperation) -> Result<AdminRpc, Error> {
match cmd {
BlockOperation::ListErrors => Ok(AdminRpc::BlockErrorList(
self.garage.block_manager.list_resync_errors()?,
)),
BlockOperation::Info { hash } => self.handle_block_info(hash).await,
BlockOperation::RetryNow { all, blocks } => {
self.handle_block_retry_now(*all, blocks).await
}
BlockOperation::Purge { yes, blocks } => self.handle_block_purge(*yes, blocks).await,
}
}
async fn handle_block_info(&self, hash: &String) -> Result<AdminRpc, Error> {
let hash = self.find_block_hash_by_prefix(hash)?;
let refcount = self.garage.block_manager.get_block_rc(&hash)?;
let block_refs = self
.garage
.block_ref_table
.get_range(&hash, None, None, 10000, Default::default())
.await?;
let mut versions = vec![];
let mut uploads = vec![];
for br in block_refs {
if let Some(v) = self
.garage
.version_table
.get(&br.version, &EmptyKey)
.await?
{
if let VersionBacklink::MultipartUpload { upload_id } = &v.backlink {
if let Some(u) = self.garage.mpu_table.get(upload_id, &EmptyKey).await? {
uploads.push(u);
}
}
versions.push(Ok(v));
} else {
versions.push(Err(br.version));
}
}
Ok(AdminRpc::BlockInfo {
hash,
refcount,
versions,
uploads,
})
}
async fn handle_block_retry_now(
&self,
all: bool,
blocks: &[String],
) -> Result<AdminRpc, Error> {
if all {
if !blocks.is_empty() {
return Err(Error::BadRequest(
"--all was specified, cannot also specify blocks".into(),
));
}
let blocks = self.garage.block_manager.list_resync_errors()?;
for b in blocks.iter() {
self.garage.block_manager.resync.clear_backoff(&b.hash)?;
}
Ok(AdminRpc::Ok(format!(
"{} blocks returned in queue for a retry now (check logs to see results)",
blocks.len()
)))
} else {
for hash in blocks {
let hash = hex::decode(hash).ok_or_bad_request("invalid hash")?;
let hash = Hash::try_from(&hash).ok_or_bad_request("invalid hash")?;
self.garage.block_manager.resync.clear_backoff(&hash)?;
}
Ok(AdminRpc::Ok(format!(
"{} blocks returned in queue for a retry now (check logs to see results)",
blocks.len()
)))
}
}
async fn handle_block_purge(&self, yes: bool, blocks: &[String]) -> Result<AdminRpc, Error> {
if !yes {
return Err(Error::BadRequest(
"Pass the --yes flag to confirm block purge operation.".into(),
));
}
let mut obj_dels = 0;
let mut mpu_dels = 0;
let mut ver_dels = 0;
let mut br_dels = 0;
for hash in blocks {
let hash = hex::decode(hash).ok_or_bad_request("invalid hash")?;
let hash = Hash::try_from(&hash).ok_or_bad_request("invalid hash")?;
let block_refs = self
.garage
.block_ref_table
.get_range(&hash, None, None, 10000, Default::default())
.await?;
for br in block_refs {
if let Some(version) = self
.garage
.version_table
.get(&br.version, &EmptyKey)
.await?
{
self.handle_block_purge_version_backlink(
&version,
&mut obj_dels,
&mut mpu_dels,
)
.await?;
if !version.deleted.get() {
let deleted_version = Version::new(version.uuid, version.backlink, true);
self.garage.version_table.insert(&deleted_version).await?;
ver_dels += 1;
}
}
if !br.deleted.get() {
let mut br = br;
br.deleted.set();
self.garage.block_ref_table.insert(&br).await?;
br_dels += 1;
}
}
}
Ok(AdminRpc::Ok(format!(
"Purged {} blocks: marked {} block refs, {} versions, {} objects and {} multipart uploads as deleted",
blocks.len(),
br_dels,
ver_dels,
obj_dels,
mpu_dels,
)))
}
async fn handle_block_purge_version_backlink(
&self,
version: &Version,
obj_dels: &mut usize,
mpu_dels: &mut usize,
) -> Result<(), Error> {
let (bucket_id, key, ov_id) = match &version.backlink {
VersionBacklink::Object { bucket_id, key } => (*bucket_id, key.clone(), version.uuid),
VersionBacklink::MultipartUpload { upload_id } => {
if let Some(mut mpu) = self.garage.mpu_table.get(upload_id, &EmptyKey).await? {
if !mpu.deleted.get() {
mpu.parts.clear();
mpu.deleted.set();
self.garage.mpu_table.insert(&mpu).await?;
*mpu_dels += 1;
}
(mpu.bucket_id, mpu.key.clone(), *upload_id)
} else {
return Ok(());
}
}
};
if let Some(object) = self.garage.object_table.get(&bucket_id, &key).await? {
let ov = object.versions().iter().rev().find(|v| v.is_complete());
if let Some(ov) = ov {
if ov.uuid == ov_id {
let del_uuid = gen_uuid();
let deleted_object = Object::new(
bucket_id,
key,
vec![ObjectVersion {
uuid: del_uuid,
timestamp: ov.timestamp + 1,
state: ObjectVersionState::Complete(ObjectVersionData::DeleteMarker),
}],
);
self.garage.object_table.insert(&deleted_object).await?;
*obj_dels += 1;
}
}
}
Ok(())
}
// ---- helper function ----
fn find_block_hash_by_prefix(&self, prefix: &str) -> Result<Hash, Error> {
if prefix.len() < 4 {
return Err(Error::BadRequest(
"Please specify at least 4 characters of the block hash".into(),
));
}
let prefix_bin =
hex::decode(&prefix[..prefix.len() & !1]).ok_or_bad_request("invalid hash")?;
let iter = self
.garage
.block_ref_table
.data
.store
.range(&prefix_bin[..]..)
.map_err(GarageError::from)?;
let mut found = None;
for item in iter {
let (k, _v) = item.map_err(GarageError::from)?;
let hash = Hash::try_from(&k[..32]).unwrap();
if &hash.as_slice()[..prefix_bin.len()] != prefix_bin {
break;
}
if hex::encode(hash.as_slice()).starts_with(prefix) {
match &found {
Some(x) if *x == hash => (),
Some(_) => {
return Err(Error::BadRequest(format!(
"Several blocks match prefix `{}`",
prefix
)));
}
None => {
found = Some(hash);
}
}
}
}
found.ok_or_else(|| Error::BadRequest("No matching block found".into()))
}
}

500
src/garage/admin/bucket.rs Normal file
View file

@ -0,0 +1,500 @@
use std::collections::HashMap;
use std::fmt::Write;
use garage_util::crdt::*;
use garage_util::time::*;
use garage_table::*;
use garage_model::bucket_alias_table::*;
use garage_model::bucket_table::*;
use garage_model::helper::error::{Error, OkOrBadRequest};
use garage_model::permission::*;
use crate::cli::*;
use super::*;
impl AdminRpcHandler {
pub(super) async fn handle_bucket_cmd(&self, cmd: &BucketOperation) -> Result<AdminRpc, Error> {
match cmd {
BucketOperation::List => self.handle_list_buckets().await,
BucketOperation::Info(query) => self.handle_bucket_info(query).await,
BucketOperation::Create(query) => self.handle_create_bucket(&query.name).await,
BucketOperation::Delete(query) => self.handle_delete_bucket(query).await,
BucketOperation::Alias(query) => self.handle_alias_bucket(query).await,
BucketOperation::Unalias(query) => self.handle_unalias_bucket(query).await,
BucketOperation::Allow(query) => self.handle_bucket_allow(query).await,
BucketOperation::Deny(query) => self.handle_bucket_deny(query).await,
BucketOperation::Website(query) => self.handle_bucket_website(query).await,
BucketOperation::SetQuotas(query) => self.handle_bucket_set_quotas(query).await,
BucketOperation::CleanupIncompleteUploads(query) => {
self.handle_bucket_cleanup_incomplete_uploads(query).await
}
}
}
async fn handle_list_buckets(&self) -> Result<AdminRpc, Error> {
let buckets = self
.garage
.bucket_table
.get_range(
&EmptyKey,
None,
Some(DeletedFilter::NotDeleted),
10000,
EnumerationOrder::Forward,
)
.await?;
Ok(AdminRpc::BucketList(buckets))
}
async fn handle_bucket_info(&self, query: &BucketOpt) -> Result<AdminRpc, Error> {
let bucket_id = self
.garage
.bucket_helper()
.admin_get_existing_matching_bucket(&query.name)
.await?;
let bucket = self
.garage
.bucket_helper()
.get_existing_bucket(bucket_id)
.await?;
let counters = self
.garage
.object_counter_table
.table
.get(&bucket_id, &EmptyKey)
.await?
.map(|x| x.filtered_values(&self.garage.system.cluster_layout()))
.unwrap_or_default();
let mpu_counters = self
.garage
.mpu_counter_table
.table
.get(&bucket_id, &EmptyKey)
.await?
.map(|x| x.filtered_values(&self.garage.system.cluster_layout()))
.unwrap_or_default();
let mut relevant_keys = HashMap::new();
for (k, _) in bucket
.state
.as_option()
.unwrap()
.authorized_keys
.items()
.iter()
{
if let Some(key) = self
.garage
.key_table
.get(&EmptyKey, k)
.await?
.filter(|k| !k.is_deleted())
{
relevant_keys.insert(k.clone(), key);
}
}
for ((k, _), _, _) in bucket
.state
.as_option()
.unwrap()
.local_aliases
.items()
.iter()
{
if relevant_keys.contains_key(k) {
continue;
}
if let Some(key) = self.garage.key_table.get(&EmptyKey, k).await? {
relevant_keys.insert(k.clone(), key);
}
}
Ok(AdminRpc::BucketInfo {
bucket,
relevant_keys,
counters,
mpu_counters,
})
}
#[allow(clippy::ptr_arg)]
async fn handle_create_bucket(&self, name: &String) -> Result<AdminRpc, Error> {
if !is_valid_bucket_name(name, self.garage.config.allow_punycode) {
return Err(Error::BadRequest(format!(
"{}: {}",
name, INVALID_BUCKET_NAME_MESSAGE
)));
}
let helper = self.garage.locked_helper().await;
if let Some(alias) = self.garage.bucket_alias_table.get(&EmptyKey, name).await? {
if alias.state.get().is_some() {
return Err(Error::BadRequest(format!("Bucket {} already exists", name)));
}
}
// ---- done checking, now commit ----
let bucket = Bucket::new();
self.garage.bucket_table.insert(&bucket).await?;
helper.set_global_bucket_alias(bucket.id, name).await?;
Ok(AdminRpc::Ok(format!("Bucket {} was created.", name)))
}
async fn handle_delete_bucket(&self, query: &DeleteBucketOpt) -> Result<AdminRpc, Error> {
let helper = self.garage.locked_helper().await;
let bucket_id = helper
.bucket()
.admin_get_existing_matching_bucket(&query.name)
.await?;
// Get the alias, but keep in minde here the bucket name
// given in parameter can also be directly the bucket's ID.
// In that case bucket_alias will be None, and
// we can still delete the bucket if it has zero aliases
// (a condition which we try to prevent but that could still happen somehow).
// We just won't try to delete an alias entry because there isn't one.
let bucket_alias = self
.garage
.bucket_alias_table
.get(&EmptyKey, &query.name)
.await?;
// Check bucket doesn't have other aliases
let mut bucket = helper.bucket().get_existing_bucket(bucket_id).await?;
let bucket_state = bucket.state.as_option().unwrap();
if bucket_state
.aliases
.items()
.iter()
.filter(|(_, _, active)| *active)
.any(|(name, _, _)| name != &query.name)
{
return Err(Error::BadRequest(format!("Bucket {} still has other global aliases. Use `bucket unalias` to delete them one by one.", query.name)));
}
if bucket_state
.local_aliases
.items()
.iter()
.any(|(_, _, active)| *active)
{
return Err(Error::BadRequest(format!("Bucket {} still has other local aliases. Use `bucket unalias` to delete them one by one.", query.name)));
}
// Check bucket is empty
if !helper.bucket().is_bucket_empty(bucket_id).await? {
return Err(Error::BadRequest(format!(
"Bucket {} is not empty",
query.name
)));
}
if !query.yes {
return Err(Error::BadRequest(
"Add --yes flag to really perform this operation".to_string(),
));
}
// --- done checking, now commit ---
// 1. delete authorization from keys that had access
for (key_id, _) in bucket.authorized_keys() {
helper
.set_bucket_key_permissions(bucket.id, key_id, BucketKeyPerm::NO_PERMISSIONS)
.await?;
}
// 2. delete bucket alias
if bucket_alias.is_some() {
helper
.purge_global_bucket_alias(bucket_id, &query.name)
.await?;
}
// 3. delete bucket
bucket.state = Deletable::delete();
self.garage.bucket_table.insert(&bucket).await?;
Ok(AdminRpc::Ok(format!("Bucket {} was deleted.", query.name)))
}
async fn handle_alias_bucket(&self, query: &AliasBucketOpt) -> Result<AdminRpc, Error> {
let helper = self.garage.locked_helper().await;
let bucket_id = helper
.bucket()
.admin_get_existing_matching_bucket(&query.existing_bucket)
.await?;
if let Some(key_pattern) = &query.local {
let key = helper.key().get_existing_matching_key(key_pattern).await?;
helper
.set_local_bucket_alias(bucket_id, &key.key_id, &query.new_name)
.await?;
Ok(AdminRpc::Ok(format!(
"Alias {} now points to bucket {:?} in namespace of key {}",
query.new_name, bucket_id, key.key_id
)))
} else {
helper
.set_global_bucket_alias(bucket_id, &query.new_name)
.await?;
Ok(AdminRpc::Ok(format!(
"Alias {} now points to bucket {:?}",
query.new_name, bucket_id
)))
}
}
async fn handle_unalias_bucket(&self, query: &UnaliasBucketOpt) -> Result<AdminRpc, Error> {
let helper = self.garage.locked_helper().await;
if let Some(key_pattern) = &query.local {
let key = helper.key().get_existing_matching_key(key_pattern).await?;
let bucket_id = key
.state
.as_option()
.unwrap()
.local_aliases
.get(&query.name)
.cloned()
.flatten()
.ok_or_bad_request("Bucket not found")?;
helper
.unset_local_bucket_alias(bucket_id, &key.key_id, &query.name)
.await?;
Ok(AdminRpc::Ok(format!(
"Alias {} no longer points to bucket {:?} in namespace of key {}",
&query.name, bucket_id, key.key_id
)))
} else {
let bucket_id = helper
.bucket()
.resolve_global_bucket_name(&query.name)
.await?
.ok_or_bad_request("Bucket not found")?;
helper
.unset_global_bucket_alias(bucket_id, &query.name)
.await?;
Ok(AdminRpc::Ok(format!(
"Alias {} no longer points to bucket {:?}",
&query.name, bucket_id
)))
}
}
async fn handle_bucket_allow(&self, query: &PermBucketOpt) -> Result<AdminRpc, Error> {
let helper = self.garage.locked_helper().await;
let bucket_id = helper
.bucket()
.admin_get_existing_matching_bucket(&query.bucket)
.await?;
let key = helper
.key()
.get_existing_matching_key(&query.key_pattern)
.await?;
let allow_read = query.read || key.allow_read(&bucket_id);
let allow_write = query.write || key.allow_write(&bucket_id);
let allow_owner = query.owner || key.allow_owner(&bucket_id);
helper
.set_bucket_key_permissions(
bucket_id,
&key.key_id,
BucketKeyPerm {
timestamp: now_msec(),
allow_read,
allow_write,
allow_owner,
},
)
.await?;
Ok(AdminRpc::Ok(format!(
"New permissions for {} on {}: read {}, write {}, owner {}.",
&key.key_id, &query.bucket, allow_read, allow_write, allow_owner
)))
}
async fn handle_bucket_deny(&self, query: &PermBucketOpt) -> Result<AdminRpc, Error> {
let helper = self.garage.locked_helper().await;
let bucket_id = helper
.bucket()
.admin_get_existing_matching_bucket(&query.bucket)
.await?;
let key = helper
.key()
.get_existing_matching_key(&query.key_pattern)
.await?;
let allow_read = !query.read && key.allow_read(&bucket_id);
let allow_write = !query.write && key.allow_write(&bucket_id);
let allow_owner = !query.owner && key.allow_owner(&bucket_id);
helper
.set_bucket_key_permissions(
bucket_id,
&key.key_id,
BucketKeyPerm {
timestamp: now_msec(),
allow_read,
allow_write,
allow_owner,
},
)
.await?;
Ok(AdminRpc::Ok(format!(
"New permissions for {} on {}: read {}, write {}, owner {}.",
&key.key_id, &query.bucket, allow_read, allow_write, allow_owner
)))
}
async fn handle_bucket_website(&self, query: &WebsiteOpt) -> Result<AdminRpc, Error> {
let bucket_id = self
.garage
.bucket_helper()
.admin_get_existing_matching_bucket(&query.bucket)
.await?;
let mut bucket = self
.garage
.bucket_helper()
.get_existing_bucket(bucket_id)
.await?;
let bucket_state = bucket.state.as_option_mut().unwrap();
if !(query.allow ^ query.deny) {
return Err(Error::BadRequest(
"You must specify exactly one flag, either --allow or --deny".to_string(),
));
}
let website = if query.allow {
Some(WebsiteConfig {
index_document: query.index_document.clone(),
error_document: query.error_document.clone(),
})
} else {
None
};
bucket_state.website_config.update(website);
self.garage.bucket_table.insert(&bucket).await?;
let msg = if query.allow {
format!("Website access allowed for {}", &query.bucket)
} else {
format!("Website access denied for {}", &query.bucket)
};
Ok(AdminRpc::Ok(msg))
}
async fn handle_bucket_set_quotas(&self, query: &SetQuotasOpt) -> Result<AdminRpc, Error> {
let bucket_id = self
.garage
.bucket_helper()
.admin_get_existing_matching_bucket(&query.bucket)
.await?;
let mut bucket = self
.garage
.bucket_helper()
.get_existing_bucket(bucket_id)
.await?;
let bucket_state = bucket.state.as_option_mut().unwrap();
if query.max_size.is_none() && query.max_objects.is_none() {
return Err(Error::BadRequest(
"You must specify either --max-size or --max-objects (or both) for this command to do something.".to_string(),
));
}
let mut quotas = bucket_state.quotas.get().clone();
match query.max_size.as_ref().map(String::as_ref) {
Some("none") => quotas.max_size = None,
Some(v) => {
let bs = v
.parse::<bytesize::ByteSize>()
.ok_or_bad_request(format!("Invalid size specified: {}", v))?;
quotas.max_size = Some(bs.as_u64());
}
_ => (),
}
match query.max_objects.as_ref().map(String::as_ref) {
Some("none") => quotas.max_objects = None,
Some(v) => {
let mo = v
.parse::<u64>()
.ok_or_bad_request(format!("Invalid number specified: {}", v))?;
quotas.max_objects = Some(mo);
}
_ => (),
}
bucket_state.quotas.update(quotas);
self.garage.bucket_table.insert(&bucket).await?;
Ok(AdminRpc::Ok(format!(
"Quotas updated for {}",
&query.bucket
)))
}
async fn handle_bucket_cleanup_incomplete_uploads(
&self,
query: &CleanupIncompleteUploadsOpt,
) -> Result<AdminRpc, Error> {
let mut bucket_ids = vec![];
for b in query.buckets.iter() {
bucket_ids.push(
self.garage
.bucket_helper()
.admin_get_existing_matching_bucket(b)
.await?,
);
}
let duration = parse_duration::parse::parse(&query.older_than)
.ok_or_bad_request("Invalid duration passed for --older-than parameter")?;
let mut ret = String::new();
for bucket in bucket_ids {
let count = self
.garage
.bucket_helper()
.cleanup_incomplete_uploads(&bucket, duration)
.await?;
writeln!(
&mut ret,
"Bucket {:?}: {} incomplete uploads aborted",
bucket, count
)
.unwrap();
}
Ok(AdminRpc::Ok(ret))
}
}

161
src/garage/admin/key.rs Normal file
View file

@ -0,0 +1,161 @@
use std::collections::HashMap;
use garage_table::*;
use garage_model::helper::error::*;
use garage_model::key_table::*;
use crate::cli::*;
use super::*;
impl AdminRpcHandler {
pub(super) async fn handle_key_cmd(&self, cmd: &KeyOperation) -> Result<AdminRpc, Error> {
match cmd {
KeyOperation::List => self.handle_list_keys().await,
KeyOperation::Info(query) => self.handle_key_info(query).await,
KeyOperation::Create(query) => self.handle_create_key(query).await,
KeyOperation::Rename(query) => self.handle_rename_key(query).await,
KeyOperation::Delete(query) => self.handle_delete_key(query).await,
KeyOperation::Allow(query) => self.handle_allow_key(query).await,
KeyOperation::Deny(query) => self.handle_deny_key(query).await,
KeyOperation::Import(query) => self.handle_import_key(query).await,
}
}
async fn handle_list_keys(&self) -> Result<AdminRpc, Error> {
let key_ids = self
.garage
.key_table
.get_range(
&EmptyKey,
None,
Some(KeyFilter::Deleted(DeletedFilter::NotDeleted)),
10000,
EnumerationOrder::Forward,
)
.await?
.iter()
.map(|k| (k.key_id.to_string(), k.params().unwrap().name.get().clone()))
.collect::<Vec<_>>();
Ok(AdminRpc::KeyList(key_ids))
}
async fn handle_key_info(&self, query: &KeyInfoOpt) -> Result<AdminRpc, Error> {
let mut key = self
.garage
.key_helper()
.get_existing_matching_key(&query.key_pattern)
.await?;
if !query.show_secret {
key.state.as_option_mut().unwrap().secret_key = "(redacted)".into();
}
self.key_info_result(key).await
}
async fn handle_create_key(&self, query: &KeyNewOpt) -> Result<AdminRpc, Error> {
let key = Key::new(&query.name);
self.garage.key_table.insert(&key).await?;
self.key_info_result(key).await
}
async fn handle_rename_key(&self, query: &KeyRenameOpt) -> Result<AdminRpc, Error> {
let mut key = self
.garage
.key_helper()
.get_existing_matching_key(&query.key_pattern)
.await?;
key.params_mut()
.unwrap()
.name
.update(query.new_name.clone());
self.garage.key_table.insert(&key).await?;
self.key_info_result(key).await
}
async fn handle_delete_key(&self, query: &KeyDeleteOpt) -> Result<AdminRpc, Error> {
let helper = self.garage.locked_helper().await;
let mut key = helper
.key()
.get_existing_matching_key(&query.key_pattern)
.await?;
if !query.yes {
return Err(Error::BadRequest(
"Add --yes flag to really perform this operation".to_string(),
));
}
helper.delete_key(&mut key).await?;
Ok(AdminRpc::Ok(format!(
"Key {} was deleted successfully.",
key.key_id
)))
}
async fn handle_allow_key(&self, query: &KeyPermOpt) -> Result<AdminRpc, Error> {
let mut key = self
.garage
.key_helper()
.get_existing_matching_key(&query.key_pattern)
.await?;
if query.create_bucket {
key.params_mut().unwrap().allow_create_bucket.update(true);
}
self.garage.key_table.insert(&key).await?;
self.key_info_result(key).await
}
async fn handle_deny_key(&self, query: &KeyPermOpt) -> Result<AdminRpc, Error> {
let mut key = self
.garage
.key_helper()
.get_existing_matching_key(&query.key_pattern)
.await?;
if query.create_bucket {
key.params_mut().unwrap().allow_create_bucket.update(false);
}
self.garage.key_table.insert(&key).await?;
self.key_info_result(key).await
}
async fn handle_import_key(&self, query: &KeyImportOpt) -> Result<AdminRpc, Error> {
if !query.yes {
return Err(Error::BadRequest("This command is intended to re-import keys that were previously generated by Garage. If you want to create a new key, use `garage key new` instead. Add the --yes flag if you really want to re-import a key.".to_string()));
}
let prev_key = self.garage.key_table.get(&EmptyKey, &query.key_id).await?;
if prev_key.is_some() {
return Err(Error::BadRequest(format!("Key {} already exists in data store. Even if it is deleted, we can't let you create a new key with the same ID. Sorry.", query.key_id)));
}
let imported_key = Key::import(&query.key_id, &query.secret_key, &query.name)
.ok_or_bad_request("Invalid key format")?;
self.garage.key_table.insert(&imported_key).await?;
self.key_info_result(imported_key).await
}
async fn key_info_result(&self, key: Key) -> Result<AdminRpc, Error> {
let mut relevant_buckets = HashMap::new();
for (id, _) in key
.state
.as_option()
.unwrap()
.authorized_buckets
.items()
.iter()
{
if let Some(b) = self.garage.bucket_table.get(&EmptyKey, id).await? {
relevant_buckets.insert(*id, b);
}
}
Ok(AdminRpc::KeyInfo(key, relevant_buckets))
}
}

545
src/garage/admin/mod.rs Normal file
View file

@ -0,0 +1,545 @@
mod block;
mod bucket;
mod key;
use std::collections::HashMap;
use std::fmt::Write;
use std::future::Future;
use std::sync::Arc;
use futures::future::FutureExt;
use serde::{Deserialize, Serialize};
use format_table::format_table_to_string;
use garage_util::background::BackgroundRunner;
use garage_util::data::*;
use garage_util::error::Error as GarageError;
use garage_table::replication::*;
use garage_table::*;
use garage_rpc::layout::PARTITION_BITS;
use garage_rpc::*;
use garage_block::manager::BlockResyncErrorInfo;
use garage_model::bucket_table::*;
use garage_model::garage::Garage;
use garage_model::helper::error::{Error, OkOrBadRequest};
use garage_model::key_table::*;
use garage_model::s3::mpu_table::MultipartUpload;
use garage_model::s3::version_table::Version;
use crate::cli::*;
use crate::repair::online::launch_online_repair;
pub const ADMIN_RPC_PATH: &str = "garage/admin_rpc.rs/Rpc";
#[derive(Debug, Serialize, Deserialize)]
#[allow(clippy::large_enum_variant)]
pub enum AdminRpc {
BucketOperation(BucketOperation),
KeyOperation(KeyOperation),
LaunchRepair(RepairOpt),
Stats(StatsOpt),
Worker(WorkerOperation),
BlockOperation(BlockOperation),
MetaOperation(MetaOperation),
// Replies
Ok(String),
BucketList(Vec<Bucket>),
BucketInfo {
bucket: Bucket,
relevant_keys: HashMap<String, Key>,
counters: HashMap<String, i64>,
mpu_counters: HashMap<String, i64>,
},
KeyList(Vec<(String, String)>),
KeyInfo(Key, HashMap<Uuid, Bucket>),
WorkerList(
HashMap<usize, garage_util::background::WorkerInfo>,
WorkerListOpt,
),
WorkerVars(Vec<(Uuid, String, String)>),
WorkerInfo(usize, garage_util::background::WorkerInfo),
BlockErrorList(Vec<BlockResyncErrorInfo>),
BlockInfo {
hash: Hash,
refcount: u64,
versions: Vec<Result<Version, Uuid>>,
uploads: Vec<MultipartUpload>,
},
}
impl Rpc for AdminRpc {
type Response = Result<AdminRpc, Error>;
}
pub struct AdminRpcHandler {
garage: Arc<Garage>,
background: Arc<BackgroundRunner>,
endpoint: Arc<Endpoint<AdminRpc, Self>>,
}
impl AdminRpcHandler {
pub fn new(garage: Arc<Garage>, background: Arc<BackgroundRunner>) -> Arc<Self> {
let endpoint = garage.system.netapp.endpoint(ADMIN_RPC_PATH.into());
let admin = Arc::new(Self {
garage,
background,
endpoint,
});
admin.endpoint.set_handler(admin.clone());
admin
}
// ================ REPAIR COMMANDS ====================
async fn handle_launch_repair(self: &Arc<Self>, opt: RepairOpt) -> Result<AdminRpc, Error> {
if !opt.yes {
return Err(Error::BadRequest(
"Please provide the --yes flag to initiate repair operations.".to_string(),
));
}
if opt.all_nodes {
let mut opt_to_send = opt.clone();
opt_to_send.all_nodes = false;
let mut failures = vec![];
let all_nodes = self.garage.system.cluster_layout().all_nodes().to_vec();
for node in all_nodes.iter() {
let node = (*node).into();
let resp = self
.endpoint
.call(
&node,
AdminRpc::LaunchRepair(opt_to_send.clone()),
PRIO_NORMAL,
)
.await;
if !matches!(resp, Ok(Ok(_))) {
failures.push(node);
}
}
if failures.is_empty() {
Ok(AdminRpc::Ok("Repair launched on all nodes".to_string()))
} else {
Err(Error::BadRequest(format!(
"Could not launch repair on nodes: {:?} (launched successfully on other nodes)",
failures
)))
}
} else {
launch_online_repair(&self.garage, &self.background, opt).await?;
Ok(AdminRpc::Ok(format!(
"Repair launched on {:?}",
self.garage.system.id
)))
}
}
// ================ STATS COMMANDS ====================
async fn handle_stats(&self, opt: StatsOpt) -> Result<AdminRpc, Error> {
if opt.all_nodes {
let mut ret = String::new();
let mut all_nodes = self.garage.system.cluster_layout().all_nodes().to_vec();
for node in self.garage.system.get_known_nodes().iter() {
if node.is_up && !all_nodes.contains(&node.id) {
all_nodes.push(node.id);
}
}
for node in all_nodes.iter() {
let mut opt = opt.clone();
opt.all_nodes = false;
opt.skip_global = true;
writeln!(&mut ret, "\n======================").unwrap();
writeln!(&mut ret, "Stats for node {:?}:", node).unwrap();
let node_id = (*node).into();
match self
.endpoint
.call(&node_id, AdminRpc::Stats(opt), PRIO_NORMAL)
.await
{
Ok(Ok(AdminRpc::Ok(s))) => writeln!(&mut ret, "{}", s).unwrap(),
Ok(Ok(x)) => writeln!(&mut ret, "Bad answer: {:?}", x).unwrap(),
Ok(Err(e)) => writeln!(&mut ret, "Remote error: {}", e).unwrap(),
Err(e) => writeln!(&mut ret, "Network error: {}", e).unwrap(),
}
}
writeln!(&mut ret, "\n======================").unwrap();
write!(
&mut ret,
"Cluster statistics:\n\n{}",
self.gather_cluster_stats()
)
.unwrap();
Ok(AdminRpc::Ok(ret))
} else {
Ok(AdminRpc::Ok(self.gather_stats_local(opt)?))
}
}
fn gather_stats_local(&self, opt: StatsOpt) -> Result<String, Error> {
let mut ret = String::new();
writeln!(
&mut ret,
"\nGarage version: {} [features: {}]\nRust compiler version: {}",
garage_util::version::garage_version(),
garage_util::version::garage_features()
.map(|list| list.join(", "))
.unwrap_or_else(|| "(unknown)".into()),
garage_util::version::rust_version(),
)
.unwrap();
writeln!(&mut ret, "\nDatabase engine: {}", self.garage.db.engine()).unwrap();
// Gather table statistics
let mut table = vec![" Table\tItems\tMklItems\tMklTodo\tGcTodo".into()];
table.push(self.gather_table_stats(&self.garage.bucket_table)?);
table.push(self.gather_table_stats(&self.garage.key_table)?);
table.push(self.gather_table_stats(&self.garage.object_table)?);
table.push(self.gather_table_stats(&self.garage.version_table)?);
table.push(self.gather_table_stats(&self.garage.block_ref_table)?);
write!(
&mut ret,
"\nTable stats:\n{}",
format_table_to_string(table)
)
.unwrap();
// Gather block manager statistics
writeln!(&mut ret, "\nBlock manager stats:").unwrap();
let rc_len = self.garage.block_manager.rc_approximate_len()?.to_string();
writeln!(
&mut ret,
" number of RC entries (~= number of blocks): {}",
rc_len
)
.unwrap();
writeln!(
&mut ret,
" resync queue length: {}",
self.garage.block_manager.resync.queue_approximate_len()?
)
.unwrap();
writeln!(
&mut ret,
" blocks with resync errors: {}",
self.garage.block_manager.resync.errors_approximate_len()?
)
.unwrap();
if !opt.skip_global {
write!(&mut ret, "\n{}", self.gather_cluster_stats()).unwrap();
}
Ok(ret)
}
fn gather_cluster_stats(&self) -> String {
let mut ret = String::new();
// Gather storage node and free space statistics for current nodes
let layout = &self.garage.system.cluster_layout();
let mut node_partition_count = HashMap::<Uuid, u64>::new();
for short_id in layout.current().ring_assignment_data.iter() {
let id = layout.current().node_id_vec[*short_id as usize];
*node_partition_count.entry(id).or_default() += 1;
}
let node_info = self
.garage
.system
.get_known_nodes()
.into_iter()
.map(|n| (n.id, n))
.collect::<HashMap<_, _>>();
let mut table = vec![" ID\tHostname\tZone\tCapacity\tPart.\tDataAvail\tMetaAvail".into()];
for (id, parts) in node_partition_count.iter() {
let info = node_info.get(id);
let status = info.map(|x| &x.status);
let role = layout.current().roles.get(id).and_then(|x| x.0.as_ref());
let hostname = status.and_then(|x| x.hostname.as_deref()).unwrap_or("?");
let zone = role.map(|x| x.zone.as_str()).unwrap_or("?");
let capacity = role
.map(|x| x.capacity_string())
.unwrap_or_else(|| "?".into());
let avail_str = |x| match x {
Some((avail, total)) => {
let pct = (avail as f64) / (total as f64) * 100.;
let avail = bytesize::ByteSize::b(avail);
let total = bytesize::ByteSize::b(total);
format!("{}/{} ({:.1}%)", avail, total, pct)
}
None => "?".into(),
};
let data_avail = avail_str(status.and_then(|x| x.data_disk_avail));
let meta_avail = avail_str(status.and_then(|x| x.meta_disk_avail));
table.push(format!(
" {:?}\t{}\t{}\t{}\t{}\t{}\t{}",
id, hostname, zone, capacity, parts, data_avail, meta_avail
));
}
write!(
&mut ret,
"Storage nodes:\n{}",
format_table_to_string(table)
)
.unwrap();
let meta_part_avail = node_partition_count
.iter()
.filter_map(|(id, parts)| {
node_info
.get(id)
.and_then(|x| x.status.meta_disk_avail)
.map(|c| c.0 / *parts)
})
.collect::<Vec<_>>();
let data_part_avail = node_partition_count
.iter()
.filter_map(|(id, parts)| {
node_info
.get(id)
.and_then(|x| x.status.data_disk_avail)
.map(|c| c.0 / *parts)
})
.collect::<Vec<_>>();
if !meta_part_avail.is_empty() && !data_part_avail.is_empty() {
let meta_avail =
bytesize::ByteSize(meta_part_avail.iter().min().unwrap() * (1 << PARTITION_BITS));
let data_avail =
bytesize::ByteSize(data_part_avail.iter().min().unwrap() * (1 << PARTITION_BITS));
writeln!(
&mut ret,
"\nEstimated available storage space cluster-wide (might be lower in practice):"
)
.unwrap();
if meta_part_avail.len() < node_partition_count.len()
|| data_part_avail.len() < node_partition_count.len()
{
writeln!(&mut ret, " data: < {}", data_avail).unwrap();
writeln!(&mut ret, " metadata: < {}", meta_avail).unwrap();
writeln!(&mut ret, "A precise estimate could not be given as information is missing for some storage nodes.").unwrap();
} else {
writeln!(&mut ret, " data: {}", data_avail).unwrap();
writeln!(&mut ret, " metadata: {}", meta_avail).unwrap();
}
}
ret
}
fn gather_table_stats<F, R>(&self, t: &Arc<Table<F, R>>) -> Result<String, Error>
where
F: TableSchema + 'static,
R: TableReplication + 'static,
{
let data_len = t
.data
.store
.approximate_len()
.map_err(GarageError::from)?
.to_string();
let mkl_len = t.merkle_updater.merkle_tree_approximate_len()?.to_string();
Ok(format!(
" {}\t{}\t{}\t{}\t{}",
F::TABLE_NAME,
data_len,
mkl_len,
t.merkle_updater.todo_approximate_len()?,
t.data.gc_todo_approximate_len()?
))
}
// ================ WORKER COMMANDS ====================
async fn handle_worker_cmd(&self, cmd: &WorkerOperation) -> Result<AdminRpc, Error> {
match cmd {
WorkerOperation::List { opt } => {
let workers = self.background.get_worker_info();
Ok(AdminRpc::WorkerList(workers, *opt))
}
WorkerOperation::Info { tid } => {
let info = self
.background
.get_worker_info()
.get(tid)
.ok_or_bad_request(format!("No worker with TID {}", tid))?
.clone();
Ok(AdminRpc::WorkerInfo(*tid, info))
}
WorkerOperation::Get {
all_nodes,
variable,
} => self.handle_get_var(*all_nodes, variable).await,
WorkerOperation::Set {
all_nodes,
variable,
value,
} => self.handle_set_var(*all_nodes, variable, value).await,
}
}
async fn handle_get_var(
&self,
all_nodes: bool,
variable: &Option<String>,
) -> Result<AdminRpc, Error> {
if all_nodes {
let mut ret = vec![];
let all_nodes = self.garage.system.cluster_layout().all_nodes().to_vec();
for node in all_nodes.iter() {
let node = (*node).into();
match self
.endpoint
.call(
&node,
AdminRpc::Worker(WorkerOperation::Get {
all_nodes: false,
variable: variable.clone(),
}),
PRIO_NORMAL,
)
.await??
{
AdminRpc::WorkerVars(v) => ret.extend(v),
m => return Err(GarageError::unexpected_rpc_message(m).into()),
}
}
Ok(AdminRpc::WorkerVars(ret))
} else {
#[allow(clippy::collapsible_else_if)]
if let Some(v) = variable {
Ok(AdminRpc::WorkerVars(vec![(
self.garage.system.id,
v.clone(),
self.garage.bg_vars.get(v)?,
)]))
} else {
let mut vars = self.garage.bg_vars.get_all();
vars.sort();
Ok(AdminRpc::WorkerVars(
vars.into_iter()
.map(|(k, v)| (self.garage.system.id, k.to_string(), v))
.collect(),
))
}
}
}
async fn handle_set_var(
&self,
all_nodes: bool,
variable: &str,
value: &str,
) -> Result<AdminRpc, Error> {
if all_nodes {
let mut ret = vec![];
let all_nodes = self.garage.system.cluster_layout().all_nodes().to_vec();
for node in all_nodes.iter() {
let node = (*node).into();
match self
.endpoint
.call(
&node,
AdminRpc::Worker(WorkerOperation::Set {
all_nodes: false,
variable: variable.to_string(),
value: value.to_string(),
}),
PRIO_NORMAL,
)
.await??
{
AdminRpc::WorkerVars(v) => ret.extend(v),
m => return Err(GarageError::unexpected_rpc_message(m).into()),
}
}
Ok(AdminRpc::WorkerVars(ret))
} else {
self.garage.bg_vars.set(variable, value)?;
Ok(AdminRpc::WorkerVars(vec![(
self.garage.system.id,
variable.to_string(),
value.to_string(),
)]))
}
}
// ================ META DB COMMANDS ====================
async fn handle_meta_cmd(self: &Arc<Self>, mo: &MetaOperation) -> Result<AdminRpc, Error> {
match mo {
MetaOperation::Snapshot { all: true } => {
let to = self.garage.system.cluster_layout().all_nodes().to_vec();
let resps = futures::future::join_all(to.iter().map(|to| async move {
let to = (*to).into();
self.endpoint
.call(
&to,
AdminRpc::MetaOperation(MetaOperation::Snapshot { all: false }),
PRIO_NORMAL,
)
.await?
}))
.await;
let mut ret = vec![];
for (to, resp) in to.iter().zip(resps.iter()) {
let res_str = match resp {
Ok(_) => "ok".to_string(),
Err(e) => format!("error: {}", e),
};
ret.push(format!("{:?}\t{}", to, res_str));
}
if resps.iter().any(Result::is_err) {
Err(GarageError::Message(format_table_to_string(ret)).into())
} else {
Ok(AdminRpc::Ok(format_table_to_string(ret)))
}
}
MetaOperation::Snapshot { all: false } => {
garage_model::snapshot::async_snapshot_metadata(&self.garage).await?;
Ok(AdminRpc::Ok("Snapshot has been saved.".into()))
}
}
}
}
impl EndpointHandler<AdminRpc> for AdminRpcHandler {
fn handle(
self: &Arc<Self>,
message: &AdminRpc,
_from: NodeID,
) -> impl Future<Output = Result<AdminRpc, Error>> + Send {
let self2 = self.clone();
async move {
match message {
AdminRpc::BucketOperation(bo) => self2.handle_bucket_cmd(bo).await,
AdminRpc::KeyOperation(ko) => self2.handle_key_cmd(ko).await,
AdminRpc::LaunchRepair(opt) => self2.handle_launch_repair(opt.clone()).await,
AdminRpc::Stats(opt) => self2.handle_stats(opt.clone()).await,
AdminRpc::Worker(wo) => self2.handle_worker_cmd(wo).await,
AdminRpc::BlockOperation(bo) => self2.handle_block_cmd(bo).await,
AdminRpc::MetaOperation(mo) => self2.handle_meta_cmd(mo).await,
m => Err(GarageError::unexpected_rpc_message(m).into()),
}
}
.boxed()
}
}

280
src/garage/cli/cmd.rs Normal file
View file

@ -0,0 +1,280 @@
use std::collections::{HashMap, HashSet};
use std::time::Duration;
use format_table::format_table;
use garage_util::error::*;
use garage_rpc::layout::*;
use garage_rpc::system::*;
use garage_rpc::*;
use garage_model::helper::error::Error as HelperError;
use crate::admin::*;
use crate::cli::*;
pub async fn cli_command_dispatch(
cmd: Command,
system_rpc_endpoint: &Endpoint<SystemRpc, ()>,
admin_rpc_endpoint: &Endpoint<AdminRpc, ()>,
rpc_host: NodeID,
) -> Result<(), HelperError> {
match cmd {
Command::Status => Ok(cmd_status(system_rpc_endpoint, rpc_host).await?),
Command::Node(NodeOperation::Connect(connect_opt)) => {
Ok(cmd_connect(system_rpc_endpoint, rpc_host, connect_opt).await?)
}
Command::Layout(layout_opt) => {
Ok(cli_layout_command_dispatch(layout_opt, system_rpc_endpoint, rpc_host).await?)
}
Command::Bucket(bo) => {
cmd_admin(admin_rpc_endpoint, rpc_host, AdminRpc::BucketOperation(bo)).await
}
Command::Key(ko) => {
cmd_admin(admin_rpc_endpoint, rpc_host, AdminRpc::KeyOperation(ko)).await
}
Command::Repair(ro) => {
cmd_admin(admin_rpc_endpoint, rpc_host, AdminRpc::LaunchRepair(ro)).await
}
Command::Stats(so) => cmd_admin(admin_rpc_endpoint, rpc_host, AdminRpc::Stats(so)).await,
Command::Worker(wo) => cmd_admin(admin_rpc_endpoint, rpc_host, AdminRpc::Worker(wo)).await,
Command::Block(bo) => {
cmd_admin(admin_rpc_endpoint, rpc_host, AdminRpc::BlockOperation(bo)).await
}
Command::Meta(mo) => {
cmd_admin(admin_rpc_endpoint, rpc_host, AdminRpc::MetaOperation(mo)).await
}
_ => unreachable!(),
}
}
pub async fn cmd_status(rpc_cli: &Endpoint<SystemRpc, ()>, rpc_host: NodeID) -> Result<(), Error> {
let status = fetch_status(rpc_cli, rpc_host).await?;
let layout = fetch_layout(rpc_cli, rpc_host).await?;
println!("==== HEALTHY NODES ====");
let mut healthy_nodes =
vec!["ID\tHostname\tAddress\tTags\tZone\tCapacity\tDataAvail".to_string()];
for adv in status.iter().filter(|adv| adv.is_up) {
let host = adv.status.hostname.as_deref().unwrap_or("?");
let addr = match adv.addr {
Some(addr) => addr.to_string(),
None => "N/A".to_string(),
};
if let Some(NodeRoleV(Some(cfg))) = layout.current().roles.get(&adv.id) {
let data_avail = match &adv.status.data_disk_avail {
_ if cfg.capacity.is_none() => "N/A".into(),
Some((avail, total)) => {
let pct = (*avail as f64) / (*total as f64) * 100.;
let avail = bytesize::ByteSize::b(*avail);
format!("{} ({:.1}%)", avail, pct)
}
None => "?".into(),
};
healthy_nodes.push(format!(
"{id:?}\t{host}\t{addr}\t[{tags}]\t{zone}\t{capacity}\t{data_avail}",
id = adv.id,
host = host,
addr = addr,
tags = cfg.tags.join(","),
zone = cfg.zone,
capacity = cfg.capacity_string(),
data_avail = data_avail,
));
} else {
let prev_role = layout
.versions
.iter()
.rev()
.find_map(|x| match x.roles.get(&adv.id) {
Some(NodeRoleV(Some(cfg))) => Some(cfg),
_ => None,
});
if let Some(cfg) = prev_role {
healthy_nodes.push(format!(
"{id:?}\t{host}\t{addr}\t[{tags}]\t{zone}\tdraining metadata...",
id = adv.id,
host = host,
addr = addr,
tags = cfg.tags.join(","),
zone = cfg.zone,
));
} else {
let new_role = match layout.staging.get().roles.get(&adv.id) {
Some(NodeRoleV(Some(_))) => "pending...",
_ => "NO ROLE ASSIGNED",
};
healthy_nodes.push(format!(
"{id:?}\t{h}\t{addr}\t\t\t{new_role}",
id = adv.id,
h = host,
addr = addr,
new_role = new_role,
));
}
}
}
format_table(healthy_nodes);
// Determine which nodes are unhealthy and print that to stdout
let status_map = status
.iter()
.map(|adv| (adv.id, adv))
.collect::<HashMap<_, _>>();
let tf = timeago::Formatter::new();
let mut drain_msg = false;
let mut failed_nodes = vec!["ID\tHostname\tTags\tZone\tCapacity\tLast seen".to_string()];
let mut listed = HashSet::new();
for ver in layout.versions.iter().rev() {
for (node, _, role) in ver.roles.items().iter() {
let cfg = match role {
NodeRoleV(Some(role)) if role.capacity.is_some() => role,
_ => continue,
};
if listed.contains(node) {
continue;
}
listed.insert(*node);
let adv = status_map.get(node);
if adv.map(|x| x.is_up).unwrap_or(false) {
continue;
}
// Node is in a layout version, is not a gateway node, and is not up:
// it is in a failed state, add proper line to the output
let (host, last_seen) = match adv {
Some(adv) => (
adv.status.hostname.as_deref().unwrap_or("?"),
adv.last_seen_secs_ago
.map(|s| tf.convert(Duration::from_secs(s)))
.unwrap_or_else(|| "never seen".into()),
),
None => ("??", "never seen".into()),
};
let capacity = if ver.version == layout.current().version {
cfg.capacity_string()
} else {
drain_msg = true;
"draining metadata...".to_string()
};
failed_nodes.push(format!(
"{id:?}\t{host}\t[{tags}]\t{zone}\t{capacity}\t{last_seen}",
id = node,
host = host,
tags = cfg.tags.join(","),
zone = cfg.zone,
capacity = capacity,
last_seen = last_seen,
));
}
}
if failed_nodes.len() > 1 {
println!("\n==== FAILED NODES ====");
format_table(failed_nodes);
if drain_msg {
println!();
println!("Your cluster is expecting to drain data from nodes that are currently unavailable.");
println!("If these nodes are definitely dead, please review the layout history with");
println!(
"`garage layout history` and use `garage layout skip-dead-nodes` to force progress."
);
}
}
if print_staging_role_changes(&layout) {
println!();
println!("Please use `garage layout show` to check the proposed new layout and apply it.");
println!();
}
Ok(())
}
pub async fn cmd_connect(
rpc_cli: &Endpoint<SystemRpc, ()>,
rpc_host: NodeID,
args: ConnectNodeOpt,
) -> Result<(), Error> {
match rpc_cli
.call(&rpc_host, SystemRpc::Connect(args.node), PRIO_NORMAL)
.await??
{
SystemRpc::Ok => {
println!("Success.");
Ok(())
}
m => Err(Error::unexpected_rpc_message(m)),
}
}
pub async fn cmd_admin(
rpc_cli: &Endpoint<AdminRpc, ()>,
rpc_host: NodeID,
args: AdminRpc,
) -> Result<(), HelperError> {
match rpc_cli.call(&rpc_host, args, PRIO_NORMAL).await?? {
AdminRpc::Ok(msg) => {
println!("{}", msg);
}
AdminRpc::BucketList(bl) => {
print_bucket_list(bl);
}
AdminRpc::BucketInfo {
bucket,
relevant_keys,
counters,
mpu_counters,
} => {
print_bucket_info(&bucket, &relevant_keys, &counters, &mpu_counters);
}
AdminRpc::KeyList(kl) => {
print_key_list(kl);
}
AdminRpc::KeyInfo(key, rb) => {
print_key_info(&key, &rb);
}
AdminRpc::WorkerList(wi, wlo) => {
print_worker_list(wi, wlo);
}
AdminRpc::WorkerVars(wv) => {
print_worker_vars(wv);
}
AdminRpc::WorkerInfo(tid, wi) => {
print_worker_info(tid, wi);
}
AdminRpc::BlockErrorList(el) => {
print_block_error_list(el);
}
AdminRpc::BlockInfo {
hash,
refcount,
versions,
uploads,
} => {
print_block_info(hash, refcount, versions, uploads);
}
r => {
error!("Unexpected response: {:?}", r);
}
}
Ok(())
}
// ---- utility ----
pub async fn fetch_status(
rpc_cli: &Endpoint<SystemRpc, ()>,
rpc_host: NodeID,
) -> Result<Vec<KnownNodeInfo>, Error> {
match rpc_cli
.call(&rpc_host, SystemRpc::GetKnownNodes, PRIO_NORMAL)
.await??
{
SystemRpc::ReturnKnownNodes(nodes) => Ok(nodes),
resp => Err(Error::unexpected_rpc_message(resp)),
}
}

Some files were not shown because too many files have changed in this diff Show more