The UI now lives under /ui so reverse proxies can apply different
access rules to it (e.g. require auth) while leaving the package
endpoints (/npm, /pypi, /v2, ...) open to build machines.
- GET / redirects to /ui/
- /api/browse and /api/compare move to /ui/api/browse and
/ui/api/compare since only the browser JS calls them
- /health, /stats, /metrics, /openapi.json and /api/* stay at root
* config: add Health.StorageProbeInterval
* metrics: add proxy_health_probe_failures_total counter
* server: add storageProbe with happy-path test
* server: add storageProbe failure-mode tests
* server: add healthCache with TTL, single-flight, transition logging
* server: wire storage probe into /health
* server: update TestHealthEndpoint for JSON; wire healthCache into newTestServer
Also fix Windows file-locking issue in storageProbe: close the reader
explicitly before Delete so the file handle is released prior to os.Remove.
* server: clean up stale comment in storageProbe
* docs: document storage health probe and new metric
* docs: regenerate Swagger for /health JSON response
* server: simplify rc.Close error handling in storageProbe
* server: defer probe cleanup so size/open/read/verify failures don't leak objects
Previously, storageProbe only called Delete on the success path. Any
failure between Store and the final Delete (size mismatch, Open error,
mid-stream read failure, content mismatch) left the probe object orphaned
in the storage backend. With caching disabled and Kubernetes-rate probing,
the leak could accumulate noticeably on backends like S3.
Use a named return + defer to attempt Delete after every successful Store.
The earlier-step failure remains the primary error; Delete failure only
surfaces as step="delete" when nothing else went wrong. Add a table-driven
test that asserts cleanup runs for each non-delete failure path.
Reported by Copilot on #119.
* config: validate health.storage_probe_interval in Config.Validate
The new duration field was only validated at use time in newHealthCache.
The existing codebase already validates other duration fields
(MetadataTTL, DirectServeTTL, Gradle.MaxAge, Gradle.SweepInterval) in
Config.Validate() so misconfiguration fails fast at startup with a
config-key-specific error.
Match that pattern. The parse-at-use code in newHealthCache stays as
a safety net, mirroring the MetadataTTL precedent.
Reported by Copilot on #119.
* docs: lowercase "counter" in metrics table for consistency
Other rows in the table use lowercase type names (counter/gauge/histogram).
Match that style.
Reported by Copilot on #119.
* docs: include size-check step in /health probe description
The probe is write → size-check → read → verify → delete; the
architecture note was missing the size-check step.
Reported by Copilot on #119.
* server: address andrew's review on #119
- Drop unused callerCtx parameter from healthCache.Check (Check is now
parameter-less; the comment-only "accepted for symmetry" justification
wasn't carrying its weight).
- Emit "storage": {"status": "skipped"} on DB short-circuit instead of
omitting the key, so monitors expecting a fixed key set keep working.
- Reject negative storage_probe_interval at config validation time
(previously parsed and silently behaved like "0").
- Extract HealthConfig.Validate to keep Config.Validate under the
gocognit threshold and match the existing GradleBuildCacheConfig pattern.
- README Health Check section: note that /health is intended as a
readiness probe rather than a liveness probe (Check holds a mutex
for up to the 10s probe timeout).
- cmd/proxy/main.go godoc: column-align the new env var with the
surrounding Gradle entries.
Reported by andrew on #119.
- Implement /julia/* handler for the Pkg server protocol
(registries, registry, package, artifact, meta)
- Resolve package UUIDs to names by parsing Registry.toml from
the General registry tarball, with a hash-guarded background
refresh on registry updates
- Wire into router, ecosystem list, install page, badge styles
- Update README and architecture docs
- Bump github.com/git-pkgs/registries to v0.6.0: the fetcher now
honours HTTP_PROXY, gates dialled IPs against the safehttp block
list, and Version.Integrity is populated for pub, julia and nuget
- Replace internal/cooldown with github.com/git-pkgs/cooldown v0.1.1
(identical surface, lifted from this repo)
- Update docs/architecture.md to point at the external package
Bake the extended linter set into a project config so plain
golangci-lint run matches what we check locally, with goconst tuned
to ignore tests and bare lowercase words to drop ~200 ecosystem-name
and test-literal false positives.
Clear the remaining real findings: extract GradleBuildCacheConfig.Validate
from Config.Validate, pull the eviction sort comparator into
sortOldestFirst (zero time.Time already sorts first via Before so the
switch was redundant), add headerAcceptEncoding and SQL column-type
constants, and drop a dead empty-key recheck in the gradle handler.
* add Gradle Build Cache support with handler and tests
* linting issue
* MR Suggestions: Add Gradle HTTP Build Cache configuration to README
* implement minor stuff: Refactor Gradle handler to remove unnecessary URL parameter and update related tests
Co-authored-by: Copilot <copilot@github.com>
* Add Gradle build cache configuration and eviction support
- Introduced configuration options for Gradle build cache in config files and documentation.
- Implemented read-only mode and upload size limits for the Gradle build cache.
- Added cache eviction logic based on age and size, with corresponding tests.
- Enhanced storage interfaces to support listing objects by prefix.
* implement minor stuff: Refactor Gradle handler to remove unnecessary URL parameter and update related tests
* last finding fix
* fix tests and implement PR suggestions
Co-authored-by: Copilot <copilot@github.com>
* unify path
---------
Co-authored-by: Mateusz (Mati) Kepa <m.kepa@sportradar.com>
Co-authored-by: Copilot <copilot@github.com>
checkCache opened the storage reader and streamed it to the client
without checking that the bytes still matched what was originally
stored, or what the upstream registry declared. Disk corruption,
accidental overwrites, or local tampering would go unnoticed.
Wrap the storage reader in a verifyingReader that computes SHA256
(against artifact.content_hash) and, when version.integrity holds an
SRI string, the corresponding sha256/384/512 digest as bytes flow
through. At EOF the digests are compared; on mismatch we log at
error level, bump proxy_integrity_failures_total, and clear the
artifact's cache entry so the next request refetches from upstream.
Verification is skipped when the stream was not fully consumed
(client disconnect) to avoid evicting good artifacts on partial
reads. The DirectServe presigned-URL path is unverified since the
proxy never sees those bytes.
Refs #42 (part 1)
* Structured JSON error responses for API endpoints
API handlers returned errors via http.Error (text/plain) with ad-hoc
strings, while the mirror API used a different {"error": "..."} shape
and leaked internal err.Error() text to clients.
Add ErrorResponse{Code, Message} with stable codes (BAD_REQUEST,
NOT_FOUND, UPSTREAM_ERROR, INTERNAL_ERROR) and writeError/badRequest/
notFound/internalError helpers. Convert all JSON API handlers in
api.go, browse.go, mirror_api.go and the /stats endpoint. Enrichment
failures now report 502 UPSTREAM_ERROR rather than 500.
Protocol handlers in internal/handler/ are deliberately unchanged
since npm/pip/cargo clients expect their own response formats, not
JSON. HTML page handlers in server.go also keep text/plain.
Swagger @Failure annotations updated and docs regenerated.
Fixes#76
* Convert validatePackagePath errors to JSON in API handlers
The wildcard package routes (/packages/{ecosystem}/*, /api/package/*,
/api/vulns/*, /api/browse/*, /api/compare/*) only checked for an empty
path before passing user input to GetPackageByEcosystemName and the
enrichment service.
Add validatePackagePath as a coarse first-line filter: reject null
bytes, other control characters, and paths over 512 bytes. Wired into
all five entry handlers immediately after the chi wildcard is read.
This is the generic layer; ecosystem-specific name format rules (npm
scoped name shape, Maven coordinate structure, etc.) can be added on
top per #75.
Fixes#75
containsPathTraversal only checked literal ".." segments separated by
forward slashes. Encoded forms like %2e%2e%2f or backslash separators
would slip past if a caller ever passed a raw or Windows-style path.
The check now URL-decodes the input and treats backslashes as
separators before splitting. Go's stdlib already decodes r.URL.Path so
the encoded case is mostly belt-and-braces for cache keys and other
non-router inputs, but the storage layer guard from #106 makes this
worth locking in with tests.
Fixes#74
The browse and compare handlers buffer the full artifact into memory for
prefix detection. Without a cap, a single request for a large cached
artifact could exhaust server memory.
These file types were served with executable content types (text/html,
image/svg+xml) allowing stored XSS via package archive contents.
Also adds Content-Security-Policy: sandbox and X-Content-Type-Options:
nosniff headers to all browse file responses.
Closes#99. The max_size storage config was parsed and validated but
never enforced. This adds a background eviction loop that periodically
checks total cache size and evicts least recently used artifacts when
the limit is exceeded.
When the proxy reaches storage at an internal address (127.0.0.1, a
Docker service name) the presigned URLs it generates point there too,
which is useless to external clients. This adds an optional base URL
that replaces the scheme and host of signed URLs before they're returned,
keeping the signed path and query intact.
When storage.direct_serve is enabled and the backend supports it (S3,
Azure), cached artifact downloads return a 302 redirect to a presigned
URL instead of streaming bytes through the proxy. Falls back to
streaming when the backend can't sign (fileblob, local filesystem) or
signing fails.
Adds the azureblob driver so azblob:// storage URLs work.
Cache-hit accounting already happened before io.Copy so redirects are
counted correctly; the metrics calls are pulled into a helper so both
paths share them.
Closes#96