Closes#99. The max_size storage config was parsed and validated but
never enforced. This adds a background eviction loop that periodically
checks total cache size and evicts least recently used artifacts when
the limit is exceeded.
When the proxy reaches storage at an internal address (127.0.0.1, a
Docker service name) the presigned URLs it generates point there too,
which is useless to external clients. This adds an optional base URL
that replaces the scheme and host of signed URLs before they're returned,
keeping the signed path and query intact.
When storage.direct_serve is enabled and the backend supports it (S3,
Azure), cached artifact downloads return a 302 redirect to a presigned
URL instead of streaming bytes through the proxy. Falls back to
streaming when the backend can't sign (fileblob, local filesystem) or
signing fails.
Adds the azureblob driver so azblob:// storage URLs work.
Cache-hit accounting already happened before io.Copy so redirects are
counted correctly; the metrics calls are pulled into a helper so both
paths share them.
Closes#96
Cached metadata is now served directly within a configurable TTL window
(default 5m) without contacting upstream, reducing latency and upstream
load. When upstream is unreachable and the cache is past its TTL, stale
content is served with a Warning: 110 header per RFC 7234.
New config: `metadata_ttl` (YAML) / `PROXY_METADATA_TTL` (env).
Set to "0" to always revalidate with upstream.
- Fix race where runJob could overwrite canceled state set by Cancel()
- Fix Debian ecosystem name inconsistency ("deb" -> "debian")
- Stream metadata responses when caching is disabled to avoid buffering
- Add metadata_cache table to initial schema strings for consistency
- Gate mirror API behind mirror_api config flag (disabled by default)
- Fix goconst lint in metadata_cache_test.go
Add a `proxy mirror` CLI command and `/api/mirror` API endpoints that
pre-populate the cache from various input sources: individual PURLs,
SBOM files (CycloneDX and SPDX), or full registry enumeration.
The mirror reuses the existing handler.Proxy.GetOrFetchArtifact()
pipeline so cached artifacts are identical to those fetched on demand.
A bounded worker pool controls download parallelism.
Metadata caching is opt-in via `cache_metadata: true` in config (or
PROXY_CACHE_METADATA=true). The mirror command always enables it. When
enabled, upstream metadata responses are stored for offline fallback
with ETag-based conditional revalidation.
New internal/mirror package with Source interface, PURLSource,
SBOMSource, RegistrySource, and async JobStore. New metadata_cache
database table for offline metadata serving.
* Fix all golangci-lint issues across the codebase
Resolve 77 lint issues reported by golangci-lint with gocritic, gocognit,
gocyclo, maintidx, dupl, mnd, unparam, ireturn, goconst, and errcheck
enabled. Net reduction of ~175 lines through shared helpers and
deduplication.
* Suppress staticcheck SA1019 for intentional deprecated field usage
The Storage.Path field is deprecated but still read for backwards
compatibility with existing configs that haven't migrated to the URL field.
Hides package versions published too recently from metadata responses,
giving the community time to spot malicious releases. Configurable
per-ecosystem and per-package with duration overrides. Supported for
npm, PyPI, pub.dev, and Composer.
Replace raw database/sql with jmoiron/sqlx for cleaner query handling.
Support both SQLite (default) and PostgreSQL as configurable backends.
Configuration via:
- CLI flags: -database-driver, -database-path, -database-url
- Environment: PROXY_DATABASE_DRIVER, PROXY_DATABASE_PATH, PROXY_DATABASE_URL
- Config file: database.driver, database.path, database.url
Tests run against both databases when PROXY_DATABASE_URL is set.