Commit graph

50 commits

Author SHA1 Message Date
a947a7546a Sort the ecosystems list for presentation in the UI
In the page footer and the 'select' list on the packages page, the
list of ecosystems should be sorted in a predictable order.
2026-04-06 18:06:16 -04:00
Andrew Nesbitt
e36a92433e
Clean up review feedback: use path.Ext for extension checks, remove dead getStripPrefix, add openArchive tests 2026-04-06 19:06:48 +01:00
Andrew Nesbitt
941ed51f76
Auto-detect and strip single top-level directory prefix when browsing archives
GitHub zipballs wrap all files in a repo-hash/ directory. Instead of
hardcoding prefixes per ecosystem, open the archive once to check if all
files share a single root directory and strip it automatically. The npm
package/ prefix is still handled as a special case.
2026-04-06 17:14:15 +01:00
Andrew Nesbitt
b68184cbab
Fix composer dist URL rewriting and browse source for extensionless filenames
GitHub zipball URLs end in a bare commit hash with no file extension.
rewriteDistURL now appends .zip when the filename has no extension and
the dist type is zip. expandMinifiedVersions deep copies inherited
values so in-place URL rewriting no longer corrupts shared references.
browse.go infers .zip for extensionless filenames so existing cached
artifacts can still be opened.
2026-04-06 17:07:20 +01:00
Andrew Nesbitt
bcbb883d1b
Add failing tests for composer dist URL and shared reference bugs
GitHub zipball URLs produce filenames without .zip extension, breaking
browse source. Minified version expansion shares nested map references,
causing dist URL corruption when versions inherit unchanged dist fields.
2026-04-06 17:07:20 +01:00
Andrew Nesbitt
43a164ed72
Add cooldown support for Hex
Decode the Hex registry protobuf format, filter releases by fetching
timestamps from the Hex HTTP API (hex.pm/api/packages/{name}), and
re-encode without the original signature.

The protobuf handling uses protowire for low-level encoding/decoding
of the Signed wrapper, Package, and Release messages. Timestamps come
from the inserted_at field in the JSON API response.

Since the proxy re-encodes the payload without the original signature,
users need to disable registry signature verification.
2026-04-06 13:18:57 +01:00
Andrew Nesbitt
cb9bbbc385
Add cooldown support for RubyGems
Filter versions from the compact index (/info/{name}) by fetching
timestamps from the versions API (/api/v1/versions/{name}.json).
Both requests run concurrently to minimize latency. If the versions
API is unavailable, the compact index is proxied unfiltered.

Handles platform-specific versions (e.g. 1.0.0-java) by matching
the compact index format.
2026-04-06 13:16:26 +01:00
Andrew Nesbitt
75ff85f2f0
Add cooldown support for Conda (#68)
* Add cooldown support for Conda

Filter entries from Conda repodata.json based on the timestamp field
(milliseconds since epoch). Filters both packages and packages.conda
sections. When cooldown is disabled, repodata requests are proxied
directly without parsing.

* Update README table to mark Conda cooldown support
2026-04-06 13:16:00 +01:00
Andrew Nesbitt
70fe686953
Add cooldown support for NuGet (#67)
* Add cooldown support for NuGet

Filter versions from NuGet registration pages based on the
catalogEntry.published timestamp. Handles both RFC3339 and NuGet's
fractional-second timestamp formats. When cooldown is disabled,
registration requests are proxied directly without parsing.

* Update README table to mark NuGet cooldown support
2026-04-06 13:12:18 +01:00
Andrew Nesbitt
24d5e77443
Fix cross-device link error when running in Docker with volumes (#66)
`fileblob` creates temp files in `os.TempDir()` (`/tmp`) by default,
then uses `os.Rename` to move them to the final path. When the storage
directory is on a different filesystem (e.g. a Docker volume mount at
`/data`), the rename fails with "invalid cross-device link".

Set `no_tmp_dir=true` on file:// bucket URLs so fileblob creates temp
files next to the final destination instead.

Fixes #65
2026-04-06 13:07:31 +01:00
Andrew Nesbitt
15c133f1fa
Fix Composer minified metadata expansion and namespaced package routing (#63)
* Fix Composer minified metadata expansion and namespaced package routing

Packagist serves metadata in a minified format where only the first version
entry has all fields and subsequent entries inherit from the previous one.
The proxy was passing this through without expanding it, which meant cooldown
filtering could break the inheritance chain (losing fields like `name`) and
`~dev` sentinel markers were silently dropped.

The proxy now expands the minified format before filtering and rewriting,
ensuring every version entry is self-contained.

Web UI and API routes used single-segment chi URL params for package names,
which broke for Composer's `vendor/name` format. `/package/composer/monolog/monolog`
would match the version show route instead of the package show route.

All `/package/` and related API routes now use wildcard paths with a
`resolvePackageName` helper that tries increasingly longer path prefixes as
package names via DB lookup, correctly handling namespaced packages across
all endpoints (show, version, browse, compare, vulns).

Fixes #61, fixes #62

* Add namespaced package routing tests for all affected ecosystems

Verifies the wildcard routing handles slashes in package names for
npm (@babel/core), Go modules (github.com/stretchr/testify),
OCI images (library/nginx), Conda (conda-forge/numpy), and
Conan (zlib/1.2.13@demo/stable).

* Regenerate swagger docs after route refactor

The swagger annotations for the old per-endpoint handlers were removed
during the wildcard routing refactor. Regenerate to match current state.
2026-04-06 13:07:02 +01:00
Andrew Nesbitt
e45706d808
Track applied migrations to skip column checks on startup (#60)
* Track applied migrations to skip column checks on startup

Add a migrations table that records which migrations have been applied.
On boot, load the set of applied names in one query and only run new ones.
A fully migrated database now does 1 query instead of ~12 HasColumn/HasTable
checks.

Fresh databases created via CreateSchema record all migrations as already
applied. Old databases get the migrations table on first MigrateSchema call
and each migration is recorded after it runs.

Closes #54

* Add benchmark for MigrateSchema on fully migrated database

* Optimize MigrateSchema to single query for fully migrated databases

Skip HasTable/HasColumn checks when the migrations table already exists.
A fully migrated database now does one SELECT instead of ~12 individual
column and table checks.

* Add migration docs and link from architecture

* Add test for upgrade from fully migrated database without migrations table
2026-04-06 13:06:45 +01:00
Andrew Nesbitt
34009bad98
Lazy-load HTML templates behind sync.Once (#59)
Templates are parsed on first Render call instead of at server startup.
API-only traffic never pays the ~780µs parsing cost.

Closes #53
2026-04-06 13:06:25 +01:00
Kevin P. Fleming
ec9c437498
Correct ecosystem name in UI for Go (golang). (#64) 2026-04-05 16:20:57 +01:00
Andrew Nesbitt
beddf8357a
Fix startup message and add connectivity check for S3 storage (#57)
* Fix startup message and add connectivity check for S3 storage

When S3 storage is configured, the startup log incorrectly showed the
default local path (./cache/artifacts) instead of the actual S3 URL.
This also adds a lightweight connectivity check at startup so bad
credentials or endpoints fail immediately rather than on first request.

Add URL() and Close() to the Storage interface so all backends report
their URL and can be cleaned up properly. Rename the stats JSON field
from storage_path to storage_url. Close storage in error paths and
during graceful shutdown.

Fixes #49

* Fix Windows test assertion for file:// URL format

OpenBucket normalizes Windows paths to file:///C:/path (three slashes)
but the test expected file://C:/path (two slashes).
2026-04-03 14:06:51 +01:00
Lily Young
922d44b34e
Add Cargo cooldown support (#48)
* Add Cargo cooldown support

- Added support for cooldowns for cargo
- Added a test to test cooldowns with cargo

* Update README.md

add cargo to registry's with support for cooldowns

* Apply suggestion from @andrew

Co-authored-by: Andrew Nesbitt <andrewnez@gmail.com>
2026-04-01 18:15:07 +01:00
Andrew Nesbitt
5e04182bbd
Add upstream URL tests for all ecosystem download handlers (#51)
Adds regression test for the PyPI double-packages bug fixed in #50,
and adds fetchedURL assertions to every ecosystem that constructs
upstream download URLs (Conda, CRAN, Maven, NuGet, Conan, Debian, RPM).
2026-04-01 15:22:52 +01:00
Andrew Nesbitt
bdc246dc10
Fix container blob caching by passing auth token to fetcher (#44)
* Fix container blob caching by passing auth token to fetcher

The container handler was calling GetOrFetchArtifactFromURL without
authentication headers, causing Docker Hub to return 401. The fallback
proxyBlobWithAuth path had auth but bypassed the cache entirely.

Now passes the Bearer token through GetOrFetchArtifactFromURLWithHeaders
so blobs are both authenticated and cached.

Fixes git-pkgs/proxy#43

* Update registries to v0.4.0

Replace pre-release pseudo-version with the released v0.4.0 now that
git-pkgs/registries#13 has been merged.
2026-04-01 15:22:39 +01:00
Kevin P. Fleming
03ddad10ec
Fix paths for files.pythonhosted.org (#50)
The URLs constructed for downloading package assets from PyPI had
'packages' twice, resulting in 404s.
2026-04-01 14:58:08 +01:00
Andrew Nesbitt
599fe9e254
Fix all golangci-lint issues across the codebase (#32)
* Fix all golangci-lint issues across the codebase

Resolve 77 lint issues reported by golangci-lint with gocritic, gocognit,
gocyclo, maintidx, dupl, mnd, unparam, ireturn, goconst, and errcheck
enabled. Net reduction of ~175 lines through shared helpers and
deduplication.

* Suppress staticcheck SA1019 for intentional deprecated field usage

The Storage.Path field is deprecated but still read for backwards
compatibility with existing configs that haven't migrated to the URL field.
2026-03-18 10:59:29 +00:00
Andrew Nesbitt
3afa5e050d
Add handler download flow and server utility tests
Covers HTTP download paths for gem, hex, go, conda, cran, and maven
handlers with cache hit, invalid input, and upstream proxy scenarios.
Adds server tests for formatTimeAgo, formatSize, categorizeLicense,
LoggerMiddleware, search/pagination, and API packages list endpoint.
2026-03-17 20:31:54 +00:00
Andrew Nesbitt
d820f75fa6
Add direct tests for handler core methods, template rendering, and query validation 2026-03-13 17:05:14 +00:00
Andrew Nesbitt
240a61c537
Merge pull request #28 from git-pkgs/add-handler-tests
Add tests for Conan and NuGet handlers
2026-03-13 08:23:20 +00:00
Andrew Nesbitt
e2a683c7a6
Route handler metadata requests through Proxy.HTTPClient instead of http.DefaultClient
All handler metadata and proxy requests were using http.DefaultClient directly,
bypassing any timeout or transport configuration. Added an HTTPClient field to
the Proxy struct with a 30-second default timeout, and updated every handler
to use it for upstream HTTP requests.
2026-03-13 07:46:28 +00:00
Andrew Nesbitt
06483d2d5c
Add tests for Conan and NuGet handlers
Tests cover ping endpoints, upstream proxying, service index rewriting,
URL rewriting, file caching decisions, header forwarding, error handling,
query string preservation, status code passthrough, and input validation.
2026-03-13 07:43:28 +00:00
Andrew Nesbitt
0e1a06c5e6
Add size limits on request bodies and upstream metadata reads
POST endpoints (/api/outdated, /api/bulk) now reject bodies over 1 MB
using http.MaxBytesReader. Upstream metadata reads (npm, pypi, composer,
nuget, pub) now use io.LimitReader capped at 50 MB to prevent OOM from
unexpectedly large responses.
2026-03-13 07:28:20 +00:00
Andrew Nesbitt
38213d9631
Merge pull request #23 from git-pkgs/fix-browse-xss
Fix XSS in browse source file tree
2026-03-13 07:25:57 +00:00
Andrew Nesbitt
3ec353c624
Merge pull request #24 from git-pkgs/fix-error-disclosure
Stop leaking internal error details to clients
2026-03-13 07:25:44 +00:00
Andrew Nesbitt
bf7e83efe3
Reject path traversal in debian and rpm handlers
The debian and rpm handlers take the request path and pass it directly
to the upstream URL without checking for ".." segments. This could let
a client craft a request that reaches unintended upstream paths.

Add a containsPathTraversal check at the entry point of both handlers
and return 400 for any path containing ".." segments.
2026-03-12 12:05:52 +00:00
Andrew Nesbitt
3d6ebc9522
Stop leaking internal error messages in API and health responses
Replace err.Error() in HTTP error responses with generic messages.
Internal details like database driver errors and enrichment failures
were being sent directly to clients.
2026-03-12 12:01:29 +00:00
Andrew Nesbitt
9e97a3316a
Escape user-controlled strings in browse source JavaScript
File paths from archive contents were interpolated directly into onclick
handlers and innerHTML via template literals. A crafted filename containing
quotes could break out of the string context and execute arbitrary JS.

Add an escapeHTML helper and use it on all interpolated path and URL values
in the browse source page.
2026-03-12 11:59:14 +00:00
Andrew Nesbitt
82443e137f
Add generated OpenAPI docs support 2026-03-12 11:49:31 +00:00
Andrew Nesbitt
fe32236a57
Remove hard-coded ecosystems from templates 2026-03-11 17:25:47 +00:00
Andrew Nesbitt
4f8f63f354
Add version cooldown to filter recently published packages
Hides package versions published too recently from metadata responses,
giving the community time to spot malicious releases. Configurable
per-ecosystem and per-package with duration overrides. Supported for
npm, PyPI, pub.dev, and Composer.
2026-03-04 19:00:31 +00:00
Andrew Nesbitt
364549ad14
Replace inline PURL construction with purl library
Uses purl.MakePURLString() instead of fmt.Sprintf("pkg:...") for
correct namespace handling (npm scopes, Go module paths, Maven group
IDs) and percent-encoding. Replaces hand-rolled extractEcosystem and
inline PURL parsing in the bulk lookup fallback with purl.Parse().
2026-03-04 09:20:16 +00:00
Andrew Nesbitt
07778d9727
Replace internal/diff with archives/diff
The diff package has been extracted into the archives module where it
belongs, since it operates on archives.Reader. This removes the internal
copy and imports from github.com/git-pkgs/archives/diff instead.
2026-02-27 10:55:10 +00:00
Andrew Nesbitt
be8c4b9860
Replace internal/upstream with registries/fetch
Use the new client/ and fetch/ sub-packages from git-pkgs/registries
instead of the local upstream package. The fetcher, circuit breaker, and
resolver now live in registries where they can be shared across projects.

Depends on git-pkgs/registries#8.
2026-02-20 17:31:12 +00:00
Andrew Nesbitt
e6645f38c9
Fix staticcheck QF1012 lint warnings in diff package 2026-02-20 07:53:24 +00:00
Andrew Nesbitt
4d5098e044
Remove internal/archive package, replaced by git-pkgs/archives dependency 2026-02-16 11:11:48 +00:00
Andrew Nesbitt
8ddf07587c
Fix golangci-lint errors (errcheck, staticcheck, unused) 2026-02-16 10:53:21 +00:00
Andrew Nesbitt
e35394bee3
Use shared github.com/git-pkgs/enrichment module 2026-02-06 10:37:00 +00:00
Andrew Nesbitt
2d7cb8eae5
Refactoring and features 2026-02-03 22:40:40 +00:00
Andrew Nesbitt
9c974a0a81
Fix Windows CI test failures
- Use proper file:/// URL format for Windows paths in blob storage tests
- Accept both text/javascript and application/javascript MIME types
2026-01-29 21:12:13 +00:00
Andrew Nesbitt
658e9621d8
Add Container, Debian, RPM handlers and enrichment API
Adds proxy support for Docker/OCI container registries, Debian/APT
repositories, and RPM/Yum repositories. Includes a new enrichment API
for package metadata, vulnerability scanning, and outdated detection.

Updates the dashboard with Tailwind CSS, dark mode support, and a
security overview section showing vulnerability counts.
2026-01-29 19:35:15 +00:00
Andrew Nesbitt
1eb9d71bd7
Align database schema with git-pkgs for compatibility
The proxy can now use an existing git-pkgs database as a starting point.
Packages and versions tables match git-pkgs schema, using PURL-based
references instead of integer IDs. The proxy adds its own artifacts
table for caching functionality.
2026-01-29 16:44:01 +00:00
Andrew Nesbitt
fcc5289f97
Add auth pass-through for upstream registries
Configure authentication per URL prefix in config:

  upstream:
    auth:
      "https://registry.npmjs.org":
        type: bearer
        token: "${NPM_TOKEN}"

Supports bearer tokens, basic auth, and custom headers.
Credentials can reference environment variables with ${VAR_NAME} syntax.
The longest matching URL prefix wins when multiple patterns match.
2026-01-29 16:33:09 +00:00
Andrew Nesbitt
ba754f8a79
Add gocloud.dev/blob for S3 and filesystem storage
Replace custom filesystem storage with gocloud.dev/blob for unified
storage backend support.

Supported backends:
- file:///path/to/dir - Local filesystem (default)
- s3://bucket-name - Amazon S3
- s3://bucket?endpoint=http://localhost:9000 - S3-compatible (MinIO)

Configuration via:
- CLI flag: -storage-url
- Environment: PROXY_STORAGE_URL
- Config file: storage.url

The old storage.path config is deprecated but still supported.
2026-01-29 16:13:16 +00:00
Andrew Nesbitt
41aa11ab66
Add sqlx with SQLite default and PostgreSQL option
Replace raw database/sql with jmoiron/sqlx for cleaner query handling.
Support both SQLite (default) and PostgreSQL as configurable backends.

Configuration via:
- CLI flags: -database-driver, -database-path, -database-url
- Environment: PROXY_DATABASE_DRIVER, PROXY_DATABASE_PATH, PROXY_DATABASE_URL
- Config file: database.driver, database.path, database.url

Tests run against both databases when PROXY_DATABASE_URL is set.
2026-01-29 16:06:56 +00:00
Andrew Nesbitt
7b1bdf75f7
Extract checkCache helper to reduce duplication 2026-01-21 22:47:23 +00:00
Andrew Nesbitt
7b22638ef7
Hello world 2026-01-20 22:00:31 +00:00