* chore: merge authorization contexts
Instead of 2 auth contexts from apikey and dbauthz, merge them to
just use dbauthz. It is annoying to have two.
* fixup authorization reference
pbkdf2 is too expensive to run in init, so this change makes it load
lazily. I introduced a lazy package that I hope to use more in my
`GODEBUG=inittrace=1` adventure.
Benchmark results:
```
$ hyperfine "coder --help" "coder-new --help"
Benchmark 1: coder --help
Time (mean ± σ): 82.1 ms ± 3.8 ms [User: 93.3 ms, System: 30.4 ms]
Range (min … max): 72.2 ms … 90.7 ms 35 runs
Benchmark 2: coder-new --help
Time (mean ± σ): 52.0 ms ± 4.3 ms [User: 62.4 ms, System: 30.8 ms]
Range (min … max): 41.9 ms … 62.2 ms 52 runs
Summary
coder-new --help ran
1.58 ± 0.15 times faster than coder --help
```
Currently, importing `codersdk` just to interact with the API requires
importing tailscale, which causes builds to fail unless manually using
our fork.
This cleans up `root.go` a bit, adds tests for middleware HTTP transport
functions, and removes two HTTP requests we always always performed previously
when executing *any* client command.
It should improve CLI performance (especially for users with higher latency).
This PR updates the tests in `insights_test.go` to enable commented-out scenarios. This behavior was fixed by previous PRs in this stack. Note that the updated golden files are correct since they are "second template only" meaning that the newly introduced data is considered as expected. In other golden files there is no change since "only count once" is applied.
This PR updates the `*ByTempalte` insights queries used for generating Prometheus metrics to behave the same way as the new rollup query and re-written insights queries that utilize the rolled up data.
Add `dbrollup` service that runs the `UpsertTemplateUsageStats` query
every 5 minutes, on the minute. This allows us to have fairly real-time
insights data when viewing "today".
Add `template_usage_stats` table for aggregating tempalte usage data.
Data is rolled up by the `UpsertTemplateUsageStats` query, which fetches
data from the `workspace_agent_stats` and `workspace_app_stats` tables.
- Updates existing tests under workspaceapps/apptest to not reuse existing appDetails as assertWorkspaceLastUsed(Not)?Updated calls FlushStats() which was causing racy test behaviour and incorrect test assertions.
- Expands scope of assertWorkspaceLastUsedAtUpdated and its counterpart to ProxySubdomain tests.
This more closely aligns with GitHub's label search style. Actual search params need to be converted to allow this format, by default they will throw an error if they do not support listing.
This PR updates the coder port-forward command to periodically inform coderd that the workspace is being used:
- Adds workspaceusage.Tracker which periodically batch-updates workspace LastUsedAt
- Adds coderd endpoint to signal workspace usage
- Updates coder port-forward to periodically hit this endpoint
- Modifies BatchUpdateWorkspacesLastUsedAt to avoid overwriting with stale data
Co-authored-by: Danny Kopping <danny@coder.com>
* chore: remove max_ttl from templates
Completely removing max_ttl as a feature on template scheduling. Must use other template scheduling features to achieve autostop.
* chore: add org ID as optional param to AcquireJob
* chore: plumb through organization id to provisioner daemons
* add org id to provisioner domain key
* enforce org id argument
* dbgen provisioner jobs defaults to default org
* fix: separate signals for passive, active, and forced shutdown
`SIGTERM`: Passive shutdown stopping provisioner daemons from accepting new
jobs but waiting for existing jobs to successfully complete.
`SIGINT` (old existing behavior): Notify provisioner daemons to cancel in-flight jobs, wait 5s for jobs to be exited, then force quit.
`SIGKILL`: Untouched from before, will force-quit.
* Revert dramatic signal changes
* Rename
* Fix shutdown behavior for provisioner daemons
* Add test for graceful shutdown
Apptest requires a port without a listening server to test failure
cases. This port was chosen and had a chance of actually being
provisioned. To prevent this accident, a port <1k is chosen,
since those will never be allocated.
fixes#11950https://github.com/coder/coder/issues/11950#issuecomment-1987756088 explains the bug
We were also calling into `Unlisten()` and `Close()` while holding the mutex. I don't believe that `Close()` depends on the notification loop being unblocked, but it's hard to be sure, and the safest thing to do is assume it could block.
So, I added a unit test that fakes out `pq.Listener` and sends a bunch of notifies every time we call into it to hopefully prevent regression where we hold the mutex while calling into these functions.
It also removes the use of a `context.Context` to stop the PubSub -- it must be explicitly `Closed()`. This simplifies a bunch of the logic, and is how we use the pubsub anyway.
* coderd: add test to reproduce trailing directory issue
* coderd: add trailing path separator to dir entries when converting to zip
* provisionersdk: add trailing path separator to directory entries
This fixes a vulnerability with the `CODER_OIDC_EMAIL_DOMAIN` option,
where users with a superset of the allowed email domain would be allowed
to login. For example, given `CODER_OIDC_EMAIL_DOMAIN=google.com`, a
user would be permitted entry if their email domain was
`colin-google.com`.
This adds the ability for `TunnelAuth` to also authorize incoming wireguard node IPs, preventing agents from reporting anything other than their static IP generated from the agent ID.
- Adds more testcases to TestAcquirer_MatchTags
- Adds functionality to generate a table from above test
- Update provisioner tag documentation with generated table
- Apply other feedback from #12315
DERP mesh key setup would do a SELECT and then an INSERT on failure, without a lock. During some testing with multiple replicas, I managed to cause a replica to crash due to them initializing simultaneously.
Fixes:
Encountered an error running "coder server"
create coder API: insert mesh key: pq: duplicate key value violates unique constraint "site_configs_key_key"
Co-authored-by: Cian Johnston <cian@coder.com>
* fix(coderd): mark provisioner daemon psk as secret
Marks provisioner daemon PSK with the secret annotation.
This ensures it will be scrubbed from API requests to
/api/v2/deployment/config.
* make gen
* chore: drop github per user rate limit tracking
Rate limits for authenticated requests are per user.
This would be an excessive number of prometheus labels,
so we only track the unauthorized limit.
Moves healthcheck report-related structs from coderd/healthcheck to codersdk
This prevents an import cycle when adding a codersdk.Client method to hit /api/v2/debug/health.
In anticipation of needing the `LogSender` to run on a context that doesn't get immediately canceled when you `Close()` the agent, I've undertaken a little refactor to manage the goroutines that get run against the Tailnet and Agent API connection.
This handles controlling two contexts, one that gets canceled right away at the start of graceful shutdown, and another that stays up to allow graceful shutdown to complete.
Alternative solution to #6442
Modifies the behaviour of AcquireProvisionerJob and adds a special case for 'un-tagged' jobs such that they can only be picked up by 'un-tagged' provisioners.
Also adds comprehensive test coverage for AcquireJob given various combinations of tags.