Commit Graph

580 Commits

Author SHA1 Message Date
Colin Adler 205c43da99
fix(enterprise): mark nodes from unhealthy coordinators as lost (#13123)
Instead of removing the mappings of unhealthy coordinators entirely,
mark them as lost instead. This prevents peers from disappearing from
other peers if a coordinator misses a heartbeat.
2024-05-03 14:07:29 -05:00
Steven Masley 09f00c08df
chore: shutdown provisioner should stop waiting on client (#13118)
* chore: shutdown provisioner should stop waiting on client
* chore: add unit test that replicates failed client conn
2024-05-03 10:15:17 -05:00
Garrett Delfosse c550d0641d
feat: move shared ports out of experiment (#13120) 2024-05-02 14:11:33 -04:00
Steven Masley 845407fe7a
chore: cover deadline crossing autostart border on start (#13115)
When starting a workspace, if the deadline crosses an autostart boundary, the deadline is set to autostart + TTL. 
This copies the behavior in `ActivityBumpWorkspace`, but does not require activity.
2024-05-01 10:43:04 -05:00
Kyle Carberry 1bda8a0856
feat: add `deployment_id` to the ui and licenses (#13096)
* feat: expose `deployment_id` in the user dropdown

* feat: add license deployment_id verification

* Ignore wireguard.com from mlc config
2024-04-29 16:50:11 -04:00
Kayla Washburn-Love 74f27719b8
feat: specify a custom "terms of service" link (#13068) 2024-04-25 16:36:51 -06:00
Kyle Carberry 227e632053
fix: add grace period before showing replicas license error (#12989)
Fixes #8665.
2024-04-17 13:30:32 -04:00
Colin Adler 777dfbe965
feat(enterprise): add ready for handshake support to pgcoord (#12935) 2024-04-16 15:01:10 -05:00
Kayla Washburn-Love 2ad7fcc0b7
fix: show template autostop setting when it overrides the workspace setting (#12910) 2024-04-11 13:08:51 -06:00
Colin Adler e801e878ba
feat: add agent acks to in-memory coordinator (#12786)
When an agent receives a node, it responds with an ACK which is relayed
to the client. After the client receives the ACK, it's allowed to begin
pinging.
2024-04-10 17:15:33 -05:00
Spike Curtis 06eae954c9
fix: stop sending DeleteTailnetPeer when coordinator is unhealthy (#12925)
fixes #12923

Prevents Coordinate peer connections from generating spurious database queries like DeleteTailnetPeer when the coordinator is unhealthy.

It does this by checking the health of the querier before accepting a connection, rather than unconditionally accepting it only for it to get swatted down later.
2024-04-10 22:49:13 +04:00
Steven Masley a607d5610e
chore: disable pgcoord (HA) when --in-memory (#12919)
* chore: disable pgcoord (HA) when --in-memory

HA does not make any sense while using in-memory database
2024-04-10 11:05:55 -05:00
Steven Masley 838e8df5be
chore: merge apikey/token session config values (#12817)
* chore: merge apikey/token session config values

There is a confusing difference between an apikey and a token. This
difference leaks into our configs. This change does not resolve the
difference. It only groups the config values to try and manage any
bloat that occurs from adding more similar config values
2024-04-10 10:34:49 -05:00
Steven Masley eeb3d63be6
chore: merge authorization contexts (#12816)
* chore: merge authorization contexts

Instead of 2 auth contexts from apikey and dbauthz, merge them to
just use dbauthz. It is annoying to have two.

* fixup authorization reference
2024-03-29 10:14:27 -05:00
Colin Adler dc8cf3eea5
fix: nil ptr dereference when removing a license (#12785) 2024-03-27 15:59:35 -05:00
Colin Adler 4d5a7b2d56
chore(codersdk): move all tailscale imports out of `codersdk` (#12735)
Currently, importing `codersdk` just to interact with the API requires
importing tailscale, which causes builds to fail unless manually using
our fork.
2024-03-26 12:44:31 -05:00
Asher 40e5ad5499
feat: make OAuth2 provider not enterprise-only (#12732) 2024-03-25 11:52:22 -08:00
Kyle Carberry 03ab37b343
chore: remove middleware to request version and entitlement warnings (#12750)
This cleans up `root.go` a bit, adds tests for middleware HTTP transport
functions, and removes two HTTP requests we always always performed previously
when executing *any* client command.

It should improve CLI performance (especially for users with higher latency).
2024-03-25 15:01:42 -04:00
Colin Adler 37a05372fa
fix: disable relay if built-in DERP is disabled (#12654)
Fixes https://github.com/coder/coder/issues/12493
2024-03-21 16:53:41 -05:00
Cian Johnston 5454f4997b
chore(ci): clean up databases after test finishes in CI (#12702) 2024-03-21 14:53:16 +00:00
Steven Masley bd6ad88077
chore: nolint always return error function (#12701) 2024-03-21 09:35:10 -05:00
Steven Masley 131d0bd2ba
chore: fix linting issue in main(#12697) 2024-03-20 20:15:01 -05:00
Dean Sheather 2b773f9034
fix: allow proxy version mismatch (with warning) (#12433) 2024-03-20 18:24:18 +00:00
Garrett Delfosse 4d9fe05f5a
feat: add awsiamrds db auth driver (#12566) 2024-03-20 13:14:43 -04:00
Steven Masley d789a60d47
chore: remove max_ttl from templates (#12644)
* chore: remove max_ttl from templates

Completely removing max_ttl as a feature on template scheduling. Must use other template scheduling features to achieve autostop.
2024-03-20 10:37:57 -05:00
Steven Masley f0f9569d51
chore: enforce that provisioners can only acquire jobs in their own organization (#12600)
* chore: add org ID as optional param to AcquireJob
* chore: plumb through organization id to provisioner daemons
* add org id to provisioner domain key
* enforce org id argument
* dbgen provisioner jobs defaults to default org
2024-03-18 12:48:13 -05:00
Marcin Tojek 0e8ebb9b22
fix: fix flaky TestWorkspaceProxy_Server_PrometheusEnabled (#12645) 2024-03-18 16:43:41 +01:00
Ammar Bandukwala e5cc17af92
chore(cli): hide --organization (#12626)
It's potentially confusing to users since we aren't fleshing out organizations right now.
2024-03-18 09:09:26 -05:00
Dean Sheather cf50461ab4
fix: prevent single replica proxies from staying unhealthy (#12641)
In the peer healthcheck code, when an error pinging peers is detected we
write a "replicaErr" string with the error reason. However, if there are
no peer replicas to ping we returned early without setting the string to
empty. This would cause replicas that had peers (which were failing) and
then the peers left to permanently show an error until a new peer
appeared.

Also demotes DERP replica checking to a "warning" rather than an "error"
which should prevent the primary from removing the proxy from the region
map if DERP meshing is non-functional. This can happen without causing
problems if the peer is shutting down so we don't want to disrupt
everything if there isn't an issue.
2024-03-18 23:45:25 +10:00
Ammar Bandukwala b4c0fa80d8
chore(cli): rename Cmd to Command (#12616)
I think Command is cleaner and my original decision to use "Cmd"
a mistake.

Plus this creates better parity with cobra.
2024-03-17 09:45:26 -05:00
Ammar Bandukwala 496232446d
chore(cli): replace clibase with external `coder/serpent` (#12252) 2024-03-15 11:24:38 -05:00
Kyle Carberry 895df54051
fix: separate signals for passive, active, and forced shutdown (#12358)
* fix: separate signals for passive, active, and forced shutdown

`SIGTERM`: Passive shutdown stopping provisioner daemons from accepting new
jobs but waiting for existing jobs to successfully complete.

`SIGINT` (old existing behavior): Notify provisioner daemons to cancel in-flight jobs, wait 5s for jobs to be exited, then force quit.

`SIGKILL`: Untouched from before, will force-quit.

* Revert dramatic signal changes

* Rename

* Fix shutdown behavior for provisioner daemons

* Add test for graceful shutdown
2024-03-15 13:16:36 +00:00
Garrett Delfosse 0723dd3abf
fix: ensure agent token is from latest build in middleware (#12443) 2024-03-14 12:27:32 -04:00
Danny Kopping 7a7105ad66
feat: make agent stats' cardinality configurable (#12535) 2024-03-13 12:03:36 +02:00
Cian Johnston 8f40ee3465
Revert "feat: make agent stats' cardinality configurable (#12468)" (#12533)
This reverts commit 21d1873d97.
2024-03-11 14:33:36 +00:00
Danny Kopping 21d1873d97
feat: make agent stats' cardinality configurable (#12468)
Closes #12221
2024-03-11 16:04:08 +02:00
Dean Sheather d2a5b31b2b
feat: add derp mesh health checking in workspace proxies (#12222) 2024-03-08 16:31:40 +10:00
Colin Adler 6b0b87eb27
fix: add `--block-direct-connections` to wsproxies (#12182) 2024-03-07 23:45:59 -06:00
Colin Adler 66154f937e
fix(coderd): pass block endpoints into servertailnet (#12149) 2024-03-08 05:29:54 +00:00
Dean Sheather 586586e9dd
fix: do not set max deadline for workspaces on template update (#12446)
* fix: do not set max deadline for workspaces on template update

When templates are updated and schedule data is changed, we update all
running workspaces to have up-to-date scheduling information that sticks
to the new policy.

When updating the max_deadline for existing running workspaces, if the
max_deadline was before now()+2h we would set the max_deadline to
now()+2h.

Builds that don't/shouldn't have a max_deadline have it set to 0, which
is always before now()+2h, and thus would always have the max_deadline
updated.

* test: add unit test to excercise template schedule bug
---------

Co-authored-by: Steven Masley <stevenmasley@gmail.com>
2024-03-07 09:42:50 -06:00
Steven Masley b5f866c1cb
chore: add organization_id column to provisioner daemons (#12356)
* chore: add organization_id column to provisioner daemons
* Update upsert to include organization id on set
2024-03-06 12:04:50 -06:00
Dean Sheather 46a2ff1061
feat: allow setting port share protocol (#12383)
Co-authored-by: Garrett Delfosse <garrett@coder.com>
2024-03-06 09:23:57 -05:00
Garrett Delfosse 61bd341a36
chore: change max share level on existing port shares (#12411) 2024-03-05 13:47:01 -05:00
Dean Sheather 0016b0200b
chore: add test for workspace proxy derp meshing (#12220)
- Reworks the proxy registration loop into a struct (so I can add a `RegisterNow` method)
- Changes the proxy registration loop interval to 15s (previously 30s)
- Adds test which tests bidirectional DERP meshing on all possible paths between 6 workspace proxy replicas

Related to https://github.com/coder/customers/issues/438
2024-03-04 23:40:15 -08:00
Steven Masley 5c6974e55f
feat: implement provisioner auth middleware and proper org params (#12330)
* feat: provisioner auth in mw to allow ExtractOrg

Step to enable org scoped provisioner daemons

* chore: handle default org handling for provisioner daemons
2024-03-04 15:15:41 -06:00
Mathias Fredriksson 4ce1448bbe
fix(cli): generate correctly named file in DumpHandler (#12409) 2024-03-04 18:35:33 +02:00
Colin Adler e5d911462f
fix(tailnet): enforce valid agent and client addresses (#12197)
This adds the ability for `TunnelAuth` to also authorize incoming wireguard node IPs, preventing agents from reporting anything other than their static IP generated from the agent ID.
2024-03-01 09:02:33 -06:00
Cian Johnston eba8cd7c07
chore: consolidate various randomPort() implementations (#12362)
Consolidates our existing randomPort() implementations to package testutil
2024-02-29 12:51:44 +00:00
Cian Johnston 2bf3c72948
chore: add test for enterprise server cli (#12353) 2024-02-29 10:25:50 +00:00
Marcin Tojek 30d9d84758
fix: use flag to enable Prometheus (#12345) 2024-02-28 17:58:03 +01:00