Commit Graph

181 Commits

Author SHA1 Message Date
Kayla Washburn-Love d8e0be6ee6
feat: add support for multiple banners (#13081) 2024-05-08 15:40:43 -06:00
Spike Curtis d51c6912a7
fix: make handleManifest always signal dependents (#13141)
Fixes #13139

Using a bare channel to signal dependent goroutines means that we can only signal success, not failure, which leads to deadlock if we fail in a way that doesn't cause the whole `apiConnRoutineManager` to tear down routines.

Instead, we use a new object called a `checkpoint` that signals success or failure, so that dependent routines get unblocked if the routine they depend on fails.
2024-05-06 14:47:41 +04:00
Spike Curtis 2efb46a10e
chore: remove superfluous context.Canceled handling (#13140)
Removes a check for `context.Canceled` inside the `handleManifest` routine.  This checking is handled in the `apiConnRoutineManager`, so checking inside the handler is redundant.
2024-05-06 14:33:16 +04:00
Cian Johnston 99dda4a43a
fix(agent): keep track of lastReportIndex between invocations of reportLifecycle() (#13075) 2024-04-25 16:54:51 +01:00
Jon Ayers 426e9f2b96
feat: support adjusting child proc oom scores (#12655) 2024-04-03 09:42:03 -05:00
Colin Adler 4d5a7b2d56
chore(codersdk): move all tailscale imports out of `codersdk` (#12735)
Currently, importing `codersdk` just to interact with the API requires
importing tailscale, which causes builds to fail unless manually using
our fork.
2024-03-26 12:44:31 -05:00
Cian Johnston b0c4e7504c
feat(support): add client magicsock and agent prometheus metrics to support bundle (#12604)
* feat(codersdk): add ability to fetch prometheus metrics directly from agent
* feat(support): add client magicsock and agent prometheus metrics to support bundle
* refactor(support): simplify AgentInfo control flow

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2024-03-15 15:33:49 +00:00
Cian Johnston 653ddccd8e
fix(agent): remove unused token debug handler (#12602) 2024-03-15 09:43:36 +00:00
Cian Johnston 63696d762f
feat(codersdk): add debug handlers for logs, manifest, and token to agent (#12593)
* feat(codersdk): add debug handlers for logs, manifest, and token to agent

* add more logging

* use io.LimitReader instead of seeking
2024-03-14 15:36:12 +00:00
Cian Johnston 3b406878e0
feat(agent): expose HTTP debug server over tailnet API (#12582) 2024-03-14 10:02:01 +00:00
Spike Curtis b0afffbafb
feat: use v2 API for agent metadata updates (#12281)
Switches the agent to report metadata over the v2 API.

Fixes #10534
2024-02-26 09:50:19 +04:00
Spike Curtis aa7a9f5cc4
feat: use v2 API for agent lifecycle updates (#12278)
Agent uses the v2 API to post lifecycle updates.

Part of #10534
2024-02-23 15:24:28 +04:00
Spike Curtis 4cc132cea0
feat: switch agent to use v2 API for sending logs (#12068)
Changes the agent to use the new v2 API for sending logs, via the logSender component.

We keep the PatchLogs function around, but deprecate it so that we can test the v1 endpoint.
2024-02-23 11:27:15 +04:00
Spike Curtis af3fdc68c3
chore: refactor agent routines that use the v2 API (#12223)
In anticipation of needing the `LogSender` to run on a context that doesn't get immediately canceled when you `Close()` the agent, I've undertaken a little refactor to manage the goroutines that get run against the Tailnet and Agent API connection.

This handles controlling two contexts, one that gets canceled right away at the start of graceful shutdown, and another that stays up to allow graceful shutdown to complete.
2024-02-23 11:04:23 +04:00
Mathias Fredriksson b1c0b39d88
feat(agent): add script data dir for binaries and files (#12205)
The agent is extended with a `--script-data-dir` flag, defaulting to the
OS temp dir. This dir is used for storing `coder-script-data/bin` and
`coder-script/[script uuid]`. The former is a place for all scripts to
place executable binaries that will be available by other scripts, SSH
sessions, etc. The latter is a place for the script to store files.

Since we default to OS temp dir, files are ephemeral by default. In the
future, we may consider adding new env vars or changing the default
storage location. Workspace startup speed could potentially benefit from
scripts being able to skip steps that require downloading software. We
may also extend this with more env variables (e.g. persistent storage in
HOME).

Fixes #11131
2024-02-20 13:26:18 +02:00
Mathias Fredriksson c63f569174
refactor(agent/agentssh): move envs to agent and add agentssh config struct (#12204)
This commit refactors where custom environment variables are set in the
workspace and decouples agent specific configs from the `agentssh.Server`.
To reproduce all functionality, `agentssh.Config` is introduced.

The custom environment variables are now configured in `agent/agent.go`
and the agent retains control of the final state. This will allow for
easier extension in the future and keep other modules decoupled.
2024-02-19 16:30:00 +02:00
Spike Curtis 1cf4b62867
feat: change agent to use v2 API for reporting stats (#12024)
Modifies the agent to use the v2 API to report its statistics, using the `statsReporter` subcomponent.
2024-02-07 15:26:41 +04:00
Spike Curtis 1aa117b9ec
chore: rename client Listen to ConnectRPC (#11916)
ConnectRPC seems more appropriate for this function
2024-02-01 14:44:11 +04:00
Spike Curtis 0fc177203e
feat: use agent v2 API to update app health (#11889)
Use the Agent v2 API to update App Health
2024-01-30 11:35:12 +04:00
Spike Curtis 2599850e54
feat: use agent v2 API to post startup (#11877)
Uses the v2 Agent API to post startup information.
2024-01-30 11:23:28 +04:00
Spike Curtis da8bb1c198
feat: use agent v2 API to fetch manifest (#11832)
Agent uses the v2 API to obtain the manifest, instead of the HTTP API.
2024-01-30 10:11:28 +04:00
Spike Curtis 0eff646c31
chore: move proto to sdk conversion to agentsdk (#11831)
`agentsdk` depends on `agent/proto` because it needs to get the version to dial.

Therefore, the conversion routines need to live in `agentsdk` so that we can convert to and from the Manifest.

I briefly considered refactoring the agent to only reference `proto.Manifest`, but decided against it because we might have multiple protocol versions in the future, its useful to have a protocol-independent data structure.
2024-01-30 09:04:56 +04:00
Spike Curtis 13e24f21e4
feat: use Agent v2 API for Service Banner (#11806)
Agent uses the v2 API for the service banner, rather than the v1 HTTP API.

One of several for #10534
2024-01-30 07:44:47 +04:00
Spike Curtis 059e533544
feat: agent uses Tailnet v2 API for DERPMap updates (#11698)
Switches the Agent to use Tailnet v2 API to get DERPMap updates.

Subsequent PRs will do the same for the CLI (`codersdk`) and `wsproxy`.
2024-01-23 14:42:07 +04:00
Spike Curtis f01cab9894
feat: use tailnet v2 API for coordination (#11638)
This one is huge, and I'm sorry.

The problem is that once I change `tailnet.Conn` to start doing v2 behavior, I kind of have to change it everywhere, including in CoderSDK (CLI), the agent, wsproxy, and ServerTailnet.

There is still a bit more cleanup to do, and I need to add code so that when we lose connection to the Coordinator, we mark all peers as LOST, but that will be in a separate PR since this is big enough!
2024-01-22 11:07:50 +04:00
Spike Curtis 4071f1713b
feat: add logging to agent stats and JetBrains tracking (#11364)
Adds logging so we can hope to diagnose #11363
2024-01-02 13:34:49 +04:00
Steven Masley b7bdb17460
feat: add metrics to workspace agent scripts (#11132)
* push startup script metrics to agent
2023-12-13 11:45:43 -06:00
Dean Sheather a9c0c01629
chore: fix flake in listening ports test (#10833) 2023-11-22 09:30:51 +00:00
Mathias Fredriksson 7fecd39e23
fix(agent/agentscripts): display informative error for ErrWaitDelay (#10407)
Fixes #10400
2023-10-27 19:07:26 +03:00
Mathias Fredriksson 1a2aea3a6b
fix(agent): prevent metadata from being discarded if report is slow (#10386) 2023-10-23 17:02:54 +00:00
Mathias Fredriksson 76c65b1e1b
fix(agent): send metadata in batches (#10225)
Fixes #9782

---

I recommend reviewing with ignore whitespace.
2023-10-13 17:48:25 +03:00
Mathias Fredriksson 4857d4bd55
feat(codersdk/agentsdk): use new agent metadata batch endpoint (#10224)
Part of #9782
2023-10-13 17:32:28 +03:00
Mathias Fredriksson 7eeba15d16
feat(coderd): add support for sending batched agent metadata (#10223)
Part of #9782
2023-10-13 16:37:55 +03:00
Jon Ayers 4452a1484d
fix: fix log spam related to skipping custom nice scores (#10206) 2023-10-11 02:32:50 -05:00
Spike Curtis 54fd350913
feat: improve logging for speedtest connections
part of #7963

improve connection logging for speedtest connections
2023-10-09 20:48:28 +04:00
Spike Curtis 17e889af16
feat: improve logging for reconnectingPTY connections
part of #7963

improves connection logging on reconnectingPTY
2023-10-09 20:35:50 +04:00
Kyle Carberry 1262eef2c0
feat: add support for `coder_script` (#9584)
* Add basic migrations

* Improve schema

* Refactor agent scripts into it's own package

* Support legacy start and stop script format

* Pipe the scripts!

* Finish the piping

* Fix context usage

* It works!

* Fix sql query

* Fix SQL query

* Rename `LogSourceID` -> `SourceID`

* Fix the FE

* fmt

* Rename migrations

* Fix log tests

* Fix lint err

* Fix gen

* Fix story type

* Rename source to script

* Fix schema jank

* Uncomment test

* Rename proto to TimeoutSeconds

* Fix comments

* Fix comments

* Fix legacy endpoint without specified log_source

* Fix non-blocking by default in agent

* Fix resources tests

* Fix dbfake

* Fix resources

* Fix linting I think

* Add fixtures

* fmt

* Fix startup script behavior

* Fix comments

* Fix context

* Fix cancel

* Fix SQL tests

* Fix e2e tests

* Interrupt on Windows

* Fix agent leaking script process

* Fix migrations

* Fix stories

* Fix duplicate logs appearing

* Gen

* Fix log location

* Fix tests

* Fix tests

* Fix log output

* Show display name in output

* Fix print

* Return timeout on start context

* Gen

* Fix fixture

* Fix the agent status

* Fix startup timeout msg

* Fix command using shared context

* Fix timeout draining

* Change signal type

* Add deterministic colors to startup script logs

---------

Co-authored-by: Muhammad Atif Ali <atif@coder.com>
2023-09-25 16:47:17 -05:00
Jon Ayers 7311ffbd9d
feat: implement agent process management (#9461)
- An opt-in feature has been added to the agent to allow
   deprioritizing non coder-related processes for CPU by setting their
   niceness level to 10.
- Opting in to the feature requires setting CODER_PROC_PRIO_MGMT to a non-empty value.
2023-09-14 19:45:05 -05:00
Mathias Fredriksson 19d7da3d24
refactor(coderd/database): split `Time` and `Now` into `dbtime` package (#9482)
Ref: #9380
2023-09-01 16:50:12 +00:00
Mathias Fredriksson 702b064cac
refactor: split coderd/gitauth into two, add cli/gitauth (#9479)
* refactor: split coderd/gitauth into two, add cli/gitauth

Ref: #9380
2023-09-01 15:41:22 +00:00
Dean Sheather 64df076328
feat: add server flag to force DERP to use always websockets (#9238) 2023-08-24 17:22:31 +00:00
Kyle Carberry 22e781eced
chore: add /v2 to import module path (#9072)
* chore: add /v2 to import module path

go mod requires semantic versioning with versions greater than 1.x

This was a mechanical update by running:
```
go install github.com/marwan-at-work/mod/cmd/mod@latest
mod upgrade
```

Migrate generated files to import /v2

* Fix gen
2023-08-18 18:55:43 +00:00
Mathias Fredriksson bbaa057e15
fix(agent): log correct script timeout for startup script (#9190) 2023-08-18 17:35:49 +00:00
Colin Adler 344d32b2f1
feat(coderd): expire agents from server tailnet (#9092) 2023-08-14 20:38:37 -05:00
Asher a08f7b8fb9
fix: catch missing output with reconnecting PTY (#9094)
I forgot that waiting on the cond releases the lock so it was possible
to get pty output after writing the buffer but before adding the pty to
the map.  To fix, add the pty to the map while under the same lock where
we read from the buffer.

The rest does not need to be behind the lock so I moved it out of
doAttach, and that also means we no longer need
waitForStateOrContextLocked.

Also, this can hit a logger error saying the attach failed which fails
the tests however it is not that the attach failed, just that the
process already ran and exited, so when the process exits do not
set an error, instead for now assume this is an expected close.
2023-08-14 15:54:23 -08:00
Asher b993cab49a
fix: use screen for reconnecting terminal sessions on Linux if available (#8640)
* Add screen backend for reconnecting ptys

The screen portion is a port from wsep.  There is an interface that lets
you choose between screen and the previous method.  By default it will
choose screen if it is installed but this can be overidden (mostly for
tests).

The tests use a scanner instead of a reader now because the reader will
loop infinitely at the end of a stream.

Replace /bin/bash with bash since bash is not always in /bin.

* Remove connection_id from reconnecting PTY logger

This serves multiple connections so it makes no sense to scope it to a
single connection.

Also lets us use "connection_id" when logging write errors instead of
"other_conn_id".

* Use PATH to test buffered reconnecting pty
2023-08-14 11:19:13 -08:00
Dean Sheather 07fd73c4a0
chore: allow multiple agent subsystems, add exectrace (#8933) 2023-08-08 22:10:28 -07:00
Dean Sheather 3c52b01850
chore: add tailscale magicsock debug logging controls (#8982) 2023-08-08 17:56:08 +00:00
Dean Sheather b955c5fefc
fix: avoid agent runLoop exiting due to ws ping (#8852) 2023-08-02 07:25:07 +00:00
Dean Sheather c575292ba6
fix: fix tailnet netcheck issues (#8802) 2023-08-02 01:50:43 +10:00