coder

Commit Graph

Author	SHA1	Message	Date
Cian Johnston	99dda4a43a	fix(agent): keep track of lastReportIndex between invocations of reportLifecycle() (#13075 )	2024-04-25 16:54:51 +01:00
Jon Ayers	426e9f2b96	feat: support adjusting child proc oom scores (#12655 )	2024-04-03 09:42:03 -05:00
Colin Adler	4d5a7b2d56	chore(codersdk): move all tailscale imports out of `codersdk` (#12735 ) Currently, importing `codersdk` just to interact with the API requires importing tailscale, which causes builds to fail unless manually using our fork.	2024-03-26 12:44:31 -05:00
Cian Johnston	b0c4e7504c	feat(support): add client magicsock and agent prometheus metrics to support bundle (#12604 ) * feat(codersdk): add ability to fetch prometheus metrics directly from agent * feat(support): add client magicsock and agent prometheus metrics to support bundle * refactor(support): simplify AgentInfo control flow Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>	2024-03-15 15:33:49 +00:00
Cian Johnston	653ddccd8e	fix(agent): remove unused token debug handler (#12602 )	2024-03-15 09:43:36 +00:00
Cian Johnston	63696d762f	feat(codersdk): add debug handlers for logs, manifest, and token to agent (#12593 ) * feat(codersdk): add debug handlers for logs, manifest, and token to agent * add more logging * use io.LimitReader instead of seeking	2024-03-14 15:36:12 +00:00
Cian Johnston	3b406878e0	feat(agent): expose HTTP debug server over tailnet API (#12582 )	2024-03-14 10:02:01 +00:00
Colin Adler	e5d911462f	fix(tailnet): enforce valid agent and client addresses (#12197 ) This adds the ability for `TunnelAuth` to also authorize incoming wireguard node IPs, preventing agents from reporting anything other than their static IP generated from the agent ID.	2024-03-01 09:02:33 -06:00
Cian Johnston	eba8cd7c07	chore: consolidate various randomPort() implementations (#12362 ) Consolidates our existing randomPort() implementations to package testutil	2024-02-29 12:51:44 +00:00
Mathias Fredriksson	32691e67e6	test(agent/agentscripts): fix test flake in `TestEnv` (#12326 )	2024-02-27 17:58:10 +00:00
Spike Curtis	b0afffbafb	feat: use v2 API for agent metadata updates (#12281 ) Switches the agent to report metadata over the v2 API. Fixes #10534	2024-02-26 09:50:19 +04:00
Spike Curtis	aa7a9f5cc4	feat: use v2 API for agent lifecycle updates (#12278 ) Agent uses the v2 API to post lifecycle updates. Part of #10534	2024-02-23 15:24:28 +04:00
Spike Curtis	4cc132cea0	feat: switch agent to use v2 API for sending logs (#12068 ) Changes the agent to use the new v2 API for sending logs, via the logSender component. We keep the PatchLogs function around, but deprecate it so that we can test the v1 endpoint.	2024-02-23 11:27:15 +04:00
Spike Curtis	af3fdc68c3	chore: refactor agent routines that use the v2 API (#12223 ) In anticipation of needing the `LogSender` to run on a context that doesn't get immediately canceled when you `Close()` the agent, I've undertaken a little refactor to manage the goroutines that get run against the Tailnet and Agent API connection. This handles controlling two contexts, one that gets canceled right away at the start of graceful shutdown, and another that stays up to allow graceful shutdown to complete.	2024-02-23 11:04:23 +04:00
Mathias Fredriksson	b1c0b39d88	feat(agent): add script data dir for binaries and files (#12205 ) The agent is extended with a `--script-data-dir` flag, defaulting to the OS temp dir. This dir is used for storing `coder-script-data/bin` and `coder-script/[script uuid]`. The former is a place for all scripts to place executable binaries that will be available by other scripts, SSH sessions, etc. The latter is a place for the script to store files. Since we default to OS temp dir, files are ephemeral by default. In the future, we may consider adding new env vars or changing the default storage location. Workspace startup speed could potentially benefit from scripts being able to skip steps that require downloading software. We may also extend this with more env variables (e.g. persistent storage in HOME). Fixes #11131	2024-02-20 13:26:18 +02:00
Spike Curtis	081e37d7d9	chore: move LogSender to agentsdk (#12158 ) Moves the LogSender to agentsdk and deprecates LogsSender based on the v1 API.	2024-02-20 10:44:20 +04:00
Mathias Fredriksson	c63f569174	refactor(agent/agentssh): move envs to agent and add agentssh config struct (#12204 ) This commit refactors where custom environment variables are set in the workspace and decouples agent specific configs from the `agentssh.Server`. To reproduce all functionality, `agentssh.Config` is introduced. The custom environment variables are now configured in `agent/agent.go` and the agent retains control of the final state. This will allow for easier extension in the future and keep other modules decoupled.	2024-02-19 16:30:00 +02:00
Mathias Fredriksson	0442ee5fa8	fix(agent/reconnectingpty): fix screen startup speed by disabling messages (#12190 )	2024-02-16 22:37:02 +02:00
Spike Curtis	2aff014e5d	feat: add logSender for sending logs on agent v2 API (#12046 ) Adds a new subcomponent of the agent for queueing up logs until they can be sent over the Agent API. Subsequent PR will change the agent to use this instead of the HTTP API for posting logs. Relates to #10534	2024-02-15 16:57:17 +04:00
Dean Sheather	2fc3064653	chore: add tests for app ID copy in app healths (#12088 )	2024-02-12 05:49:48 +00:00
Spike Curtis	92b2e26a48	feat: send log limit exceeded in response, not error (#12078 ) When we exceed the db-imposed limit of logs, we need to communicate that back to the agent. In v1 we did it with a 4xx-level HTTP status, but with dRPC, the errors are delivered as strings, which feels fragile to me for something we want to gracefully handle. So, this PR adds the log limit exceeded as a field on the response message, and fixes the API handler to set it as appropriate instead of an error.	2024-02-09 16:17:20 +04:00
Colin Adler	ec8e41f516	chore: add logging around agent app health reporting (#12071 )	2024-02-08 23:37:44 -06:00
Spike Curtis	1cf4b62867	feat: change agent to use v2 API for reporting stats (#12024 ) Modifies the agent to use the v2 API to report its statistics, using the `statsReporter` subcomponent.	2024-02-07 15:26:41 +04:00
Mathias Fredriksson	f2aef0726b	fix(agent/agentssh): allow scp to exit with zero status (#12028 ) Fixes #11786	2024-02-07 10:22:31 +02:00
Spike Curtis	1aa117b9ec	chore: rename client Listen to ConnectRPC (#11916 ) ConnectRPC seems more appropriate for this function	2024-02-01 14:44:11 +04:00
Spike Curtis	eb03e4490a	feat: add statsReporter for reporting stats on agent v2 API (#11920 ) Adds a new statsReporter subcomponent of the agent, which in a later PR will be used to report stats over the v2 API. Refactors the logic a bit so that we can handle starting and stopping stats reporting if the agent API connection drops and reconnects.	2024-02-01 08:21:01 +04:00
Spike Curtis	0fc177203e	feat: use agent v2 API to update app health (#11889 ) Use the Agent v2 API to update App Health	2024-01-30 11:35:12 +04:00
Spike Curtis	2599850e54	feat: use agent v2 API to post startup (#11877 ) Uses the v2 Agent API to post startup information.	2024-01-30 11:23:28 +04:00
Spike Curtis	da8bb1c198	feat: use agent v2 API to fetch manifest (#11832 ) Agent uses the v2 API to obtain the manifest, instead of the HTTP API.	2024-01-30 10:11:28 +04:00
Spike Curtis	9cf4e7f15a	fix: prevent agent_test.go from failing on error logs (#11909 ) We're failing tests on error logs like this: https://github.com/coder/coder/actions/runs/7706053882/job/21000984583 Unfortunately, the error we hit, when the underlying connection is closed, is unexported, so we can't specifically ignore it. Part of the issue is that agent.Close() doesn't wait for these goroutines to complete before returning, so the test harness proceeds to close the connection. This looks to our product code like the network connection failing. It would be possible to fix this, but just doesn't seem worth it for the extra insurance of catching other error logs in these tests.	2024-01-30 10:04:01 +04:00
Spike Curtis	0eff646c31	chore: move proto to sdk conversion to agentsdk (#11831 ) `agentsdk` depends on `agent/proto` because it needs to get the version to dial. Therefore, the conversion routines need to live in `agentsdk` so that we can convert to and from the Manifest. I briefly considered refactoring the agent to only reference `proto.Manifest`, but decided against it because we might have multiple protocol versions in the future, its useful to have a protocol-independent data structure.	2024-01-30 09:04:56 +04:00
Spike Curtis	13e24f21e4	feat: use Agent v2 API for Service Banner (#11806 ) Agent uses the v2 API for the service banner, rather than the v1 HTTP API. One of several for #10534	2024-01-30 07:44:47 +04:00
Spike Curtis	207328ca50	feat: use appearance.Fetcher in agentapi (#11770 ) This PR updates the Agent API to use the appearance.Fetcher, which is set by entitlement code in Enterprise coderd. This brings the agentapi into compliance with the Enterprise feature.	2024-01-29 21:22:50 +04:00
Steven Masley	081fbef097	fix: code-server path based forwarding, defer to code-server (#11759 ) Do not attempt to construct a path based port forward url. Always defer to code server, as it has it's own proxy method.	2024-01-23 11:36:44 -06:00
Spike Curtis	059e533544	feat: agent uses Tailnet v2 API for DERPMap updates (#11698 ) Switches the Agent to use Tailnet v2 API to get DERPMap updates. Subsequent PRs will do the same for the CLI (`codersdk`) and `wsproxy`.	2024-01-23 14:42:07 +04:00
Spike Curtis	3e0e7f8739	feat: check agent API version on connection (#11696 ) fixes #10531 Adds a check for `version` on connection to the Agent API websocket endpoint. This is primarily for future-proofing, so that up-level agents get a sensible error if they connect to a back-level Coderd. It also refactors the location of the `CurrentVersion` variables, to be part of the `proto` packages, since the versions refer to the APIs defined therein.	2024-01-23 14:27:49 +04:00
Spike Curtis	f01cab9894	feat: use tailnet v2 API for coordination (#11638 ) This one is huge, and I'm sorry. The problem is that once I change `tailnet.Conn` to start doing v2 behavior, I kind of have to change it everywhere, including in CoderSDK (CLI), the agent, wsproxy, and ServerTailnet. There is still a bit more cleanup to do, and I need to add code so that when we lose connection to the Coordinator, we mark all peers as LOST, but that will be in a separate PR since this is big enough!	2024-01-22 11:07:50 +04:00
Asher	72d9ec07aa	fix: detect JetBrains running on local ipv6 (#11676 )	2024-01-17 14:08:15 -09:00
Spike Curtis	b173195e0d	Revert "fix: detect JetBrains running on local ipv6 (#11653 )" (#11664 ) This reverts commit `2d61d5332a`.	2024-01-17 15:38:39 +04:00
Asher	2d61d5332a	fix: detect JetBrains running on local ipv6 (#11653 )	2024-01-16 15:53:41 -09:00
Mathias Fredriksson	385d58caf6	fix(agent/agentssh): allow remote forwarding a socket multiple times (#11631 ) * fix(agent/agentssh): allow remote forwarding a socket multiple times Fixes #11198 Fixes https://github.com/coder/customers/issues/407	2024-01-16 21:26:13 +02:00
Mathias Fredriksson	b1d53a68c2	fix(agent/agentssh): fix X11 forwarding by improving Xauthority management (#11550 ) Fixes #11531	2024-01-10 19:04:44 +02:00
Spike Curtis	fdd60d316e	fix: fix MetricsAggregator check for metric sameness (#11508 ) Fixes #11451 A refactor of the Agent API passes metrics as protobufs, which include pointers to label name/value pairs. The aggregator tested for sameness by doing a shallow compare of label values, which for different stats reports would compare unequal because the pointers would be different. This fix does a deep compare. While testing I also noted that we neglect to compare template names. This is unlikely to have caused any issue in practice, since the combination of username/workspace is unique, but in the context of comparing metric labels we should do the comparison. If a user creates a workspace, deletes it, then recreates from a different template, we could in principle have reported incorrect stats for the old template.	2024-01-09 15:21:30 +04:00
Steven Masley	dd05a6b13a	chore: mockgen archived, moved to new location (#11415 ) * chore: mockgen archived, moved to new location	2024-01-04 18:35:56 -06:00
Mathias Fredriksson	df3c310379	feat(cli): add `coder open vscode` (#11191 ) Fixes #7667	2024-01-02 20:46:18 +02:00
Spike Curtis	520c3a8ff7	fix: use TSMP for pings and checking reachability (#11306 ) We're seeing some flaky tests related to agent connectivity - https://github.com/coder/coder/actions/runs/7286675441/job/19856270998 I'm pretty sure what happened in this one is that the client opened a connection while the wgengine was in the process of reconfiguring the wireguard device, so the fact that the peer became "active" as a result of traffic being sent was not noticed. The test calls `AwaitReachable()` but this only tests the disco layer, so it doesn't wait for wireguard to come up. I think we should be using TSMP for pinging and reachability, since this operates at the IP layer, and therefore requires that wireguard comes up before being successful. This should also help with the problems we have seen where a TCP connection starts before wireguard is up and the initial round trip has to wait for the 5 second wireguard handshake retry. fixes: #11294	2024-01-02 15:53:52 +04:00
Spike Curtis	4071f1713b	feat: add logging to agent stats and JetBrains tracking (#11364 ) Adds logging so we can hope to diagnose #11363	2024-01-02 13:34:49 +04:00
Spike Curtis	25f2abf9ab	chore: remove tailnet from agent API and rename client API to tailnet (#11303 ) Refactors our DRPC service definitions slightly. In the previous version, I inserted the RPCs from the tailnet proto directly into the Agent service. This makes things hard to deal with because DRPC then generates a new set of methods with new interfaces with the `DRPCAgent_` prefixed. Since you can't have a single method that takes different argument types, we couldn't reuse the implementation of those RFCs without a lot of extra classes and pass-thru methods. Instead, the "right" way to do it is to integrate at the DRPC layer. So, we have two DRPC services available over the Agent websocket, and register them both on the DRPC `mux`. Since the tailnet proto RPC service is now for both clients and agents, I renamed some things to clarify and shorten. This PR also removes the `TailnetAPI` implementation from the `agentapi` package, and the next PR in the stack replaces it with the implementation from the `tailnet` package.	2024-01-02 10:02:45 +04:00
Spike Curtis	db9104c02e	fix: avoid panic on nil connection (#11305 ) Related to https://github.com/coder/coder/actions/runs/7286675441/job/19855871305 Fixes a panic if the listener returns an error, which can obfuscate the underlying problem and cause unrelated tests to be marked failed.	2023-12-21 14:26:11 +04:00
Kayla Washburn	3ab4800a18	chore: clean up lint (#11270 )	2023-12-18 14:59:39 -07:00

1 2 3 4 5 ...

299 Commits