coder

Commit Graph

Author	SHA1	Message	Date
Ammar Bandukwala	b4c0fa80d8	chore(cli): rename Cmd to Command (#12616 ) I think Command is cleaner and my original decision to use "Cmd" a mistake. Plus this creates better parity with cobra.	2024-03-17 09:45:26 -05:00
Ammar Bandukwala	496232446d	chore(cli): replace clibase with external `coder/serpent` (#12252 )	2024-03-15 11:24:38 -05:00
Kyle Carberry	895df54051	fix: separate signals for passive, active, and forced shutdown (#12358 ) * fix: separate signals for passive, active, and forced shutdown `SIGTERM`: Passive shutdown stopping provisioner daemons from accepting new jobs but waiting for existing jobs to successfully complete. `SIGINT` (old existing behavior): Notify provisioner daemons to cancel in-flight jobs, wait 5s for jobs to be exited, then force quit. `SIGKILL`: Untouched from before, will force-quit. * Revert dramatic signal changes * Rename * Fix shutdown behavior for provisioner daemons * Add test for graceful shutdown	2024-03-15 13:16:36 +00:00
Spike Curtis	b96f6b48a4	fix: ensure ssh cleanup happens on cmd error I noticed in my logs that sometimes `coder ssh` doesn't gracefully disconnect from the coordinator. The cause is the `closerStack` construct we use in that function. It has two paths to start closing things down: 1. explicit `close()` which we do in `defer` 2. context cancellation, which happens if the cli function returns an error sometimes the ssh remote command returns an error, and this triggers context cancellation of the `closerStack`. That is fine in and of itself, but we still want the explicit `close()` to wait until everything is closed before returning, since that's where we do cleanup, including the graceful disconnect. Prior to this fix the `close()` just immediately exits if another goroutine is closing the stack. Here we add a wait until everything is done.	2024-03-07 17:26:49 +04:00
Spike Curtis	4e7beee102	feat: show tailnet peer diagnostics after coder ping (#12314 ) Beginnings of a solution to #12297 Doesn't cover disco or definitively display whether we successfully connected to DERP, but shows some checklist diagnostics for connecting to an agent. For this first PR, I just added it to `coder ping` to see how we like it, but could be incorporated into `coder ssh` _et al._ after a timeout. ``` $ coder ping dogfood2 p2p connection established in 147ms pong from dogfood2 p2p via 95.217.xxx.yyy:42631 in 147ms pong from dogfood2 p2p via 95.217.xxx.yyy:42631 in 140ms pong from dogfood2 p2p via 95.217.xxx.yyy:42631 in 140ms ✔ preferred DERP region 999 (Council Bluffs, Iowa) ✔ sent local data to Coder networking coodinator ✔ received remote agent data from Coder networking coordinator preferred DERP 10013 (Europe Fly.io (Paris)) endpoints: 95.217.xxx.yyy:42631, 95.217.xxx.yyy:37576, 172.17.0.1:37576, 172.20.0.10:37576 ✔ Wireguard handshake 11s ago ```	2024-02-27 22:04:46 +04:00
Mathias Fredriksson	e659957b65	fix(cli/ssh): prevent reads/writes to stdin/stdout in stdio mode (#12045 ) Fixes #11530	2024-02-08 13:09:42 +02:00
Marcin Tojek	77a4792ecd	fix(cli): ssh: auto-update workspace (#11773 )	2024-01-23 18:01:44 +01:00
Mathias Fredriksson	200a87e7d4	feat(cli/ssh): allow multiple remote forwards and allow missing local file (#11648 )	2024-01-19 15:21:10 +02:00
Mathias Fredriksson	df3c310379	feat(cli): add `coder open vscode` (#11191 ) Fixes #7667	2024-01-02 20:46:18 +02:00
Jon Ayers	37f6b38d53	fix: return 403 when rebuilding workspace with require_active_version (#11114 )	2023-12-08 23:03:46 -06:00
Steven Masley	cb89bc1729	feat: restart stopped workspaces on ssh command (#11050 ) * feat: autostart workspaces on ssh & port forward This is opt out by default. VScode ssh does not have this behavior	2023-12-08 10:01:13 -06:00
Mathias Fredriksson	61be4dfe5a	fix: improve exit codes for agent/agentssh and cli/ssh (#10850 )	2023-11-24 14:35:56 +02:00
Spike Curtis	f20cc66c04	fix: give SSH stdio sessions a chance to close before closing netstack (#10815 ) Man, graceful shutdown is hard. Even after my changes, we were still hitting a graceful shutdown race: https://github.com/coder/coder/runs/18886842123 The problem was that while we attempt a graceful shutdown at the SSH layer by closing the session for writing, we were not giving it a chance to complete before continuing to tear down the stack of closers, including one that closes the netstack, and thus drop the TCP connection before it closes.	2023-11-22 13:11:21 +04:00
Spike Curtis	3dd35e019b	fix: close ssh sessions gracefully (#10732 ) Re-enables TestSSH/RemoteForward_Unix_Signal and addresses the underlying race: we were not closing the remote forward on context expiry, only the session and connection. However, there is still a more fundamental issue in that we don't have the ability to ensure that TCP sessions are properly terminated before tearing down the Tailnet conn. This is due to the assumption in the sockets API, that the underlying IP interface is long lived compared with the TCP socket, and thus closing a socket returns immediately and does not wait for the TCP termination handshake --- that is handled async in the tcpip stack. However, this assumption does not hold for us and tailnet, since on shutdown, we also tear down the tailnet connection, and this can race with the TCP termination. Closing the remote forward explicitly should prevent forward state from accumulating, since the Close() function waits for a reply from the remote SSH server. I've also attempted to workaround the TCP/tailnet issue for `--stdio` by using `CloseWrite()` instead of `Close()`. By closing the write side of the connection, half-close the TCP connection, and the server detects this and closes the other direction, which then triggers our read loop to exit only after the server has had a chance to process the close. TODO in a stacked PR is to implement this logic for `vscodessh` as well.	2023-11-17 12:43:20 +04:00
Spike Curtis	4894eda711	feat: capture cli logs in tests (#10669 ) Adds a Logger to cli Invocation and standardizes CLI commands to use it. clitest creates a test logger by default so that CLI command logs are captured in the test logs. CLI commands that do their own log configuration are modified to add sinks to the existing logger, rather than create a new one. This ensures we still capture logs in CLI tests.	2023-11-14 22:56:27 +04:00
Spike Curtis	dc4b1ef406	fix: lock log sink against concurrent write and close (#10668 ) fixes #10663	2023-11-14 16:38:34 +04:00
Spike Curtis	f400d8a0c5	fix: handle SIGHUP from OpenSSH (#10638 ) Fixes an issue where remote forwards are not correctly torn down when using OpenSSH with `coder ssh --stdio`. OpenSSH sends a disconnect signal, but then also sends SIGHUP to `coder`. Previously, we just exited when we got SIGHUP, and this raced against properly disconnecting. Fixes https://github.com/coder/customers/issues/327	2023-11-13 15:14:42 +04:00
Kyle Carberry	1262eef2c0	feat: add support for `coder_script` (#9584 ) * Add basic migrations * Improve schema * Refactor agent scripts into it's own package * Support legacy start and stop script format * Pipe the scripts! * Finish the piping * Fix context usage * It works! * Fix sql query * Fix SQL query * Rename `LogSourceID` -> `SourceID` * Fix the FE * fmt * Rename migrations * Fix log tests * Fix lint err * Fix gen * Fix story type * Rename source to script * Fix schema jank * Uncomment test * Rename proto to TimeoutSeconds * Fix comments * Fix comments * Fix legacy endpoint without specified log_source * Fix non-blocking by default in agent * Fix resources tests * Fix dbfake * Fix resources * Fix linting I think * Add fixtures * fmt * Fix startup script behavior * Fix comments * Fix context * Fix cancel * Fix SQL tests * Fix e2e tests * Interrupt on Windows * Fix agent leaking script process * Fix migrations * Fix stories * Fix duplicate logs appearing * Gen * Fix log location * Fix tests * Fix tests * Fix log output * Show display name in output * Fix print * Return timeout on start context * Gen * Fix fixture * Fix the agent status * Fix startup timeout msg * Fix command using shared context * Fix timeout draining * Change signal type * Add deterministic colors to startup script logs --------- Co-authored-by: Muhammad Atif Ali <atif@coder.com>	2023-09-25 16:47:17 -05:00
Ammar Bandukwala	6ba92ef924	ci: enable gocognit (#9359 ) And, bring the server under 300: * Removed the undocumented "disable" STUN address in favor of the --disable-direct flag.	2023-08-27 14:46:44 -05:00
Kyle Carberry	22e781eced	chore: add /v2 to import module path (#9072 ) * chore: add /v2 to import module path go mod requires semantic versioning with versions greater than 1.x This was a mechanical update by running: ``` go install github.com/marwan-at-work/mod/cmd/mod@latest mod upgrade ``` Migrate generated files to import /v2 * Fix gen	2023-08-18 18:55:43 +00:00
Kyle Carberry	bd944e0d21	chore: rename startup logs to agent logs (#8649 ) * chore: rename startup logs to agent logs This also adds a `source` property to every agent log. It should allow us to group logs and display them nicer in the UI as they stream in. * Fix migration order * Fix naming * Rename the frontend * Fix tests * Fix down migration * Match enums for workspace agent logs * Fix inserting log source * Fix migration order * Fix logs tests * Fix psql insert	2023-07-28 15:57:23 +00:00
Marcin Tojek	9689bca5d2	feat(cli): implement ssh remote forward (#8515 )	2023-07-20 12:05:39 +02:00
Colin Adler	1c3bfacca3	fix(cli): ensure `cliui.Agent` doesn't fetch infinitely (#8446 )	2023-07-12 10:21:54 -05:00
Cian Johnston	7fcf319e01	fix(cli)!: protect client Logger and refactor cli scaletest tests (#8317 ) - (breaking) Protects Logger and LogBodies fields of codersdk.Client with its mutex. This addresses a data race in cli/scaletest. - Fillets the existing cli/createworkspaces unit test and moves the testing logic there into the tests under scaletest/createworkspaces. - Adds testutil.RaceEnabled bool const and conditionaly skips previously-skipped tests under scaletest/ if the race detector is enabled. This is unfortunate and sad, but I would prefer to have these tests at least running without the race detector than not running at all. - Adds IgnoreErrors option to fake in-memory agent loggers; having the agents fail the test immediately when they encounter any sort of error isn't really helpful.	2023-07-06 09:43:39 +01:00
Mathias Fredriksson	d3c39b60c9	feat: add agent log streaming and follow provisioner format (#8170 )	2023-06-28 10:54:13 +02:00
Dean Sheather	24b95e16c4	feat: add --disable-direct flag to CLI (#8131 )	2023-06-21 20:22:43 +00:00
Marcin Tojek	b1d1b63113	chore: ensure logs consistency across Coder (#8083 )	2023-06-20 12:30:45 +02:00
Mathias Fredriksson	4bc4e63637	fix(cli/ssh): fix lint error (#7974 )	2023-06-12 16:17:41 +00:00
Ammar Bandukwala	5de1084639	feat(cli/ssh): simplify log file flags (#7863 ) And, fix a race condition.	2023-06-12 09:18:33 +04:00
Mathias Fredriksson	94aa9be33a	feat(cli/ssh): implement wait options and deprecate no-wait (#7894 ) Fixes #7768 Refs #7893	2023-06-08 16:52:44 +03:00
Mathias Fredriksson	660bbb8d38	refactor: deprecate `login_before_ready` in favor of `startup_script_behavior` (#7837 ) Fixes #7758	2023-06-06 11:58:07 +03:00
Marcin Tojek	a7366a8b76	feat!: drop support for legacy parameters (#7663 )	2023-06-02 11:16:46 +02:00
Spike Curtis	6a1e7ee1d0	feat: add file logger to coder ssh (#7646 ) * coder ssh can log to file Signed-off-by: Spike Curtis <spike@coder.com> * Update golden file Signed-off-by: Spike Curtis <spike@coder.com> * generate CLI docs Signed-off-by: Spike Curtis <spike@coder.com> * Fix imports, typo Signed-off-by: Spike Curtis <spike@coder.com> * log more things! Signed-off-by: Spike Curtis <spike@coder.com> --------- Signed-off-by: Spike Curtis <spike@coder.com>	2023-05-25 05:07:39 +00:00
Mathias Fredriksson	b6c8e5be48	fix(cli/ssh): Fetch up-to-date build info to avoid ws has no agents (#7650 ) Fixes #5836	2023-05-24 12:37:22 +03:00
Mathias Fredriksson	c2871e12aa	fix(cli/ssh): Avoid connection hang when workspace is stopped (#7201 ) * fix(cli/ssh): Avoid connection hang when workspace is stopped Two issues are addressed here: 1. We were not detecting disconnects due to waiting for Stdin to close (disconnect would only propagate after entering input and failing to write to the connection). 2. In other scenarios, where the connection drop is not detected, we now also watch workspace status and drop the connection when a workspace reaches the stopped state. Fixes: https://github.com/coder/jetbrains-coder/issues/199 Refs: #6180, #6175	2023-04-19 21:32:28 +03:00
Mathias Fredriksson	0224426e5b	refactor(agent): Move SSH server into agentssh package (#7004 ) Refs: #6177	2023-04-06 19:39:22 +03:00
Ammar Bandukwala	b439c3e167	fix: permit SSH by default when startup script fails (#6798 )	2023-03-27 14:59:58 +00:00
Ammar Bandukwala	2bd6d2908e	feat: convert entire CLI to clibase (#6491 ) I'm sorry.	2023-03-23 17:42:20 -05:00
Kyle Carberry	2c2bbcc019	chore: update tests to support fish (#6023 ) * fix: update tests to add fish support * Track connections for SSH sessions to prevent leaks * Revert SSH conn handling	2023-02-03 12:25:11 -06:00
Kyle Carberry	8487127f5c	chore: skip reconnecting pty scale tests (#5908 ) * fix: close reconnecting pty conn when exiting agent Fixes https://github.com/coder/coder/actions/runs/4038282899/jobs/6942170850 * Fix conpty * Fix contrib * Skip runner tests for being flakes * Fix gpg key test * Fix golden files * Fix comments	2023-01-29 14:53:49 -06:00
Mathias Fredriksson	981cac5e28	chore: Invert `delay_login_until_ready`, now `login_before_ready` (#5893 )	2023-01-27 20:07:47 +00:00
Mathias Fredriksson	a753703e47	feat(cli): Add support for `delay_login_until_ready` (#5851 )	2023-01-27 19:05:40 +02:00
Kyle Carberry	9f6edab53b	feat: replace vscodeipc with vscodessh (#5645 ) The VS Code extension has been refactored to use VS Code Remote SSH instead of using the private API. This changes the structure to continue using SSH, but output network information periodically to a file.	2023-01-10 04:23:17 +00:00
Dean Sheather	f1fe2b5c06	feat: add GPG forwarding to coder ssh (#5482 )	2023-01-06 07:52:19 +00:00
Kyle Carberry	82f494c99c	fix: Improve tailnet connections by reducing timeouts (#5043 ) * fix: Improve tailnet connections by reducing timeouts This awaits connection ping before running a dial. Before, we were hitting the TCP retransmission and handshake timeouts, which could intermittently add 1 or 5 seconds to a connection being initialized. * Update Tailscale	2022-11-13 11:33:05 -06:00
Garrett Delfosse	766a2ad590	chore: refactor workspace count to single route (#4809 ) Co-authored-by: Presley Pizzo <presley@coder.com>	2022-11-10 13:25:46 -05:00
Dean Sheather	d82364b9b5	feat: make trace provider in loadtest, add tracing to sdk (#4939 )	2022-11-09 08:10:48 +10:00
Marcin Tojek	641aacf793	feat: show banner when workspace is outdated (#4926 ) * feat: show banner when workspace is outdated * Address PR comments * Fix: writer	2022-11-07 19:12:39 +01:00
Kyle Carberry	30281852d6	feat: Add buffering to provisioner job logs (#4918 ) * feat: Add bufferring to provisioner job logs This should improve overall build performance, and especially under load. It removes the old `id` column on the `provisioner_job_logs` table and replaces it with an auto-incrementing big integer to preserve order. Funny enough, we never had to care about order before because inserts would at minimum be 1ms different. Now they aren't, so the order needs to be preserved. * Fix log bufferring * Fix frontend log streaming * Fix JS test	2022-11-06 20:50:34 -06:00
Kyle Carberry	2ba4a62a0d	feat: Add high availability for multiple replicas (#4555 ) * feat: HA tailnet coordinator * fixup! feat: HA tailnet coordinator * fixup! feat: HA tailnet coordinator * remove printlns * close all connections on coordinator * impelement high availability feature * fixup! impelement high availability feature * fixup! impelement high availability feature * fixup! impelement high availability feature * fixup! impelement high availability feature * Add replicas * Add DERP meshing to arbitrary addresses * Move packages to highavailability folder * Move coordinator to high availability package * Add flags for HA * Rename to replicasync * Denest packages for replicas * Add test for multiple replicas * Fix coordination test * Add HA to the helm chart * Rename function pointer * Add warnings for HA * Add the ability to block endpoints * Add flag to disable P2P connections * Wow, I made the tests pass * Add replicas endpoint * Ensure close kills replica * Update sql * Add database latency to high availability * Pipe TLS to DERP mesh * Fix DERP mesh with TLS * Add tests for TLS * Fix replica sync TLS * Fix RootCA for replica meshing * Remove ID from replicasync * Fix getting certificates for meshing * Remove excessive locking * Fix linting * Store mesh key in the database * Fix replica key for tests * Fix types gen * Fix unlocking unlocked * Fix race in tests * Update enterprise/derpmesh/derpmesh.go Co-authored-by: Colin Adler <colin1adler@gmail.com> * Rename to syncReplicas * Reuse http client * Delete old replicas on a CRON * Fix race condition in connection tests * Fix linting * Fix nil type * Move pubsub to in-memory for twenty test * Add comment for configuration tweaking * Fix leak with transport * Fix close leak in derpmesh * Fix race when creating server * Remove handler update * Skip test on Windows * Fix DERP mesh test * Wrap HTTP handler replacement in mutex * Fix error message for relay * Fix API handler for normal tests * Fix speedtest * Fix replica resend * Fix derpmesh send * Ping async * Increase wait time of template version jobd * Fix race when closing replica sync * Add name to client * Log the derpmap being used * Don't connect if DERP is empty * Improve agent coordinator logging * Fix lock in coordinator * Fix relay addr * Fix race when updating durations * Fix client publish race * Run pubsub loop in a queue * Store agent nodes in order * Fix coordinator locking * Check for closed pipe Co-authored-by: Colin Adler <colin1adler@gmail.com>	2022-10-17 13:43:30 +00:00

1 2

91 Commits