Commit Graph

64 Commits

Author SHA1 Message Date
Colin Adler 1cd5f38cb0
feat: add debug server for tailnet coordinators (#5861)
Implements a Tailscale-like debug server for our in-memory coordinator. This should provide some visibility into why connections could be failing.
Resolves: https://github.com/coder/coder/issues/5845

![image](https://user-images.githubusercontent.com/6332295/214680832-2724d633-2d54-44d6-a7ce-5841e5824ee5.png)
2023-01-25 21:27:36 +00:00
Mathias Fredriksson 138887de7e
feat: Add workspace agent lifecycle state reporting (#5785) 2023-01-24 14:24:27 +02:00
Cian Johnston 73afdd7c09
chore: agent_test.go: use ptty.Peek() instead of expecting caret in TestAgent_SessionTTYShell (#5821) 2023-01-23 11:23:25 +00:00
Dean Sheather f1fe2b5c06
feat: add GPG forwarding to coder ssh (#5482) 2023-01-06 07:52:19 +00:00
Mathias Fredriksson 760419a965
chore: Refactor agent tests to avoid `t.Run` when not needed (#5376)
It turns out that writing tests that contain subtests should probably be
limited to table-based tests and tests that share a common setup shared
between tests.

Writing tests with a subtest like this:

```
func TestSomething(t *testing.T) {
	t.Run("Subtest", func(t *testing.t) {})
}
```

Has the following disadvantages:

- It can lead to multiple tests failing with `(unknown)` status when
  only one of the subtests hang (never exit)
- In Go 1.20rc1, using `t.Setenv` is no longer allowed if the parent
  test is parallel
2022-12-12 22:20:46 +02:00
Mathias Fredriksson 88bb901283
fix: Close tailnet if agent is closed during creation (#5375) 2022-12-12 11:26:49 +00:00
Mathias Fredriksson 05130db571
fix: Improve closing of services in agent tests (#5355) 2022-12-09 12:22:27 +02:00
Mathias Fredriksson eff99f78fa
feat: Add support for MOTD file in coder agents (#5147) 2022-11-24 12:22:20 +00:00
Colin Adler ae38bbeab6
chore: refactor agent stats streaming (#5112) 2022-11-18 16:46:53 -06:00
Dean Sheather 69e8c9e7b4
feat: add reconnectingpty loadtest (#5083) 2022-11-17 16:57:15 +00:00
Ammar Bandukwala 73f91e4690
ci: use big runners (#4990)
* chore: Close idle connections on test cleanup

It's possible that this was the source of a leak on Windows...

* ci: use big runners

* fix: Improve tailnet connections by reducing timeouts

This awaits connection ping before running a dial. Before,
we were hitting the TCP retransmission and handshake timeouts,
which could intermittently add 1 or 5 seconds to a connection
being initialized.

* Add logging to Startupscript test

* Add better logging

* Write startup script logs to fs dir

* Fix startup script test

* Fix startup script test

* Reduce test timeout

* Use central tmp dir in agent

* Adjust output

* Skip startup script test on Windows

Co-authored-by: Kyle Carberry <kyle@carberry.com>
2022-11-13 14:23:23 -06:00
Kyle Carberry 82f494c99c
fix: Improve tailnet connections by reducing timeouts (#5043)
* fix: Improve tailnet connections by reducing timeouts

This awaits connection ping before running a dial. Before,
we were hitting the TCP retransmission and handshake timeouts,
which could intermittently add 1 or 5 seconds to a connection
being initialized.

* Update Tailscale
2022-11-13 11:33:05 -06:00
Kyle Carberry 16e9b1eb1a
fix: Add timeouts to every tailnet ping (#4986)
A ping isn't guaranteed to deliver, so these need to have a
tight timeout for tests to not flake.
2022-11-09 20:12:51 +00:00
Dean Sheather d82364b9b5
feat: make trace provider in loadtest, add tracing to sdk (#4939) 2022-11-09 08:10:48 +10:00
Kyle Carberry 8e743d28c8
fix: Use instance identity session token for git subcommands (#4884)
This broke using gitssh with instance identity!
2022-11-04 09:44:36 -07:00
Kyle Carberry 104d6608d9
feat: Add `VSCODE_PROXY_URI` to surface code-server ports (#4798)
* feat: Add `VSCODE_PROXY_URI` to surface code-server ports

Fixes #4776.

* Check if app host is provided
2022-11-04 04:45:43 +00:00
Mathias Fredriksson a0bdb4fca2
fix: Remove pkg/sftp fork, fix SFTP test (#4759) 2022-10-26 16:02:06 +03:00
Kyle Carberry eec406b739
feat: Add Git auth for GitHub, GitLab, Azure DevOps, and BitBucket (#4670)
* Add scaffolding

* Move migration

* Add endpoints for gitauth

* Add configuration files and tests!

* Update typesgen

* Convert configuration format for git auth

* Fix unclosed database conn

* Add overriding VS Code configuration

* Fix Git screen

* Write VS Code special configuration if providers exist

* Enable automatic cloning from VS Code

* Add tests for gitaskpass

* Fix feature visibiliy

* Add banner for too many configurations

* Fix update loop for oauth token

* Jon comments

* Add deployment config page
2022-10-24 19:46:24 -05:00
Kyle Carberry bf3224e373
fix: Refactor agent to consume API client (#4715)
* fix: Refactor agent to consume API client

This simplifies a lot of code by creating an interface for
the codersdk client into the agent. It also moves agent
authentication code so instance identity will work between
restarts.

Fixes #3485 and #4082.

* Fix client reconnections
2022-10-23 22:35:08 -05:00
Mathias Fredriksson 173b7a2c83
fix: Start SFTP sessions in user home (working directory) (#4549)
* fix: Start SFTP sessions in user home (working directory)

This commit switches to our fork of `pkg/sftp` which includes a Server
option for changing the current working directory.

Attempt to upstream: https://github.com/pkg/sftp/pull/528

Supercedes and closes #4420

Fixes #3620

* Update fork
2022-10-21 09:54:06 -05:00
Kyle Carberry 2ba4a62a0d
feat: Add high availability for multiple replicas (#4555)
* feat: HA tailnet coordinator

* fixup! feat: HA tailnet coordinator

* fixup! feat: HA tailnet coordinator

* remove printlns

* close all connections on coordinator

* impelement high availability feature

* fixup! impelement high availability feature

* fixup! impelement high availability feature

* fixup! impelement high availability feature

* fixup! impelement high availability feature

* Add replicas

* Add DERP meshing to arbitrary addresses

* Move packages to highavailability folder

* Move coordinator to high availability package

* Add flags for HA

* Rename to replicasync

* Denest packages for replicas

* Add test for multiple replicas

* Fix coordination test

* Add HA to the helm chart

* Rename function pointer

* Add warnings for HA

* Add the ability to block endpoints

* Add flag to disable P2P connections

* Wow, I made the tests pass

* Add replicas endpoint

* Ensure close kills replica

* Update sql

* Add database latency to high availability

* Pipe TLS to DERP mesh

* Fix DERP mesh with TLS

* Add tests for TLS

* Fix replica sync TLS

* Fix RootCA for replica meshing

* Remove ID from replicasync

* Fix getting certificates for meshing

* Remove excessive locking

* Fix linting

* Store mesh key in the database

* Fix replica key for tests

* Fix types gen

* Fix unlocking unlocked

* Fix race in tests

* Update enterprise/derpmesh/derpmesh.go

Co-authored-by: Colin Adler <colin1adler@gmail.com>

* Rename to syncReplicas

* Reuse http client

* Delete old replicas on a CRON

* Fix race condition in connection tests

* Fix linting

* Fix nil type

* Move pubsub to in-memory for twenty test

* Add comment for configuration tweaking

* Fix leak with transport

* Fix close leak in derpmesh

* Fix race when creating server

* Remove handler update

* Skip test on Windows

* Fix DERP mesh test

* Wrap HTTP handler replacement in mutex

* Fix error message for relay

* Fix API handler for normal tests

* Fix speedtest

* Fix replica resend

* Fix derpmesh send

* Ping async

* Increase wait time of template version jobd

* Fix race when closing replica sync

* Add name to client

* Log the derpmap being used

* Don't connect if DERP is empty

* Improve agent coordinator logging

* Fix lock in coordinator

* Fix relay addr

* Fix race when updating durations

* Fix client publish race

* Run pubsub loop in a queue

* Store agent nodes in order

* Fix coordinator locking

* Check for closed pipe

Co-authored-by: Colin Adler <colin1adler@gmail.com>
2022-10-17 13:43:30 +00:00
Kyle Carberry 39cf329404
fix: Replace access URL for built-in DERP servers (#4197)
Fixes #4195.
2022-09-26 12:56:04 -05:00
Garrett Delfosse 4c8be34d81
feat: add health check monitoring to workspace apps (#4114) 2022-09-23 15:51:04 -04:00
Kyle Carberry a7ee8b31e0
fix: Don't use StatusAbnormalClosure (#4155) 2022-09-22 18:26:05 +00:00
Kyle Carberry 714c366d16
chore: Remove WebRTC networking (#3881)
* chore: Remove WebRTC networking

* Fix race condition

* Fix WebSocket not closing
2022-09-19 19:46:29 -05:00
Kyle Carberry 0f8c2f592e
feat: Use Tailscale networking by default (#4003)
* feat: Use Tailscale networking by default

Removal of WebRTC code will happen in another PR, but it
felt dangerious to default and remove in a single commit.

Ideally, we can release this version and collect final
thoughts and  feedback before a full commitment.

* Remove UNIX forwarding

Tailscale doesn't support this, and adding support
for it shouldn't block our rollout. Customers can
always forward over SSH.

* Update cli/portforward_test.go

Co-authored-by: Dean Sheather <dean@deansheather.com>

Co-authored-by: Dean Sheather <dean@deansheather.com>
2022-09-13 15:55:56 -05:00
Kyle Carberry 00104096c2
fix: Resolve CI flakes for tailnet agent (#3924) 2022-09-07 09:24:58 -05:00
Kyle Carberry 1254e7a902
feat: Add speedtest command for tailnet (#3874) 2022-09-05 17:15:49 -05:00
Ammar Bandukwala 04b03792cb
feat: add last used to Workspaces page (#3816) 2022-09-02 00:08:51 +00:00
Ammar Bandukwala 30f8fd9b95
Daily Active User Metrics (#3735)
* agent: add StatsReporter

* Stabilize protoc
2022-09-01 14:58:23 -05:00
Kyle Carberry 9bd83e5ec7
feat: Add Tailscale networking (#3505)
* fix: Add coder user to docker group on installation

This makes for a simpler setup, and reduces the likelihood
a user runs into a strange issue.

* Add wgnet

* Add ping

* Add listening

* Finish refactor to make this work

* Add interface for swapping

* Fix conncache with interface

* chore: update gvisor

* fix tailscale types

* linting

* more linting

* Add coordinator

* Add coordinator tests

* Fix coordination

* It compiles!

* Move all connection negotiation in-memory

* Migrate coordinator to use net.conn

* Add closed func

* Fix close listener func

* Make reconnecting PTY work

* Fix reconnecting PTY

* Update CI to Go 1.19

* Add CLI flags for DERP mapping

* Fix Tailnet test

* Rename ConnCoordinator to TailnetCoordinator

* Remove print statement from workspace agent test

* Refactor wsconncache to use tailnet

* Remove STUN from unit tests

* Add migrate back to dump

* chore: Upgrade to Go 1.19

This is required as part of #3505.

* Fix reconnecting PTY tests

* fix: update wireguard-go to fix devtunnel

* fix migration numbers

* linting

* Return early for status if endpoints are empty

* Update cli/server.go

Co-authored-by: Colin Adler <colin1adler@gmail.com>

* Update cli/server.go

Co-authored-by: Colin Adler <colin1adler@gmail.com>

* Fix frontend entites

* Fix agent bicopy

* Fix race condition for the last node

* Fix down migration

* Fix connection RBAC

* Fix migration numbers

* Fix forwarding TCP to a local port

* Implement ping for tailnet

* Rename to ForceHTTP

* Add external derpmapping

* Expose DERP region names to the API

* Add global option to enable Tailscale networking for web

* Mark DERP flags hidden while testing

* Update DERP map on reconnect

* Add close func to workspace agents

* Fix race condition in upstream dependency

* Fix feature columns race condition

Co-authored-by: Colin Adler <colin1adler@gmail.com>
2022-08-31 20:09:44 -05:00
Mathias Fredriksson e44f7adb7e
feat: Set SSH env vars: `SSH_CLIENT`, `SSH_CONNECTION` and `SSH_TTY` (#3622)
Fixes #2339
2022-08-23 21:19:57 +03:00
Mathias Fredriksson f7ccfa2ab9
feat: Set `CODER=true` in workspaces (#3637)
Fixes #2340
2022-08-23 14:29:01 +03:00
Mathias Fredriksson 4730c589fe
chore: Use standardized test timeouts and delays (#3291) 2022-08-01 15:45:05 +03:00
Spike Curtis 36ffdce065
Return proper exit code on ssh with TTY (#3192)
* Return proper exit code on ssh with TTY

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix revive lint

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix Windows exit code for missing command

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix close error handling on agent TTY

Signed-off-by: Spike Curtis <spike@coder.com>
2022-07-27 14:23:28 -05:00
Mathias Fredriksson 0b86c8047c
fix: Close connections in agent tests (#3196) 2022-07-26 13:24:54 +03:00
Mathias Fredriksson 51dd1fde3b
fix: Remove use of `require` in `require.Eventually` in tests (#3110)
* fix: Remove use of `require` in `require.Eventually` in tests

Because require uses `t.FailNow()` and `require.Eventually` runs the
function in a goroutine, which is not allowed.

* feat: Add ruleguard for require.Eventually

Co-authored-by: Cian Johnston <cian@coder.com>
2022-07-22 20:02:49 +03:00
Mathias Fredriksson 7d07e670ca
chore: Improve test cleanup (#3112) 2022-07-22 15:14:45 +03:00
Kyle Carberry d9da96cad0
fix: Add test for SCP (#2692)
* fix: Elongate agent disconnect timeout in tests

This will fix the flake seen here:
https://github.com/coder/coder/runs/7071719863?check_suite_focus=true

* fix: Add test for SCP

This was hanging due to the stdin pipe never being closed.
A test has been added to make sure it works!
2022-06-27 17:41:53 +01:00
Kyle Carberry b9f3fe49cb
fix: Start login shells on macOS and Linux (#2437)
This appends `-l` to the shell command on macOS and Linux.
It also adds environment variable expansion to allow for
chaining from `coder_agent.env`.
2022-06-17 05:54:45 +00:00
Kyle Carberry d0ed107b08
fix: Add command to reconnecting PTY (#1860)
This fixes #1708 and opens the door for PTYs to execute
non-shell commands!
2022-05-27 14:51:20 -05:00
Garrett Delfosse 781f3d0641
fix: use dir over full path for coder bin (#1795) 2022-05-26 19:05:46 +00:00
Garrett Delfosse 3052a6d88e
Add coder executable to PATH (#1771) 2022-05-26 12:59:41 -05:00
Colin Adler 4543a3b277
fix: log after test exit in `TestAgent/StartupScript` (#1726)
```
$ go test ./agent/ -v -run TestAgent/StartupScript -count 1
=== RUN   TestAgent
=== PAUSE TestAgent
=== CONT  TestAgent
=== RUN   TestAgent/StartupScript
=== PAUSE TestAgent/StartupScript
=== CONT  TestAgent/StartupScript
    t.go:56: 2022-05-24 20:22:39.648 [INFO]	<agent.go:112>	connected
--- PASS: TestAgent (0.00s)
    --- PASS: TestAgent/StartupScript (0.17s)
PASS
panic: Log in goroutine after TestAgent/StartupScript has completed: 2022-05-24 20:22:39.651 [WARN]	<agent.go:130>	agent script failed ...
"error": run:
             github.com/coder/coder/agent.(*agent).runStartupScript
                 /home/colin/Projects/coder/coder/agent/agent.go:183
           - signal: killed
```
2022-05-24 16:03:42 -05:00
Cian Johnston c2f74f3cc2
chore: avoid concurrent usage of t.FailNow (#1683)
* chore: golangci: add linter rule to report usage of t.FailNow inside goroutines
* chore: avoid t.FailNow in goroutines to appease the race detector
2022-05-24 08:58:39 +01:00
David Wahler a50a6e8638
fix: Make TestAgent and TestWorkspaceAgentPTY less flaky (#1562) 2022-05-18 17:06:17 +00:00
Dean Sheather 9141be3656
feat: add port-forward subcommand (#1350) 2022-05-19 00:10:40 +10:00
Kyle Carberry 81577f120a
feat: Add web terminal with reconnecting TTYs (#1186)
* feat: Add web terminal with reconnecting TTYs

This adds a web terminal that can reconnect to resume sessions!
No more disconnects, and no more bad bufferring!

* Add xstate service

* Add the webpage for accessing a web terminal

* Add terminal page tests

* Use Ticker instead of Timer

* Active Windows mode on Windows
2022-04-29 17:30:10 -05:00
Kyle Carberry 744a00a55d
feat: Add GIT_COMMITTER information to agent env vars (#1171)
This makes setting up git a bit simpler, and users
can always override these values!

We'll probably add a way to disable our Git integration
anyways, so these could be part of that.
2022-04-25 20:03:54 -05:00
Kyle Carberry 64348882a0
test: Loop if content length is zero for Windows (#1151)
On Windows it seems the file can be created before the
content has been written, which results in comparing
an empty string.
2022-04-25 19:01:49 +00:00