* feat: Add workspace agent for SSH
This adds the initial agent that supports TTY
and execution over SSH. It functions across MacOS,
Windows, and Linux.
This does not handle the coderd interaction yet,
but does setup a simple path forward.
* Fix pty tests on Windows
* Fix log race
* Lock around dial error to fix log output
* Fix context return early
* fix: Leaking yamux session after HTTP handler is closed
Closes#317. We depended on the context canceling the yamux connection,
but this isn't a sync operation. Explicitly calling close ensures the
handler waits for yamux to complete before exit.
* Lock around close return
* Force failure with log
* Fix failed handler
* Upgrade dep
* Fix defer inside loops
* Fix context cancel for HTTP requests
* Fix resize
* Switch from memfs to built-in fstest.MapFS
* Ensure httptest servers are closed during test cleanup
* Swap ordering of expected/actual values in assertion functions
* Use http.StatusOK constants for status codes
* Add a 1-second context timeout for each request
* Use the test httptest.Server.Client() so that we can handle
TLS server certificates if desired
Fixes#210 - this isPR implements `coder login` in the case where the default user is already created.
This change adds:
- A prompt in the case where there is not an initial user that opens the server URL + requests a session token
- This ports over some code from v1 for the `openURL` and `isWSL` functions to support opening the browser
- A `/api/v2/api-keys` endpoint that can be `POST`'d to in order to request a new api key for a user
- This route was inspired by the v1 functionality
- A `cli-auth` route + page that shows the generated api key
- Tests for the new code + storybook for the new UI
The `/cli-auth` route, like in v1, is very minimal:
<img width="624" alt="Screen Shot 2022-02-16 at 5 05 07 PM" src="https://user-images.githubusercontent.com/88213859/154384627-78ab9841-27bf-490f-9bbe-23f8173c9e97.png">
And the terminal UX looks like this:
![2022-02-16 17 13 29](https://user-images.githubusercontent.com/88213859/154385225-509c78d7-840c-4cab-8f1e-074fede8f97e.gif)
* Initial agent
* fix: Use buffered reader in peer to fix ShortBuffer
This prevents a io.ErrShortBuffer from occurring when the byte
slice being read is smaller than the chunks sent from the opposite
pipe.
This makes sense for unordered connections, where transmission is
not guarunteed, but does not make sense for TCP-like connections.
We use a bufio.Reader when ordered to ensure data isn't lost.
* SSH server works!
* Start Windows support
* Something works
* Refactor pty package to support Windows spawn
* SSH server now works on Windows
* Fix non-Windows
* Fix Linux PTY render
* FIx linux build tests
* Remove agent and wintest
* Add test for Windows resize
* Fix linting errors
* Add Windows environment variables
* Add strings import
* Add comment for attrs
* Add goleak
* Add require import
* Refactor parameter parsing to return nil values if none computed
* Refactor parameter to allow for hiding redisplay
* Refactor parameters to enable schema matching
* Refactor provisionerd to dynamically update parameter schemas
* Refactor job update for provisionerd
* Handle multiple states correctly when provisioning a project
* Add project import job resource table
* Basic creation flow works!
* Create project fully works!!!
* Only show job status if completed
* Add create workspace support
* Replace Netflix/go-expect with ActiveState
* Fix linting errors
* Use forked chzyer/readline
* Add create workspace CLI
* Add CLI test
* Move jobs to their own APIs
* Remove go-expect
* Fix requested changes
* Skip workspacecreate test on windows
An issue came up last week... our `embed.go` strategy doesn't handle dynamic NextJS-style routes! This is a blocker, because I'm aiming to set up CD on Monday, and the v2 UI makes heavy use of dynamic routing.
As a potential solution, this implements a go pkg `nextrouter` that serves `html` files, but respecting the dynamic routing behavior of NextJS:
- Files that have square brackets - ie `[providers]` provide a single-level dynamic route
- Files that have `[[...` prefix - ie `[[...any]]` - are catch-all routes.
- Files should be preferred over folders (ie, `providers.html` is preferred over `/providers`)
- Fixes the trailing-slash bug we hit in the previous `embed` strategy
This also integrates with `slog.Logger` for tracing, and handles injecting template parameters - a feature we need in v1 and v2 to be able to inject stuff like CSRF tokens.
This implements testing by using an in-memory file-system, so that we can exercise all of these cases.
In addition, this adjust V2's `embed.go` strategy to use `nextrouter`, which simplifies that file considerably. I'm tempted to factor out the `secureheaders` logic into a separate package, too.
If this works OK, it could be used for V1 too (although that scenario is more complex due to our hybrid-routing strategy). Based on our FE variety meeting, there's always a chance we could move away from NextJS in v1 - if that's the case, this router will still work and be more tested than our previous strategy (it just won't make use of dynamic routing). So I figured this was worth doing to make sure we can make forward progress in V2.
* chore: Rename ProjectHistory to ProjectVersion
Version more accurately represents version storage. This
forks from the WorkspaceHistory name, but I think it's
easier to understand Workspace history.
* Rename files
* Standardize tests a bit more
* Remove Server struct from coderdtest
* Improve test coverage for workspace history
* Fix linting errors
* Fix coderd test leak
* Fix coderd test leak
* Improve workspace history logs
* Standardize test structure for codersdk
* Fix linting errors
* Fix WebSocket compression
* Update coderd/workspaces.go
Co-authored-by: Bryan <bryan@coder.com>
* Add test for listing project parameters
* Cache npm dependencies with setup node
* Remove windows npm cache key
Co-authored-by: Bryan <bryan@coder.com>
* feat: Add history middleware parameters
These will be used for streaming logs, checking status,
and other operations related to workspace and project
history.
* refactor: Move all HTTP routes to top-level struct
Nesting all structs behind their respective structures
is leaky, and promotes naming conflicts between handlers.
Our HTTP routes cannot have conflicts, so neither should
function naming.
* Add provisioner daemon routes
* Add periodic updates
* Skip pubsub if short
* Return jobs with WorkspaceHistory
* Add endpoints for extracting singular history
* The full end-to-end operation works
* fix: Disable compression for websocket dRPC transport (#145)
There is a race condition in the interop between the websocket and `dRPC`: https://github.com/coder/coder/runs/5038545709?check_suite_focus=true#step:7:117 - it seems both the websocket and dRPC feel like they own the `byte[]` being sent between them. This can lead to data races, in which both `dRPC` and the websocket are writing.
This is just tracking some experimentation to fix that race condition
## Run results: ##
- Run 1: peer test failure
- Run 2: peer test failure
- Run 3: `TestWorkspaceHistory/CreateHistory` - https://github.com/coder/coder/runs/5040858460?check_suite_focus=true#step:8:45
```
status code 412: The provided project history is running. Wait for it to complete importing!`
```
- Run 4: `TestWorkspaceHistory/CreateHistory` - https://github.com/coder/coder/runs/5040957999?check_suite_focus=true#step:7:176
```
workspacehistory_test.go:122:
Error Trace: workspacehistory_test.go:122
Error: Condition never satisfied
Test: TestWorkspaceHistory/CreateHistory
```
- Run 5: peer failure
- Run 6: Pass ✅
- Run 7: Peer failure
## Open Questions: ##
### Is `dRPC` or `websocket` at fault for the data race?
It looks like this condition is specifically happening when `dRPC` decides to [`SendError`]). This constructs a new byte payload from [`MarshalError`](f6e369438f/drpcwire/error.go (L15)) - so `dRPC` has created this buffer and owns it.
From `dRPC`'s perspective, the callstack looks like this:
- [`sendPacket`](f6e369438f/drpcstream/stream.go (L253))
- [`writeFrame`](f6e369438f/drpcwire/writer.go (L65))
- [`AppendFrame`](f6e369438f/drpcwire/packet.go (L128))
- with finally the data race happening here:
```go
// AppendFrame appends a marshaled form of the frame to the provided buffer.
func AppendFrame(buf []byte, fr Frame) []byte {
...
out := buf
out = append(out, control). // <---------
```
This should be fine, since `dPRC` create this buffer, and is taking the byte buffer constructed from `MarshalError` and tacking a bunch of headers on it to create a proper frame.
Once `dRPC` is done writing, it _hangs onto the buffer and resets it here__: f6e369438f/drpcwire/writer.go (L73)
However... the websocket implementation, once it gets the buffer, it runs a `statelessDeflate` [here](8dee580a7f/write.go (L180)), which compresses the buffer on the fly. This functionality actually [mutates the buffer in place](a1a9cfc821/flate/stateless.go (L94)), which is where get our race.
In the case where the `byte[]` aren't being manipulated anywhere else, this compress-in-place operation would be safe, and that's probably the case for most over-the-wire usages. In this case, though, where we're plumbing `dRPC` -> websocket, they both are manipulating it (`dRPC` is reusing the buffer for the next `write`, and `websocket` is compressing on the fly).
### Why does cloning on `Read` fail?
Get a bunch of errors like:
```
2022/02/02 19:26:10 [WARN] yamux: frame for missing stream: Vsn:0 Type:0 Flags:0 StreamID:0 Length:0
2022/02/02 19:26:25 [ERR] yamux: Failed to read header: unexpected EOF
2022/02/02 19:26:25 [ERR] yamux: Failed to read header: unexpected EOF
2022/02/02 19:26:25 [WARN] yamux: frame for missing stream: Vsn:0 Type:0 Flags:0 StreamID:0 Length:0
```
# UPDATE:
We decided we could disable websocket compression, which would avoid the race because the in-place `deflate` operaton would no longer be run. Trying that out now:
- Run 1: ✅
- Run 2: https://github.com/coder/coder/runs/5042645522?check_suite_focus=true#step:8:338
- Run 3: ✅
- Run 4: https://github.com/coder/coder/runs/5042988758?check_suite_focus=true#step:7:168
- Run 5: ✅
* fix: Remove race condition with acquiredJobDone channel (#148)
Found another data race while running the tests: https://github.com/coder/coder/runs/5044320845?check_suite_focus=true#step:7:83
__Issue:__ There is a race in the p.acquiredJobDone chan - in particular, there can be a case where we're waiting on the channel to finish (in close) with <-p.acquiredJobDone, but in parallel, an acquireJob could've been started, which would create a new channel for p.acquiredJobDone. There is a similar race in `close(..)`ing the channel, which also came up in test runs.
__Fix:__ Instead of recreating the channel everytime, we can use `sync.WaitGroup` to accomplish the same functionality - a semaphore to make close wait for the current job to wrap up.
* fix: Bump up workspace history timeout (#149)
This is an attempted fix for failures like: https://github.com/coder/coder/runs/5043435263?check_suite_focus=true#step:7:32
Looking at the timing of the test:
```
t.go:56: 2022-02-02 21:33:21.964 [DEBUG] (terraform-provisioner) <provision.go:139> ran apply
t.go:56: 2022-02-02 21:33:21.991 [DEBUG] (provisionerd) <provisionerd.go:162> skipping acquire; job is already running
t.go:56: 2022-02-02 21:33:22.050 [DEBUG] (provisionerd) <provisionerd.go:162> skipping acquire; job is already running
t.go:56: 2022-02-02 21:33:22.090 [DEBUG] (provisionerd) <provisionerd.go:162> skipping acquire; job is already running
t.go:56: 2022-02-02 21:33:22.140 [DEBUG] (provisionerd) <provisionerd.go:162> skipping acquire; job is already running
t.go:56: 2022-02-02 21:33:22.195 [DEBUG] (provisionerd) <provisionerd.go:162> skipping acquire; job is already running
t.go:56: 2022-02-02 21:33:22.240 [DEBUG] (provisionerd) <provisionerd.go:162> skipping acquire; job is already running
workspacehistory_test.go:122:
Error Trace: workspacehistory_test.go:122
Error: Condition never satisfied
Test: TestWorkspaceHistory/CreateHistory
```
It appears that the `terraform apply` job had just finished - with less than a second to spare until our `require.Eventually` completes - but there's still work to be done (ie, collecting the state files). So my suspicion is that terraform might, in some cases, exceed our 5s timeout.
Note that in the setup for this test - there is a similar project history wait that waits for 15s, so I borrowed that here.
In the future - we can look at potentially using a simple echo provider to exercise this in the unit test, in a way that is more reliable in terms of timing. I'll log an issue to track that.
Co-authored-by: Bryan <bryan@coder.com>
This brings an async service that parses and
provisions to life! It's separated from coderd
intentionally to allow for simpler testing.
Integration with coderd will come in another PR!
* fix: Synchronize peer logging with a channel
We were depending on the close mutex to properly
report connection state. This ensures the RTC
connection is properly closed before returning.
* Disable pion logging
* Remove buffer
* Try ICE servers
* Remove flushed
* Add diagram explaining handshake
* Fix candidate accept ordering
* Add debug logging to peerbroker
* Fix send ordering
* Lock adding ICE candidate
* Add test for negotiating out of order
* Reduce connection to a single negotiation channel
* Improve test times by pre-installing Terraform
* Lock remote session description being applied
* Organize conn
* Revert to multi-channel setup
* Properly close ICE gatherer
* Improve comments
* Try removing buffered candidates
* Buffer local and remote messages
* Log dTLS transport state
* Add pion logging
* chore: Update pion/ice fork to resolve goroutine leak
* Flush remote too
* Add logs for setting the description
* Try locking only on remote
* Remove local bufferring in favor of remote
* Remove unused flush func
* Set candidates flushed to true
* Defer flush until the end of negotiation
* Buffer ICE candidates
* Add comment clarifying channel buffer
* Flush after handshake
* Move away from fork
* Ignore pion/ice leaks
* chore: Fix race in collecting ICE Candidates
This logic was flawed previously. ICE Candidates could collect
before a negotiation was triggered, which led to a race where
candidates would be lost. Candidates can no longer be lost,
and we removed some code 😎.
* Add comment describing fix
* Use upstream dependency to fix goroutine leak
* Use upstream dependency to fix goroutine leak
* feat: Add authentication and personal user endpoint
This contribution adds a lot of scaffolding for the database fake
and testability of coderd.
A new endpoint "/user" is added to return the currently authenticated
user to the requester.
* Use TestMain to catch leak instead
* Add userpassword package
* Add WIP
* Add user auth
* Fix test
* Add comments
* Fix login response
* Fix order
* Fix generated code
* Update httpapi/httpapi.go
Co-authored-by: Bryan <bryan@coder.com>
Co-authored-by: Bryan <bryan@coder.com>
* feat: Add cryptorand package for random string and number generation
This package is taken from the monorepo, and was renamed from crand
for improved clarity. It will be used for API key generation.
* Remove "Must" functions
There is little precedence of functions leading with Must being
idiomatic in Go code. Ignoring errors in favor of a panic is
dangerous in highly-reliable code.
* Remove unused must.go
This change bundles the static assets like we have for v1 - using the [`embed`](https://pkg.go.dev/embed) go package. Fixes#22
In addition, it sets up a development script that runs `coderd` locally and serves the front-end, with hot-reloading. The script used is `./develop.sh`:
![2022-01-14 17 30 14](https://user-images.githubusercontent.com/88213859/149603926-f673d3d3-ba12-4eda-bcdd-427252405480.gif)
> NOTE: The UI is still placeholder, of course. Need to start testing out a simple, placeholder flow for the new v2 world as a next step
Summary of changes:
- Add build steps for `go` in the `Makefile`
- Add a step for production build, in which we use the `embed` tag
- Add a step for development, which doesn't need the `embed` tag - so we don't need to build the front-end twice
- Add `next export` build step to output front-end artifacts in `out`
- Add a `site` package for `go`
- Add `embed_static.go` and `embed.go`. This is mostly brought in as-is from v1, except removing some intercom/sentry CSP entries that we aren't using.
- Add a [next development server](https://nextjs.org/docs/advanced-features/custom-server)
- Add a `v2-dev` script, that runs `coderd` and the `next` dev server side-by-side
- Use the `site` package as the fallback handler.
- Add `.gitignore` entries for additional build collateral
* feat: Add v1 schema types
This adds compatibility for sharing data with Coder v1. Since the tables are the same, all CRUD operations should function as expected.
* Add license table
* feat: Add Coder Daemon to serve the API
coderd is a public package which will be consumed by v1 to support running both at the same time. The frontend will need to be compiled and statically served as part of this eventually.
* Fix initial migration
* Move to /api/v2
* Increase peer disconnectedTimeout to reduce flakes on slow machines
* Reduce timeout again
* Fix version for pion/ice
* feat: Create provisioner abstraction
Creates a provisioner abstraction that takes prior art from the Terraform plugin system. It's safe to assume this code will change a lot when it becomes integrated with provisionerd.
Closes#10.
* Ignore generated files in diff view
* Check for unstaged file changes
* Install protoc-gen-go
* Use proper drpc plugin version
* Fix serve closed pipe
* Install sqlc with curl for speed
* Fix install command
* Format CI action
* Add linguist-generated and closed pipe test
* Cleanup code from comments
* Add dRPC comment
* Add Terraform installer for cross-platform
* Build provisioner tests on Linux only
This package was pulled straight from github.com/coder/m. Nothing has been changed.
It will be used for networking clients<->workspaces, and coderd<->provisionerd.
* chore: Initial database scaffolding
This implements migrations and code generation for interfacing with a PostgreSQL database.
A dependency is added for the "postgres" binary on the host, but that seems like an acceptable requirement considering it's our primary database.
An in-memory database object can be created for simple cross-OS and fast testing.
* Run tests in CI
* Use Docker instead of binaries on the host
* Skip database tests on non-Linux operating systems
* chore: Add golangci-lint and codecov
* Use consistent file names