Commit Graph

157 Commits

Author SHA1 Message Date
Spike Curtis f01cab9894
feat: use tailnet v2 API for coordination (#11638)
This one is huge, and I'm sorry.

The problem is that once I change `tailnet.Conn` to start doing v2 behavior, I kind of have to change it everywhere, including in CoderSDK (CLI), the agent, wsproxy, and ServerTailnet.

There is still a bit more cleanup to do, and I need to add code so that when we lose connection to the Coordinator, we mark all peers as LOST, but that will be in a separate PR since this is big enough!
2024-01-22 11:07:50 +04:00
Spike Curtis 4071f1713b
feat: add logging to agent stats and JetBrains tracking (#11364)
Adds logging so we can hope to diagnose #11363
2024-01-02 13:34:49 +04:00
Steven Masley b7bdb17460
feat: add metrics to workspace agent scripts (#11132)
* push startup script metrics to agent
2023-12-13 11:45:43 -06:00
Dean Sheather a9c0c01629
chore: fix flake in listening ports test (#10833) 2023-11-22 09:30:51 +00:00
Mathias Fredriksson 7fecd39e23
fix(agent/agentscripts): display informative error for ErrWaitDelay (#10407)
Fixes #10400
2023-10-27 19:07:26 +03:00
Mathias Fredriksson 1a2aea3a6b
fix(agent): prevent metadata from being discarded if report is slow (#10386) 2023-10-23 17:02:54 +00:00
Mathias Fredriksson 76c65b1e1b
fix(agent): send metadata in batches (#10225)
Fixes #9782

---

I recommend reviewing with ignore whitespace.
2023-10-13 17:48:25 +03:00
Mathias Fredriksson 4857d4bd55
feat(codersdk/agentsdk): use new agent metadata batch endpoint (#10224)
Part of #9782
2023-10-13 17:32:28 +03:00
Mathias Fredriksson 7eeba15d16
feat(coderd): add support for sending batched agent metadata (#10223)
Part of #9782
2023-10-13 16:37:55 +03:00
Jon Ayers 4452a1484d
fix: fix log spam related to skipping custom nice scores (#10206) 2023-10-11 02:32:50 -05:00
Spike Curtis 54fd350913
feat: improve logging for speedtest connections
part of #7963

improve connection logging for speedtest connections
2023-10-09 20:48:28 +04:00
Spike Curtis 17e889af16
feat: improve logging for reconnectingPTY connections
part of #7963

improves connection logging on reconnectingPTY
2023-10-09 20:35:50 +04:00
Kyle Carberry 1262eef2c0
feat: add support for `coder_script` (#9584)
* Add basic migrations

* Improve schema

* Refactor agent scripts into it's own package

* Support legacy start and stop script format

* Pipe the scripts!

* Finish the piping

* Fix context usage

* It works!

* Fix sql query

* Fix SQL query

* Rename `LogSourceID` -> `SourceID`

* Fix the FE

* fmt

* Rename migrations

* Fix log tests

* Fix lint err

* Fix gen

* Fix story type

* Rename source to script

* Fix schema jank

* Uncomment test

* Rename proto to TimeoutSeconds

* Fix comments

* Fix comments

* Fix legacy endpoint without specified log_source

* Fix non-blocking by default in agent

* Fix resources tests

* Fix dbfake

* Fix resources

* Fix linting I think

* Add fixtures

* fmt

* Fix startup script behavior

* Fix comments

* Fix context

* Fix cancel

* Fix SQL tests

* Fix e2e tests

* Interrupt on Windows

* Fix agent leaking script process

* Fix migrations

* Fix stories

* Fix duplicate logs appearing

* Gen

* Fix log location

* Fix tests

* Fix tests

* Fix log output

* Show display name in output

* Fix print

* Return timeout on start context

* Gen

* Fix fixture

* Fix the agent status

* Fix startup timeout msg

* Fix command using shared context

* Fix timeout draining

* Change signal type

* Add deterministic colors to startup script logs

---------

Co-authored-by: Muhammad Atif Ali <atif@coder.com>
2023-09-25 16:47:17 -05:00
Jon Ayers 7311ffbd9d
feat: implement agent process management (#9461)
- An opt-in feature has been added to the agent to allow
   deprioritizing non coder-related processes for CPU by setting their
   niceness level to 10.
- Opting in to the feature requires setting CODER_PROC_PRIO_MGMT to a non-empty value.
2023-09-14 19:45:05 -05:00
Mathias Fredriksson 19d7da3d24
refactor(coderd/database): split `Time` and `Now` into `dbtime` package (#9482)
Ref: #9380
2023-09-01 16:50:12 +00:00
Mathias Fredriksson 702b064cac
refactor: split coderd/gitauth into two, add cli/gitauth (#9479)
* refactor: split coderd/gitauth into two, add cli/gitauth

Ref: #9380
2023-09-01 15:41:22 +00:00
Dean Sheather 64df076328
feat: add server flag to force DERP to use always websockets (#9238) 2023-08-24 17:22:31 +00:00
Kyle Carberry 22e781eced
chore: add /v2 to import module path (#9072)
* chore: add /v2 to import module path

go mod requires semantic versioning with versions greater than 1.x

This was a mechanical update by running:
```
go install github.com/marwan-at-work/mod/cmd/mod@latest
mod upgrade
```

Migrate generated files to import /v2

* Fix gen
2023-08-18 18:55:43 +00:00
Mathias Fredriksson bbaa057e15
fix(agent): log correct script timeout for startup script (#9190) 2023-08-18 17:35:49 +00:00
Colin Adler 344d32b2f1
feat(coderd): expire agents from server tailnet (#9092) 2023-08-14 20:38:37 -05:00
Asher a08f7b8fb9
fix: catch missing output with reconnecting PTY (#9094)
I forgot that waiting on the cond releases the lock so it was possible
to get pty output after writing the buffer but before adding the pty to
the map.  To fix, add the pty to the map while under the same lock where
we read from the buffer.

The rest does not need to be behind the lock so I moved it out of
doAttach, and that also means we no longer need
waitForStateOrContextLocked.

Also, this can hit a logger error saying the attach failed which fails
the tests however it is not that the attach failed, just that the
process already ran and exited, so when the process exits do not
set an error, instead for now assume this is an expected close.
2023-08-14 15:54:23 -08:00
Asher b993cab49a
fix: use screen for reconnecting terminal sessions on Linux if available (#8640)
* Add screen backend for reconnecting ptys

The screen portion is a port from wsep.  There is an interface that lets
you choose between screen and the previous method.  By default it will
choose screen if it is installed but this can be overidden (mostly for
tests).

The tests use a scanner instead of a reader now because the reader will
loop infinitely at the end of a stream.

Replace /bin/bash with bash since bash is not always in /bin.

* Remove connection_id from reconnecting PTY logger

This serves multiple connections so it makes no sense to scope it to a
single connection.

Also lets us use "connection_id" when logging write errors instead of
"other_conn_id".

* Use PATH to test buffered reconnecting pty
2023-08-14 11:19:13 -08:00
Dean Sheather 07fd73c4a0
chore: allow multiple agent subsystems, add exectrace (#8933) 2023-08-08 22:10:28 -07:00
Dean Sheather 3c52b01850
chore: add tailscale magicsock debug logging controls (#8982) 2023-08-08 17:56:08 +00:00
Dean Sheather b955c5fefc
fix: avoid agent runLoop exiting due to ws ping (#8852) 2023-08-02 07:25:07 +00:00
Dean Sheather c575292ba6
fix: fix tailnet netcheck issues (#8802) 2023-08-02 01:50:43 +10:00
Kyle Carberry bd944e0d21
chore: rename startup logs to agent logs (#8649)
* chore: rename startup logs to agent logs

This also adds a `source` property to every agent log. It
should allow us to group logs and display them nicer in
the UI as they stream in.

* Fix migration order

* Fix naming

* Rename the frontend

* Fix tests

* Fix down migration

* Match enums for workspace agent logs

* Fix inserting log source

* Fix migration order

* Fix logs tests

* Fix psql insert
2023-07-28 15:57:23 +00:00
Ammar Bandukwala 25e30c6f41
feat(cli): support fine-grained server log filtering (#8748) 2023-07-26 16:46:22 -05:00
Dean Sheather 2f0a9996e7
chore: add derpserver to wsproxy, add proxies to derpmap (#7311) 2023-07-27 02:21:04 +10:00
Colin Adler 71d4e4e6e8
fix(agent): check agent metadata every second instead of minute (#8614) 2023-07-20 14:02:58 -05:00
Colin Adler c8d65de4b7
test(agent): fix `TestAgent_Metadata/Once` flake (#8613) 2023-07-20 18:49:44 +00:00
Mathias Fredriksson 5fd77ad7cf
test(agent): fix service banner and metadata intervals (#8516) 2023-07-14 16:10:26 +03:00
Colin Adler c47b78c44b
chore: replace wsconncache with a single tailnet (#8176) 2023-07-12 17:37:31 -05:00
Mathias Fredriksson 3f058f28e7
test(agent): use afero for motd tests to allow parallel execution (#8329) 2023-07-06 10:57:51 +03:00
Asher 6015319e9d
feat: show service banner in SSH/TTY sessions (#8186)
* Allow workspace agents to get appearance
* Poll for service banner every two minutes
* Show service banner before MOTD if not quiet
2023-06-30 10:41:29 -08:00
Mathias Fredriksson 3b9b06fe5a
feat(codersdk/agentsdk): add `StartupLogsSender` and `StartupLogsWriter` (#8129)
This commit adds two new `agentsdk` functions, `StartupLogsSender` and
`StartupLogsWriter` that can be used by any client looking to send
startup logs to coderd.

We also refactor the `agent` to use these new functions.

As a bonus, agent startup logs are separated into "info" and "error"
levels to separate stdout and stderr.

---------

Co-authored-by: Marcin Tojek <mtojek@users.noreply.github.com>
2023-06-22 23:28:59 +03:00
Dean Sheather a28d422c35
feat: add flag to disable all direct connections (#7936) 2023-06-21 22:02:05 +00:00
Marcin Tojek 4fb4c9b270
chore: add more rules to ensure logs consistency (#8104) 2023-06-21 12:00:38 +02:00
Mathias Fredriksson ea4b7d60d7
fix(agent): refactor `trackScriptLogs` to avoid deadlock (#8084)
During agent close it was possible for the startup script logs consumer
to enter a deadlock state where by agent close was waiting via
`a.trackConnGoroutine` and the log reader for a flush event.

This refactor removes the mutex in favor of channel communication and
relies on two goroutines without shared state.
2023-06-20 18:05:11 +03:00
Mathias Fredriksson 8dac0356ed
refactor: replace startup script logs EOF with starting/ready time (#8082)
This commit reverts some of the changes in #8029 and implements an
alternative method of keeping track of when the startup script has ended
and there will be no more logs.

This is achieved by adding new agent fields for tracking when the agent
enters the "starting" and "ready"/"start_error" lifecycle states. The
timestamps simplify logic since we don't need understand if the current
state is before or after the state we're interested in. They can also be
used to show data like how long the startup script took to execute. This
also allowed us to remove the EOF field from the logs as the
implementation was problematic when we returned the EOF log entry in the
response since requesting _after_ that ID would give no logs and the API
would thus lose track of EOF.
2023-06-20 14:41:55 +03:00
Marcin Tojek b1d1b63113
chore: ensure logs consistency across Coder (#8083) 2023-06-20 12:30:45 +02:00
Mathias Fredriksson 0c5077464b
fix: avoid missed logs when streaming startup logs (#8029)
* feat(coderd,agent): send startup log eof at the end

* fix(coderd): fix edge case in startup log pubsub

* fix(coderd): ensure startup logs are closed on lifecycle state change (fallback)

* fix(codersdk): fix startup log channel shared memory bug

* fix(site): remove the EOF log line
2023-06-16 17:14:22 +03:00
Mathias Fredriksson c916a9e67f
fix(agent): guard against multiple rpty race for same id (#7998)
* fix(agent): guard against multiple rpty race for same id
* fix(agent): ensure pty is closed on error
2023-06-13 15:14:07 +00:00
Marcin Tojek 14efdadd3c
feat: Collect agent SSH metrics (#7584) 2023-05-25 12:52:36 +02:00
Jon Ayers 00a2413c03
feat: add telemetry support for workspace agent subsystem (#7579) 2023-05-17 22:49:25 -05:00
Spike Curtis 9c030a8888
fix: pty.Start respects context on Windows too (#7373)
* fix: pty.Start respects context on Windows too

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix windows imports; rename ToExec -> AsExec

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix import in windows test

Signed-off-by: Spike Curtis <spike@coder.com>

---------

Signed-off-by: Spike Curtis <spike@coder.com>
2023-05-03 11:43:05 +04:00
Ammar Bandukwala 465fe8658d
chore: skip timing-sensistive AgentMetadata test in the standard suite (#7237)
* chore: skip timing-sensistive AgentMetadata test in the standard suite

* Add test-timing target

* fix windows?

* Works on my Windows desktop?

* Use tag system

* fixup! Use tag system
2023-05-02 10:41:41 +00:00
Marcin Tojek bb0a38b161
feat: Implement aggregator for agent metrics (#7259) 2023-04-27 12:34:00 +02:00
Spike Curtis b6666cf1cf
chore: tailnet debug logging (#7260)
* Enable discovery (disco) debug

Signed-off-by: Spike Curtis <spike@coder.com>

* Better debug on reconnectingPTY

Signed-off-by: Spike Curtis <spike@coder.com>

* Agent logging in appstest

Signed-off-by: Spike Curtis <spike@coder.com>

* More reconnectingPTY logging

Signed-off-by: Spike Curtis <spike@coder.com>

* Add logging to coordinator

Signed-off-by: Spike Curtis <spike@coder.com>

* Update agent/agent.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Update agent/agent.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Update agent/agent.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Update agent/agent.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Clarify logs; remove unrelated changes

Signed-off-by: Spike Curtis <spike@coder.com>

---------

Signed-off-by: Spike Curtis <spike@coder.com>
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2023-04-27 13:59:01 +04:00
Colin Adler 3eb7f06bf1
feat(agent): add http debug routes for magicsock (#7287) 2023-04-26 13:01:49 -05:00