This one is huge, and I'm sorry.
The problem is that once I change `tailnet.Conn` to start doing v2 behavior, I kind of have to change it everywhere, including in CoderSDK (CLI), the agent, wsproxy, and ServerTailnet.
There is still a bit more cleanup to do, and I need to add code so that when we lose connection to the Coordinator, we mark all peers as LOST, but that will be in a separate PR since this is big enough!
- An opt-in feature has been added to the agent to allow
deprioritizing non coder-related processes for CPU by setting their
niceness level to 10.
- Opting in to the feature requires setting CODER_PROC_PRIO_MGMT to a non-empty value.
* chore: add /v2 to import module path
go mod requires semantic versioning with versions greater than 1.x
This was a mechanical update by running:
```
go install github.com/marwan-at-work/mod/cmd/mod@latest
mod upgrade
```
Migrate generated files to import /v2
* Fix gen
I forgot that waiting on the cond releases the lock so it was possible
to get pty output after writing the buffer but before adding the pty to
the map. To fix, add the pty to the map while under the same lock where
we read from the buffer.
The rest does not need to be behind the lock so I moved it out of
doAttach, and that also means we no longer need
waitForStateOrContextLocked.
Also, this can hit a logger error saying the attach failed which fails
the tests however it is not that the attach failed, just that the
process already ran and exited, so when the process exits do not
set an error, instead for now assume this is an expected close.
* Add screen backend for reconnecting ptys
The screen portion is a port from wsep. There is an interface that lets
you choose between screen and the previous method. By default it will
choose screen if it is installed but this can be overidden (mostly for
tests).
The tests use a scanner instead of a reader now because the reader will
loop infinitely at the end of a stream.
Replace /bin/bash with bash since bash is not always in /bin.
* Remove connection_id from reconnecting PTY logger
This serves multiple connections so it makes no sense to scope it to a
single connection.
Also lets us use "connection_id" when logging write errors instead of
"other_conn_id".
* Use PATH to test buffered reconnecting pty
* chore: rename startup logs to agent logs
This also adds a `source` property to every agent log. It
should allow us to group logs and display them nicer in
the UI as they stream in.
* Fix migration order
* Fix naming
* Rename the frontend
* Fix tests
* Fix down migration
* Match enums for workspace agent logs
* Fix inserting log source
* Fix migration order
* Fix logs tests
* Fix psql insert
This commit adds two new `agentsdk` functions, `StartupLogsSender` and
`StartupLogsWriter` that can be used by any client looking to send
startup logs to coderd.
We also refactor the `agent` to use these new functions.
As a bonus, agent startup logs are separated into "info" and "error"
levels to separate stdout and stderr.
---------
Co-authored-by: Marcin Tojek <mtojek@users.noreply.github.com>
During agent close it was possible for the startup script logs consumer
to enter a deadlock state where by agent close was waiting via
`a.trackConnGoroutine` and the log reader for a flush event.
This refactor removes the mutex in favor of channel communication and
relies on two goroutines without shared state.
This commit reverts some of the changes in #8029 and implements an
alternative method of keeping track of when the startup script has ended
and there will be no more logs.
This is achieved by adding new agent fields for tracking when the agent
enters the "starting" and "ready"/"start_error" lifecycle states. The
timestamps simplify logic since we don't need understand if the current
state is before or after the state we're interested in. They can also be
used to show data like how long the startup script took to execute. This
also allowed us to remove the EOF field from the logs as the
implementation was problematic when we returned the EOF log entry in the
response since requesting _after_ that ID would give no logs and the API
would thus lose track of EOF.
* feat(coderd,agent): send startup log eof at the end
* fix(coderd): fix edge case in startup log pubsub
* fix(coderd): ensure startup logs are closed on lifecycle state change (fallback)
* fix(codersdk): fix startup log channel shared memory bug
* fix(site): remove the EOF log line
* chore: skip timing-sensistive AgentMetadata test in the standard suite
* Add test-timing target
* fix windows?
* Works on my Windows desktop?
* Use tag system
* fixup! Use tag system