Commit Graph

6577 Commits

Author SHA1 Message Date
Spike Curtis 04991f425a
fix: set node callback each time we reinit the coordinator in servertailnet (#12140)
I think this will resolve #12136 but lets get a proper test at the system level before closing.

Before this change, we only register the node callback at start of day for the server tailnet.  If the coordinator changes, like we know happens when we are licensed for the PGCoordinator, we close the connection to the old coord, and open a new one to the new coord.

The callback is designed to direct the updates to the new coordinator, but there is nothing that specifically triggers it to fire after we connect to the new coordinator.

If we have STUN, then period re-STUNs will generally get it to fire eventually, but without STUN it we could go indefinitely without a callback.

This PR changes the servertailnet to re-register the callback each time we reconnect to the coordinator.  Registering a callback (even if it's the same callback) triggers an immediate call with our node information, so the new coordinator will have it.
2024-02-14 20:45:31 +04:00
Spike Curtis 5a0d240bc3
feat: expose DERP server debug metrics (#12135)
Adds some debug endpoints for looking into the DERP server.

The `api/v2/debug/derp/traffic` endpoint requires the `ss` utility to be present in order to function.  I have *not* added the `iproute2` package to our base image as it adds 11MB, so this endpoint won't be useful by default.  However, in a debugging situation, we could exec into the container and then `apk add iproute2`, or build a special debug image.

The `api/v2/debug/expvar` handler contains DERP metrics as well as commandline and memstats.

Example:

```
{
"alert_failed": 0,
"alert_generated": 0,
"cmdline": ["/Users/spike/repos/coder/build/coder_darwin_arm64","--global-config","/Users/spike/repos/coder/.coderv2","server","--http-address","0.0.0.0:3000","--swagger-enable","--access-url","http://127.0.0.1:3000","--dangerous-allow-cors-requests=true"],
"derp": {"accepts": 1, "average_queue_duration_ms": 0, "bytes_received": 0, "bytes_sent": 0, "counter_packets_dropped_reason": {"gone_disconnected": 0, "gone_not_here": 0, "queue_head": 0, "queue_tail": 0, "unknown_dest": 0, "unknown_dest_on_fwd": 0, "write_error": 0}, "counter_packets_dropped_type": {"disco": 0, "other": 0}, "counter_packets_received_kind": {"disco": 0, "other": 0}, "counter_tcp_rtt": {}, "counter_total_dup_client_conns": 0, "gauge_clients_local": 1, "gauge_clients_remote": 0, "gauge_clients_total": 1, "gauge_current_connections": 1, "gauge_current_dup_client_conns": 0, "gauge_current_dup_client_keys": 0, "gauge_current_file_descriptors": 0, "gauge_current_home_connections": 1, "gauge_memstats_sys0": 20874504, "gauge_watchers": 0, "got_ping": 0, "home_moves_in": 0, "home_moves_out": 0, "multiforwarder_created": 0, "multiforwarder_deleted": 0, "packet_forwarder_delete_other_value": 0, "packets_dropped": 0, "packets_forwarded_in": 0, "packets_forwarded_out": 0, "packets_received": 0, "packets_sent": 0, "peer_gone_disconnected_frames": 0, "peer_gone_not_here_frames": 0, "sent_pong": 0, "unknown_frames": 0, "version": "1.47.0-dev20240214-t64db8c604"},
"memstats": {"Alloc":286506256,"TotalAlloc":297594632,"Sys":310621512,"Lookups":0,"Mallocs":304204,"Frees":171570,"HeapAlloc":286506256,"HeapSys":294060032,"HeapIdle":3694592,"HeapInuse":290365440,"HeapReleased":3620864,"HeapObjects":132634,"StackInuse":3735552,"StackSys":3735552,"MSpanInuse":347256,"MSpanSys":358512,"MCacheInuse":9600,"MCacheSys":15600,"BuckHashSys":1469877,"GCSys":9434896,"OtherSys":1547043,"NextGC":551867656,"LastGC":1707892877408883000,"PauseTotalNs":1247000,"PauseNs":[200333,229375,239875,209542,106958,203792,57125,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"PauseEnd":[1707892876217481000,1707892876219726000,1707892876222273000,1707892876226151000,1707892876234815000,1707892877398146000,1707892877408883000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"NumGC":7,"NumForcedGC":0,"GCCPUFraction":0.0022425810335762954,"EnableGC":true,"DebugGC":false,"BySize":[{"Size":0,"Mallocs":0,"Frees":0},{"Size":8,"Mallocs":14396,"Frees":9143},{"Size":16,"Mallocs":89090,"Frees":50507},{"Size":24,"Mallocs":40839,"Frees":24456},{"Size":32,"Mallocs":22404,"Frees":12379},{"Size":48,"Mallocs":51174,"Frees":23718},{"Size":64,"Mallocs":15406,"Frees":3501},{"Size":80,"Mallocs":6688,"Frees":2352},{"Size":96,"Mallocs":2567,"Frees":374},{"Size":112,"Mallocs":19371,"Frees":16883},{"Size":128,"Mallocs":2873,"Frees":1061},{"Size":144,"Mallocs":5600,"Frees":2742},{"Size":160,"Mallocs":2159,"Frees":622},{"Size":176,"Mallocs":454,"Frees":86},{"Size":192,"Mallocs":227,"Frees":128},{"Size":208,"Mallocs":1407,"Frees":732},{"Size":224,"Mallocs":1365,"Frees":1090},{"Size":240,"Mallocs":82,"Frees":48},{"Size":256,"Mallocs":310,"Frees":162},{"Size":288,"Mallocs":1945,"Frees":562},{"Size":320,"Mallocs":1200,"Frees":458},{"Size":352,"Mallocs":133,"Frees":33},{"Size":384,"Mallocs":582,"Frees":51},{"Size":416,"Mallocs":747,"Frees":200},{"Size":448,"Mallocs":113,"Frees":22},{"Size":480,"Mallocs":34,"Frees":21},{"Size":512,"Mallocs":951,"Frees":91},{"Size":576,"Mallocs":364,"Frees":122},{"Size":640,"Mallocs":532,"Frees":270},{"Size":704,"Mallocs":93,"Frees":39},{"Size":768,"Mallocs":83,"Frees":35},{"Size":896,"Mallocs":308,"Frees":175},{"Size":1024,"Mallocs":226,"Frees":122},{"Size":1152,"Mallocs":198,"Frees":100},{"Size":1280,"Mallocs":314,"Frees":171},{"Size":1408,"Mallocs":77,"Frees":47},{"Size":1536,"Mallocs":80,"Frees":54},{"Size":1792,"Mallocs":199,"Frees":107},{"Size":2048,"Mallocs":112,"Frees":48},{"Size":2304,"Mallocs":71,"Frees":32},{"Size":2688,"Mallocs":206,"Frees":81},{"Size":3072,"Mallocs":39,"Frees":15},{"Size":3200,"Mallocs":16,"Frees":7},{"Size":3456,"Mallocs":44,"Frees":29},{"Size":4096,"Mallocs":192,"Frees":83},{"Size":4864,"Mallocs":44,"Frees":25},{"Size":5376,"Mallocs":105,"Frees":43},{"Size":6144,"Mallocs":25,"Frees":5},{"Size":6528,"Mallocs":22,"Frees":7},{"Size":6784,"Mallocs":3,"Frees":0},{"Size":6912,"Mallocs":4,"Frees":2},{"Size":8192,"Mallocs":59,"Frees":10},{"Size":9472,"Mallocs":31,"Frees":12},{"Size":9728,"Mallocs":5,"Frees":2},{"Size":10240,"Mallocs":5,"Frees":0},{"Size":10880,"Mallocs":27,"Frees":11},{"Size":12288,"Mallocs":4,"Frees":1},{"Size":13568,"Mallocs":4,"Frees":2},{"Size":14336,"Mallocs":9,"Frees":2},{"Size":16384,"Mallocs":10,"Frees":2},{"Size":18432,"Mallocs":4,"Frees":2}]},
"warning_failed": 0,
"warning_generated": 0
}
```

If we find the DERP metrics useful we could consider how to include them in Prometheus scrapes based on the tailnet `varz` package.  That's for a later PR if at all.
2024-02-14 15:11:45 +04:00
Muhammad Atif Ali 53c55439be
chore (examples/templates/incus): fix a typo (#12123) 2024-02-13 19:16:33 +00:00
Steven Masley 5d483a7ea1
fix: do not query user_link for deleted accounts (#12112) 2024-02-13 13:02:21 -06:00
Steven Masley 06f3ab1206
chore: add database test fixture to insert non-unique linked_ids (#12111)
* chore: add database test fixture to insert non-unique linked_ids
2024-02-13 12:06:47 -06:00
Kayla Washburn-Love d37b131426
feat: add activity status and autostop reason to workspace overview (#11987) 2024-02-13 10:50:17 -07:00
Muhammad Atif Ali e53d8bdb50
docs: update modules docs (#11911) 2024-02-13 15:35:09 +00:00
Cian Johnston 68641f9e2f
chore(examples/templates/incus): fix incus group name in README (#12120) 2024-02-13 15:31:07 +00:00
dependabot[bot] e938690b1e
chore: bump golang.org/x/mod from 0.14.0 to 0.15.0 (#12094)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-13 18:25:26 +03:00
Muhammad Danish 3c536aa880
ci: use repo secret for syncing winget-pkgs fork (#12108) 2024-02-13 18:25:13 +03:00
dependabot[bot] 28bbdee655
chore: bump github.com/go-playground/validator/v10 (#12096)
Bumps [github.com/go-playground/validator/v10](https://github.com/go-playground/validator) from 10.17.0 to 10.18.0.
- [Release notes](https://github.com/go-playground/validator/releases)
- [Commits](https://github.com/go-playground/validator/compare/v10.17.0...v10.18.0)

---
updated-dependencies:
- dependency-name: github.com/go-playground/validator/v10
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-13 17:54:13 +03:00
dependabot[bot] 4760e85c15
chore: bump golang.org/x/net from 0.20.0 to 0.21.0 (#12097)
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.20.0 to 0.21.0.
- [Commits](https://github.com/golang/net/compare/v0.20.0...v0.21.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-13 17:53:16 +03:00
dependabot[bot] 9560d9a68b
ci: bump the github-actions group with 2 updates (#12091)
Bumps the github-actions group with 2 updates: [crate-ci/typos](https://github.com/crate-ci/typos) and [aquasecurity/trivy-action](https://github.com/aquasecurity/trivy-action).


Updates `crate-ci/typos` from 1.18.0 to 1.18.2
- [Release notes](https://github.com/crate-ci/typos/releases)
- [Changelog](https://github.com/crate-ci/typos/blob/master/CHANGELOG.md)
- [Commits](https://github.com/crate-ci/typos/compare/v1.18.0...v1.18.2)

Updates `aquasecurity/trivy-action` from 0.16.1 to 0.17.0
- [Release notes](https://github.com/aquasecurity/trivy-action/releases)
- [Commits](d43c1f16c0...84384bd6e7)

---
updated-dependencies:
- dependency-name: crate-ci/typos
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
- dependency-name: aquasecurity/trivy-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-13 17:52:12 +03:00
Garrett Delfosse 3ab3a62bef
feat: add port-sharing backend (#11939) 2024-02-13 09:31:20 -05:00
Cian Johnston c939416702
chore(examples): add sample Incus template (#12114)
Adds sample incus template created for FOSDEM 2024; there's enough intricacy involved to make it worth persisting
2024-02-13 14:30:31 +00:00
Dean Sheather e1e352d8c1
feat: add template activity_bump property (#11734)
Allows template admins to configure the activity bump duration. Defaults to 1h.
2024-02-13 07:00:35 +00:00
Dean Sheather fead57f304
fix: allow access to unhealthy/initializing apps (#12086) 2024-02-13 16:30:49 +10:00
Cian Johnston ec25fb8bbc
fix(docs/networking/stun): convert svg diagrams to png 2024-02-12 17:27:53 +00:00
Cian Johnston 2fabc9499a
fix(docs): remove inline mermaid diagrams (#12107) 2024-02-12 15:56:37 +00:00
Cian Johnston 1cc51b009a
chore(examples): remove deprecated startup_script_timeout and shutdown_script_timeout (#12104)
Removes deprecated startup_script_timeout and shutdown_script_timeout from our example templates.

Co-authored-by: Muhammad Atif Ali <atif@coder.com>
2024-02-12 14:29:41 +00:00
Marcin Tojek 3e68650791
feat: support `order` property of `coder_app` resource (#12077) 2024-02-12 15:11:31 +01:00
Cian Johnston 1e9a3c952f
chore(docs/networking/stun): fix diagram in section 2 (#12103) 2024-02-12 12:33:41 +00:00
Cian Johnston d1a522a8fc
chore(docs): add requirements re ports and stun server to docs (#12026)
Adds documentation on port requirements and a short overview of STUN with some example scenarios.

Co-authored-by: Dean Sheather <dean@deansheather.com>
Co-authored-by: Spike Curtis <spike@coder.com>
2024-02-12 11:42:27 +00:00
Dean Sheather 2fc3064653
chore: add tests for app ID copy in app healths (#12088) 2024-02-12 05:49:48 +00:00
Colin Adler 06254a167f
chore(docs): add `v2.8.2` changelog (#12089) 2024-02-12 05:48:34 +00:00
Dean Sheather 429144da22
fix: copy app ID in healthcheck (#12087) 2024-02-12 05:01:16 +00:00
Eric Paulsen bb308851f5
docs: fix jetbrains reconnect faq (#12073)
* docs: fix jetbrains reconnect faq

* make: fmt

* add asher feedback
2024-02-09 23:44:33 +00:00
Bruno Quaresma 390217b396
feat(site): add create template from scratch (#12082) 2024-02-09 14:42:26 +00:00
Cian Johnston 2b307c7c4e
fix(cli/server): do not redirect /healthz (#12080) 2024-02-09 13:44:47 +00:00
Spike Curtis 92b2e26a48
feat: send log limit exceeded in response, not error (#12078)
When we exceed the db-imposed limit of logs, we need to communicate that back to the agent.  In v1 we did it with a 4xx-level HTTP status, but with dRPC, the errors are delivered as strings, which feels fragile to me for something we want to gracefully handle.

So, this PR adds the log limit exceeded as a field on the response message, and fixes the API handler to set it as appropriate instead of an error.
2024-02-09 16:17:20 +04:00
Spike Curtis 1f5a6d59ba
chore: consolidate websocketNetConn implementations (#12065)
Consolidates websocketNetConn from multiple packages in favor of a central one in codersdk
2024-02-09 11:39:08 +04:00
Colin Adler ec8e41f516
chore: add logging around agent app health reporting (#12071) 2024-02-08 23:37:44 -06:00
Marcin Tojek c0e169ebf9
feat: support custom order of agent metadata (#12066) 2024-02-08 17:29:34 +01:00
Mathias Fredriksson e659957b65
fix(cli/ssh): prevent reads/writes to stdin/stdout in stdio mode (#12045)
Fixes #11530
2024-02-08 13:09:42 +02:00
Spike Curtis 151aaadc23
fix: allow startup scripts larger than 32k (#12060)
Fixes #12057 and adds a regression test.
2024-02-07 22:26:42 +04:00
Bruno Quaresma 4d63a473b2
fix(site): fix infinity loading when template has no previous version (#12059) 2024-02-07 14:56:09 -03:00
Mathias Fredriksson 040ce40ed8
fix(dogfood): add ability to synchronize with startup script via done file (#12058) 2024-02-07 19:16:18 +02:00
Bruno Quaresma d8a8070986
fix(site): enable submit when auto start and stop are both disabled (#12055) 2024-02-07 14:06:48 -03:00
Bruno Quaresma 4b1bac31b6
feat(site): allow any file extension on template editor (#12000) 2024-02-07 13:24:28 -03:00
Marcin Tojek 4e7b208068
fix(site): e2e: print API backend calls (#12051) 2024-02-07 15:50:07 +01:00
Eric Paulsen 1abe0cfa1a
docs: fix /audit & /insights params (#12043) 2024-02-07 08:38:54 -05:00
Spike Curtis 1cf4b62867
feat: change agent to use v2 API for reporting stats (#12024)
Modifies the agent to use the v2 API to report its statistics, using the `statsReporter` subcomponent.
2024-02-07 15:26:41 +04:00
Muhammad Atif Ali 70ad833b02
ci: fix GH_TOKEN in release.yaml (#12044) 2024-02-07 13:37:11 +03:00
Mathias Fredriksson f2aef0726b
fix(agent/agentssh): allow scp to exit with zero status (#12028)
Fixes #11786
2024-02-07 10:22:31 +02:00
Josh Vawdrey d3ccb07361
feat(cli): support header and header-command in config-ssh (#10413)
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2024-02-07 10:21:26 +02:00
Ben Potter d6cdaae8b1
docs: add v2.8.0 changelog (#12042)
* docs: add v2.8.0 changelog

* fmt
2024-02-07 00:14:17 +00:00
Cian Johnston 36808f19dc
feat!: update terraform to version 1.6.x, relax max version constraint (#12027)
* feat(provisioner): relax max terraform version constraint

* feat!(scripts/Dockerfile.base): update bundled terraform to 1.6.x

* bump terraform version in Dogfood image

* fix over-zealous rename
2024-02-06 17:58:26 -06:00
Kayla Washburn-Love b8e32a37de
fix: use `replace` when redirecting from /health (#12039)
`pushHistory` will break the back button, so we need to use `replaceHistory` instead
2024-02-06 14:27:32 -07:00
Marcin Tojek 3f04e98cfa
feat(cli): pull templates in zip format (#12032) 2024-02-06 19:17:29 +01:00
Spike Curtis 213ae69bee
fix: start timer before subscribing to avoid test race (#12031)
Fixes #12030

This is a good example of the kind of thing I'd like to address with a time-testing lib.  The problem is that there is a race between the watchdog starting it's timer and the test incrementing the time.  What would make this easier is if the time-testing library could wait for and assert the call to start the timer before incrementing the time.
2024-02-06 20:21:23 +04:00