docs: describe reference architectures (#12609)

2024-03-15 17:01:45 +01:00 · 2024-03-15 17:01:45 +01:00 · bed2545636
parent b0c4e7504c
commit bed2545636
6 changed files with 544 additions and 7 deletions
--- a/docs/admin/architectures/1k-users.md
+++ b/docs/admin/architectures/1k-users.md
@ -0,0 +1,51 @@
+# Reference Architecture: up to 1,000 users
+
+The 1,000 users architecture is designed to cover a wide range of workflows.
+Examples of subjects that might utilize this architecture include medium-sized
+tech startups, educational units, or small to mid-sized enterprises.
+
+**Target load**: API: up to 180 RPS
+
+**High Availability**: non-essential for small deployments
+
+## Hardware recommendations
+
+### Coderd nodes
+
+| Users       | Node capacity       | Replicas            | GCP             | AWS        | Azure             |
+| ----------- | ------------------- | ------------------- | --------------- | ---------- | ----------------- |
+| Up to 1,000 | 2 vCPU, 8 GB memory | 1-2 / 1 coderd each | `n1-standard-2` | `t3.large` | `Standard_D2s_v3` |
+
+**Footnotes**:
+
+- For small deployments (ca. 100 users, 10 concurrent workspace builds), it is
+  acceptable to deploy provisioners on `coderd` nodes.
+
+### Provisioner nodes
+
+| Users       | Node capacity        | Replicas                       | GCP              | AWS          | Azure             |
+| ----------- | -------------------- | ------------------------------ | ---------------- | ------------ | ----------------- |
+| Up to 1,000 | 8 vCPU, 32 GB memory | 2 nodes / 30 provisioners each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |
+
+**Footnotes**:
+
+- An external provisioner is deployed as Kubernetes pod.
+
+### Workspace nodes
+
+| Users       | Node capacity        | Replicas                | GCP              | AWS          | Azure             |
+| ----------- | -------------------- | ----------------------- | ---------------- | ------------ | ----------------- |
+| Up to 1,000 | 8 vCPU, 32 GB memory | 64 / 16 workspaces each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |
+
+**Footnotes**:
+
+- Assumed that a workspace user needs at minimum 2 GB memory to perform. We
+  recommend against over-provisioning memory for developer workloads, as this my
+  lead to OOMKiller invocations.
+- Maximum number of Kubernetes workspace pods per node: 256
+
+### Database nodes
+
+| Users       | Node capacity       | Replicas | Storage | GCP                | AWS           | Azure             |
+| ----------- | ------------------- | -------- | ------- | ------------------ | ------------- | ----------------- |
+| Up to 1,000 | 2 vCPU, 8 GB memory | 1        | 512 GB  | `db-custom-2-7680` | `db.t3.large` | `Standard_D2s_v3` |
--- a/docs/admin/architectures/2k-users.md
+++ b/docs/admin/architectures/2k-users.md
@ -0,0 +1,59 @@
+# Reference Architecture: up to 2,000 users
+
+In the 2,000 users architecture, there is a moderate increase in traffic,
+suggesting a growing user base or expanding operations. This setup is
+well-suited for mid-sized companies experiencing growth or for universities
+seeking to accommodate their expanding user populations.
+
+Users can be evenly distributed between 2 regions or be attached to different
+clusters.
+
+**Target load**: API: up to 300 RPS
+
+**High Availability**: The mode is _enabled_; multiple replicas provide higher
+deployment reliability under load.
+
+## Hardware recommendations
+
+### Coderd nodes
+
+| Users       | Node capacity        | Replicas                | GCP             | AWS         | Azure             |
+| ----------- | -------------------- | ----------------------- | --------------- | ----------- | ----------------- |
+| Up to 2,000 | 4 vCPU, 16 GB memory | 2 nodes / 1 coderd each | `n1-standard-4` | `t3.xlarge` | `Standard_D4s_v3` |
+
+### Provisioner nodes
+
+| Users       | Node capacity        | Replicas                       | GCP              | AWS          | Azure             |
+| ----------- | -------------------- | ------------------------------ | ---------------- | ------------ | ----------------- |
+| Up to 2,000 | 8 vCPU, 32 GB memory | 4 nodes / 30 provisioners each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |
+
+**Footnotes**:
+
+- An external provisioner is deployed as Kubernetes pod.
+- It is not recommended to run provisioner daemons on `coderd` nodes.
+- Consider separating provisioners into different namespaces in favor of
+  zero-trust or multi-cloud deployments.
+
+### Workspace nodes
+
+| Users       | Node capacity        | Replicas                 | GCP              | AWS          | Azure             |
+| ----------- | -------------------- | ------------------------ | ---------------- | ------------ | ----------------- |
+| Up to 2,000 | 8 vCPU, 32 GB memory | 128 / 16 workspaces each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |
+
+**Footnotes**:
+
+- Assumed that a workspace user needs 2 GB memory to perform
+- Maximum number of Kubernetes workspace pods per node: 256
+- Nodes can be distributed in 2 regions, not necessarily evenly split, depending
+  on developer team sizes
+
+### Database nodes
+
+| Users       | Node capacity        | Replicas | Storage | GCP                 | AWS            | Azure             |
+| ----------- | -------------------- | -------- | ------- | ------------------- | -------------- | ----------------- |
+| Up to 2,000 | 4 vCPU, 16 GB memory | 1        | 1 TB    | `db-custom-4-15360` | `db.t3.xlarge` | `Standard_D4s_v3` |
+
+**Footnotes**:
+
+- Consider adding more replicas if the workspace activity is higher than 500
+  workspace builds per day or to achieve higher RPS.
--- a/docs/admin/architectures/3k-users.md
+++ b/docs/admin/architectures/3k-users.md
@ -0,0 +1,62 @@
+# Reference Architecture: up to 3,000 users
+
+The 3,000 users architecture targets large-scale enterprises, possibly with
+on-premises network and cloud deployments.
+
+**Target load**: API: up to 550 RPS
+
+**High Availability**: Typically, such scale requires a fully-managed HA
+PostgreSQL service, and all Coder observability features enabled for operational
+purposes.
+
+**Observability**: Deploy monitoring solutions to gather Prometheus metrics and
+visualize them with Grafana to gain detailed insights into infrastructure and
+application behavior. This allows operators to respond quickly to incidents and
+continuously improve the reliability and performance of the platform.
+
+## Hardware recommendations
+
+### Coderd nodes
+
+| Users       | Node capacity        | Replicas          | GCP             | AWS         | Azure             |
+| ----------- | -------------------- | ----------------- | --------------- | ----------- | ----------------- |
+| Up to 3,000 | 8 vCPU, 32 GB memory | 4 / 1 coderd each | `n1-standard-4` | `t3.xlarge` | `Standard_D4s_v3` |
+
+### Provisioner nodes
+
+| Users       | Node capacity        | Replicas                 | GCP              | AWS          | Azure             |
+| ----------- | -------------------- | ------------------------ | ---------------- | ------------ | ----------------- |
+| Up to 3,000 | 8 vCPU, 32 GB memory | 8 / 30 provisioners each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |
+
+**Footnotes**:
+
+- An external provisioner is deployed as Kubernetes pod.
+- It is strongly discouraged to run provisioner daemons on `coderd` nodes at
+  this level of scale.
+- Separate provisioners into different namespaces in favor of zero-trust or
+  multi-cloud deployments.
+
+### Workspace nodes
+
+| Users       | Node capacity        | Replicas                       | GCP              | AWS          | Azure             |
+| ----------- | -------------------- | ------------------------------ | ---------------- | ------------ | ----------------- |
+| Up to 3,000 | 8 vCPU, 32 GB memory | 256 nodes / 12 workspaces each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |
+
+**Footnotes**:
+
+- Assumed that a workspace user needs 2 GB memory to perform
+- Maximum number of Kubernetes workspace pods per node: 256
+- As workspace nodes can be distributed between regions, on-premises networks
+  and cloud areas, consider different namespaces in favor of zero-trust or
+  multi-cloud deployments.
+
+### Database nodes
+
+| Users       | Node capacity        | Replicas | Storage | GCP                 | AWS             | Azure             |
+| ----------- | -------------------- | -------- | ------- | ------------------- | --------------- | ----------------- |
+| Up to 3,000 | 8 vCPU, 32 GB memory | 2        | 1.5 TB  | `db-custom-8-30720` | `db.t3.2xlarge` | `Standard_D8s_v3` |
+
+**Footnotes**:
+
+- Consider adding more replicas if the workspace activity is higher than 1500
+  workspace builds per day or to achieve higher RPS.
--- a/docs/admin/architectures/index.md
+++ b/docs/admin/architectures/index.md
@ -0,0 +1,344 @@
+# Reference Architectures
+
+This document provides prescriptive solutions and reference architectures to
+support successful deployments of up to 3000 users and outlines at a high-level
+the methodology currently used to scale-test Coder.
+
+## General concepts
+
+This section outlines core concepts and terminology essential for understanding
+Coder's architecture and deployment strategies.
+
+### Administrator
+
+An administrator is a user role within the Coder platform with elevated
+privileges. Admins have access to administrative functions such as user
+management, template definitions, insights, and deployment configuration.
+
+### Coder
+
+Coder, also known as _coderd_, is the main service recommended for deployment
+with multiple replicas to ensure high availability. It provides an API for
+managing workspaces and templates. Each _coderd_ replica has the capability to
+host multiple [provisioners](#provisioner).
+
+### User
+
+A user is an individual who utilizes the Coder platform to develop, test, and
+deploy applications using workspaces. Users can select available templates to
+provision workspaces. They interact with Coder using the web interface, the CLI
+tool, or directly calling API methods.
+
+### Workspace
+
+A workspace refers to an isolated development environment where users can write,
+build, and run code. Workspaces are fully configurable and can be tailored to
+specific project requirements, providing developers with a consistent and
+efficient development environment. Workspaces can be autostarted and
+autostopped, enabling efficient resource management.
+
+Users can connect to workspaces using SSH or via workspace applications like
+`code-server`, facilitating collaboration and remote access. Additionally,
+workspaces can be parameterized, allowing users to customize settings and
+configurations based on their unique needs. Workspaces are instantiated using
+Coder templates and deployed on resources created by provisioners.
+
+### Template
+
+A template in Coder is a predefined configuration for creating workspaces.
+Templates streamline the process of workspace creation by providing
+pre-configured settings, tooling, and dependencies. They are built by template
+administrators on top of Terraform, allowing for efficient management of
+infrastructure resources. Additionally, templates can utilize Coder modules to
+leverage existing features shared with other templates, enhancing flexibility
+and consistency across deployments. Templates describe provisioning rules for
+infrastructure resources offered by Terraform providers.
+
+### Workspace Proxy
+
+A workspace proxy serves as a relay connection option for developers connecting
+to their workspace over SSH, a workspace app, or through port forwarding. It
+helps reduce network latency for geo-distributed teams by minimizing the
+distance network traffic needs to travel. Notably, workspace proxies do not
+handle dashboard connections or API calls.
+
+### Provisioner
+
+Provisioners in Coder execute Terraform during workspace and template builds.
+While the platform includes built-in provisioner daemons by default, there are
+advantages to employing external provisioners. These external daemons provide
+secure build environments and reduce server load, improving performance and
+scalability. Each provisioner can handle a single concurrent workspace build,
+allowing for efficient resource allocation and workload management.
+
+### Registry
+
+The Coder Registry is a platform where you can find starter templates and
+_Modules_ for various cloud services and platforms.
+
+Templates help create self-service development environments using
+Terraform-defined infrastructure, while _Modules_ simplify template creation by
+providing common features like workspace applications, third-party integrations,
+or helper scripts.
+
+Please note that the Registry is a hosted service and isn't available for
+offline use.
+
+## Scale-testing methodology
+
+Scaling Coder involves planning and testing to ensure it can handle more load
+without compromising service. This process encompasses infrastructure setup,
+traffic projections, and aggressive testing to identify and mitigate potential
+bottlenecks.
+
+A dedicated Kubernetes cluster for Coder is Kubernetes cluster specifically
+configured to host and manage Coder workloads. Kubernetes provides container
+orchestration capabilities, allowing Coder to efficiently deploy, scale, and
+manage workspaces across a distributed infrastructure. This ensures high
+availability, fault tolerance, and scalability for Coder deployments. Code is
+deployed on this cluster using the
+[Helm chart](../install/kubernetes#install-coder-with-helm).
+
+Our scale tests include the following stages:
+
+1. Prepare environment: create expected users and provision workspaces.
+
+2. SSH connections: establish user connections with agents, verifying their
+   ability to echo back received content.
+
+3. Web Terminal: verify the PTY connection used for communication with Web
+   Terminal.
+
+4. Workspace application traffic: assess the handling of user connections with
+   specific workspace apps, confirming their capability to echo back received
+   content effectively.
+
+5. Dashboard evaluation: verify the responsiveness and stability of Coder
+   dashboards under varying load conditions. This is achieved by simulating user
+   interactions using instances of headless Chromium browsers.
+
+6. Cleanup: delete workspaces and users created in step 1.
+
+### Infrastructure and setup requirements
+
+The scale tests runner can distribute the workload to overlap single scenarios
+based on the workflow configuration:
+
+|                      | T0  | T1  | T2  | T3  | T4  | T5  | T6  |
+| -------------------- | --- | --- | --- | --- | --- | --- | --- |
+| SSH connections      | X   | X   | X   | X   |     |     |     |
+| Web Terminal (PTY)   |     | X   | X   | X   | X   |     |     |
+| Workspace apps       |     |     | X   | X   | X   | X   |     |
+| Dashboard (headless) |     |     |     | X   | X   | X   | X   |
+
+This pattern closely reflects how our customers naturally use the system. SSH
+connections are heavily utilized because they're the primary communication
+channel for IDEs with VS Code and JetBrains plugins.
+
+The basic setup of scale tests environment involves:
+
+1. Scale tests runner (32 vCPU, 128 GB RAM)
+2. Coder: 2 replicas (4 vCPU, 16 GB RAM)
+3. Database: 1 instance (2 vCPU, 32 GB RAM)
+4. Provisioner: 50 instances (0.5 vCPU, 512 MB RAM)
+
+The test is deemed successful if users did not experience interruptions in their
+workflows, `coderd` did not crash or require restarts, and no other internal
+errors were observed.
+
+### Traffic Projections
+
+In our scale tests, we simulate activity from 2000 users, 2000 workspaces, and
+2000 agents, with two items of workspace agent metadata being sent every 10
+seconds. Here are the resulting metrics:
+
+Coder:
+
+- Median CPU usage for _coderd_: 3 vCPU, peaking at 3.7 vCPU while all tests are
+  running concurrently.
+- Median API request rate: 350 RPS during dashboard tests, 250 RPS during Web
+  Terminal and workspace apps tests.
+- 2000 agent API connections with latency: p90 at 60 ms, p95 at 220 ms.
+- on average 2400 Web Socket connections during dashboard tests.
+
+Provisionerd:
+
+- Median CPU usage is 0.35 vCPU during workspace provisioning.
+
+Database:
+
+- Median CPU utilization is 80%, with a significant portion dedicated to writing
+  workspace agent metadata.
+- Memory utilization averages at 40%.
+- `write_ops_count` between 6.7 and 8.4 operations per second.
+
+## Available reference architectures
+
+[Up to 1,000 users](1k-users.md)
+
+[Up to 2,000 users](2k-users.md)
+
+[Up to 3,000 users](3k-users.md)
+
+## Hardware recommendation
+
+### Control plane: coderd
+
+To ensure stability and reliability of the Coder control plane, it's essential
+to focus on node sizing, resource limits, and the number of replicas. We
+recommend referencing public cloud providers such as AWS, GCP, and Azure for
+guidance on optimal configurations. A reasonable approach involves using scaling
+formulas based on factors like CPU, memory, and the number of users.
+
+While the minimum requirements specify 1 CPU core and 2 GB of memory per
+`coderd` replica, it is recommended to allocate additional resources depending
+on the workload size to ensure deployment stability.
+
+#### CPU and memory usage
+
+Enabling [agent stats collection](../../cli.md#--prometheus-collect-agent-stats)
+(optional) may increase memory consumption.
+
+Enabling direct connections between users and workspace agents (apps or SSH
+traffic) can help prevent an increase in CPU usage. It is recommended to keep
+[this option enabled](../../cli.md#--disable-direct-connections) unless there
+are compelling reasons to disable it.
+
+Inactive users do not consume Coder resources.
+
+#### Scaling formula
+
+When determining scaling requirements, consider the following factors:
+
+- `1 vCPU x 2 GB memory x 250 users`: A reasonable formula to determine resource
+  allocation based on the number of users and their expected usage patterns.
+- API latency/response time: Monitor API latency and response times to ensure
+  optimal performance under varying loads.
+- Average number of HTTP requests: Track the average number of HTTP requests to
+  gauge system usage and identify potential bottlenecks. The number of proxied
+  connections: For a very high number of proxied connections, more memory is
+  required.
+
+**HTTP API latency**
+
+For a reliable Coder deployment dealing with medium to high loads, it's
+important that API calls for workspace/template queries and workspace build
+operations respond within 300 ms. However, API template insights calls, which
+involve browsing workspace agent stats and user activity data, may require more
+time. Moreover, Coder API exposes WebSocket long-lived connections for Web
+Terminal (bidirectional), and Workspace events/logs (unidirectional).
+
+If the Coder deployment expects traffic from developers spread across the globe,
+be aware that customer-facing latency might be higher because of the distance
+between users and the load balancer. Fortunately, the latency can be improved
+with a deployment of Coder [workspace proxies](../workspace-proxies.md).
+
+**Node Autoscaling**
+
+We recommend disabling the autoscaling for `coderd` nodes. Autoscaling can cause
+interruptions for user connections, see [Autoscaling](../scale.md#autoscaling)
+for more details.
+
+### Control plane: provisionerd
+
+Each external provisioner can run a single concurrent workspace build. For
+example, running 10 provisioner containers will allow 10 users to start
+workspaces at the same time.
+
+By default, the Coder server runs 3 built-in provisioner daemons, but the
+_Enterprise_ Coder release allows for running external provisioners to separate
+the load caused by workspace provisioning on the `coderd` nodes.
+
+#### Scaling formula
+
+When determining scaling requirements, consider the following factors:
+
+- `1 vCPU x 1 GB memory x 2 concurrent workspace build`: A formula to determine
+  resource allocation based on the number of concurrent workspace builds, and
+  standard complexity of a Terraform template. _Rule of thumb_: the more
+  provisioners are free/available, the more concurrent workspace builds can be
+  performed.
+
+**Node Autoscaling**
+
+Autoscaling provisioners is not an easy problem to solve unless it can be
+predicted when a number of concurrent workspace builds increases.
+
+We recommend disabling autoscaling and adjusting the number of provisioners to
+developer needs based on the workspace build queuing time.
+
+### Data plane: Workspaces
+
+To determine workspace resource limits and keep the best developer experience
+for workspace users, administrators must be aware of a few assumptions.
+
+- Workspace pods run on the same Kubernetes cluster, but possibly in a different
+  namespace or on a separate set of nodes.
+- Workspace limits (per workspace user):
+  - Evaluate the workspace utilization pattern. For instance, web application
+    development does not require high CPU capacity at all times, but will spike
+    during builds or testing.
+  - Evaluate minimal limits for single workspace. Include in the calculation
+    requirements for Coder agent running in an idle workspace - 0.1 vCPU and 256
+    MB. For instance, developers can choose between 0.5-8 vCPUs, and 1-16 GB
+    memory.
+
+#### Scaling formula
+
+When determining scaling requirements, consider the following factors:
+
+- `1 vCPU x 2 GB memory x 1 workspace`: A formula to determine resource
+  allocation based on the minimal requirements for an idle workspace with a
+  running Coder agent and occasional CPU and memory bursts for building
+  projects.
+
+**Node Autoscaling**
+
+Workspace nodes can be set to operate in autoscaling mode to mitigate the risk
+of prolonged high resource utilization.
+
+One approach is to scale up workspace nodes when total CPU usage or memory
+consumption reaches 80%. Another option is to scale based on metrics such as the
+number of workspaces or active users. It's important to note that as new users
+onboard, the autoscaling configuration should account for ongoing workspaces.
+
+Scaling down workspace nodes to zero is not recommended, as it will result in
+longer wait times for workspace provisioning by users. However, this may be
+necessary for workspaces with special resource requirements (e.g. GPUs) that
+incur significant cost overheads.
+
+### Data plane: External database
+
+While running in production, Coder requires a access to an external PostgreSQL
+database. Depending on the scale of the user-base, workspace activity, and High
+Availability requirements, the amount of CPU and memory resources required by
+Coder's database may differ.
+
+#### Scaling formula
+
+When determining scaling requirements, take into account the following
+considerations:
+
+- `2 vCPU x 8 GB RAM x 512 GB storage`: A baseline for database requirements for
+  Coder deployment with less than 1000 users, and low activity level (30% active
+  users). This capacity should be sufficient to support 100 external
+  provisioners.
+- Storage size depends on user activity, workspace builds, log verbosity,
+  overhead on database encryption, etc.
+- Allocate two additional CPU core to the database instance for every 1000
+  active users.
+- Enable _High Availability_ mode for database engine for large scale
+  deployments.
+
+If you enable [database encryption](../encryption.md) in Coder, consider
+allocating an additional CPU core to every `coderd` replica.
+
+#### Performance optimization guidelines
+
+We provide the following general recommendations for PostgreSQL settings:
+
+- Increase number of vCPU if CPU utilization or database latency is high.
+- Allocate extra memory if database performance is poor, CPU utilization is low,
+  and memory utilization is high.
+- Utilize faster disk options (higher IOPS) such as SSDs or NVMe drives for
+  optimal performance enhancement and possibly reduce database load.
--- a/docs/admin/scale.md
+++ b/docs/admin/scale.md
@ -106,12 +106,13 @@ For example, to support 120 concurrent workspace builds:
 > Note: the below information is for reference purposes only, and are not
 > intended to be used as guidelines for infrastructure sizing.

-| Environment      | Coder CPU | Coder RAM | Coder Replicas | Database         | Users | Concurrent builds | Concurrent connections (Terminal/SSH) | Coder Version | Last tested  |
-| ---------------- | --------- | --------- | -------------- | ---------------- | ----- | ----------------- | ------------------------------------- | ------------- | ------------ |
-| Kubernetes (GKE) | 3 cores   | 12 GB     | 1              | db-f1-micro      | 200   | 3                 | 200 simulated                         | `v0.24.1`     | Jun 26, 2023 |
-| Kubernetes (GKE) | 4 cores   | 8 GB      | 1              | db-custom-1-3840 | 1500  | 20                | 1,500 simulated                       | `v0.24.1`     | Jun 27, 2023 |
-| Kubernetes (GKE) | 2 cores   | 4 GB      | 1              | db-custom-1-3840 | 500   | 20                | 500 simulated                         | `v0.27.2`     | Jul 27, 2023 |
-| Kubernetes (GKE) | 2 cores   | 4 GB      | 2              | db-custom-2-7680 | 1000  | 20                | 1000 simulated                        | `v2.2.1`      | Oct 9, 2023  |
+| Environment      | Coder CPU | Coder RAM | Coder Replicas | Database          | Users | Concurrent builds | Concurrent connections (Terminal/SSH) | Coder Version | Last tested  |
+| ---------------- | --------- | --------- | -------------- | ----------------- | ----- | ----------------- | ------------------------------------- | ------------- | ------------ |
+| Kubernetes (GKE) | 3 cores   | 12 GB     | 1              | db-f1-micro       | 200   | 3                 | 200 simulated                         | `v0.24.1`     | Jun 26, 2023 |
+| Kubernetes (GKE) | 4 cores   | 8 GB      | 1              | db-custom-1-3840  | 1500  | 20                | 1,500 simulated                       | `v0.24.1`     | Jun 27, 2023 |
+| Kubernetes (GKE) | 2 cores   | 4 GB      | 1              | db-custom-1-3840  | 500   | 20                | 500 simulated                         | `v0.27.2`     | Jul 27, 2023 |
+| Kubernetes (GKE) | 2 cores   | 8 GB      | 2              | db-custom-2-7680  | 1000  | 20                | 1000 simulated                        | `v2.2.1`      | Oct 9, 2023  |
+| Kubernetes (GKE) | 4 cores   | 16 GB     | 2              | db-custom-8-30720 | 2000  | 50                | 2000 simulated                        | `v2.8.4`      | Feb 28, 2024 |

 > Note: a simulated connection reads and writes random data at 40KB/s per
 > connection.
--- a/docs/manifest.json
+++ b/docs/manifest.json
@ -375,10 +375,30 @@
        },
        {
          "title": "Scaling Coder",
-          "description": "Reference architecture and load testing tools",
+          "description": "Learn how to use load testing tools",
          "path": "./admin/scale.md",
          "icon_path": "./images/icons/scale.svg"
        },
+        {
+          "title": "Reference Architectures",
+          "description": "Learn about reference architectures for Coder",
+          "path": "./admin/architectures/index.md",
+          "icon_path": "./images/icons/scale.svg",
+          "children": [
+            {
+              "title": "Up to 1,000 users",
+              "path": "./admin/architectures/1k-users.md"
+            },
+            {
+              "title": "Up to 2,000 users",
+              "path": "./admin/architectures/2k-users.md"
+            },
+            {
+              "title": "Up to 3,000 users",
+              "path": "./admin/architectures/3k-users.md"
+            }
+          ]
+        },
        {
          "title": "External Provisioners",
          "description": "Run provisioners isolated from the Coder server",