6.9. Large environments#

This section describes how to configure Nubus for Kubernetes for use in large environments with a high number of concurrent users. It assumes that you are familiar with the standard Kubernetes metrics. A good option to visualize these metrics is the kube-prometheus-stack.

See also

Kubernetes metrics

for information about Kubernetes metrics.

kube-prometheus-stack at GitHub

for information about the kube-prometheus-stack, a collection of Kubernetes manifests, Grafana dashboards, and Prometheus rules.

6.9.1. Login performance#

As an identity and access management platform, a focus point of Nubus is its login performance. Nubus for Kubernetes sustains between 100 and 120 user logins per second, equating to 6,000 to 7,200 user logins per minute. Univention internal load tests with Nubus configured as described in Configuration for large environments revealed the performance statistics shown in Table 6.1. The columns in the table have the following meaning:

Concurrent logins:

Number of simultaneous login attempts during the test.

Logins per second:

Number of completed logins per second.

Median login duration:

Median login duration.

P95 login duration:

Ninety-fifth percentile of login duration meaning 95% of logins finished within this time.

The performance tests measured the login duration from the redirect to the Keycloak identity provider until the Portal Frontend in the user’s browser has loaded the personalized Portal.

Table 6.1 Login Performance Results#

Concurrent logins

Logins per second

Median login duration

P95 login duration

400

110

3.25 seconds

3.9 seconds

500

117

4.0 seconds

4.6 seconds

600

120

4.6 seconds

5.4 seconds

A higher load of 1,000 to 2,000 concurrent login attempts doesn’t significantly affect the overall logins per second. However, the individual login duration increases significantly to 30 to 60 seconds.

See also

Scalability

for configuration settings to scale components in Nubus for Kubernetes.

6.9.2. Focus areas#

The following areas require attention to optimize Nubus for Kubernetes for high login volumes.

6.9.2.1. Authentication flow#

Nubus for Kubernetes uses SAML for authentication. Nubus involves the following components in the authentication flow.

Keycloak

Keycloak provides the Identity Provider in Nubus for Kubernetes. Scale it to the supported maximum of 5 replicas. You can expect a virtual CPU load of 0.5 to 1 virtual CPU per pod.

LDAP Secondary

The LDAP Secondary pods in the Identity Store and Directory Service serve as read replicas for the user and group directory database.

Scale the LDAP Secondary pods to 8 replicas. The authentication flow puts a high request load on the LDAP Secondary pods. Consequently, a virtual CPU usage of 5 to 10 virtual CPU is normal and doesn’t imply a bottleneck.

For more information on LDAP scalability, see Directory service high availability and scalability.

UMC Server

The UMC Server acts as the SAML service provider and manages the users’ browser sessions. Deploy 128 pod replicas to prevent bottlenecks and latency spikes. To mitigate deployment, scaling, and update latency caused by the high replica count, set the nubusUmcServer.podManagementPolicy Helm Chart value to Parallel to deploy 128 pods in parallel rather than sequentially.

See also

For architectural information of the involved components, see Univention Nubus for Kubernetes - Architecture Manual [2] and the following sections in detail:

6.9.2.2. Pod distribution and scheduling constraints#

It’s recommended to configure Kubernetes pod scheduling constraints to control the pod distribution in the Kubernetes cluster. This is important for both performance and high availability.

topologySpreadConstraints

Running all 8 LDAP Secondary pods on the same cluster node would cause performance problems. Configure topologySpreadConstraints to equally distribute the pod replicas of a Kubernetes workload across the Kubernetes cluster nodes. You can configure Kubernetes (anti)affinity rules to further optimize pod scheduling.

resources.requests

Set resources.requests to match the expected CPU and memory consumption of your pods. Kubernetes reserves these values and prevents nodes from becoming resource-constrained.

resources.limits

resources.limits control runtime resource usage, but don’t influence pod scheduling. Adjust virtual CPU requests to align with your specific usage patterns and cluster size.

For examples of how to use topologySpreadConstraints, resources.requests, and resource.limits to influence pod scheduling, see Configuration for large environments.

The scheduling constraints mentioned in this section only apply to newly created pods. When the cluster layout changes—for example, when adding worker nodes to the cluster—it’s advisable to optimize the cluster by redeploying pods. You can redeploy pods manually, for example, using the kubectl rollout restart command. Another option with automation in mind is the Descheduler project, which evicts pods that no longer meet the scheduling constraints.

See also

Kubernetes affinity rules

Kubernetes Documentation [1] for information about Kubernetes affinity.

kubectl rollout restart

for information about manually redeploying pods.

Descheduler project at GitHub

for information about Descheduler for Kubernetes.

6.9.2.3. Portal#

In addition to optimizing the authentication flow, it’s important to ensure that the Portal Service loads quickly. The following components affect the performance of the Portal.

Ingress Controller

To equally distribute external network traffic into the Kubernetes cluster across all worker nodes, you can use a cloud load balancer. Deploy at least one ingress controller on every Kubernetes worker node, for example by deploying the ingress-nginx Helm Chart as a DaemonSet. The Ingress controller isn’t part of Nubus for Kubernetes.

Portal Frontend

The portal-frontend serves most static HTML, JavaScript, and CSS files for the Portal. Scale it to 6 replicas that are equally spread across worker nodes.

Portal Server

The portal-server pod provides the required information for the Portal Frontend to personalize the portal for the signed-in user. This component is single-threaded, but can serve many requests concurrently. To avoid bottlenecks, 10 replicas are sufficient.

UMC Gateway

The umc-gateway serves additional static HTML, JavaScript, and CSS files for the Portal. Scale it to 40 replicas to prevent request latency aberrations of multiple seconds.

UMC Server

The umc-server pod supports the personalization of the portal with authorization decisions. It doesn’t need scaling beyond the 128 pods mentioned above.

See also

LoadBalancer - Kubernetes Service type

for information about type: LoadBalancer Kubernetes service.

For architectural information of the involved components, see Univention Nubus for Kubernetes - Architecture Manual [2] and the following sections in detail:

6.9.3. Configuration for large environments#

To optimize your Nubus for Kubernetes deployment, integrate the values in Listing 6.28 into your Nubus for Kubernetes configuration. You can download the file at large-environments.yaml. Merge the yaml file with your existing custom-values.yaml or deploy the Nubus umbrella helm chart with multiple values files.

To apply the configuration, follow the steps in Apply configuration.

Listing 6.28 Configuration optimized for login performance#
# SPDX-License-Identifier: AGPL-3.0-only
# SPDX-FileCopyrightText: 2025 Univention GmbH
---
yamlAnchors:
  resources: &baselineResources
    requests:
      cpu: "0.1"
      memory: "500Mi"
    limits:
      cpu: "8"
      memory: "8Gi"

keycloak:
  replicaCount: 5
  resources:
    requests:
      cpu: 1
      memory: "4Gi"
    limits:
      cpu: 4
      memory: "8Gi"

nubusLdapServer:
  resourcesPrimary:
    requests:
      cpu: 3
      memory: "2Gi"
    limits:
      cpu: 16
      memory: "4Gi"
  resourcesSecondary:
    requests:
      cpu: 5
      memory: "2Gi"
    limits:
      cpu: 16
      memory: "4Gi"
  resourcesProxy:
    requests:
      cpu: 3
      memory: "2Gi"
    limits:
      cpu: 16
      memory: "4Gi"
  replicaCountPrimary: 2
  replicaCountSecondary: 8
  replicaCountProxy: 0

nubusUmcServer:
  # Deploy Pods in parallel instead of sequentially
  podManagementPolicy: "Parallel"
  resources:
    requests:
      cpu: "0.15"
      memory: "500M"
    limits:
      cpu: 2
      memory: "2Gi"
  replicaCount: 128
  proxy:
    replicaCount: 4
  topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: "kubernetes.io/hostname"
      whenUnsatisfiable: DoNotSchedule
      labelSelector:
        matchExpressions:
          - key: app.kubernetes.io/name
            operator: In
            values:
              - umc-server
              - nubus-umc-server-proxy

nubusUmcGateway:
  resources: *baselineResources
  replicaCount: 40
  topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: "kubernetes.io/hostname"
      whenUnsatisfiable: DoNotSchedule
      labelSelector:
        matchLabels:
          app.kubernetes.io/name: umc-gateway

nubusPortalServer:
  # Disable the /me endpoint
  portalServer:
    featureToggles:
      api_me: false
  resources:
    requests:
      cpu: "0.4"
      memory: "500M"
    limits:
      cpu: 2
      memory: "2Gi"
  replicaCount: 10
  topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: "kubernetes.io/hostname"
      whenUnsatisfiable: DoNotSchedule
      labelSelector:
        matchLabels:
          app.kubernetes.io/name: portal-server

nubusPortalFrontend:
  resources: *baselineResources
  replicaCount: 6
  topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: "kubernetes.io/hostname"
      whenUnsatisfiable: DoNotSchedule
      labelSelector:
        matchLabels:
          app.kubernetes.io/name: portal-frontend

nubusNotificationsApi:
  requests:
    cpu: "300m"
    memory: "500Mi"
  limits:
    cpu: "8"
    memory: "8Gi"
  replicaCount: 2

postgresql:
  resources:
    limits:
      cpu: "8"
      memory: "8Gi"
    requests:
      cpu: "1000m"
      memory: "500Mi"
  primary:
    resources: *baselineResources
    extendedConfiguration: |
      max_connections = 1200

minio:
  resources:
    limits:
      cpu: "8"
      memory: "8Gi"
    requests:
      cpu: "1.5"
      memory: "500Mi"
  networkPolicy:
    resources: *baselineResources
  tls:
    resources: *baselineResources
  provisioning:
    resources: *baselineResources
    cleanupAfterFinished:
      resources: *baselineResources

nubusGuardian:
  enabled: false

nubusTwofaHelpdesk:
  enabled: false

nubusLdapNotifier:
  resources: *baselineResources

nubusPortalConsumer:
  resources: *baselineResources

nubusProvisioning:
  resources:
    dispatcher: *baselineResources
    prefill: *baselineResources
    api: *baselineResources
  nats:
    resources: *baselineResources
    reloader:
      resources: *baselineResources
    natsBox:
      resources: *baselineResources

nubusUdmListener:
  resources: *baselineResources

nubusSelfServiceConsumer:
  resources: *baselineResources

nubusKeycloakBootstrap:
  resources: *baselineResources

nubusKeycloakExtensions:
  resources: *baselineResources

nubusStackDataUms:
  resources: *baselineResources

nubusUdmRestApi:
  resources: *baselineResources
  replicaCount: 2