Kubernetes density testing

Change-Id: I350fa5797c15880290a9ff31b322e67e55c8b9b5
2016-12-20 16:11:51 +04:00 · 2016-12-20 16:11:51 +04:00 · 72de796900
parent 09b76b1148
commit 72de796900
23 changed files with 29405 additions and 1 deletions
--- a/doc/source/test_plans/kubernetes_density/plan.rst
+++ b/doc/source/test_plans/kubernetes_density/plan.rst
@ -0,0 +1,77 @@
+.. _Kubernetes_density_test_plan:
+
+**************************
+Kubernetes density testing
+**************************
+
+:status: **ready**
+:version: 1.0
+
+:Abstract:
+
+  This test plan covers scenarios of density testing of Kubernetes
+
+Test Plan
+=========
+
+Test Environment
+----------------
+
+Preparation
+^^^^^^^^^^^
+
+The test plan is executed against Kubernetes deployed on bare-metal nodes.
+
+Environment description
+^^^^^^^^^^^^^^^^^^^^^^^
+
+The environment description includes hardware specification of servers,
+network parameters, operation system and OpenStack deployment characteristics.
+
+
+Test Case #1: Maximum pods per node
+-----------------------------------
+
+Description
+^^^^^^^^^^^
+Kubernetes by default limits number of pods running by the node. The value is
+chosen by community and since version 1.4 equals to 110 (k8s_max_pods_).
+
+The goal of this test is to investigate system behavior at default limit and
+find out whether it can be increased or not. In particular we are interested
+in the following metrics: pod startup time during mass start (e.g. when
+replication controller is scaled up) and node's average load.
+
+From manual experiments it is observed that pod starts functioning before
+Kubernetes API reports it to be in running state. In this test case we are
+interested to investigate how long does it takes for Kubernetes to update
+pod status.
+
+List of performance metrics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. table:: list of test metrics to be collected during this test
+
+  +-------------------------+---------------------------------------------+
+  | Parameter               | Description                                 |
+  +=========================+=============================================+
+  | POD_COUNT               | Number of pods                              |
+  +-------------------------+---------------------------------------------+
+  | POD_FIRST_REPORT        | Time taken by pod to start and report       |
+  +-------------------------+---------------------------------------------+
+  | KUBECTL_RUN             | Time for all pods to be reported as running |
+  +-------------------------+---------------------------------------------+
+  | KUBECTL_TERMINATE       | Time to terminate all pods                  |
+  +-------------------------+---------------------------------------------+
+
+
+Reports
+=======
+
+Test plan execution reports:
+ * :ref:`Kubernetes_density_test_report`
+
+
+.. references:
+
+.. _k8s_max_pods: https://github.com/kubernetes/kubernetes/blob/v1.5.0/pkg/apis/componentconfig/v1alpha1/defaults.go#L290
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/100-start.svg
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/100-start.svg
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/100-term.svg
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/100-term.svg
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/200-start.svg
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/200-start.svg
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/200-term.svg
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/200-term.svg
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/400-start.svg
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/400-start.svg
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/400-term.svg
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/400-term.svg
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/50-start.svg
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/50-start.svg
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/50-term.svg
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/50-term.svg
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/N-cpu-system.png
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/N-cpu-system.png
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/N-cpu-user.png
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/N-cpu-user.png
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/N-disk-io.png
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/N-disk-io.png
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/N-mem-used.png
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/N-mem-used.png
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/N-start-term.svg
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/N-start-term.svg
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/data/data.tar.bz2
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/data/data.tar.bz2
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/index.rst
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/index.rst
@ -0,0 +1,183 @@
+.. _Kubernetes_density_test_report:
+
+******************************
+Kubernetes density test report
+******************************
+
+:Abstract:
+
+  This document is the report for :ref:`Kubernetes_density_test_plan`
+
+
+Environment description
+=======================
+
+This report is collected on the hardware described in
+:ref:`intel_mirantis_performance_lab_1`.
+
+
+Software
+~~~~~~~~
+
+Kubernetes is installed using :ref:`Kargo` deployment tool on Ubuntu 16.04.1.
+
+Node roles:
+ - node1: minion+master+etcd
+ - node2: minion+master+etcd
+ - node3: minion+etcd
+ - node4: minion
+
+Software versions:
+ - OS: Ubuntu 16.04.1 LTS (Xenial Xerus)
+ - Kernel: 4.4.0-47
+ - Docker: 1.12.1
+ - Kubernetes: 1.4.3
+
+Reports
+=======
+
+Test Case #1: Maximum pods per node
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Pod startup time is measured with help of
+`MMM(MySQL/Master/Minions) testing suite`_. To schedule all pods on a single
+node the original replication controller for minions is updated with scheduler
+hint. To do this add the following lines into template's spec section:
+
+.. code-block:: yaml
+
+      nodeSelector:
+        kubernetes.io/hostname: node4
+
+Pod status from Kubernetes point of view is retrieved from kubectl tool.
+The process is automated with
+:download:`kubectl_mon.py <kubectl-mon/kubectl_mon.py>`, which produces
+output in CSV format. Charts are created by
+:download:`pod_stats.py <kubectl-mon/pod_stats.py>` script.
+
+Every measurement starts with empty namespace. Then Kubernetes replication
+controller is created with specified number of pods. We collect pod's report
+time and kubectl stats. The summary data is presented below.
+
+.. image:: summary.png
+
+.. list-table:: POD stats
+    :header-rows: 1
+
+    *
+      - POD_COUNT
+      - POD_FIRST_REPORT, s
+      - KUBECTL_RUN, s
+      - KUBECTL_TERMINATE, s
+    *
+      - 50
+      - 12
+      - 44
+      - 30
+    *
+      - 100
+      - 27
+      - 131
+      - 87
+    *
+      - 200
+      - 61
+      - 450
+      - 153
+    *
+      - 400
+      - 208
+      - ∞ (not finished)
+      - 390
+
+
+
+Detailed Stats
+--------------
+
+50 pods
+^^^^^^^
+
+Start replication controller with 50 pods
+
+.. image:: 50-start.svg
+    :width: 100%
+
+Terminate replication controller with 50 pods
+
+.. image:: 50-term.svg
+    :width: 100%
+
+100 pods
+^^^^^^^^
+
+Start replication controller with 100 pods
+
+.. image:: 100-start.svg
+    :width: 100%
+
+Terminate replication controller with 100 pods
+
+.. image:: 100-term.svg
+    :width: 100%
+
+200 pods
+^^^^^^^^
+
+Start replication controller with 200 pods
+
+.. image:: 200-start.svg
+    :width: 100%
+
+Terminate replication controller with 200 pods
+
+.. image:: 200-term.svg
+    :width: 100%
+
+400 pods
+^^^^^^^^
+
+Start replication controller with 400 pods.
+
+Note: In this experiment all pods successfully reported, however from Kubernetes API
+point of view less than 60 pods were in running state. The number of pods
+reported as running was slowly increasing over the time, but the speed was very
+low to treat the process as succeed.
+
+.. image:: 400-start.svg
+    :width: 100%
+
+Terminate replication controller with 400 pods.
+
+.. image:: 400-term.svg
+    :width: 100%
+
+Scale by 100 pods steps
+^^^^^^^^^^^^^^^^^^^^^^^
+
+In this experiment we scale replication controller up by steps of 100 pods.
+Scaling process is invoked after all pods are reported as running. On step 3
+(201-300 pods) the process has become significantly slower and we've started
+scaling replication controller down. The full cycle is visualized below.
+
+.. image:: N-start-term.svg
+    :width: 100%
+
+System metrics from API nodes and minion are below
+
+.. image:: N-cpu-user.png
+
+.. image:: N-cpu-system.png
+
+.. image:: N-mem-used.png
+
+.. image:: N-disk-io.png
+
+Full `Kubernetes stats`_ are available online.
+
+
+.. references:
+
+.. _Kargo: https://github.com/kubespray/kargo
+.. _MMM(MySQL/Master/Minions) testing suite: https://github.com/AleksandrNull/MMM
+.. _Kubernetes stats: https://snapshot.raintank.io/dashboard/snapshot/YCtAh7jHhYpmWk8nsfda0EAIRRnG4TV9
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/kubectl-mon/kubectl_mon.py
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/kubectl-mon/kubectl_mon.py
@ -0,0 +1,27 @@
+#!/bin/python
+
+import re
+import subprocess
+import time
+
+KUBECTL_CMD = 'kubectl --namespace minions get pods -l k8s-app=minion'
+
+
+def main():
+    while True:
+        start = time.time()
+        stdout = subprocess.Popen(KUBECTL_CMD, shell=True,
+                                  stdout=subprocess.PIPE).stdout.read()
+        print('time,name,status')
+        for line in stdout.split('\n')[1:]:
+            if line:
+                tokens = re.split('\s+', line)
+                name = tokens[0]
+                status = tokens[2]
+                print('%f,%s,%s' % (start, name, status))
+
+        d = 1 - (time.time() - start)
+        time.sleep(d)
+
+if __name__ == '__main__':
+    main()
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/kubectl-mon/pod_stats.py
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/kubectl-mon/pod_stats.py
@ -0,0 +1,85 @@
+#!/bin/python
+
+import argparse
+import operator
+import itertools
+import numpy as np
+import matplotlib.patches as mpatches
+import matplotlib.pyplot as plt
+import matplotlib.ticker as mticker
+
+COLORS = {
+    'Pending': '#ffb624',
+    'ContainerCreating': '#ebeb00',
+    'Running': '#50c878',
+    'Terminating': '#a366ff',
+    'Error': '#cc0000',
+}
+
+def main():
+    parser = argparse.ArgumentParser(prog='pod-stats')
+    parser.add_argument('data', nargs='+')
+    args = parser.parse_args()
+
+    source = args.data[0]
+    data = np.genfromtxt(source, dtype=None, delimiter=',',
+                         skip_header=1, skip_footer=0,
+                         names=['time', 'name', 'status'])
+    categories = list(set(x[2] for x in data))
+    categories.sort()
+
+    processed = []
+    t = []
+    base_time = data[0][0]
+
+    for k, g in itertools.groupby(data, key=operator.itemgetter(0)):
+        r = dict((c, 0) for c in categories)
+        for x in g:
+            r[x[2]] += 1
+
+        v = [r[c] for c in categories]
+        processed.append(v)
+        t.append(k - base_time)
+
+    figure = plt.figure()
+    plot = figure.add_subplot(111)
+
+    colors = [COLORS[c] for c in categories]
+
+    plot.stackplot(t, np.transpose(processed), colors=colors)
+
+    if len(args.data) > 1:
+        y = []
+        x = []
+        with open(args.data[1]) as fd:
+            cnt = fd.read()
+        for i, p in enumerate(cnt.strip().split('\n')):
+            x.append(int(p) / 1000.0 - base_time)
+            y.append(i)
+        plot.plot(x, y, 'b.')
+
+    plot.grid(True)
+    plot.set_xlabel('time, s')
+    plot.set_ylabel('pods')
+
+    ax = figure.gca()
+    ax.yaxis.set_major_locator(mticker.MaxNLocator(integer=True))
+
+    # add legend
+    patches = [mpatches.Patch(color=c) for c in colors]
+    texts = categories
+
+    if len(args.data) > 1:
+        patches.append(mpatches.Patch(color='blue'))
+        texts.append('Pod report')
+
+    legend = plot.legend(patches, texts, loc='right', shadow=True)
+
+    for label in legend.get_texts():
+        label.set_fontsize('small')
+
+    plt.show()
+    figure.savefig('figure.svg')
+
+if __name__ == '__main__':
+    main()
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/minion/Dockerfile
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/minion/Dockerfile
@ -0,0 +1,10 @@
+FROM debian:latest
+ENV DEBIAN_FRONTEND noninteractive
+
+RUN apt-get update && apt-get -y upgrade
+RUN apt-get -y --no-install-recommends install python
+
+ADD minion.py /opt/minion/minion
+RUN chmod 0777 /opt/minion/minion
+
+ENTRYPOINT ["/opt/minion/minion"]
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/minion/minion.py
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/minion/minion.py
@ -0,0 +1,35 @@
+#!/usr/bin/python2
+
+import httplib
+import signal
+import sys
+import time
+import uuid
+
+
+class GracefulKiller:
+    kill_now = False
+
+    def __init__(self):
+        signal.signal(signal.SIGINT, self.exit_gracefully)
+        signal.signal(signal.SIGTERM, self.exit_gracefully)
+
+    def exit_gracefully(self, signum, frame):
+        print('Signal caught')
+        self.kill_now = True
+        sys.exit(0)
+
+
+if __name__ == '__main__':
+    killer = GracefulKiller()
+
+    t = int(time.time() * (10 ** 3))
+    u = str(uuid.uuid4())
+    e = ''
+    c = httplib.HTTPConnection('172.20.9.7:8000')
+    q = '%s' % t
+    print(q)
+    c.request('GET', q)
+    r = c.getresponse()
+    print(r.status)
+    time.sleep(2 << 20)
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/minion/rc.yaml
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/minion/rc.yaml
@ -0,0 +1,25 @@
+apiVersion: v1
+kind: ReplicationController
+metadata:
+  name: minion
+  namespace: minions
+
+spec:
+  replicas: 1
+  selector:
+    k8s-app: minion
+    version: v0
+  template:
+    metadata:
+      labels:
+        k8s-app: minion
+        version: v0
+    spec:
+      containers:
+      - name: minion
+        image: 127.0.0.1:31500/qa/minion:latest
+        env:
+        - name: MINION_RC
+          value: "1"
+      nodeSelector:
+        kubernetes.io/hostname: node4
--- a/doc/source/test_results/container_cluster_systems/kubernetes/density/summary.png
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/density/summary.png
--- a/doc/source/test_results/container_cluster_systems/kubernetes/index.rst
+++ b/doc/source/test_results/container_cluster_systems/kubernetes/index.rst
@ -5,4 +5,5 @@ Kubernetes system performance
    :numbered:
    :maxdepth: 4

-    API_testing/index
+    API_testing/index
+    density/index