Support Healing Kubernetes Master/Worker-nodes with Mgmtdriver

In this spec, we propose the Heal operation for Kubernetes Master
or Worker-nodes when the VNF is composed of Kubernetes cluster.

Proposed changes for the Heal in ETSI-NFV-SOL002:
* Mgmtdriver supports ``heal_start`` for healing worker nodes to
evacuate Pods and ``heal_end`` for worker and master nodes to join
the existing Kubernetes cluster.
* A sample script is provided to configure Kubernetes cluster.

Proposed changes for the Heal in ETSI-NFV-SOL003:
* None.

Change-Id: Ia9d49a724f4874a7ac632db97d1e417d91af37ce
Blueprint: mgmt-driver-for-k8s-heal
This commit is contained in:
Yoshito Ito 2020-10-27 08:21:46 +09:00
parent 30d72f6439
commit 8d26f30698
1 changed files with 824 additions and 0 deletions

View File

@ -0,0 +1,824 @@
==============================================================
Support Healing Kubernetes Master/Worker-nodes with Mgmtdriver
==============================================================
https://blueprints.launchpad.net/tacker/+spec/mgmt-driver-for-k8s-heal
This specification describes enhancement of Heal operation for the VNF which
includes Kubernetes cluster.
Problem description
===================
The ETSI-compliant VNF Heal operation has been implemented in ``Ussuri``
release and updated to support Notification and Grant operations in
``Victoria`` release. For the use case with OS container based VNF with
Kubernetes cluster, it's important to support the cluster management. In this
spec, we propose the Heal operation for Kubernetes Master or Worker-nodes
when the VNF is composed of Kubernetes cluster with the spec
`mgmt-driver-for-k8s-cluster`_.
To support healing Kubernetes cluster nodes, MgmtDriver needs to be extended.
After deploying a Kubernetes cluster using MgmtDriver as the VNF Lifecycle
Management interface with ETSI NFV-SOL 003 [#SOL003]_, the heal related
operations are required.
When healing a Master or Worker-node, you need not only the default Heal
operation, but also the ``heal_end`` operation to register the node to an
existing Kubernetes cluster.
If you want to heal a Master-node, you need to configure etcd and HA proxy.
Also, when healing a Worker-node, you need to apply the required configuration
and to make it to leave and join the cluster with the same way in scale-in
and scale-out.
Proposed change
===============
The two patterns of Heal operations are supported:
#. Heal a single node (Master / Worker) in Kubernetes cluster
based on ETSI NFV-SOL002 [#SOL002]_.
+ Heal Master-node of Kubernetes cluster
Delete a failed Master-node, create a new VM, and incorporate a new
Master-node into the existing Kubernetes cluster. The etcd cluster and
the load balancer configured as a Master-node are also repaired by
MgmtDriver.
+ Heal Worker-node of Kubernetes cluster
Delete the failed Worker-node, create a new VM, and incorporate the new
Worker-node into the existing Kubernetes cluster.
#. Heal the entire Kubernetes cluster based on ETSI NFV-SOL003 [#SOL003]_.
.. note:: The Healing of the Pods in the Kubernetes cluster is
out of scope of this specification.
.. note:: Failure detection is also out of scope.
It is assumed that the user performs a Heal operation to the known
node being failed.
.. note:: Kubernetes v1.16.0 and Kubernetes python client v11.0 are supported
for Kubernetes VIM.
VNFD for Healing operation
--------------------------
VNFD needs to have ``heal_start`` and ``heal_end`` definitions in interfaces
as the following sample:
.. code-block:: yaml
node_templates:
VNF:
...
interfaces:
Vnflcm:
...
heal_start:
implementation: mgmt-drivers-kubernetes
heal_end:
implementation: mgmt-drivers-kubernetes
artifacts:
mgmt-drivers-kubernetes:
description: Management driver for kubernetes cluster
type: tosca.artifacts.Implementation.Python
file: /.../mgmt_drivers/kubernetes_mgmt.py
MasterVDU:
type: tosca.nodes.nfv.Vdu.Compute
...
WorkerVDU:
type: tosca.nodes.nfv.Vdu.Compute
...
Request data for Healing operation
----------------------------------
User gives following heal parameter to "POST /vnf_instances/{id}/heal" as
``HealVnfRequest`` data type. It is not the same in SOL002 and SOL003.
In ETSI NFV-SOL002 v2.6.1 [#SOL002]_:
+------------------+---------------------------------------------------------+
| Attribute name | Parameter description |
+==================+=========================================================+
| vnfcInstanceId | User specify heal target, user can know "vnfcInstanceId"|
| | by ``InstantiatedVnfInfo.vnfcResourceInfo`` that |
| | contained in the response of "GET /vnf_instances/{id}". |
+------------------+---------------------------------------------------------+
| cause | Not needed |
+------------------+---------------------------------------------------------+
| additionalParams | Not needed |
+------------------+---------------------------------------------------------+
| healScript | Not needed |
+------------------+---------------------------------------------------------+
In ETSI NFV-SOL003 v2.6.1 [#SOL003]_:
+------------------+---------------------------------------------------------+
| Attribute name | Parameter description |
+==================+=========================================================+
| cause | Not needed |
+------------------+---------------------------------------------------------+
| additionalParams | Not needed |
+------------------+---------------------------------------------------------+
Tacker in ``Ussuri`` release, ``vnfcInstanceId``, ``cause``, and
``additionalParams`` are supported for both of SOL002 and SOL003.
If the vnfcInstanceId parameter is null, this means that healing operation is
required for the entire Kubernetes cluster, which is the case in SOL003.
Following is a sample of healing request body for SOL002:
.. code-block::
{
"vnfcInstanceId": "311485f3-45df-41fe-85d9-306154ff4c8d"
}
Healing a node in Kubernetes cluster with SOL002
------------------------------------------------
Healing a Master-node
^^^^^^^^^^^^^^^^^^^^^
The following changes are needed:
+ Extend MgmtDriver to support ``heal_start`` and ``heal_end``
+ In ``heal_start``, the target node is removed from the Kubernetes cluster
by deleting it from the etcd cluster and disabling the load balance in HA
proxy. These are executed in a sample script invoked in MgmtDriver.
+ In ``heal_end``, MgmtDriver invokes a sample script to install and
configure the new Master-node, to add it to the etcd cluster, and to
register in HA proxy for load balancing.
.. note:: It is assumed that the required information for Heal operation in
MgmtDriver is stored in the additional parameter of the Instantiate
and/or Heal request. MgmtDriver passes them to the sample script.
The diagram below shows the Heal VNF operation for a Master-node:
::
+---------------+
| Heal |
| Request with |
| Additional |
| Params |
+---+-----------+
|
+----------------+-----------+
| v VNFM |
| +-------------------+ |
| | Tacker-server | |
| +---------+---------+ |
| | |
4. heal_end | v |
Kubernetes cluster | +----------------------+ |
(Master) | | +-------------+ | |
+----------------------+--+----| MgmtDriver | | |
| | | +-----------+-+ | |
| | | | | |
| | | 1. heal_start | | |
| | | Kubernetes | | |
+--------------------+----------+ | | cluster | | |
| | | | | (Master)| | |
| | | | | | | |
| | | | | | | |
| +----------+ +---+------+ | | | | | |
| | | | v | | 3. Create | | +-----------+ | | |
| | +------+ | | +------+ | | new VM | | |OpenStack | | | |
| | |Worker| | | |Master| |<--------------+--+-|InfraDriver| | | |
| | +------+ | | +------+ | | | | +------+----+ | | |
| | VM | | VM | | | | | | | |
| +----------+ +----------+ | | | | | | |
| +----------+ +----------+ | 2. Delete | | | | | |
| | +------+ | | +------+ | | failed VM | | | | | |
| | |Worker| | | |Master| |<--+-----------+--+--------+ | | |
| | +------+ | | +------+ | | | | | | |
| | VM | | VM |<--+-----------+--+----------------+ | |
| +----------+ +----------+ | | | | |
+-------------------------------+ | | Tacker-conductor| |
+-------------------------------+ | +----------------------+ |
| Hardware Resources | | |
+-------------------------------+ +----------------------------+
The diagram shows related component of this spec proposal and an overview of
the following processing:
#. MgmtDriver pre-processes to remove Master-node from the existing etcd cluster
in ``heal_start`` before deleting the failed VM.
#. Remove the failed Master-node from etcd cluster by notifying the other
active Master-node of the failure.
#. Remove the failed VM from a load balancing target by changing the HA Proxy
setting.
#. OpenStackInfraDriver deletes failed VM.
#. OpenStackInfraDriver creates a new VM as a replacement for a failed VM.
#. MgmtDriver heals the Kubernetes cluster in ``heal_end``.
#. MgmtDriver setup new Master-node on new VMs.
This setup procedure can be implemented with the shell script or the
python script including installation and configuration tasks.
#. Invoke the script for repairing etcd cluster if the Heal target is
Master-node.
#. Invoke the script for changing HA proxy configuration.
.. note:: The failed VMs are expected to be identified by checking
vnfcInstanceId parameter in the Heal request. In case of this
parameter is null, it means that entire rebuild is required.
Those procedure is defined in the specification of
etsi-nfv-sol-rest-api-for-VNF-deployment [#VNF-deployment]_.
.. note:: To identify the VMs newly created, it is assumed that information
such as the number of Worker-node VMs before Heal and the creation
time of each VM will need to be referenced.
Following sequence diagram describes the components involved and the flow of
Healing Master-node operation:
.. seqdiag::
seqdiag {
node_width = 80;
edge_length = 100;
"Client"
"Tacker-server"
"Tacker-conductor"
"VnfLcmDriver"
"OpenstackDriver"
"Heat"
"MgmtDriver"
"VnfInstance(Tacker DB)"
"RemoteCommandExecutor"
Client -> "Tacker-server"
[label = "POST /vnf_instances/{vnfInstanceId}/heal"];
Client <-- "Tacker-server"
[label = "Response 202 Accepted"];
"Tacker-server" -> "Tacker-conductor"
[label = "trigger asynchronous task"];
"Tacker-conductor" -> "MgmtDriver"
[label = "heal_start"];
"MgmtDriver" -> "VnfInstance(Tacker DB)"
[label = "get the vim connection info"];
"MgmtDriver" <-- "VnfInstance(Tacker DB)"
[label = ""];
"MgmtDriver" -> "MgmtDriver"
[label = "get the vm info to be deleted by the request data"];
"MgmtDriver" -> "RemoteCommandExecutor"
[label = "remove the failed Master-node from etcd cluster by notifying
the other active Master-node of the failure"];
"MgmtDriver" <-- "RemoteCommandExecutor"
[label = ""];
"MgmtDriver" -> "RemoteCommandExecutor"
[label = "remove the failed Master-node from HA proxy"];
"MgmtDriver" <-- "RemoteCommandExecutor"
[label = ""];
"Tacker-conductor" <-- "MgmtDriver"
[label = ""];
"Tacker-conductor" -> "VnfLcmDriver"
[label = "execute VnfLcmDriver"];
"VnfLcmDriver" -> "OpenstackDriver"
[label = "execute OpenstackDriver"];
"OpenstackDriver" -> "Heat"
[label = "resource signal"];
"OpenstackDriver" -> "Heat"
[label = "update stack"];
"OpenstackDriver" <-- "Heat"
[label = ""];
"VnfLcmDriver" <-- "OpenstackDriver"
[label = ""];
"Tacker-conductor" <-- "VnfLcmDriver"
[label = ""];
"Tacker-conductor" -> "MgmtDriver"
[label = "heal_end"];
"MgmtDriver" -> "VnfInstance(Tacker DB)"
[label = "get the vim connection info"];
"MgmtDriver" <-- "VnfInstance(Tacker DB)"
[label = ""];
"MgmtDriver" -> "Heat"
[label = "get the new vm info created by Heal operation."];
"MgmtDriver" <-- "Heat"
[label = ""];
"MgmtDriver" -> "RemoteCommandExecutor"
[label = "Install Kubernetes on the new Master-node"];
"MgmtDriver" <-- "RemoteCommandExecutor"
[label = ""];
"MgmtDriver" -> "RemoteCommandExecutor"
[label = "Repairs etcd cluster by invoking shell script"];
"MgmtDriver" <-- "RemoteCommandExecutor"
[label = ""];
"MgmtDriver" -> "RemoteCommandExecutor"
[label = "Changes HA proxy configuration"];
"MgmtDriver" <-- "RemoteCommandExecutor"
[label = ""];
"Tacker-conductor" <-- "MgmtDriver"
[label = ""];
}
The procedure consists of the following steps as illustrated in above sequence:
#. Client sends a POST request to the Heal VNF Instance resource.
#. Basically the same sequence as described in the "3) Flow of Heal of a VNF
instance "chapter of spec `etsi-nfv-sol-rest-api-for-VNF-deployment`_,
except for the MgmtDriver.
#. The following processes are performed in ``heal_start``.
#. MgmtDriver gets Vim connection information in order to get failed
Kubernetes cluster information such as auth_url.
#. MgmtDriver gets the failed VM information by checking vnfcInstanceId
parameter in the Heal request.
#. MgmtDriver remove the failed Master-node from etcd cluster by notifying
the other active Master-node of the failure.
#. MgmtDriver Remove the failed VM from a load balancing target by changing
the HA Proxy setting.
#. OpenStack Driver uses Heat to delete failed VM.
#. OpenStack Driver uses Heat to create a new VM.
#. The following processes are performed in ``heal_end``.
#. MgmtDriver gets Vim connection information in order to get existing
Kubernetes cluster information such as auth_url.
#. MgmtDriver gets the new VM information from the stack resource in Heat.
#. MgmtDriver install and configure the Master-node on the VM by invoking a
shell script using RemoteCommandExecutor.
#. MgmtDriver adds the new Master-node to etcd cluster by invoking shell
script using RemoteCommandExecutor.
#. MgmtDriver changes HA proxy configuration by invoking shell script using
RemoteCommandExecutor.
Healing a Worker-node
^^^^^^^^^^^^^^^^^^^^^
The following changes are needed:
+ Extend MgmtDriver to support ``heal_start`` and ``heal_end``
+ In ``heal_start``
+ Try evacuating the pod running on the failed VM to another worker node.
+ Remove the failed Worker-node by notifying the existing Master-node of
the failure.
+ In ``heal_end``
+ the same as ``scale_end`` described in the specification of
`mgmt-driver-for-k8s-scale`_.
MgmtDriver setup new Worker-node on new VMs in ``heal_end``.
This setup procedure can be implemented with the shell script or the
python script including installation and configuration tasks.
.. note:: It is assumed that the required information for Heal operation in
MgmtDriver is stored in the additional parameter of the Instantiate
and/or Heal request. MgmtDriver passes them to the sample script.
The diagram below shows VNF Heal operation for a Worker-node:
::
+---------------+
| Heal |
| Request with |
| Additional |
| Params |
+---+-----------+
|
+----------------+-----------+
| v VNFM |
| +-------------------+ |
| | Tacker-server | |
| +---------+---------+ |
| | |
4. heal_end | v |
Kubernetes cluster | +----------------------+ |
(Worker) | | +-------------+ | |
+----------------------+--+----| MgmtDriver | | |
| | | +-----------+-+ | |
| | | | | |
| | | 1. heal_start | | |
| | | Kubernetes | | |
+--------------------+----------+ | | cluster | | |
| | | | | (Worker)| | |
| | | | | | | |
| | | | | | | |
| +----------+ +---+------+ | | | | | |
| | | | v | | 3. Create | | +-----------+ | | |
| | +------+ | | +------+ | | new VM | | |OpenStack | | | |
| | |Master| | | |Worker| |<--------------+--+-|InfraDriver| | | |
| | +------+ | | +------+ | | | | +------+----+ | | |
| | VM | | VM | | | | | | | |
| +----------+ +----------+ | | | | | | |
| +----------+ +----------+ | 2. Delete | | | | | |
| | +------+ | | +------+ | | failed VM | | | | | |
| | |Master| | | |Worker| |<--+-----------+--+--------+ | | |
| | +------+ | | +------+ | | | | | | |
| | VM | | VM |<--+-----------+--+----------------+ | |
| +----------+ +----------+ | | | | |
+-------------------------------+ | | Tacker-conductor| |
+-------------------------------+ | +----------------------+ |
| Hardware Resources | | |
+-------------------------------+ +----------------------------+
The diagram shows related component of this spec proposal and an overview of
the following processing:
#. MgmtDriver pre-processes to remove Worker-node from the Kubernetes cluster in
``heal_start`` before deleting the failed VM.
#. Try evacuating the pod running on the failed VM to another worker node.
#. Remove the failed Worker-node by notifying the existing Master-node of
the failure.
#. OpenStackInfraDriver deletes failed VM.
#. OpenStackInfraDriver creates a new VM as a replacement for a failed VM.
#. MgmtDriver heals the Kubernetes cluster in ``heal_end``.
This Heal procedure is basically the same as ``scale_end`` described in the
specification of `mgmt-driver-for-k8s-scale`_.
MgmtDriver setup new Worker-node on new VMs in ``heal_end``.
This setup procedure can be implemented with the shell script or the python
script including installation and configuration tasks.
.. note:: The failed VMs are expected to be identified by checking
vnfcInstanceId parameter in the Heal request. In case of this
parameter is null, it means that entire rebuild is required.
Those procedure is defined in the specification of
etsi-nfv-sol-rest-api-for-VNF-deployment [#VNF-deployment]_.
.. note:: To identify the VMs newly created, it is assumed that information
such as the number of Worker-node VMs before Heal and the creation
time of each VM will need to be referenced.
Following sequence diagram describes the components involved and the flow of
Heal the Worker-node operation:
.. seqdiag::
seqdiag {
node_width = 80;
edge_length = 100;
"Client"
"Tacker-server"
"Tacker-conductor"
"VnfLcmDriver"
"OpenstackDriver"
"Heat"
"MgmtDriver"
"VnfInstance(Tacker DB)"
"RemoteCommandExecutor"
Client -> "Tacker-server"
[label = "POST /vnf_instances/{vnfInstanceId}/heal"];
Client <-- "Tacker-server"
[label = "Response 202 Accepted"];
"Tacker-server" -> "Tacker-conductor"
[label = "trigger asynchronous task"];
"Tacker-conductor" -> "MgmtDriver"
[label = "heal_start"];
"MgmtDriver" -> "VnfInstance(Tacker DB)"
[label = "get the vim connection info"];
"MgmtDriver" <-- "VnfInstance(Tacker DB)"
[label = ""];
"MgmtDriver" -> "MgmtDriver"
[label = "get the vm info to be deleted by the request data"];
"MgmtDriver" -> "RemoteCommandExecutor"
[label = "try evacuating the pod running on the failed VM to another
worker node"];
"MgmtDriver" <-- "RemoteCommandExecutor"
[label = ""];
"MgmtDriver" -> "RemoteCommandExecutor"
[label = "Notify the Master-node of the failed VM"];
"MgmtDriver" <-- "RemoteCommandExecutor"
[label = ""];
"Tacker-conductor" <-- "MgmtDriver"
[label = ""];
"Tacker-conductor" -> "VnfLcmDriver"
[label = "execute VnfLcmDriver"];
"VnfLcmDriver" -> "OpenstackDriver"
[label = "execute OpenstackDriver"];
"OpenstackDriver" -> "Heat"
[label = "resource signal"];
"OpenstackDriver" -> "Heat"
[label = "update stack"];
"OpenstackDriver" <-- "Heat"
[label = ""];
"VnfLcmDriver" <-- "OpenstackDriver"
[label = ""];
"Tacker-conductor" <-- "VnfLcmDriver"
[label = ""];
"Tacker-conductor" -> "MgmtDriver"
[label = "heal_end"];
"MgmtDriver" -> "VnfInstance(Tacker DB)"
[label = "get the vim connection info"];
"MgmtDriver" <-- "VnfInstance(Tacker DB)"
[label = ""];
"MgmtDriver" -> "Heat"
[label = "get the new vm info created by Heal operation."];
"MgmtDriver" <-- "Heat"
[label = ""];
"MgmtDriver" -> "RemoteCommandExecutor"
[label = "Install Kubernetes on the new Worker-node"];
"MgmtDriver" <-- "RemoteCommandExecutor"
[label = ""];
"Tacker-conductor" <-- "MgmtDriver"
[label = ""];
}
The procedure consists of the following steps as illustrated in above sequence:
#. Client sends a POST request to the Heal VNF Instance resource.
#. Basically the same sequence as described in the "3) Flow of Heal of a VNF
instance "chapter of spec `etsi-nfv-sol-rest-api-for-VNF-deployment`_,
except for the MgmtDriver.
#. The following processes are performed in ``heal_start``.
#. MgmtDriver gets Vim connection information in order to get failed
Kubernetes cluster information such as auth_url.
#. MgmtDriver gets the failed VM information by checking vnfcInstanceId
parameter in the Heal request.
#. Try evacuating the pod running on the failed VM to another worker node.
#. Remove the failed Worker-node by notifying the existing Master-node of
the failure.
#. OpenStack Driver uses Heat to delete failed VM.
#. OpenStack Driver uses Heat to create a new VM.
#. The following processes are performed in ``heal_end``.
#. MgmtDriver gets Vim connection information in order to get existing
Kubernetes cluster information such as auth_url.
#. MgmtDriver gets the new VM information from the stack resource in Heat.
#. MgmtDriver setup a Worker-node on the VM by invoking shell script using
RemoteCommandExecutor.
Healing a Kubernetes cluster with SOL003
----------------------------------------
Basically, Heal operation for the entire Kubernetes cluster is based on the
spec of `mgmt-driver-for-k8s-cluster`_, which specifies operations for
instantiation and termination. For deleting an existing resource in a Heal
operation, ``heal_start`` is same as ``terminate_end``, and for generating a
new resource in a Heal operation, ``heal_end`` is same as ``instantiate_end``.
During the Heal of the entire VNF instance, the VIM information must be
deleted and the old Kubernetes cluster information stored in the
``vim_connection_info`` of the VNF Instance table must also be cleared. These
operations are required to be implemented in both ``heal_start`` and
``terminate_end``.
Following sequence diagram describes the components involved and the flow of
Heal operation for an entire VNF instance:
.. seqdiag::
seqdiag {
node_width = 80;
edge_length = 100;
"Client"
"Tacker-server"
"Tacker-conductor"
"VnfLcmDriver"
"OpenstackDriver"
"Heat"
"MgmtDriver"
"VnfInstance(Tacker DB)"
"RemoteCommandExecutor"
"NfvoPlugin"
Client -> "Tacker-server"
[label = "POST /vnf_instances/{vnfInstanceId}/heal"];
Client <-- "Tacker-server"
[label = "Response 202 Accepted"];
"Tacker-server" -> "Tacker-conductor"
[label = "trigger asynchronous task"];
"Tacker-conductor" -> "MgmtDriver"
[label = "heal_start"];
"MgmtDriver" -> "NfvoPlugin"
[label = "delete the failed VIM information"];
"MgmtDriver" <-- "NfvoPlugin"
[label = ""];
"MgmtDriver" -> "VnfInstance(Tacker DB)"
[label = "Clear the old Kubernetes cluster information stored in the
vim_connection_info of the VNF Instance"];
"MgmtDriver" <-- "VnfInstance(Tacker DB)"
"Tacker-conductor" <-- "MgmtDriver"
[label = ""];
"Tacker-conductor" -> "VnfLcmDriver"
[label = "execute VnfLcmDriver"];
"VnfLcmDriver" -> "OpenstackDriver"
[label = "execute OpenstackDriver"];
"OpenstackDriver" -> "Heat"
[label = "resource signal"];
"OpenstackDriver" -> "Heat"
[label = "update stack"];
"OpenstackDriver" <-- "Heat"
[label = ""];
"VnfLcmDriver" <-- "OpenstackDriver"
[label = ""];
"Tacker-conductor" <-- "VnfLcmDriver"
[label = ""];
"Tacker-conductor" -> "MgmtDriver"
[label = "heal_end"];
"MgmtDriver" -> "VnfInstance(Tacker DB)"
[label = "get stack id"];
"MgmtDriver" <-- "VnfInstance(Tacker DB)"
[label = ""];
"MgmtDriver" -> "Heat"
[label = "get ssh ip address and Kubernetes address using stack id"];
"MgmtDriver" <-- "Heat"
[label = ""];
"MgmtDriver" -> "RemoteCommandExecutor"
[label = "install Kubernetes on the new node"];
"MgmtDriver" <-- "RemoteCommandExecutor"
[label = ""];
"MgmtDriver" -> "RemoteCommandExecutor"
[label = "install etcd cluster by invoking shell script"];
"MgmtDriver" <-- "RemoteCommandExecutor"
[label = ""];
"MgmtDriver" -> "NfvoPlugin"
[label = "register Kubernetes VIM to tacker"];
"MgmtDriver" <-- "NfvoPlugin"
[label = ""]
"MgmtDriver" -> "VnfInstance(Tacker DB)"
[label = "append Kubernetes cluster VIM info to VimConnectionInfo"]
"MgmtDriver" <-- "VnfInstance(Tacker DB)"
[label = ""]
"Tacker-conductor" <-- "MgmtDriver"
[label = ""];
}
The procedure consists of the following steps as illustrated in above sequence:
#. Client sends a POST request to the Heal VNF Instance resource.
#. Basically the same sequence as described in the "3) Flow of Heal of a VNF
instance "chapter of spec `etsi-nfv-sol-rest-api-for-VNF-deployment`_,
except for the MgmtDriver.
#. ``heal_start`` works the same as `mgmt-driver-for-k8s-cluster`_ of
terminate_end.
#. OpenStack Driver uses Heat to delete failed VM.
#. OpenStack Driver uses Heat to create a new VM.
#. ``heal_end`` works the same as `mgmt-driver-for-k8s-cluster`_ of
instantiate_end.
Alternatives
------------
None
Data model impact
-----------------
None
REST API impact
---------------
None
Security impact
---------------
None
Notifications impact
--------------------
None
Other end user impact
---------------------
None
Performance Impact
------------------
None
Other deployer impact
---------------------
None
Developer impact
----------------
None
Implementation
==============
Assignee(s)
-----------
Primary assignee:
Yoshito Ito <yoshito.itou.dr@hco.ntt.co.jp>
Other contributors:
Shotaro Banno <banno.shotaro@fujitsu.com>
Ayumu Ueha <ueha.ayumu@fujitsu.com>
Liang Lu <lu.liang@fujitsu.com>
Work Items
----------
+ MgmtDriver will be modified to implement:
+ Support the healing of Kubernetes nodes in ``heal_start`` and ``heal_end``.
+ Provides the following sample script executed by MgmtDriver:
+ to incorporate the new Master-node into the existing etcd cluster.
+ to change HA Proxy settings due to replacement of old and new VMs in the
process of ``heal_start`` and ``heal_end``
+ Add new unit and functional tests.
Dependencies
============
This spec depends on the following specs:
+ `mgmt-driver-for-k8s-cluster`_
+ `mgmt-driver-for-ha-k8s`_
+ `mgmt-driver-for-k8s-scale`_
``heal_end`` referred in "Proposed change" is based on ``instantiate_end``
in the spec of `mgmt-driver-for-k8s-cluster`_.
Healing operation for individual Master-node is performed assuming that the
Kubernetes cluster is an HA configuration, as described in the spec of
`mgmt-driver-for-ha-k8s`_. Also, Healing operation for individual Worker-node
is referring `mgmt-driver-for-k8s-scale`_ for the ``scale_start``.
Testing
=======
Unit and functional tests will be added to cover cases required in the spec.
Documentation Impact
====================
Complete user guide will be added to explain Kubernetes cluster Healing from the
perspective of VNF LCM APIs.
References
==========
.. _support-scale-api-based-on-etsi-nfv-sol:
../victoria/support-scale-api-based-on-etsi-nfv-sol.html
.. _mgmt-driver-for-k8s-cluster:
./mgmt-driver-for-k8s-cluster.html
.. _mgmt-driver-for-k8s-scale:
./mgmt-driver-for-k8s-scale.html
.. _mgmt-driver-for-ha-k8s:
./mgmt-driver-for-ha-k8s.html
.. _etsi-nfv-sol-rest-api-for-VNF-deployment:
https://specs.openstack.org/openstack/tacker-specs/specs/ussuri/etsi-nfv-sol
-rest-api-for-VNF-deployment.html
.. [#SOL002] https://www.etsi.org/deliver/etsi_gs/NFV-SOL/001_099/002/
.. [#SOL003] https://www.etsi.org/deliver/etsi_gs/NFV-SOL/001_099/003/
.. [#VNF-deployment] https://specs.openstack.org/openstack/tacker-specs/specs/ussuri/etsi-nfv-sol-rest-api-for-VNF-deployment