WIP etcd coordination

Introduce etcd and tooz as service coordination.

Change-Id: If2c228c4d2ebaf93d79c4cbf2cc39146f8f74086
Story: 2001842
Task: 30376
This commit is contained in:
Kaifeng Wang 2019-04-08 16:16:32 +08:00
parent 110ec01268
commit cd6c8744df
1 changed files with 162 additions and 0 deletions

162
specs/etcd-coordination.rst Normal file
View File

@ -0,0 +1,162 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
========================================
Incorporate ETCD as service coordination
========================================
https://storyboard.openstack.org/#!/story/2001842
This spec is part of the ironic-inspector HA work. To further split the
inspector service, this spec proposes to introduce etcd as the base service
for the coordination between ironic-inspector api and conductor services.
Problem description
===================
From the previous work, the single process ironic-inspector is logically
splitted into two services both running under ``oslo.service``, namely
``ironic_inspector`` and ``ironic-inspector-conductor``.
To split two services into two processes, we need to address existing
functional test issue before we can split two services into respective
executables. Currently the functional test uses fake messaging driver
which only works for single process, we can either add rabbitmq support
for functional test env or introduce another messaging mechanism like
``json-rpc``, but the first solution is not desirable.
Even when services are splitted, we are facing the challenge of service
coordination, for multiple inspector conductor services, we need a way to
prevent the racing of concurrent operation on the same node, or to choose
which inspector conductor should the request be delivered to.
Proposed change
===============
As etcd is already a base service for the OpenStack platform, the spec
proposes to add ``python-etcd3`` and ``tooz`` as project requirements for the
service coordination. ``tooz`` provides several feature encapsulations like
group management, locking, etc. Group management is only implemented for ETCD
API v3, thus ``python-etcd3`` is required.
All proposed work is implemented with tooz interfaces. Each service will
create a coordinator and keep heartbeating, the example workflow for
ironic-inspector API service:
#. Create a coordinator with hostname
#. Create a group "ironic-inspector-service-group", bypass if the group
already exists.
#. Query query group members upon API request, randomly pick one conductor,
generate topic according to hostname and send rpc request.
The example workflow for ironic-inspector conductor service:
#. Create a coordinator with hostname
#. Join group "ironic-inspector-service-group", create and join if the
group does not exist.
#. Leaving group explicitly when service is shutdown.
There is no distributed locking support for ironic-inspector, this spec will
introduce an abstract lock layer, and implement locking support based on tooz.
Alternatives
------------
Though it's totally workable to utilize database as the the coordination
source just like ironic, it would be much lighter if implemented with tooz.
tooz also supports multiple backends, which brings more possibilities in
deployement.
Data model impact
-----------------
None.
HTTP API impact
---------------
None.
Client (CLI) impact
-------------------
None.
Ironic python agent impact
--------------------------
None.
Ironic impact
-------------
None.
Performance and scalability impact
----------------------------------
There should be no obvious performance and scalability impact before services
are actually splitted.
Security impact
---------------
None.
Deployer impact
---------------
TODO(kaifeng): Add configuration options to support proposed work:
- etcd: host, port, ca_cert, cert_key, cert_cert, timeout, user, password,
?grpc_options?
- group name
- lock prefix
Developer impact
----------------
None.
Upgrades and Backwards Compatibility
------------------------------------
After this spec is implemented, etcd v3 will be a mandatory requirement for
inspector service working properly.
Implementation
==============
Assignee(s)
-----------
Primary assignee:
kaifeng, kaifeng.w@gmail.com
Work Items
----------
Implement proposed work.
Dependencies
============
``python-etcd3`` and ``tooz`` are required library support.
There should be a etcd v3 service running in the same cloud.
Testing
=======
Will be covered by unittest and bifrost.
References
==========
https://docs.openstack.org/tooz/latest/user/index.html