Introduce etcd for service coordination
The spec introduces etcd and tooz for the inspector service coordination, which is a prerequisite for service split. Group management will be used to calculate which ironic-inspector conductor service the rpc request will be sent to, distributed locking support will help to avoid racing under concurrent environment. Change-Id: If2c228c4d2ebaf93d79c4cbf2cc39146f8f74086 Story: 2001842 Task: 30376
This commit is contained in:
parent
110ec01268
commit
2a157e2630
|
@ -0,0 +1,175 @@
|
|||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
========================================
|
||||
Incorporate ETCD as service coordination
|
||||
========================================
|
||||
|
||||
https://storyboard.openstack.org/#!/story/2001842
|
||||
|
||||
This spec is part of the ironic-inspector HA work. To further split the
|
||||
inspector service, this spec proposes to introduce etcd as the base service
|
||||
for the coordination between ironic-inspector api and conductor services.
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
From the previous work, the single process ironic-inspector is logically
|
||||
splitted into two services both running under ``oslo.service``, namely
|
||||
``ironic_inspector`` and ``ironic-inspector-conductor``.
|
||||
|
||||
To split two services into two processes, we need to address existing
|
||||
functional test issue before we can split two services into respective
|
||||
executables. Currently the functional test uses fake messaging driver
|
||||
which only works for single process, we can either add rabbitmq support
|
||||
for functional test env or introduce another messaging mechanism like
|
||||
``json-rpc``, but the first solution is not desirable.
|
||||
|
||||
Even when services are splitted, we are facing the challenge of service
|
||||
coordination, for multiple inspector conductor services, we need a way to
|
||||
prevent the racing of concurrent operation on the same node, or to choose
|
||||
which inspector conductor should the request be delivered to.
|
||||
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
As etcd is already a base service for the OpenStack platform, the spec
|
||||
proposes to add ``python-etcd3`` and ``tooz`` as project requirements for the
|
||||
service coordination. ``tooz`` provides several feature encapsulations like
|
||||
group management, locking, etc. Group management is only implemented for ETCD
|
||||
API v3, thus ``python-etcd3`` is required.
|
||||
|
||||
All proposed work is implemented with tooz interfaces. Each service will
|
||||
create a coordinator and keep heartbeating, the example workflow for
|
||||
ironic-inspector API service:
|
||||
|
||||
#. Create a coordinator with hostname
|
||||
#. Create a group "ironic-inspector-service-group", bypass if the group
|
||||
already exists.
|
||||
#. Query query group members upon API request, randomly pick one conductor,
|
||||
generate topic according to hostname and send rpc request.
|
||||
|
||||
The example workflow for ironic-inspector conductor service:
|
||||
|
||||
#. Create a coordinator with hostname
|
||||
#. Join group "ironic-inspector-service-group", create and join if the
|
||||
group does not exist.
|
||||
#. Leaving group explicitly when service is shutdown.
|
||||
|
||||
There is no distributed locking support for ironic-inspector, this spec will
|
||||
introduce an abstract lock layer, and implement locking support based on tooz.
|
||||
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
Though it's totally workable to utilize database as the the coordination
|
||||
source just like ironic, it would be much lighter if implemented with tooz.
|
||||
tooz also supports multiple backends, which brings more possibilities in
|
||||
deployement.
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
None.
|
||||
|
||||
HTTP API impact
|
||||
---------------
|
||||
|
||||
None.
|
||||
|
||||
Client (CLI) impact
|
||||
-------------------
|
||||
|
||||
None.
|
||||
|
||||
Ironic python agent impact
|
||||
--------------------------
|
||||
|
||||
None.
|
||||
|
||||
Ironic impact
|
||||
-------------
|
||||
|
||||
None.
|
||||
|
||||
Performance and scalability impact
|
||||
----------------------------------
|
||||
|
||||
There should be no obvious performance and scalability impact before services
|
||||
are actually splitted.
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
None.
|
||||
|
||||
Deployer impact
|
||||
---------------
|
||||
|
||||
A new configuration section ``etcd`` with options below will be added to
|
||||
support etcd operation:
|
||||
|
||||
* ``host`` and ``port``: specify the etcd service endpoint.
|
||||
* ``ca_cert``, ``cert_key`` and ``cert_cert``: specify SSL related
|
||||
authentication.
|
||||
* ``timeout``: connection timeout per request.
|
||||
* ``user`` and ``password``: the username and password if etcd authentication
|
||||
is required.
|
||||
* ``group_path``: the name of service group used to coordinate inspector
|
||||
services, it can be a key path, a key prefix or both. By default, the value
|
||||
will be ``/openstack/ironic-inspector/service-group``.
|
||||
* ``lock_prefix``: a string prefix for a lock name, for example, locking a node
|
||||
``fake-node-uuid`` with prefix ``ironic-inspector`` will have a lock name of
|
||||
``ironic-inspector.fake-node-uuid`` passed to tooz.
|
||||
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
None.
|
||||
|
||||
Upgrades and Backwards Compatibility
|
||||
------------------------------------
|
||||
|
||||
After this spec is implemented, etcd v3 will be a mandatory requirement for
|
||||
inspector service working properly.
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
kaifeng - kaifeng.w@gmail.com
|
||||
|
||||
Other contributors:
|
||||
None
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
Implement proposed work.
|
||||
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
``python-etcd3`` and ``tooz`` are required library support.
|
||||
There should be a etcd v3 service running in the same cloud.
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
Will be covered by unittest and bifrost.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
https://docs.openstack.org/tooz/latest/user/index.html
|
||||
|
Loading…
Reference in New Issue