summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--specs/stein/magnum-nodegroups.rst275
1 files changed, 275 insertions, 0 deletions
diff --git a/specs/stein/magnum-nodegroups.rst b/specs/stein/magnum-nodegroups.rst
new file mode 100644
index 0000000..22d0391
--- /dev/null
+++ b/specs/stein/magnum-nodegroups.rst
@@ -0,0 +1,275 @@
1Magnum Nodegroups
2=================
3
4Launchpad blueprint:
5
6https://blueprints.launchpad.net/magnum/+spec/magnum-nodegroups
7
8This is a proposal to extend the Magnum API adding support for nodegroups.
9
10Problem Description
11-------------------
12
13Currently Magnum supports the creation of clusters with all the nodes in the
14same availability zone. At the same time, the user has the ability to choose
15one flavor for master nodes and one for worker nodes.
16
17The concept of nodegroups provides users with the ability to specify groups of
18nodes with different properties. Within the scope of a group users are able to
19define labels, used image, flavor, etc depending on the purpose these nodes are
20going to be used for.
21
22This proposal tries to address the changes needed to support nodegroups with
23Magnum.
24
25Use Cases
26---------
27
281. As a user, I want to deploy heterogeneous workloads in the same cluster.
29 These can include sql databases with high iops requirements, caches
30 requiring large amounts of memory and batch jobs requiring a larger number
31 of cpus or even gpus.
32
332. As a user I want to create higly available clusters with Magnum.
34
35Proposed Changes
36----------------
37
38The proposed change includes:
39
40* Add a new '/clusters/{cluster_id}/nodegroups' REST API endpoint to Magnum
41 providing management of the given cluster's nodegroups. This includes
42 nodegroup creation, update and deletion.
43
44* Add a new object to the data model to represent a nodegroup.
45
46* Change the cluster create procedure to create two default nodegroups, one
47 containing the master node(s) of the cluster and one containing the worker
48 node(s).
49
50* Adapt the cluster delete procedure to delete also the nodegroups associated
51 with the cluster being deleted.
52
53Check sections `Data Model Impact`_ and `REST API Impact`_ for more details.
54
55 NOTE::
56 As a first step, users will be able to create nodegroups containing only
57 worker nodes. This is because the scripts used for scaling up do not
58 support adding new master nodes to the cluster. This change is left as
59 future work and will be handled by another spec.
60
61Alternatives
62------------
63
64As an alternative to the proposed solution, a user could create multiple
65independent clusters and connect them in one single federated control plane,
66acting as one heterogeneous cluster.
67
68The problem is that there is no feature parity between the cluster and the
69federation APIs and for the time being, cluster federation is supported only by
70the Kubernetes COE.
71
72It seems that the concept of nodegroups takes care of the matter at hand, in a
73more complete way.
74
75Data Model Impact
76-----------------
77
78A new entity would be added (corresponding tables will be added):
79
80* **nodegroup**
81
82 * uuid
83 * name
84 * cluster_uuid (the uuid of the cluster where the nodegroup belongs)
85 * project_id
86 * docker_volume_size
87 * labels
88 * flavor_id
89 * image_id
90 * node_addresses
91 * node_count
92 * role (shows if the nodegroup contains master or worker nodes for now)
93
94The project id could be fetched by the cluster, but we add it here also for
95future use. This is the scenario where the master nodes belong to an operator
96tenant and the cluster nodegroups belong to different projects.
97
98Adding the nodegroup entity means that some information currently stored in the
99the cluster, should be moved to nodegroup table. The cluster columns that need
100to be dropped are the following:
101
102* node_count
103* master_count
104* node_addresses
105* master_addresses
106
107 NOTE::
108 It is really important to point out that moving information from the
109 cluster to the nodegroup table will NOT result in changing the output of
110 the existing CLIs. The only thing that will change is the way this
111 information is stored and subsequently fetched from the database.
112 e.g. The cluster show output will contain the node_count information but it
113 will be calculated at the API level by summing the node_count of all
114 the associated worker nodegroups.
115
116REST API Impact
117---------------
118
119This change leads to a minor version increase in the Magnum API, the
120addition of a new REST endpoint and a new set of CLI commands.
121
122Below is a description of the commands to manage nodegroups:
123
124* add a new nodegroup, in an existing cluster::
125
126 openstack coe node-group create <params> <cluster> <nodegroup>
127
128* delete an existing nodegroup::
129
130 openstack coe node-group delete <cluster> <nodegroup>
131
132* update an existing nodegroup::
133
134 openstack coe node-group update <params> <cluster> <nodegroup>
135
136* list existing nodegroups given an existing cluster::
137
138 openstack coe node-group list <cluster>
139
140 +------+-------------+-------------+------------+-----------+
141 | uuid | name | flavor id | node count | role |
142 +------+-------------+-------------+------------+-----------+
143 | ... | nodegroup1 | flavor-1 | 3 | master |
144 +------+-------------+-------------+------------+-----------+
145 | ... | nodegroup2 | flavor-2 | 5 | worker |
146 +------+-------------+-------------+------------+-----------+
147
148* show details of an existing nodegroup::
149
150 openstack coe node-group show <cluster> <nodegroup>
151
152 +---------------------+-------------------------------------------+
153 | Property | Value |
154 +---------------------+-------------------------------------------+
155 | uuid | 5b2ee3b5-2f85-4917-be7c-11a2c82031ad |
156 | name | nodegroup1 |
157 | cluster uuid | <uuid-cluster1> |
158 | project id | <uuid-project1> |
159 | docker volume size | 5 |
160 | labels | <label1>, <label2>, <label3> |
161 | flavor id | flavor1 |
162 | node count | 3 |
163 | node addresses | <ip-node1>, <ip-node2>, <ip-node3> |
164 | role | master |
165 +---------------------+-------------------------------------------+
166
167Backward Compatibility
168----------------------
169
170In this section we refer to the clusters created before the introduction of
171Magnum Nodegroups as "old clusters".
172
173During the upgrade, the existing stacks will not be modified. This is the
174reason that adding as well as deleting nodegroups to/from old clusters will be
175not permitted.
176
177Showing details for a nodegroup in an old cluster should work correctly.
178
179Security Impact
180---------------
181
182There is no keypair added in the nodegroup object as all nodegroups will
183inherit the one set to the cluster. This approach was chosen, in order to not
184propagate the use of keypairs to the level of nodegroups and complicate further
185their removal in the future.
186
187Notifications Impact
188--------------------
189
190New notifications will be added for:
191* nodegroup creation
192* nodegroup deletion
193* nodegroup update
194
195Other End User Impact
196---------------------
197
198New subcommands will be added to the openstack client as described above.
199
200At the same time, some of the existing commands for managing clusters have to
201be adapted:
202
203### Cluster Create ###
204The existing create cluster cli will result in a cluster with two default
205nodegroups, one for the master node(s) and one for the worker(s).
206
207### Cluster Delete ###
208When the user deletes a cluster, all the associated nodegroups will be deleted
209as well. There is no point of making the user delete all the nodegroups
210separately before deleting the cluster.
211
212### Cluster Update ###
213Cluster update should continue working for the already existing clusters and it
214should be deprecated for the new ones. All scaling operations for new clusters
215should be done using the "node-group update" command.
216
217### Cluster Show ###
218Firstly, the node count of the cluster should reflect the sum of the node count
219fields of all its nodegroups.
220Another thing that has to be handled is showing the status of the cluster. The
221show cluster cli should summarize the status of its nodegroups since each stack
222has its own status.
223
224Developer Impact
225----------------
226
227None.
228
229Implementation
230--------------
231
232The implementation will be done in 4 phases.
233
2341. Add the new API endpoint and data model entity, and the corresponding
235 controller implementation linked to each driver. At this point we will
236 have all drivers declaring every operation regarding nodegroups as
237 'Not Implemented'. At the same step, we need to adapt all the operations
238 for cluster management.
239
2402. Implement the nodegroup functionality for all drivers.
241
2423. Add the new command line tools to the openstack client.
243
2444. Implement the Magnum nodegroup notifications, for creation, deletion and
245 update.
246
247Assignee(s)
248-----------
249
250Primary assignee:
251 <ttsiouts>
252
253Work Items
254----------
255
256See `Implementation`_.
257
258Testing
259-------
260
261A new set of unit and functional tests covering creation, deletion and update
262of nodegroups is needed. At the same time, the existing tests for cluster
263creation, deletion and update should be adapted.
264
265Documentation Impact
266--------------------
267
268New documentation will be added to describe the new API endpoint and its
269functionality as well as the changes in the existing cluster API.
270
271References
272----------
273
274Magnum Nodegroups Blueprint:
275https://blueprints.launchpad.net/magnum/+spec/magnum-nodegroups