summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorArne Wiebalck <Arne.Wiebalck@cern.ch>2018-12-11 15:26:34 +0100
committerArne Wiebalck <Arne.Wiebalck@cern.ch>2018-12-18 12:29:46 +0100
commitf4ec59ae861358c491bfbfc9adb660dbc5bec787 (patch)
treede3add499d371c52716d0e821e5448390f69a210
parente5c3ea7b9e8138897105f0f17dc62a3b47ca2c1e (diff)
Support for Software RAID
This spec proposes to add support for software RAID to Ironic. Change-Id: I64fa27eac172016da5588156be3c0f2e3c5c6c31
Notes
Notes (review): Code-Review+2: Jim Rollenhagen <jim@jimrollenhagen.com> Code-Review+2: Julia Kreger <juliaashleykreger@gmail.com> Workflow+1: Julia Kreger <juliaashleykreger@gmail.com> Verified+2: Zuul Submitted-by: Zuul Submitted-at: Mon, 31 Dec 2018 15:51:48 +0000 Reviewed-on: https://review.openstack.org/624413 Project: openstack/ironic-specs Branch: refs/heads/master
-rw-r--r--specs/approved/software-raid.rst236
l---------specs/not-implemented/software-raid.rst1
2 files changed, 237 insertions, 0 deletions
diff --git a/specs/approved/software-raid.rst b/specs/approved/software-raid.rst
new file mode 100644
index 0000000..3b1284a
--- /dev/null
+++ b/specs/approved/software-raid.rst
@@ -0,0 +1,236 @@
1..
2 This work is licensed under a Creative Commons Attribution 3.0 Unported
3 License.
4
5 http://creativecommons.org/licenses/by/3.0/legalcode
6
7=========================
8Support for Software RAID
9=========================
10
11https://storyboard.openstack.org/#!/story/2004581
12
13This spec proposes to add support for the configuration of software RAIDs.
14
15In analogy to the way hardware RAIDs are currently set up, the RAID setup
16shall be done as part of the cleaning ("clean-time software RAID"). Admin
17Users define the target RAID config which will be applied whenever the
18node is cleaned, i.e. before it becomes available for instance creation.
19
20In order to allow the End User to provide details on how the software RAID
21shall be configured, the RAID setup should eventually become part of the
22deployment steps. Integrating this into the deployment steps framework,
23however, is beyond the scope of this spec.
24
25
26Problem description
27===================
28
29As it is hardware agnostic, flexible, reliable, and easy to use, software RAID
30has become a popular choice to protect against disk device failures - also in
31production setups. Large deployments, such as the ones at Oath or CERN, rely
32on software RAID for their various services.
33
34Ironic's current lack of support for such setups requires Deployers and Admins
35to withdraw to workarounds in order to provide their End Users with physical
36instances based on a software RAID configuration. These workarounds may require
37to maintain an additional installation infrastructure which is then either
38integrated into the installation process or requires the End User to re-install
39a machine a second time after it has been already provisioned by Ironic to
40eventually end up with the desired configuration of the disk devices. This
41increases the complexity for Deployers and Admins, and can also lead to a
42decrease of the End Users' satisfaction with the overall provisioning and
43installation process.
44
45
46Proposed change
47===============
48
49The proposal is to extend Ironic to support software RAID by:
50
51* using a node's ``target_raid_config`` to specify the desired s/w RAID layout
52 (with some restrictions, see below);
53* adding support in the ``ironic-python-agent`` to understand a software
54 RAID config as specified in a node's ``target_raid_config`` and be able to
55 create and delete such configurations;
56* allow the ``ironic-python-agent`` to consider s/w RAID devices for
57 deployment, e.g. via root device hints (considering them at all is
58 already addressed in [1]);
59* adding support in Ironic and the ``ironic-python-agent`` to take the
60 necessary steps to boot from a s/w RAID, e.g. installing the boot loader
61 on the correct device(s).
62
63Initially, only the following configurations will be supported for the
64``target_raid_config`` as to be set by the Admin:
65
66* a single RAID-1 spanning the available devices and serving as the deploy
67 target device, or
68* a RAID-1 serving as the deploy target device plus a RAID-N where the RAID
69 level N is configurable by the Admin. N can be 0, 1, 5, 6, or 10.
70
71The supported configurations have been limited to these two options in order
72to avoid issues when booting from RAID devices. Having a (small) RAID-1 device
73to boot from is a common approach when setting up more advanced RAID
74configurations: a RAID-1 holder device can look like a standalone disk and does
75not require the bootloader to have any knowledge or capabilities to understand
76more complex RAID configurations.
77
78Support for more than one RAID-N, support for the selection of a subset of
79drives to act as holder devices, as well as support to partition the created
80RAID-N device are left for follow-up enhancements and beyond the scope of
81this specification.
82
83A first prototype very close to the proposal is available from [2][3][4].
84
85Alternatives
86------------
87
88As mentioned above, the alternative is to use other methods to create s/w RAID
89setups on physical nodes and integrate these out-of-band approaches into the
90provisioning workflow of individual deployments. This increases complexity on
91the Deployer/Admin side and can have a negative impact on the user experience
92when creating physical instances which need to have a software RAID setup..
93
94
95Data model impact
96-----------------
97
98None.
99
100
101State Machine Impact
102--------------------
103
104None.
105
106
107REST API impact
108---------------
109
110None.
111
112
113Client (CLI) impact
114-------------------
115
116None.
117
118"ironic" CLI
119~~~~~~~~~~~~
120None.
121
122"openstack baremetal" CLI
123~~~~~~~~~~~~~~~~~~~~~~~~~
124None.
125
126RPC API impact
127--------------
128
129None.
130
131Driver API impact
132-----------------
133
134The proposed functionality could be consolidated into a new RAID interface.
135
136Nova driver impact
137------------------
138
139None.
140
141Ramdisk impact
142--------------
143
144The ``ironic-python-agent`` will need to be able to:
145* setup and clean software RAID devices
146* consider software RAID devices for deployment
147* configure the holder devices of the RAID-1 device in a way they are bootable
148
149This functionality could be consolidated in an additional RAID interface.
150
151Security impact
152---------------
153
154None.
155
156Other end user impact
157---------------------
158
159While the predefined RAID-1 ensures that a system should be able to boot,
160End Users need to be aware that the kernel of the started image needs to
161be able to understand software RAID devices.
162
163Scalability impact
164------------------
165
166None.
167
168Performance Impact
169------------------
170
171None.
172
173Other deployer impact
174---------------------
175
176Deployers will need to be aware that the configuration and clean up of
177the RAID-N devices is only done during cleaning, so any changes require
178the node to be cleaned. Also, the config is not configurable by the End
179User, but limited to admins (as the target_raid_config) is a node
180property. All of this, however, already holds true for hardware RAID
181configurations.
182
183Developer impact
184----------------
185
186None.
187
188Implementation
189==============
190
191An inital proof-of-concept is available from [2][3][4].
192
193Assignee(s)
194-----------
195
196Primary assignee:
197 None.
198
199Other contributors:
200 Arne.Wiebalck@cern.ch (arne_wiebalck)
201
202Work Items
203----------
204
205This is to be defined once the overall idea is accepted and there's agreement
206on a design.
207
208Dependencies
209============
210
211None.
212
213Testing
214=======
215
216TBD
217
218Upgrades and Backwards Compatibility
219====================================
220
221None.
222
223Documentation Impact
224====================
225
226Documentation on how to configure a software RAID along with the limitations
227outlined in 'Deployer's Impact' need to be documented.
228
229References
230==========
231
232[1] https://review.openstack.org/#/c/592639
233[2] CERN Hardware Manager: https://github.com/cernops/cern-ironic-hardware-manager/commit/7f6d892ec4848a09000ed1f28f3137bf8ba917f0
234[3] Patched Ironic Python Agent: https://github.com/cernops/ironic-python-agent/commit/bddac76c4d100af0103a6bc08b81dd71681a9c02
235[4] Patched Ironic: https://github.com/cernops/ironic/commit/581e65f1d8986ac3e859678cb9aadd5a5b06ba60
236
diff --git a/specs/not-implemented/software-raid.rst b/specs/not-implemented/software-raid.rst
new file mode 120000
index 0000000..1d15073
--- /dev/null
+++ b/specs/not-implemented/software-raid.rst
@@ -0,0 +1 @@
../approved/software-raid.rst \ No newline at end of file