Merge "Support for Software RAID"
This commit is contained in:
commit
74a16dd167
|
@ -0,0 +1,236 @@
|
|||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
=========================
|
||||
Support for Software RAID
|
||||
=========================
|
||||
|
||||
https://storyboard.openstack.org/#!/story/2004581
|
||||
|
||||
This spec proposes to add support for the configuration of software RAIDs.
|
||||
|
||||
In analogy to the way hardware RAIDs are currently set up, the RAID setup
|
||||
shall be done as part of the cleaning ("clean-time software RAID"). Admin
|
||||
Users define the target RAID config which will be applied whenever the
|
||||
node is cleaned, i.e. before it becomes available for instance creation.
|
||||
|
||||
In order to allow the End User to provide details on how the software RAID
|
||||
shall be configured, the RAID setup should eventually become part of the
|
||||
deployment steps. Integrating this into the deployment steps framework,
|
||||
however, is beyond the scope of this spec.
|
||||
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
As it is hardware agnostic, flexible, reliable, and easy to use, software RAID
|
||||
has become a popular choice to protect against disk device failures - also in
|
||||
production setups. Large deployments, such as the ones at Oath or CERN, rely
|
||||
on software RAID for their various services.
|
||||
|
||||
Ironic's current lack of support for such setups requires Deployers and Admins
|
||||
to withdraw to workarounds in order to provide their End Users with physical
|
||||
instances based on a software RAID configuration. These workarounds may require
|
||||
to maintain an additional installation infrastructure which is then either
|
||||
integrated into the installation process or requires the End User to re-install
|
||||
a machine a second time after it has been already provisioned by Ironic to
|
||||
eventually end up with the desired configuration of the disk devices. This
|
||||
increases the complexity for Deployers and Admins, and can also lead to a
|
||||
decrease of the End Users' satisfaction with the overall provisioning and
|
||||
installation process.
|
||||
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
The proposal is to extend Ironic to support software RAID by:
|
||||
|
||||
* using a node's ``target_raid_config`` to specify the desired s/w RAID layout
|
||||
(with some restrictions, see below);
|
||||
* adding support in the ``ironic-python-agent`` to understand a software
|
||||
RAID config as specified in a node's ``target_raid_config`` and be able to
|
||||
create and delete such configurations;
|
||||
* allow the ``ironic-python-agent`` to consider s/w RAID devices for
|
||||
deployment, e.g. via root device hints (considering them at all is
|
||||
already addressed in [1]);
|
||||
* adding support in Ironic and the ``ironic-python-agent`` to take the
|
||||
necessary steps to boot from a s/w RAID, e.g. installing the boot loader
|
||||
on the correct device(s).
|
||||
|
||||
Initially, only the following configurations will be supported for the
|
||||
``target_raid_config`` as to be set by the Admin:
|
||||
|
||||
* a single RAID-1 spanning the available devices and serving as the deploy
|
||||
target device, or
|
||||
* a RAID-1 serving as the deploy target device plus a RAID-N where the RAID
|
||||
level N is configurable by the Admin. N can be 0, 1, 5, 6, or 10.
|
||||
|
||||
The supported configurations have been limited to these two options in order
|
||||
to avoid issues when booting from RAID devices. Having a (small) RAID-1 device
|
||||
to boot from is a common approach when setting up more advanced RAID
|
||||
configurations: a RAID-1 holder device can look like a standalone disk and does
|
||||
not require the bootloader to have any knowledge or capabilities to understand
|
||||
more complex RAID configurations.
|
||||
|
||||
Support for more than one RAID-N, support for the selection of a subset of
|
||||
drives to act as holder devices, as well as support to partition the created
|
||||
RAID-N device are left for follow-up enhancements and beyond the scope of
|
||||
this specification.
|
||||
|
||||
A first prototype very close to the proposal is available from [2][3][4].
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
As mentioned above, the alternative is to use other methods to create s/w RAID
|
||||
setups on physical nodes and integrate these out-of-band approaches into the
|
||||
provisioning workflow of individual deployments. This increases complexity on
|
||||
the Deployer/Admin side and can have a negative impact on the user experience
|
||||
when creating physical instances which need to have a software RAID setup..
|
||||
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
None.
|
||||
|
||||
|
||||
State Machine Impact
|
||||
--------------------
|
||||
|
||||
None.
|
||||
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
None.
|
||||
|
||||
|
||||
Client (CLI) impact
|
||||
-------------------
|
||||
|
||||
None.
|
||||
|
||||
"ironic" CLI
|
||||
~~~~~~~~~~~~
|
||||
None.
|
||||
|
||||
"openstack baremetal" CLI
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
None.
|
||||
|
||||
RPC API impact
|
||||
--------------
|
||||
|
||||
None.
|
||||
|
||||
Driver API impact
|
||||
-----------------
|
||||
|
||||
The proposed functionality could be consolidated into a new RAID interface.
|
||||
|
||||
Nova driver impact
|
||||
------------------
|
||||
|
||||
None.
|
||||
|
||||
Ramdisk impact
|
||||
--------------
|
||||
|
||||
The ``ironic-python-agent`` will need to be able to:
|
||||
* setup and clean software RAID devices
|
||||
* consider software RAID devices for deployment
|
||||
* configure the holder devices of the RAID-1 device in a way they are bootable
|
||||
|
||||
This functionality could be consolidated in an additional RAID interface.
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
None.
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
While the predefined RAID-1 ensures that a system should be able to boot,
|
||||
End Users need to be aware that the kernel of the started image needs to
|
||||
be able to understand software RAID devices.
|
||||
|
||||
Scalability impact
|
||||
------------------
|
||||
|
||||
None.
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
None.
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
Deployers will need to be aware that the configuration and clean up of
|
||||
the RAID-N devices is only done during cleaning, so any changes require
|
||||
the node to be cleaned. Also, the config is not configurable by the End
|
||||
User, but limited to admins (as the target_raid_config) is a node
|
||||
property. All of this, however, already holds true for hardware RAID
|
||||
configurations.
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
None.
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
An inital proof-of-concept is available from [2][3][4].
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
None.
|
||||
|
||||
Other contributors:
|
||||
Arne.Wiebalck@cern.ch (arne_wiebalck)
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
This is to be defined once the overall idea is accepted and there's agreement
|
||||
on a design.
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
None.
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
TBD
|
||||
|
||||
Upgrades and Backwards Compatibility
|
||||
====================================
|
||||
|
||||
None.
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
Documentation on how to configure a software RAID along with the limitations
|
||||
outlined in 'Deployer's Impact' need to be documented.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
[1] https://review.openstack.org/#/c/592639
|
||||
[2] CERN Hardware Manager: https://github.com/cernops/cern-ironic-hardware-manager/commit/7f6d892ec4848a09000ed1f28f3137bf8ba917f0
|
||||
[3] Patched Ironic Python Agent: https://github.com/cernops/ironic-python-agent/commit/bddac76c4d100af0103a6bc08b81dd71681a9c02
|
||||
[4] Patched Ironic: https://github.com/cernops/ironic/commit/581e65f1d8986ac3e859678cb9aadd5a5b06ba60
|
||||
|
|
@ -0,0 +1 @@
|
|||
../approved/software-raid.rst
|
Loading…
Reference in New Issue