diff --git a/specs/approved/agent-http-provisioning.rst b/specs/approved/agent-http-provisioning.rst index 0808937a..a006d03a 100644 --- a/specs/approved/agent-http-provisioning.rst +++ b/specs/approved/agent-http-provisioning.rst @@ -11,8 +11,8 @@ The direct deploy interface provisioning with HTTP server https://storyboard.openstack.org/#!/story/1598852 This spec proposes a mechanism to provision baremetal nodes by hosting custom -HTTP service as an image source provider to ``direct`` deploy interface, when -the Image service is utilized. +HTTP service as an image source provider to the ``direct`` deploy interface, +when the Image service is utilized. Problem description =================== @@ -30,7 +30,7 @@ The problem is the Object Storage service is not always adopted in a deployment due to various reasons, and itself imposes restrictions on deployment. E.g.: -* It profits low for a small cloud but takes more hardware resource. +* It has little benefit for a small cloud but takes more hardware resource. * It requires baremetal nodes to have access to control plane network, which is a restriction to network topology. * It requires the Image service be configured with a backend of swift, which @@ -49,46 +49,104 @@ work. Currently there are two scenarios, if the ``instance_info['image_source']`` indicates it's a glance image, the ``direct`` deploy interface generates tempurl via glance client, and stores it to ``instance_info['image_url']``, -otherwise it will be used directly as ``image_url``. Typically the two -cases represent using the Bare Metal service in the cloud or as a standalone +otherwise it will be directly taken as ``image_url``. The two cases typically +represent using the Bare Metal service in the cloud or as a standalone service, respectively. -Introduces a new string option ``[agent]image_download_source`` to control -which kind of image URL will be generated when the ``image_source`` is a -glance image. Allowed values are ``swift`` and ``http``, defaults to ``swift``. +The proposal introduces a new string option ``[agent]image_download_source`` +to control which kind of image URL will be generated when the ``image_source`` +is a glance image. Allowed values are ``swift`` and ``http``, defaults to +``swift``. The process of the ``direct`` deploy interface on different configurations is defined as: -* ``swift``: keeps current logic, generates tempurl and update it to +* ``swift``: Keeps current logic, generates tempurl and update it to ``instance_info['image_url']``. -* ``http``: downloads instance image via ``ImageCache`` before node - deployment, makes the cached image accessible by local HTTP service, - generates proper URL and updates it to ``instance_info['image_url']``. +* ``http``: Downloads instance image via ``InstanceImageCache`` before node + deployment, creates symbolic link to downloaded instance image in the + directory accessible by local HTTP service, generates proper URL and updates + it to ``instance_info['image_url']``. -The existing ``[deploy]http_root`` and ``[deploy]http_url`` are reused for -storing instance image symlinks and generating instance image URLs. A new -string option ``[deploy]http_image_path`` is introduced to keep it isolated -with iPXE related scripts. The default value is ``agent_images``. +The existing ``[deploy]http_root`` is reused for storing symbolic links to +downloaded instance images. A new string option ``[deploy]http_image_subdir`` +is introduced to keep it isolated with iPXE related scripts. The default value +is ``agent_images``. The existing ``[deploy]http_url`` is reused to generate +instance image URLs. The ``direct`` deploy interface will use the same instance cache for image -caching, this will be performed at ``AgentDeploy.deploy``. After an instance -image is cached, the ``direct`` deploy interface creates a soft symlink at -``/`` to reference the instance image. It will be -``/httpboot/agent_images/`` if all goes to default. +caching, the caching will be performed at ``AgentDeploy.deploy``. After an +instance image is cached, the ``direct`` deploy interface creates a symbolic +link at ``/`` to reference the instance image. +It will be ``/httpboot/agent_images/`` if all goes to default. The ``direct`` deploy interface generates URL for the instance image and updates it to ``instance_info`` at ``AgentDeploy.prepare``. The corresponding -image URL will be ``//``. If -``[DEFAULT]force_raw_images`` is set to true, checksum will be recalculated -and updated as well. It is highly encouraged to set it false for better -performance. - -The symbolic link will be removed at ``AgentDeploy.deploy`` when a node deploy -is done, or ``AgentDeploy.clean_up`` when a node is teared down from the state +image URL will be ``//``. The symbolic +link will be removed at ``AgentDeploy.deploy`` when a node deploy is done, or +``AgentDeploy.clean_up`` when a node is teared down from the state ``deploy failed``. +Rule to convert image +--------------------- + +Currently the ``iscsi`` deploy interface will convert image to ``raw`` if +``[DEFAULT]force_raw_images`` is set to True. + +While IPA treats instance image in two different ways: + +* If the instance image format is ``raw``, ``stream_raw_images`` is True and + image type is whole disk image, the image will be streamed into the target + disk of the Bare Metal. +* Otherwise the image will be cached into memory before written to disk. + +To avoid a raw image been cached into the memory of Bare Metal, the ``direct`` +deploy interface will convert image to raw only if following criteria is met: + +* ``[DEFAULT]force_raw_images`` is set to True, +* ``[agent]stream_raw_images`` is set to True, +* The instance image type is a whole disk image. + +The ``direct`` deploy interface will recalculate MD5 checksum and update +necessary fields to ``instance_info`` if image conversion happened. + +Cache sharing +------------- + +``iscsi`` and ``direct`` deploy interface are sharing the same cache, +but apply different rule to whether the image should be converted to raw. +It leads to cache compatibility issue when both interface are in use. + +As an example, suppose we deploy node A (using iscsi) with a partition image, +then deploy node B (use direct) with the same image. The image in the cache is +converted to raw, but according to the rule of ``direct`` deploy interface, it +assumes image will not be converted to raw, though it specifies ``force_raw`` +to false to the image cache, due to cache hit, actually no image action will +be performed, this will leads to the situation that the ``direct`` deploy +interface actually provide a raw image but without MD5 recalculation. + +Vice versa, if we reverse the order above, the ``iscsi`` deploy interface may +get a qcow with ``[DEFAULT]force_raw_images`` set to true, though it's +probably not an issue because populate_image will check image format before +writing. it's still not a consistent behavior. + +To address the issue described above, this spec proposes to update +``ImageCache.fetch_image`` to take the input argument ``force_raw`` into +account for the master image file name: + +* The master file name is not changed if ``force_raw`` is set to ``False``. +* The master file name will have ``.converted`` as file extension if + ``force_raw`` is set to ``True``, e.g.:: + + /var/lib/ironic/master_images/6e2c5132-db24-4e0d-b612-478c3539da1e.converted + +Note that the ``.converted`` extension merely acts as an indicator that the +image downloaded has gone through the conversion logic. For a raw image in the +glance, the name of master image file still has ``.converted`` as long as +``force_raw`` argument passed in is True. + + Alternatives ------------ @@ -178,19 +236,21 @@ Scalability impact Instance images will be cached on the ironic conductor node once the ``[agent]image_download_source`` is set to ``http``, it will cost more disk space if the conductor node is using ``direct`` deploy interface before. -The expected space usage basically should be the same with ``iscsi`` -deploy interface. +The expected space usage basically should be no more than ``iscsi`` deploy +interface. IPA downloads instance image directly from the conductor node, which will reduce traffic on the control plane network, by the cost of increasing traffic -on each conductor node. Substantially the consumption should be equivalent -with the ``iscsi`` deploy interface if ``[DEFAULT]force_raw_images`` is set to -true. +on each conductor node. The consumption should be no more than ``iscsi`` deloy +interface. Performance Impact ------------------ -None +Depending on the hardware and image type, recalculating MD5 checksum for a raw +image could consume considerable amount of CPU/IO resources. If the +performance on ironic conductor node is in concern, please set +``[DEFAULT]force_raw_images`` to ``False`` (The option is ``True`` by default). Other deployer impact --------------------- @@ -198,6 +258,10 @@ Other deployer impact When using this feature, an HTTP server should be set up and configured on each ironic conductor node. +Each HTTP servers should be configured to follow symlinks for instance images +are accessible from external requests. Refer to ``FollowSymLinks`` if Apache +HTTP server is used, or ``disable_symlinks`` if Nginx HTTP server is used. + Developer impact ---------------- @@ -212,6 +276,9 @@ Assignee(s) Primary assignee: kaifeng +Other contributors: + sambetts + Work Items ---------- @@ -235,11 +302,17 @@ Upgrades and Backwards Compatibility ==================================== Two new options ``[agent]image_download_source`` and -``[deploy]http_image_path`` are introduced in this feature. +``[deploy]http_image_subdir`` are introduced in this feature. ``[agent]image_download_source`` defaults to ``swift``, which should have no impact on upgrades. +The change of the cache file naming could probably invalidate some cached +instance images after upgrades, they will be re-cached when used, images not +referenced will be cleaned up eventually. This will have no impact if caching +is disabled before upgrade. + + Documentation Impact ====================