diff --git a/specs/xena/approved/vm-evacuations-for-host-recovery.rst b/specs/xena/approved/vm-evacuations-for-host-recovery.rst index 6bb1282..b72ca9c 100644 --- a/specs/xena/approved/vm-evacuations-for-host-recovery.rst +++ b/specs/xena/approved/vm-evacuations-for-host-recovery.rst @@ -14,40 +14,33 @@ https://blueprints.launchpad.net/masakari/+spec/vm-evacuations-for-host-recovery Problem description =================== -If one compute node failed, Masakari will evacuate the instances -from the failed host. +If one compute node failed, Masakari will evacuate the instances from +the failed host. -Generally, the resources of computing nodes are gradually reduced. -If a large number of hosts fail at the same time and there are not -enough computing node resources, the operator needs to set the -priority of instance evacuation in advance to ensure that the -evacuation can be carried out in the order of priority. +If a large number of hosts fail at the same time, the resources of +computing nodes are dramatically reduced. There would not be enough +resources for all instances to recovery. So it is reasonable that +the very important instances to be firstly evacuated, and evacuations +can be aborted once the cloud environment encounters an irreversible +condition. -When the failed compute node recovers, in order to make full -use of the computing node resources and due to some instances -needing to run on a specific computing node, the operator wants -to migrate the instance back to the original node. +When the failed hosts come back, the restored resources may be lying +idle. In order to make full use of the restored resources, It needs +to move instances to the restored hosts. Sometimes there may be a +distribution on purpose. The vm moves automatically such as DRS +or manually could mess up the distribution. So it is a good idea to +save the evaucations when the host is failed, and move instances back +when the host is restored according to the previous evaucations. Proposed change =============== -This spec is mainly to record instance evacuation information in -the database, provide two interfaces to support obtaining all -evacuation information lists and specific evacuation information -details, and prepare relevant information for supporting the -migration of the instances back to previously failed hosts. - -* Record instance evacuation information in the database, mainly - including instance_id, notification_id, source_host, dest_host, - status. The status is pending. -* User can get evacuation information about a specific masakari - notification by ``GET /notifications//evacuations`` - API. -* User can get detailed information about a specific evacuation record - of a particular masakari notification by - ``GET /notifications//evacuations/`` - API. +This spec is mainly to record vm moves information in +the database, mainly including instance_uuid, notification_uuid, +source_host, dest_host, type, status, start_time and end_time. +User can get vm moves information of a 'COMPUTE_HOST' type +notification by vmove API. Alternatives ------------ @@ -57,64 +50,85 @@ None Data model impact ----------------- -The table ``evacuation`` will be added into the Masakari database. +The table ``vmoves`` will be added into the Masakari database. * created_at: Datetime. * updated_at: Datetime. * deleted_at: Datetime. * deleted: Boolean. -* uuid: UUID. uuid of evacuation record. -* notification_uuid: UUID. uuid of notification. -* instance_uuid: UUID. uuid of instance. -* source_host_name: String. The source compute node before the instance - evacuated. -* dest_host_name: String. The destination compute node after the instance is - evacuated. -* status: String. Represents possible statuses for notifications, such as - pending, ongoing, ignored, failed and succeeded. -* status_details: String. Store the details reason of evacuate failed/ignored. -* priority: Numeric. Set the evacuation priority and support the - evacuation of instances in order. The default value is 1. +* uuid: UUID. UUID of the vmove. +* notification_uuid: UUID. UUID of notification the vmove belong to. +* instance_uuid: UUID. UUID of instance. +* instance_uuid: String. Name of instance. +* source_host: String. Source host name of the vmove. +* dest_host: String. Destination host name of the vmove. +* start_time: Datetime. Start time of the vmove. +* end_time: Datetime. End time of the vmove. +* type: String. Represents possible types for the vmove, such as + migration, live_migration or evacuation. +* status: String. Represents possible statuses for the vmove, such as + pending, ongoing, ignored, failed or succeeded. +* message: String. Display some meaningful information if the vmove is + failed or ignored. REST API impact --------------- -Following changes will be introduced in a new API micro-version. +Following vmove API will be introduced in a new API micro-version. -* GET /notifications//evacuations +* GET /notifications//vmoves response example:: { - "evacuations": [ + "vmoves": [ { "uuid": "239f95ca-fd46-44d2-8ff8-35e8a9c94f69", "instance_uuid": "33826ebd-af0f-445d-833f-e06340f7ae1c", + "instance_name": "vm-1", "notification_uuid": "c0fa1a39-c150-4b86-ae97-8fae31700c67", - "source_host_name": "node01", - "dest_host_name": "node02", - "status": "pending", - "status_details": "", - "priority": "1" + "source_host": "node01", + "dest_host": "node02", + "start_time": "2022-11-22 14:50:22", + "end_time": "2022-11-22 14:50:35", + "type": "evacuation", + "status": "succeeded", + "message": null + }, + { + "uuid": "65a5da84-5819-4aea-8278-a28d2b489028", + "instance_uuid": "e1a5a45b-f251-47cf-9c5f-fa1e66e1286a", + "instance_name": "vm-2", + "notification_uuid": "c0fa1a39-c150-4b86-ae97-8fae31700c67", + "source_host": "node01", + "dest_host": "node02", + "start_time": "2022-11-22 14:50:23", + "end_time": "2022-11-22 14:50:38", + "type": "evacuation", + "status": "succeeded", + "message": null } ] } -* GET /notifications//evacuations/ +* GET /notifications//vmoves/ response example:: { - "evacuation": + "vmove": { "uuid": "239f95ca-fd46-44d2-8ff8-35e8a9c94f69", "instance_uuid": "33826ebd-af0f-445d-833f-e06340f7ae1c", + "instance_name": "vm-1", "notification_uuid": "c0fa1a39-c150-4b86-ae97-8fae31700c67", - "source_host_name": "node01", - "dest_host_name": "node02", - "status": "pending", - "status_details": "", - "priority": "1" + "source_host": "node01", + "dest_host": "node02", + "start_time": "2022-11-22 14:50:22", + "end_time": "2022-11-22 14:50:38", + "type": "evacuation", + "status": "succeeded", + "message": null } } @@ -131,8 +145,8 @@ None Other end user impact --------------------- -The python-masakariclient, masakari-dashboard and openstacksdk will be updated -to support instance evacuations for host recovery in a new micro-version. +The masakari-dashboard and openstacksdk will be updated to support +vm moves for host type notification in a new micro-version. Performance Impact ------------------ @@ -157,11 +171,7 @@ Assignee(s) Primary assignee: -* suzhengwei - -Historical assignee (pre-Yoga): - -* shenxinxin +* suzhengwei Work Items ---------- @@ -169,13 +179,12 @@ Work Items * Create the object definition, database schema, updating engine to handle this. -* Create a new API microversion to get information for all evacuations - and get detailed information about a particular evacuation. +* Create a new API microversion to get information for all vmoves + and get detailed information about a particular vmove. -* Update docs for instance evacuations for host recovery +* Update docs about vm moves for host recovery -* Update python-masakariclient, masakari-dashboard and openstacksdk to - manage instance evacuations for host recovery. +* Update masakari-dashboard and openstacksdk to manage vm moves. * Add unit and functional tests.