Enumerate Inspector errors

Propose Ironic Inspector error message enumeration for
the purpose of allowing the automation around Ironic
Inspector REST API being more robust and straightforward.

Change-Id: Ib5af833224c33274e23b417da17c71825b26775b
This commit is contained in:
Ilya Etingof 2017-08-17 15:03:02 +02:00
parent 77819e7057
commit 8ff5eaa38b
1 changed files with 247 additions and 0 deletions

View File

@ -0,0 +1,247 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
==================================
Ironic Inspector Error Enumeration
==================================
https://bugs.launchpad.net/ironic-inspector/+bug/1710945
This blueprint will introduce a new field `error-code` to the
**Ironic Inspector** API. The new field is thought to make the automation
around **Ironic Inspector** easier and more reliable.
Problem description
===================
Currently, if node inspection process fails for one reason or the other,
it may be hard for the **Ironic Inspector** REST API consumers to determine
the exact cause of the failure. That is because the only error indication
currently being offered by the **Ironic Inspector** REST API (other than
HTTP error code) is a free-form error message text.
.. code-block:: json
{
"error": {
"message": "Diskette drive 0 seek failure"
}
}
Proposed change
===============
The proposal is to enumerate **Ironic Inspector** REST API errors by
introducing a new numeric field `error-code` to the
**Ironic Inspector** REST API.
There is probably no need to assign a distinct `error-code` to every
possible `error` message. Instead a handful of important classes of
errors may be determined, then all `error` messages may be distributed
over the `error-code` set.
The collection of generally useful `error-code` values would become
part of a common library consumed by **Ironic Python Agent**,
**Ironic Inspector** and its CLI tool.
Alternatives
------------
Advise **Ironic Inspector** REST API consumers to rely upon the
`error` messages they observe. This would constitute a somewhat toxic
design as it effectively blocks **Ironic Inspector** developers from
changing error messages (accidental change, rewording, localization), puts
needless efforts on the consumers while the end product would remain
fragile.
Data model impact
-----------------
The node object at the **Ironic Inspector** database schema would include
the new integer field - `error_code`.
The **Ironic Inspector** REST API would include the new integer
field - `error-code`.
The `error-code` values would encode the exact error (lower byte),
more general error class (higher byte) and the severity of the error
(most significant byte):
.. code-block::
ERROR_SEVERITY_LOW = 0
ERROR_SEVERITY_HIGH = 1
ERROR_SEVERITY_FATAL = 2
ERROR_CLASS_NONE = 0x0000
ERROR_CLASS_IO = 0x0100
ERROR_CLASS_MEMORY = 0x0200
...
ERROR_CODE_NONE = 0x00
ERROR_CODE_BADSECTOR = 0x01
ERROR_CODE_OOM = 0x02
...
error_code = ERROR_SEVERITY_LOW | ERROR_CLASS_MEMORY | ERROR_CODE_OOM
The existence of the error class would relax the dependency on the exact
error codes among different versions of **Ironic Inspector** and the
surrounding tooling. Even if the client is not aware of the exact
`error-code` it received from **Ironic Inspector**, the client can
still attempt to interpret the error class and act accordingly to
the encoded severity.
Existing database would have to be migrated onto the modified schema.
Initial value for the new `error_code` field would be set to `<no-error>`
(e.g. 0x000000).
HTTP API impact
---------------
When **Ironic Python Agent** is sending the introspection
results up to the **Ironic Inspector** via the Ramdisk
callback, the `error-code` attribute may be present:
.. code-block:: json
POST /v1/continue
{
"inventory":
{
...
},
"root_disk": "/dev/sda1",
"boot_interface": "01:11:22:33:44:55:66",
"error": "Diskette drive 0 seek failure",
"error-code": 1234
}
When **Ironic Inspector** clients (e.g. CLI) retrieve introspection
status, the `error-code` attribute will be present alongside the
existing `error` attribute:
.. code-block:: json
GET /v1/introspection/13211c7a-0402-4a1d-b970-5a44870125f5
{
"finished": true,
"state": "error",
"error": "Diskette drive 0 seek failure",
"error-code": 1234,
...
}
The **Ironic Python Agent** REST API microversion would have to be bumped.
Client (CLI) impact
-------------------
.. code-block:: bash
$ openstack baremetal introspection status 13211c7a-0402-4a1d-b970-5a44870125f5
+-------------+--------------------------------------+
| Field | Value |
+-------------+--------------------------------------+
| error | Diskette drive 0 seek failure |
| error-code | 1234 (I/O Error) |
| finished | True |
| finished_at | 2017-09-01T14:04:58 |
| started_at | 2017-09-01T14:02:12 |
| state | error |
| uuid | 13211c7a-0402-4a1d-b970-5a44870125f5 |
+-------------+--------------------------------------+
Ironic python agent impact
--------------------------
A new dependency on the common library enumerating error codes
would be introduced.
Performance and scalability impact
----------------------------------
None
Security impact
---------------
None.
Deployer impact
---------------
The Deployer would be able to build automation utilizing the
same library as the ironic-* projects when processing/reporting
an error.
Developer impact
----------------
Developers should adhere to the standardized error codes. Introducing
new error code will require an update of the shared error codes library.
Upgrades and Backwards Compatibility
------------------------------------
The new `error-code` attribute/field enhances the current error
handling with further detail, expanding on the current error reporting.
This should be a backwards-compatible change (e.g. older CLI/automation)
won't be broken.
Implementation
==============
Assignee(s)
-----------
Primary assignee:
<etingof>
Work Items
----------
* Create a common library for error codes
* Adopt the new common library in:
* IPA
* inspector
* inspector client
* Modify **Ironic Python Agent** to report `error-code`
* Modify **Ironic Inspector** to consume, store and report `error-code`
Dependencies
============
The new dependency on the common error codes library would be
introduced. Possibly a new OpenStack project would be created
to accommodate the new library.
Testing
=======
The new functionality and the new library would require unittesting
and integration testing the same way as e.g **Ironic Inspector** does.
References
==========
None.