This is a MVP of auto-discovery with no extra customization and no new
auto_discovered field from the spec.
Change-Id: I1528096aa08da6af4ac3c45b71d00e86947ed556
Adds a redfish-https boot interface, based upon the
redfish-virtual-media boot interface, however substantially copies
some base methods because of simplification offered to use by
putting "attach/detach" logic into how the sushy library handles
the application and reset of a URL as a boot setting.
This feature also increases the requirement for the Sushy library
to version 4.7.0 which includes support to set the HttpBootUri
field in the BMC and automatically unset it as well.
Closes-Bug: #2032380
Change-Id: I991611cd67cb91aea21fc30bbae7cd24409dbbfa
Only port creation/updating/deletion logic has been replicated from
ironic-inspector, as well as the add_ports and keep_ports options.
In the future patches, the added code will become a part of processing
hooks.
Change-Id: I69d6a1a53c5bf9e0f41d1a5bce7215edeea54b22
We have both Invalid and BadRequest which result in HTTP 400 and 500
accordingly. The latter is clearly incorrect.
Then we have NotFound and HTTPNotFound which mean the same thing.
Alias exceptions in both cases with the intention to drop one copy.
Finally, NotAuthorized and Unauthorized result in HTTP 403 and 500
again. Fortunately, the latter is not used and can be removed.
Change-Id: If9d571792a8617dd6ecf17e163dea252cb0f7fae
Adds the following methods to DB API:
* create_firmware_component
* update_firmware_component
* get_firmware_component
* get_firmware_component_list
FirmwareComponent
* create | save | get
FirmwareComponentList
* get_by_node id | sync_firmware_components
Adds two exceptions:
* FirmwareComponentAlreadyExists
* FirmwareComponentNotFound
Tests for db and objects
Changes were required in models, the class name should match the
object name we will create
Story: 2010659
Task: 47977
Change-Id: Ie1e2a4150d4ee4521290737612780c02506f4a9e
Follow-up to I6b830e5cc30f1fa1f1900e7c45e6f246fa1ec51c
Original changa introduced some errors such as mismatched
arguments for exceptions
Story: 2010275
Task: 46204
Change-Id: I550e048ab22a6cd25502b41d1c579819df369249
Follow-up to Ie174904420691be64ce6ca10bca3231f45a5bc58
which enables storage of inventory in Swift, but does not delete
the Swift entry when the node whose inventory is stored is deleted
Story: 2010275
Task: 46204
Change-Id: I74b19f7a42c1326d7ec04e6320176e81639ebfb4
Prepare the ironic database to accommodate node inventory received from
the inspector once the API is implemented.
Story: 2010275
Task: 46204
Change-Id: I6b830e5cc30f1fa1f1900e7c45e6f246fa1ec51c
Provide the ability to limit resource intensive or potentially
wide scale operations which could be a symptom of a highly
distructive and unplanned operation in progress.
The idea behind this change is to help guard the overall deployment
to prevent an overall resource exhaustion situation, or prevent an
attacker with valid credentials from putting an entire deployment
into a potentially disasterous cleaning situation since ironic only
other wise limits concurrency based upon running tasks by conductor.
Story: 2010007
Task: 45140
Change-Id: I642452cd480e7674ff720b65ca32bce59a4a834a
Ironic has a lot of logic built up around use of images for filesystems,
however several recent additions, such as the ``ramdisk`` and ``anaconda``
deployment interfaces have started to break this mold.
In working with some operators attempting to utilzie the anaconda
deployment interface outside the context of full OpenStack, we discovered
some issues which needed to be make simpler to help remove the need to
route around data validation checks for things that are not required.
Standalong users also have the ability to point to a URL with anaconda,
where as Operators using OpenStack can only do so with customized kickstart
files. While this is okay, the disparity in configuraiton checking
was also creating additional issues.
In this, we discovered we were not really graceful with redirects,
so we're now a little more graceful with them.
Story: 2009939
Story: 2009940
Task: 44834
Task: 44833
Change-Id: I8b0a50751014c6093faa26094d9f99e173dcdd38
In a few places in the codebase, "insufficient" is misspelled as
"insufficent," which includes function names and exception class names.
This can be inconvenient for writing and debugging code, in which case
one would raise an exception/call a function and get an error that is
resolved by intentionally misspelling the function call.
The changes made here are mostly to the names of exceptions and
functions but also include some other instances of this misspelling
in docstrings, policy descriptions, etc. There were also some strings
describing policies in ironic/common/policy.py that were missing
spaces, which were also fixed.
Story: 2010089
Task: 45604
Change-Id: I7b65c449d5d30ca30f537a95a3ffd365492e0274
The NoValidDefaultForInterface exception is a little misleading
in that if one doesn't have the base interface enabled, and they
attempt to enable a hardware type which requires or only supports
disabled interfaces, they will also get an exeption. The reality
is we need to suggest for them to look at enabling the interfaces
before looking at the default interface overrides, because logically
the brain jumps to setting a default before checking the interface
settings.
Change-Id: I50d4381e11da96cb7ae0ee8cbda18534380bd471
This change adds support for verify steps in Ironic. Verify steps
allow executing actions on transition from "verifying" to "managable"
state and can perform actions such as cleaning BMC job queue or
resetting the BMC on supported platforms. Verify steps are similar
to deploy and clean steps, just simpler.
Story: 2009025
Task: 42751
Change-Id: Iee27199a0315b8609e629bac272998c28274802b
Adds capability to copy bootloader assets from the system OS
into the network boot folders on conductor startup.
Change-Id: Ica8f9472d0a2409cf78832166c57f2bb96677833
This patch provides basic data model change to support node history.
Batch removal is not included in this patch.
Change-Id: I5c7cebd585ee84b5b57bd4690d4074baf0d05699
Story: 2002980
Task: 22989
InvalidImageRef is a kind of InvalidParameterValue and can happen during
validation, causing a traceback now.
Change-Id: I5f10fe7240e74d337f991bbd1a5220cc4e713de7
The kickstart template is supplied by the user and it needs
to be validated to make sure it includes all the expected
variables and nothing else.
We validate the template by rendering it using expected
variables. If any of the expected variables are not present
in the template or unexpected variables are defined in the
template we raise InvalidKickstartTemplate exception
Once we render the template into kickstart file we
pass the file to 'ksvalidator' tool if it is present
on the system to validate the rendered kickstart file
for correctness.
'ksvalidator' tool comes from pykickstart libarary and
it is GPLv2 licensed. GPLv2 license is incompatible with
Openstack. So we do not explicitly include the library in
requirements.txt instead rely on it being pre-existing on
the conductor. If the 'ksvalidator' binary is not present
on the system, kickstart validation will be skipped
Change-Id: I3e040bbdbcefb8764c93355d0ba7179e2110b9c6
One of the biggest frustrations larger operators have is when they
trigger a massive number of concurrent deployments. As one would
expect, the memory utilization of the conductor goes up. Except,
even with the default number of worker threads, if we're requested
to convert 80 images at the same time, or to perform the write-out
to the remote node at the same time, we will consume a large amount
of system RAM. Or more specifically, qemu-img will consume a large
amount of memory.
If the amount of memory goes too low, the system can trigger
OOMKiller which will slay processes using ram. Ideally, we do not
want this to happen to our conductor process, much less the work
that is being performed, so we need to add some guard rails to help
keep us from entering into situations where we may compromise the
conductor by taking on too much work.
Adds a guard in the conductor to prevent multiple parallel
deployment operations from running the conductor out of memory.
With the defaults, the conductor will attempt to throttle back
automatically and hold worker threads which will slow down the
amount of work also proceeding through the conductor, as we are
in a memory condition where we should be careful about the work.
The defaults allow this to occur for a total of 15 seconds between
re-check of available RAM, for a total number of six retries.
The minimum default is 1024 (MB), as this is the amount of memory
qemu-img allocates when trying to write images. This quite literally
means no additional qemu-img process can spawn until the default
memory situation has resolved itself.
Change-Id: I69db0169c564c5b22abd0cb1b890f409c13b0ac2
Prevent each driver comming online one at a time. So that
/driver returns nothign until all interfaces are registered
Story: #2008423
Task: #41368
Change-Id: I6ef3e6e36b96106faf4581509d9219e5c535a6d8
MAC address is not user friendly for port management, having
a name field is also a feature parity with other resources.
This patch implements db related change.
Change-Id: Ibad9a1b6bbfddc0af1950def4e27db3757904cb1
Story: 2003091
Task: 23180
The agent command exec model is based upon an incoming
heartbeat, however heartbeats are independent and
commands can take a long time. For example, software RAID
setup in CI can encounter this.
From an IPA log:
[-] Picked root device /dev/md0 for node c6ca0af2-baec-40d6-879d-cbb5c751aafb
based on root device hints {'name': '/dev/md0'}
[-] Attempting to download image from http://199.204.45.248:3928/agent_images/
c6ca0af2-baec-40d6-879d-cbb5c751aafb
[-] Executing command: standby.get_partition_uuids with args: {} execute_command
/usr/local/lib/python3.6/site-packages/ironic_python_agent/extensions/base.py:255
[-] Tried to execute standby.get_partition_uuids, agent is still executing Command name:
execute_deploy_step, params: {'step': {'interface': 'deploy', 'step': 'write_image',
'args': {'image_info': {'id': 'cb9e199a-af1b-4a6f-b00e-f284008b8046',
'urls': ['http://199.204.45.248:3928/agent_images/c6ca0af2-baec-40d6-879d-cbb5c751aafb'],
'disk_format': 'raw', 'container_format': 'bare', 'stream_raw_images': True, 'os_hash_algo':
'sha512', 'os_hash_value':<trimed>
This was with code built on master, using master images.
Inside the conductor log, it notes that it is likely an out
of date agent because only AgentAPIError is evaluated,
however any API error is evaluated this way. In reality, we need
to explicitly flag *when* we have an error that is because
we've tried to soon as something is already being worked upon.
The result, is to evaluate and return an exception indicating work
is already in flight.
Update - It looks like, the original fix to prevent busy agent
recognition did not fully detect all cases as getting steps is a
command which can
get skipped by accident with a busy agent, under certain circumstances.
Change I5d86878b5ed6142ed2630adee78c0867c49b663f in ironic-python-agent
also changed the string that was being checked for the previous
handling, where we really should have just made the string we were
checking lower case in ironic. Oh well! This should fix things
right up.
Story: 2008167
Task: 41175
Change-Id: Ia169640b7084d17d26f22e457c7af512db6d21d6
Allow using IPv6 address in the provisioning network.
IP address based pxe config may not be used actually, in that case we
can remove it and saving a few neutron interaction.
Change-Id: Ideef57674550270a87513e039cd030f0bcc1c10e
The header for the file types.py denotes its dual-licensed status as
MIT with copyright to the original WSME authors, plus apache licensed
as part of Ironic.
Story: 1651346
Task: 10551
Change-Id: I986cc4a936c8679e932463ff3c91d1876a713196
Some unused HTTP param to arg parsing has not been implemented to
reduce code complexity. This includes the following types:
- DictType
- complex types
Asserts are added to confirm these param types are not used in ironic
currently, and to prevent them being used in future development.
Story: 1651346
Task: 10551
Change-Id: Idfcf99216f10e8928fe4ba6202a7d69bfa916459
Currently for install_bootloader we use wait=True with a longer
timeout. As a more robust alternative, poll the agent until
the command completes. This avoids trying to guess how long
the command will actually take.
Change-Id: I62e9086441fa2b164aee42f7489d12aed4076f49
Story: #2006963
Introduces [console]port_range configuration option and implements
the feature of automatic port allocation for IPMI based serial console.
The ipmi_terminal_port in driver_info takes precedance if specified,
otherwise ironic will allocate free port from configured port range
for underlying serial proxy tools.
The implementation deviation with the original proposal is this patch
doesn't validate whether user specified ipmi_terminal_port falls in the
range, based on following considerations:
a. ipmi_terminal_port is considered a resort for backwards compatibility,
we will remove this eventually.
b. different conductors may have different port range configured (rare,
but could happen).
c. force ipmi_terminal_port in the port range could raise the
possibility of conflicts with ports in the configured range, this is not
a desired result, so leave the choice to the end users.
Change-Id: If8722d09dc74878f4da2e4a7f059d9b079c3e472
Story: 2007099
Task: 38135
This change adds support for node retirement: nodes can
have additional properties 'retired' and 'retired_reason'
which change the way the nodes (can) traverse the FSM
and which operations are allowed. In particular:
- retired nodes cannot move from manageable to available;
- upon instance deletion, retired nodes move to manageable
(rather than available).
Story: #2005425
Task: #38142
Change-Id: I8113a44c28f62bf83f8e213aeb6704f96055d52b
This change avoids importing a wsgi namespace exception class, and
allows the future option of changing the parent class of
exception.ClientSideError when wsme is no longer processing API
requests.
Change-Id: I8165e094fafb91ff94eaa1dd96baba7671487448
Story: 1651346
Since we've dropped support for Python 2.7, it's time to look at
the bright future that Python 3.x will bring and stop forcing
compatibility with older versions.
This patch removes the six library from requirements, not
looking back.
Change-Id: Ib546f16965475c32b2f8caabd560e2c7d382ac5a
Cisco's Third-Party CI was taken down as a result of the
CTO's office being restructured. Numerous attempts to
re-engage with Cisco directly and address the various
known issues in their drivers have not proven to be
fruitful.
Additionally, the drivers are not Python3 compatible,
and some reports have indicated that the CIMC driver is
no longer compatible with newer versions.
As such, the ironic community has little choice but to
to remove the Cisco UCS/CIMC hardware types and driver
interface code.
Story: 2005033
Task: 29522
Change-Id: Ie12eaf7572ce4d66f6a68025b7fe2d294185ce28
The exception modules in ironic and ironic-lib contain the same
almost identical class IronicException.
With this patch we directly use the one in ironic-lib.
Updating requirements and lower-constraints to use compatible
version of ironic-lib.
Also deprecating duplicated fatal_exception_format_errors
option.
Change-Id: I1ce0d12d912020346425fd658d3b1807607455a4
Story: 1626578
Task: 10515
This patch proposes to adding iBMC driver for deploying the
Huawei 2288H V5, CH121 V5 series servers.
The driver aims to add management and power interfaces using
Huawei iBMC RESTful APIs for those series servers.
Change-Id: Ic5e920e4e58811c6a6dfe927732595950aea64e7
Story: 2004635
Task: 28566
This change contains two fixes for the same issue:
* Do not pass an instance of ImageRefValidationFailed as a message
to an exception constructor.
* Make sure that IronicException.__str__ always returns an str,
even when a non-string is passed as the first argument to __init__.
Change-Id: I96edb28955e64915e9d6a481634857fd27690555
Story: #2003682
Task: #26206
Adds deploy_templates REST API endpoints for retrieving, creating,
updating and deleting deployment templates. Also adds notification
objects for deploy templates.
Bumps the minimum WSME requirement to 0.9.3, since the lower constraints
job was failing with a 500 error when sending data in an unexpected
format to the POST /deploy_templates API.
Change-Id: I0e8c97e600f9b1080c8bdec790e5710e7a92d016
Story: 1722275
Task: 28677
Adds deploy_templates and deploy_template_steps tables to the database,
provides a DB API for these tables, and a DeployTemplate versioned
object.
Change-Id: I5b8b59bbea1594b1220438050b80f1c603dbc346
Story: 1722275
Task: 28674
This change introduces the two RPC calls required for the allocation
API: create_allocation and destroy_allocation.
The nodes RPC is updated to:
* Prevent instance_uuid deletion if a node has an allocation and is
not in an updatable state.
* Delete allocation when instance_uuid is deleted and the node is
in an updatable state.
* Delete allocation when a node is unprovisioned and instance_uuid
is thus cleared.
Change-Id: I45815727f970c3d7fe51bb78d8e162a374d12e04
Story: #2004341
Task: #27987
This change adds the database models and API, as well as RPC objects
for the allocation API. Also the node database API is extended with
query by power state and list of UUIDs.
There is one discrepancy from the initially approved spec: since we
do not have to separately update traits in an allocation, the planned
allocation_traits table was replaced by a simple field.
Change-Id: I6af132e2bfa6e4f7b93bd20f22a668790a22a30e
Story: #2004341
Task: #28367
When handling the "pet" case, some nodes may be critical for the deployment.
For example, in an OpenStack installer like TripleO you may want to make
sure your controllers are not removed by an incorrect operation.
This changes introduces a new field "protected" on nodes. When it is
set to True, the "deleted" and "rebuild" provisioning actions fail with
HTTP 403. Deleting such nodes is also not possible.
Also adds "protected_reason" for the operators to specify the reason
a node is protected.
Story: #2003869
Task: #26706
Change-Id: I1950bf6dd65b6596cae69d431ef288e578a89d6e
In accordance with the deprecation of oneview,
It is time to remove the oneview drivers.
This patch oneview interfaces and documentation.
Change-Id: Ided79fa788411f839614813ff033c42a13b88c75
Story: #2001924
Task: #24943
The cleaning operation may fail, if an in-band clean step were to
execute after the completion of out-of-band clean step that
performs reboot of the node. The failure is caused because of race
condition where in cleaning is resumed before the Ironic Python
Agent(IPA) is ready to execute clean steps.
Story: #2002731
Task: #22580
Change-Id: Idaacb9fbb1ea3ac82cdb6769df05d8206660c8cb