Reopen web console may occasionally result in duplicated
sol session. get_console action open
one console process while another sol session remains.
This patch adds "sol deactivate" action before get
console. Make sure the current connection always a success.
Change-Id: Ie5d9c94a3e9e3561b6aa1a52462d6739662d4eb0
Adds the logic and testing to handle vendor interfaces to be able
to be called as steps, as well as adds the ipmitool send_raw
vendor passthru method to be able to be called as a step.
Change-Id: I741a4173f1d150298008d3190e4c3998402a8b86
The dynamically allocated console port for a node is saved
into database and reused on subsequent console operations.
In certain code path the port record cann't be trusted and
we should do a re-allocation.
This patch fixes the issue by ignores previous allocation
record. The extra cleanup in the takeover is not required
anymore and removed as well.
Change-Id: I1a07ea9b30a2c760af7a6a4e39f3ff227df28fff
Story: 2010489
Task: 47061
Restart node console may occasionally result in duplicated
sol session. Especially, when a cluster deployed with multi
ironic-conductor backends, stop_console action shutdown
only one console process while another sol session remains.
This patch adds "sol deactivate" action before start node
console. Make sure the current connection always a success.
Story: 2009762
Task: 44233
Change-Id: I5bc8666ff0b4ceab61ed6a8c794d6882783d6bce
Change the default boot mode to UEFI, as discussed during the end
of the Wallaby release cycle and previously agreed a very long time
ago by the Ironic community.
Change-Id: I6d735604d56d1687f42d0573a2eed765cbb08aec
In some cases the operator can't specify `ipmi_cipher_suite`
for each node and the problem with session can still occour:
`Error in open session response message : no matching cipher suite`
This patch adds a new configuration option that will take a list
of possible cipher suite versions that can be used when the error
occurs and the node doesn't have the `ipmi_cipher_suite` set.
Story: 2008739
Task: 42093
Change-Id: I6788585a83268e20ff6447e570995871bc9c25d5
Get rid of the TODO in the code and prepare for more management
interfaces supporting detect_vendor(). Vendor detecting now runs
during transition to manageable and on power state sync (essentially
same as before but for all drivers not only IPMI).
Update the IPMI implementation to no longer hide exceptions since
they're not handled on the upper level. Simplify the regex and fix
the docstring.
Add the Redfish implementation as a foundation for future
vendor-specific changes.
Change-Id: Ie521cf2295613dde5842cbf9a053540a40be4b9c
Supermicro machines, when in UEFI mode, have a different
device number, in binary, to represent the hard disk from
other vendors such as Fujitsu which actually has somewhat
similar code in their driver.
This means we need to be somewhat cognizent of the vendor of
the BMC and possibly update the device mapping based upon that
vendor.
This may ultimately fix a number of IPMI related problems, because
there is a reliance upon the text output of ipmitool, which only
reads the bytes retured by the BMC, which may not be reality after
the next reset, espescialy if ipmitool doesn't know of the UEFI
operating difference.
Change-Id: Ie19db9e0cf1eafdfc9bb46248f4d457337821f94
Story: 2008241
Task: 41085
Calculating the ipmitool `-N` and `-R` arguments from ironic.conf
[ipmi] `command_retry_timeout` and `min_command_interval` now takes
into account the 1 second interval increment that ipmitool adds on
each retry event.
Failure-path ipmitool run duration will now be just less than
`command_retry_timeout` instead of much longer.
Change-Id: Ia3d8d85497651290c62341ac121e2aa438b4ac50
By default _verify_port() only works for IPv4 network, the same port can be
allocated to multiple nodes in a IPv6 network because the port checking
passed and be used for other nodes.
This fix passes the socat_address to the port validation and use the
correct address family to do the socket binding.
Story: 2007946
Task: 40412
Change-Id: I1355afaa551baee7b9fd7883d2d29342d059c5a0
For certain BMCs the default of 1 second is too short for the ipmitool
minimum command interval (-N). The configured
``[ipmi]min_command_interval`` should be used.
Story: 2007914
Task: 40317
Change-Id: I07f17a7321582e9829ac422efb51b571a17c5ca8
This change replaces custom Popen-based code with the new argument
(backed by the corresponding stdlib argument).
Story: #2004449
Task: #40283
Change-Id: I6840b1caffd272ef12ab2b259a02376ec185bc3f
A new configuration parameter was introduced in
https://review.opendev.org/#/c/731676/ to work around an
issue with some BMCs that don't support the Cipher
Suites command. This changes the default value to
``False`` so that Ironic will do the retries by default.
Note that this should not be backported to avoid changing
the behaviour in stable branches.
Change-Id: I34bf7e5d79defc23161213aa8942edace4b87b78
Story: 2007632
Task: 39676
Add a new ``[ipmi]use_ipmitool_retries`` option. When set to
``True`` and timing is supported by ipmitool, the number of
retries and command interval will be passed to ipmitool so
that ipmitool will do the retries. When set to ``False``,
ironic will do the retries.
The default is ``True``, so this will not change the current
behaviour which is to have ipmitool do the retries when
timing is supported.
Setting to ``False`` will help with certain BMCs which do
not support the Cipher Suites command. In this case ipmitool
can take up to 10 seconds for each retry which results in a
total time exceeding ``[ipmi]command_retry_timeout``.
Change-Id: I1d0194e7c7ae9fcdd4665e6115ee26d10b14e480
Story: 2007632
Task: 39676
Python3 have a standard library for mock in the unittest module,
let's drop the mock requirement and switch tests to unittest mock.
Change-Id: I4f1b3e25c8adbc24cdda51c73da3b66967f7ef23
The IPMI verbose output being turned on by the debug option
is confusing and misleading, and since many operators run
ironic in debug mode anyway, it doesn't make much sense
to spam logs with errors and information that can be
misleading to a less experienced operator.
Also... less logging output.
Change-Id: I0fae7bad5613865dfd4d1c663be08d40debe157a
Introduces [console]port_range configuration option and implements
the feature of automatic port allocation for IPMI based serial console.
The ipmi_terminal_port in driver_info takes precedance if specified,
otherwise ironic will allocate free port from configured port range
for underlying serial proxy tools.
The implementation deviation with the original proposal is this patch
doesn't validate whether user specified ipmi_terminal_port falls in the
range, based on following considerations:
a. ipmi_terminal_port is considered a resort for backwards compatibility,
we will remove this eventually.
b. different conductors may have different port range configured (rare,
but could happen).
c. force ipmi_terminal_port in the port range could raise the
possibility of conflicts with ports in the configured range, this is not
a desired result, so leave the choice to the end users.
Change-Id: If8722d09dc74878f4da2e4a7f059d9b079c3e472
Story: 2007099
Task: 38135
Since we've dropped support for Python 2.7, it's time to look at
the bright future that Python 3.x will bring and stop forcing
compatibility with older versions.
This patch removes the six library from requirements, not
looking back.
Change-Id: Ib546f16965475c32b2f8caabd560e2c7d382ac5a
It seems at some point oslo_service loopingcall started using
eventletutils from oslo_utils to sleep during the loopingcall
retries, and some untittests started taking up to 40 seconds
to complete. This change mocks out the correct method offering
significant speedup to unittests' run time.
The EventletEvent class is introduced to eventletutils in version
3.38.0 so lower constraints are bumped as well.
Change-Id: Id7e6ff2a4748b5301e2259acdc760ac7f56b96c3
This change allows to configure more retriable errors for ipmitool
execution that are specific to the environment it is run in.
Task: 36296
Story: 2006410
Change-Id: I4bd06ad405f87f5fb974777fc3d84e4874b4f5bb
When the conductor has debugging enabled, that command is
passed to ipmitool to enable debugging of other commands.
However, this means tons of extra data is dumped as part of
the sensor data collection for ironic, which breaks string
parsing and ultimately metrics collection.
Since we can identify these lines, lets ignore them.
Change-Id: Ife77707210f8289d8f2e0223fb9ee1909d798546
Story: 2005332
Task: 30267
Support for the -y option of ipmitool
Quote from docs:
-y <hex key>
Use supplied Kg key for IPMIv2 authentication. The key is expected
in hexadecimal format and can be used to specify keys with non-printable
characters. E.g. '-k PASSWORD' and '-y 50415353574F5244' are equivalent.
The default is not to use any Kg key.
Change-Id: Ie6a9fc1a41d924e30eff526b3eae929ce6e085c6
Story: #2005158
Task: #29876
teach the ipmitool driver about _get_ipmitool_args and use that in all
cases that we want to build an ipmitool command line. this solves
the problem that the serial console drivers were failing to honor the
ipmi_port setting in driver_info, while it was being correctly used
for power state, etc.
Change-Id: Ifbf6a92c2305567985cfbc41dbf76a076ecb8a7b
Story: 2005138
Task: 29826
Look for boolean and string like booleans in driver_info['ipmi_force_boot_device']
to make setting the option more user friendly / less error prone.
Change-Id: I2917761055db5286183ce265089c19dea98947ad
Story: 2004444
Some type of BMCs don't support an IPMI option that disable the
behavior of boot device timeout, which makes them never get booted
from PXE.
This patch extends the fix [1] by adding a configuration option,
which provides the default ipmi behavior.
[1] https://review.openstack.org/#/c/616053
Additionally revising the variable/setting names based upon review
feedback and discussion that took place during the 20181210 weekly
ironic team meeting.
Change-Id: Ie049bbaf45aeab54c1272d1d561c5a6ca00dc34a
Story: 2002977
Task: 22985
We can't trust ipmitool to terminate in time. We may have to kill
the process if it's running for longer than we asked it to.
On the other hand, abrupt IPMI exchange termination is said to be
dangerous to the state of the BMC being managed. Therefore this patch
only kills timed out IPMI "power status" call.
For the purpose of killing hung `ipmitool` we inject the time-capped
`popen.wait` call before the uncapped `popen.communicate` is called
internally. Then just kill stuck `ipmitool` process and go on.
Story: 2004449
Task: 28127
Change-Id: I7e1eafb334fe3a3337926aca27c14fe559ce0e39
The IPMI driver unconditionally instructed the BMC not automatically
clear boot flag valid bit if Chassis Control command not received within
60-second timeout (countdown restarts when a Chassis Control command is
received). Some BMCs do not support setting this. Sending the command
aborts the node boot.
A new driver option ``ipmi_disable_timeout`` is added to bypass
sending this command.
Change-Id: I1dda3cf3e4b7b888ed9d8931c8ede3a918dd01f4
Story: 2004266
The option [ipmi]retry_timeout is deprecated at Pike, now it's time
to remove it from the tree.
Change-Id: I921661db2a6f0c85e717e1a80e5f0c8b6c91d369
Story: #2003028
Task: #23052
This change collects boot-relalated functions into
the `boot_mode_utils.py` module to improve code clarity.
Change-Id: I1a2225d503deb382ba6021a6073c81cd03ca3175
Story: 1734131
Task: 10640
This change extends the list of `ipmitool` errors that
ironic treats as retryable on failure.
Change-Id: I5fddc95404a1725f03bd26da51932c3ece5a5a35
Story: 2001989
Task: 19611
This function does not work properly with hardware types, and is not
needed in the most cases where it is used. The enabled_drivers option
is changed instead where needed.
This change covers only a (randomly selected) part of files. Other files
will be updated separately.
Change-Id: Iae40ed6c5d37bb2d2af3219d2f94922a2b32d78d
There is usually no need to use some_dict.keys(). Cleanup the code by
removing this unneeded usage of keys()
Change-Id: I755de2e610ecad568c2265e10bd0a6ff1275888c
Addressing nits.
In commit 43fbc5dc6a there were some issues
pointed out during review. This is a follow-up patch to address
those issues.
Change-Id: I7155a2cdc90b8c4f4162318ec0a5ee291e169379
Commit ee5d4942a1 changed the existing
behavior so that if an ipmitool command fails when attempting to set
the power state it causes a failure. The problem with that approach
is that on some systems if the system is already in the desired power
state, an error will be generated when ipmitool tries to change it to
the desired power state.
Now when doing a reboot command we check beforehand to see if the node
is already off, if so then don't attempt to power off the node again.
Also optimize ironic/conductor/utils.py node_power_action() so that it
only checks a node's power status if it might perform an action based
on the node's power status.
Change-Id: If838aae871753ebfbdf359e0bbe3afcc54c4b559
Closes-Bug: #1718794
By having a mock patch on the testcase class it was requiring 40+
functions to have 'mock_sleep' in their argument list. When we are
only checking the mock about five times. Instead put it into the
setUp() function.
Also fix mistakes introduced by previous patches that did not realize
that:
@mock.patch.object(ipmi, '_make_password_file', _make_password_file_stub)
def some_func(self, mock_pass): ...
The mock.patch.object() is defining the object completely and thus it
(mock_pass) is not actually passed as an argument to the function.
When removing the 'mock_sleep' mock I discovered this issue.
Change-Id: Iac5dfa8382d04bead745260598d5228c6114b70d
Set power state methods now use BackoffLoopingCall to wait for the
desired power state. It uses random.SystemRandom gauss distribution
to determine the amount of time to wait for the next check. This
change adds a method to mock the result of the pseudo-random sleep
interval generation, so that the value to wait between power state
checks is 1.
Closes-Bug: 1702859
Change-Id: I9270a187fa0f413ba8a8a14f859b0fd65c439b6e
The old code blindly required power status even if the power action
failed. Now, it will retry the power action only when it detects a
retryable failure, and will only poll for power status if the power
action is successful. This patch also moves the logic for handling
waiting for power status into the conductor so that the logic is
standardised between drivers.
Change-Id: Ib48056e05d359848386ac057b58921f40b7bdd60
Co-Authored-By: Sam Betts <sam@code-smash.net>
Related-Bug: #1675529
Closes-Bug: #1692895
After a node is saved to the database, we weren't updating the
Node object to reflect what was saved. This caused a problem
where the node's update_at field was incorrect. It was fixed
in 065326c0f5 by explicitly
setting node.update_at. However, that doesn't address other
node fields that may be out of sync.
The more correct fix would be to do a similar thing that (most
of) the other Objects do, which is for the node to update itself
via ._from_db_object().
Doing this revealed several incorrect tests and code in the conductor
and agent where changes to the node's dictionaries were incorrectly
being set and thus, not being saved. Those are fixed in this patch.
Change-Id: Ia84cd60c1a4eabcc1ad0a756124c338fa9f644c8
Closes-Bug: #1679297
Related-Bug: #1281638
This patch enhances ipmitool power driver to support SOFT_REBOOT
and SOFT_POWER_OFF.
Partial-Bug: #1526226
Change-Id: If01721625c22a578b4311b82104cd895139e3a01