Commit Graph

48 Commits

Author SHA1 Message Date
Dmitry Tantsur cb1e856b9f Rename NodeInfo._lock to avoid conflict with Mock._lock in tests
Starting with https://github.com/python/cpython/pull/98797, Python's
Mock has its own _lock. I hope they rename it to something really
private (e.g. __lock), but for now rename our attribute (and hope that
no downstream plugins relied on it, sigh).

Change-Id: I7ba858fb3f259b8e7a3becde94b7ba6b90615287
2023-01-10 14:29:37 +01:00
Anton Arefiev 3fe42b53fd SQLAlchemy 2.0 Support
Primarily remove the workaround added in
Ia6d512ff2ae417bab938cb095fbb0884d195010a which added
continued use of autocommit, which is incompatible with
SQLAlchemy 2.0.

Also set the environment for unit tests to report compatability
warnings, although it appears none are being reported at this time.

Also cuts out the db upgrade cruft to only use the online database
migration code through oslo_db's enginefacade, which has the smarts
to handle online or offline migrations.

And then, retools unit/functional test data storage to utlize sqlite,
and in that re-tooled the queries to prevent locking conditions
which could exist with queries, and some additional refactoring/cleanup.

Also, don't mock and test time.sleep().

Additionally, it looks like we have discovered the root cause of the
memory/connection leakage issue which has been observed, due to the
way lists of nodes are processed/returned.

This change was based upon the work in
I506da42a9891a245831f325e34bec92e0a3f33f0 which is included in
this commit as the entire database structure and interaction
has been modified for ironic-inspector.

Co-Authored-By: aarefiev <aarefiev@mirantis.com>
Story: 2009727
Task: 44132
Change-Id: Ic88eb9dec5fddc924a72d9a23c17a304954ebf46
2022-12-15 09:28:55 -08:00
Tadeas Kot ff93c7799f Add support for state selector in the list introspection
This commit add support for state selector to the list introspection.

* ``GET /v1/introspection?state=[starting, waiting, processing,
                                 finished, error, reapplying,
				 enrolling]``

Story: 1625183
Task: 11350
Change-Id: I2c5222110487a08a4e7b1efbcbc5dc3d552fae3e
2021-09-27 14:03:58 +02:00
Zuul 00c2e99b5a Merge "Handle NodeLocked failures" 2021-08-11 00:58:32 +00:00
Julia Kreger 7f6c4c4378 Handle NodeLocked failures
Some actions can fail due to the node being locked as part of
normal operations. This was previously handled silently by
python-ironicclient, but when inspector was changed to use
openstacksdk, this was no longer handled.

Adds the retry wrapper around critical path methods involving
power/reboot ops and node updates which require locks.

Change-Id: I3d26cf46da02b3e8f3f773c0aeaed6843e0f26e5
Story: 2009107
Task: 42966
2021-08-10 21:15:53 +00:00
Zuul f38d010626 Merge "Ignored error state cache for new requests" 2021-07-02 17:18:21 +00:00
Julia Kreger d972dc93cd Ignored error state cache for new requests
A possibility exists where inspector *can* fail upon inspection if
the database connectivity was lost on a prior action. This is because
the last database transport is potentially bad and fails upon load for
the transaction. The cache can then end up with an "error" state entry,
which upon retrying can fail becasue it is already in error state

Because there really are no guarentees regarding database failures,
the best thing to do is to not trust the prior cache state if it is
in error and to reset it to starting upon new introspection requests.
This prevents operators from *having* to perform process restarts to
force all loads to be from the database unless they manage to have a
multi-inspector cluster and get another inspector node to inspect in
the mean time.

Change-Id: I04ae1d54028862642d043f3a8f3af99405863325
Story: 2008344
Task: 41246
Related: rhbz#1947147
2021-06-25 06:13:46 -07:00
Julia Kreger 948325cae7 Fix SqlAlchemy >1.3.19 support
SQLAlchemy 1.3.19 change the excepted exception and 1.4.x
changed the enum checking behavior such that we were no
longer raising an exception matching the check.

While we had a test for this, We didn't actually
check for it anywhere in the code. And the states are
driven by the code, so the underlying test that was broken
felt redundant. As such, it has been removed.

Change-Id: I39fa3d85978555b6cb9d0884a90625b4765bac28
2021-06-15 14:25:38 -07:00
Zuul 5a9221459e Merge "Follow up to incorrect pxe-enabled was set" 2020-08-17 14:51:19 +00:00
Dmitry Tantsur 06390e3723 Use node.id instead of node.uuid in record_node
This function does not have a compatibility shim that adds 'uuid'.

Change-Id: I131c8189b25ee2de66f08c742a21052ef1b266aa
2020-07-07 14:40:49 +02:00
Kaifeng Wang 6732d5edb9 Follow up to incorrect pxe-enabled was set
Previous change [1] only covered patch requests, this patch adds
coverage for create_ports so that for a fresh introspection we also
don't have incorrect pxe_enabled set.

[1] https://review.opendev.org/#/c/737129/

Change-Id: If809826cbaaee5efaf7b7ad4617fa17e9b232a1f
2020-07-01 17:02:54 +08:00
Dmitry Tantsur cee5922674 Hacking: enforce usage of autospec=True in tests
Using autospec ensures that mocked functions are called with correct
arguments, so it's highly desired to have it.

Change-Id: I9c8395adf852495d2ef6db732d727990e8abd5d7
2020-04-28 12:27:41 +02:00
Sean McGinnis a9f7f67de5
Use unittest.mock instead of third party mock
Now that we no longer support py27, we can use the standard library
unittest.mock module instead of the third party mock lib.

Change-Id: Ic67f09a223ae2d0cb460771a10a4122307afa05b
Signed-off-by: Sean McGinnis <sean.mcginnis@gmail.com>
2020-04-18 16:15:28 -05:00
Riccardo Pittau 3accdfbbc6 Use openstacksdk for ironic module
This patches removes the ironic-client dependency from the
ironic module in favor of openstacksdk.

Increase minimum required version of openstacksdk to use
recent added features.

Change-Id: I31964179835ad454e8205a59f483b419de4fb62d
2020-03-23 14:28:09 +01:00
Riccardo Pittau 9b1450398c Stop using six library
Since we've dropped support for Python 2.7, it's time to look at
the bright future that Python 3.x will bring and stop forcing
compatibility with older versions.
This patch removes the six library from requirements, not
looking back.

Change-Id: Ic443c7e4d5a5a849c4dc220207f8957e4c90bf53
2019-12-17 09:23:01 +01:00
Julia Kreger 0c9447d53b Active node introspection for nodes not in cache
When an active node introspection is performed
where an entry already exists in ironic, but inspector
has never seen the node before, we enter a state where
we should allow introspection to occur, but that we don't
have a cache record.

As a result of no cache entry, we presently fail hard claiming
the node's port already exists. In the case of active nodes,
This will always be the case, so we need to add a cache
entry and allow the process to... somewhat continue as if
normal introspeciton is occuring.

Co-Authored-By: Dmitry Tantsur <dtantsur@redhat.com>
Change-Id: Ib20b4a4e4d46ba23b8a4d87791cf30a957f1f9f9
Story: 2006233
Task: 35834
2019-07-19 17:08:56 +02:00
Kaifeng Wang 1f0be4a6a8 Adds an abstract locking layer
Adds a base lock class and encapsulates current semaphore based lock
into a subclass.

Change-Id: I26eb658ce6fb219bac6df1c2407a8179499358be
2019-07-09 09:14:51 +08:00
Zuul fe8f909619 Merge "Simplify locking code" 2019-06-14 17:26:43 +00:00
Kaifeng Wang d57179411d Simplify locking code
NodeInfo accepts a lock instance when instantiated, it is only used
in the node clean up periodic task, such implementation introduces
complexity on the lock abstraction. This patch moves node locking
out of get_node.

_get_lock_ctx is removed for it only duplicates _get_lock according to
current usage.

Change-Id: I4e946942ce684b27062d1a42c68faf1d92569489
2019-06-10 10:10:10 +08:00
Zane Bitter 9d107900b2 Eliminate SQL injection vulnerability in node_cache
In node_cache.find_node() we were constructing a raw SQL query using
unescaped data that came in on the wire. This presented an SQL injection
vulnerability. To avoid this, use the query builder from SQLAlchemy to
ensure that any input strings are correctly escaped.

Change-Id: I2b0ffa307ec1aa57538733f2e454d2d7e994d656
Story: #2005678
Task: 30992
2019-05-21 10:41:19 +02:00
Dmitry Tantsur ada6b106c5 Pass reset_interfaces when updating a driver from the rules
Otherwise updating a driver requires listing all interfaces with their
new values. This patch is designed to be backportable, so it has
a fall-back to the previous behavior if the required API version and/or
ironicclient versions are not available.

Story: #2005148
Task: #29854
Depends-On: https://review.openstack.org/#/c/643264/
Change-Id: I115b92b9d7bcdbb1c03f0d259dbba6efd2baec2a
2019-03-14 11:02:40 +01:00
Bob Fournier b76f84d4c1 Use processed bool as key in introspection_data DB table
When using store_data=database with sqlalchemy, running introspection results
in a DBDuplicateEntry error.  This happens because the query when adding an
entry uses both the primary key (uuid) and the processed flag, but the
processed flag is not a key.  This change makes it a key so that both
unprocessed and processed data can be stored in the table.

Note - since the previous migration hasn't been released yet this fixes it
without creating a new one.

Change-Id: I052594d529ae363fce50b1726169d86583bb1439
Story: #2004992
Task: #29463
2019-02-13 14:32:53 -05:00
Kaifeng Wang a8c1d06bd0 introspection data backend: implements db
Configurable introspection data storage backend [1] is
proposed to provide flexible extension of introspection
data storage instead of the single support of Swift storage
backend.

This patch adds database support for using ironic-inspector
database as the storage backend.

A table named ``introspection_data`` is created to serve as
the storage for introspected data.

[1] http://specs.openstack.org/openstack/ironic-inspector-specs/specs/configurable-introspection-data-backends.html

Change-Id: I8b29b7b86d90823d29b921ebf64acddbcd2d8d0d
Story: 1726713
Task: 11373
2018-12-04 10:54:32 +08:00
Kaifeng Wang 0c7a52b624 Remove deprecated option node_status_keep_time
[DEFAULT]node_status_keep_time is deprecated long ago [1], this
patch removes it so that inspector will not automatically remove
node status in regards to configuration option.

This also make sense when we use other store backends in the
future.

Change-Id: I8261ce115fdb03ffcfe3a1cc4ca7c8ec747be832
Related-Bug: #1695858
2018-08-10 09:37:23 +08:00
Kaifeng Wang 269d92b19c Fix lock leaks in unit tests
Following message was seen during unit tests:
"BUG: node lock was not released by the moment node info object is deleted."

Though the bug was marked invalid before, apparently it had never been resolved.
Tests involved in this patch were located by saving stack at NodeInfo.__init__ and
printing out at NodeInfo.__del__, without promises, but hopefully covers all tests
in current repo.

Change-Id: I678407b350bd07926eb98db30f24cc83747fa84c
Story: #1533595
Task: #11296
2018-08-06 19:32:15 +08:00
Dmitry Tantsur e7c3218f71 Add manage_boot parameter to introspection API
Adds a new node field "manage_boot" to store this value. When it is set
to False, neither boot device nor power state are touched for this node.
Instead, we expect a 3rd party to handle them.

We still manage the PXE filter because the node may need DHCP.

Change-Id: Id3585bd32138a069dfcfc0ab04ee4f5f10f0a5ea
Story: #1528920
Task: #11338
2018-06-25 12:09:17 +02:00
Zuul 8e1f3ede1c Merge "Terminal state transitions in transactions" 2017-12-22 19:04:46 +00:00
dparalen 7e72ceffd1 Terminal state transitions in transactions
Multiple spots were not using DB transactions when processing the terminal
state transitions (error, abort, finish, timeout). The pattern looked like
this:

    node_info.fsm_event(istate.Events.error)
    # more code
    node_info.finished(error='Oops!')

which led to brief periodes of state inconsistency of NodeInfo records in
the DB.

This patch refactors the NodeInfo.finished() method to require a terminal state
transition to perform as part of the NodeInfo state update:

   NodeInfo().finished(istate.Events.finish)
   NodeInfo().finished(istate.Events.abort, 'Canceled by operator')

This patch also introduces a new state: aborting to allow the inspector to
try call power-off the node before marking the introspection aborted.

There's a new DB migration since the new state implies a schema change too
(Enum).

Closes-Bug: #1721233
Closes-Bug: #1721230
Closes-Bug: #1723384

Change-Id: I0bb051d1956a996ed006d55a5ca2d670d9455047
2017-12-19 18:15:31 +01:00
dparalen d5266f7c0b Unittest node_info is added with a version_id
This is a follow-up on I422473e888e5e49abb3e598fc2cf2f330620bdcd.  TL;DR:
unittest the version_id is indeed set when a node is added thru
node_info.add_node().

Change-Id: I674a01bba221cea9251dbe13269be205262d65c7
2017-10-24 13:29:38 +02:00
Anton Arefiev fcec378594 Use new oslo db EngineFacade
Legacy EngineFacade was deprecated a while ago, using new
EngineFacade system is the recommended way to work with
db engine and session.

The new system has a lot of notable changes[1]:

 * Thread-safe initialization of EngineFacade;
 * Declarative reader/writer transactions separation;
 * Transaction scope defined by developer-specified context.

[1] http://specs.openstack.org/openstack/oslo-specs/specs/kilo/make-enginefacade-a-facade.html

Closes-Bug: #1475182
Change-Id: Ia03f35a1f3b22b2eda74263e319469cf2391e7b5
2017-06-15 14:25:55 +03:00
Dmitry Tantsur dd37dbaae9 Deprecate removing old status and disable it by default
We no longer have to do it, now that we can remove statuses for removed nodes.

Change-Id: Iacc546d265270983c6a360a92073acde9d9b36c7
Closes-Bug: #1695858
2017-06-13 13:43:19 +02:00
Jenkins 2414c8c741 Merge "Allow timeout in active states" 2017-05-16 11:11:57 +00:00
Jenkins f5791da9e2 Merge "Logging ironic port creation" 2017-05-10 14:20:17 +00:00
dparalen ade23652fb Allow timeout in active states
While reviewing the API--Worker split spec we discussed the issues
following from selecting oslo.messaging as the queue implementation.
Because of the at-most-once semantics of oslo.messaging it may well happen
that a task will end up hanging indefinitelly in an active state.

This change allows the node_cache.clean_up() function to pick-up these
"tasks" and "terminate" them by setting the error state.

Change-Id: I696f14d486f10d84aaae5675446138ebdd047ecd
Depends-On: Iaeb99ab1954a1d5303c9bd10b81f7f8d6aa7e731
2017-05-10 15:11:59 +02:00
Anton Arefiev 703b780bc1 Logging ironic port creation
Adds info level logging for ironic port creation.

Change-Id: Ib7c24a4b6415a5ddcad997b2aa2092ec9f353858
2017-05-05 16:01:25 +03:00
Jenkins de5a3fe03b Merge "Add new transaction starting -> error on timeout" 2017-04-27 16:53:31 +00:00
Anton Arefiev 9f125629fe Add new transaction starting -> error on timeout
Exception along the `starting` inspection code may causes
`no defined transition` error as there is no transition for
`starting` state on timeout event.

Also it could happen if small timeout configured, so there
is not enough time for starting -> waiting transition,
e.g. new greenthread spawn is blocked due to missing of
available slots.

Closes-Bug: #1662494
Change-Id: I66cf0374a2ba4ab4692110daafd4a3d2d20d56d6
2017-04-27 13:57:43 +03:00
Dmitry Tantsur 782ee92c45 Set pxe_enabled on new and existing ports on introspection
Set it to True for the PXE-booting port, to False for all the others.
Create an extended functional tests covering various operations with ports.

Change-Id: I435a5c04884b6c4da70cb7260b305fbde23eebc0
Closes-Bug: #1667472
2017-04-18 10:20:49 +00:00
Dmitry Tantsur 0568d5c4c4 Add missing test for NodeInfo.create_ports
I suspect we cover it elsewhere, but it won't hurt to have dedicated
unit tests.

Change-Id: I6fed943409a77afdddb1ec78195ab74329cd1b12
2017-04-06 17:36:05 +02:00
John L. Villalovos ea97d2b733 Use flake8-import-order
Use the flake8 plugin flake8-import-order to check import ordering. It
can do it automatically and don't need reviewers to check it.

Change-Id: I9ced9c297273db0eec6ab3995b663b1e8dffe87d
2017-02-16 10:11:06 -08:00
dparalen 0ce5cdb7c8 Find a node by multiple attributes
This patch modifies current attributes matching from a single
name-value->node hit to a best-match score.

Also using an UUID as the attributes table primary key to allow exposing
attributes in API later (bug 1525231).

Change-Id: I205e31652b21b9e030b9530149e533b29c52a387
Closes-Bug: 1651719
Partial-Bug: 1525231
2017-02-10 17:24:27 +01:00
Moshe Levi 1dce3b12d3 Adding InfiniBand Support
InfiniBand is computer-networking communications standard
used in high-performance computing, features very high
throughput and very low latency.
This patch allow ironic-inspector to add the client_id
to ironic port extra. The client_id option allow pxe boot
from InfiniBand interface.

Closes-Bug: #1532534
Depends-On: Ifad453977e5d3be64b34e544f269835a72b4d73f
Change-Id: I479d54c29bcacb6bd5c1ab20033ae6e428b0e744
2017-02-01 08:34:38 -05:00
Anton Arefiev 71099c9248 Change (started|finished)_at column type
Db column started_at and finished_at are defined as float type, but
float time doesn't fit into default db Float size. This change migrates
columns type to DateTime.

Closes-Bug: 1658975
Change-Id: I13bbb3c9a3e6e8bc744473437a378453ebc81318
2017-01-25 09:23:00 +00:00
Javier Pena 01b83595d5 Fix test when running with SQLite 3.7.17 from CentOS 7
https://review.openstack.org/348943 introduced a test that fails
to run in SQLite 3.7.17, because the failed constraint message in
that version differs from the one expected: "CHECK constraint failed"
is expected, but "(sqlite3.IntegrityError) constraint failed" is
received.

Fixing this by setting the regexp to an error message that matches
in both versions.

Change-Id: I4b09eb4c3804fb9279a5acb3c1409def739e28c3
2016-12-16 11:31:23 +01:00
dparalen 3ddc0615e5 Introducing node introspection state management
Currently, state of a node introspection isn't kept in the database.
This change introduces:

* a new database column to keep the node introspection state
* an automaton to manage the node introspection state
* a decorator to declare a function performing an introspection state
  transition
* a version_id column is added, to enhance database consistency, that
  is consulted whenever node_info is committed

This change is part of the HA_Inspector effort[1]

[1] https://specs.openstack.org/openstack/ironic-inspector-specs/specs/HA_inspector.html

Closes-Bug: #1618835
Partial-Bug: #1525218
Change-Id: I18cb45f0d1194414715ccbe826d8a95610ec718d
2016-12-15 00:20:27 +01:00
dparalen 7cb40d5fec Add API for listing all introspection statuses
This patch introduces an API endpoint to list introspection statuses.  The
endpoint supports pagination with an uuid-marker and a limit query string
fields.  Due to the pagination, this change introduces a new configuration
option: ``api_max_limit``.

APIImpact

Change-Id: I74d02698801d5290619161b2d8d7181ab51a0a5e
Partial-Bug: #1525238
2016-11-21 15:58:24 +01:00
Dmitry Tantsur af6fbf0717 Support Ironic node names in our API
This change drops check on UUID validness from our API.
It also has a subtle effect of doing Ironic node fetching in
the introspection status and data fetch calls, which might make them
slightly longer (but only when name is used).

A new helper common.ironic.get_node is created to unify how we fetch nodes
from Ironic. It also provides nicer exceptions.

Change-Id: I20cf65e57910568b70a62c3f9269a962e78a07e2
Closes-Bug: #1523902
2016-05-09 15:01:31 +02:00
Anton Arefiev 30ae1e72f1 Move unit tests to "unit" directory
This refoctor is needed for tempest test work as tempest tests
will placed in test dir. So move unit tests to separate directory
"unit" under test.

Change-Id: Ic99df6111ef30947148a9e38b9435a54f3d37064
2016-04-05 11:15:29 +03:00