Commit Graph

68 Commits

Author SHA1 Message Date
Anton Arefiev 3fe42b53fd SQLAlchemy 2.0 Support
Primarily remove the workaround added in
Ia6d512ff2ae417bab938cb095fbb0884d195010a which added
continued use of autocommit, which is incompatible with
SQLAlchemy 2.0.

Also set the environment for unit tests to report compatability
warnings, although it appears none are being reported at this time.

Also cuts out the db upgrade cruft to only use the online database
migration code through oslo_db's enginefacade, which has the smarts
to handle online or offline migrations.

And then, retools unit/functional test data storage to utlize sqlite,
and in that re-tooled the queries to prevent locking conditions
which could exist with queries, and some additional refactoring/cleanup.

Also, don't mock and test time.sleep().

Additionally, it looks like we have discovered the root cause of the
memory/connection leakage issue which has been observed, due to the
way lists of nodes are processed/returned.

This change was based upon the work in
I506da42a9891a245831f325e34bec92e0a3f33f0 which is included in
this commit as the entire database structure and interaction
has been modified for ironic-inspector.

Co-Authored-By: aarefiev <aarefiev@mirantis.com>
Story: 2009727
Task: 44132
Change-Id: Ic88eb9dec5fddc924a72d9a23c17a304954ebf46
2022-12-15 09:28:55 -08:00
Riccardo Pittau 3accdfbbc6 Use openstacksdk for ironic module
This patches removes the ironic-client dependency from the
ironic module in favor of openstacksdk.

Increase minimum required version of openstacksdk to use
recent added features.

Change-Id: I31964179835ad454e8205a59f483b419de4fb62d
2020-03-23 14:28:09 +01:00
Dmitry Tantsur 85267e881c Stop trying to power off nodes if manage_boot is False
When manage_boot is False, we do not power on the machines on starting
introspection, nor do we power them off on aborting. We should not
power off on finishing introspection either, let's leave it up to ironic.

Change-Id: If8115f8d592e1b24b07ef52bcd703d10763c1f00
Story: #1528920
2019-10-18 13:53:17 +02:00
Julia Kreger 0c9447d53b Active node introspection for nodes not in cache
When an active node introspection is performed
where an entry already exists in ironic, but inspector
has never seen the node before, we enter a state where
we should allow introspection to occur, but that we don't
have a cache record.

As a result of no cache entry, we presently fail hard claiming
the node's port already exists. In the case of active nodes,
This will always be the case, so we need to add a cache
entry and allow the process to... somewhat continue as if
normal introspeciton is occuring.

Co-Authored-By: Dmitry Tantsur <dtantsur@redhat.com>
Change-Id: Ib20b4a4e4d46ba23b8a4d87791cf30a957f1f9f9
Story: 2006233
Task: 35834
2019-07-19 17:08:56 +02:00
Julia Kreger 46a90d6dad Enable processing of active nodes
Change-Id: Ie369251bb11f36006c3921b3b40800288b49bbc2
Story: 2005896
Task: 33755
2019-07-03 20:10:12 +00:00
Zuul 3e702f65db Merge "Use bmc ipv6 address while processing introspection data" 2019-07-03 20:01:29 +00:00
Nikolay Fedotov fcb0ca8a6e Use bmc ipv6 address while processing introspection data
Change-Id: Id6d56860a4e47f28c875ed6032e1f99a997aea9b
Story: #2005779
Task: #33503
2019-06-19 15:03:07 +03:00
Kaifeng Wang d57179411d Simplify locking code
NodeInfo accepts a lock instance when instantiated, it is only used
in the node clean up periodic task, such implementation introduces
complexity on the lock abstraction. This patch moves node locking
out of get_node.

_get_lock_ctx is removed for it only duplicates _get_lock according to
current usage.

Change-Id: I4e946942ce684b27062d1a42c68faf1d92569489
2019-06-10 10:10:10 +08:00
Kaifeng Wang 29d8515f50 Support reapply with supplied introspection data
This patch adds support to provide unprocessed introspection data
to reapply a node. The provided introspection data will be save to
current introspection storage backend.

Change-Id: I969ae9c32f53f89c006a64a006388ddea9542aa5
Story: 1564863
Task: 11344
2019-04-15 15:04:33 +00:00
Kaifeng Wang 9c6c7c0a57 Follow up to introspection data driver api revision
In https://review.openstack.org/#/c/637673, the get introspection
interface is narrowed down to only accept node uuid, which previously
accepts both uuid and name. But the name to uuid conversion is
missing in the reapply api, which causes feature regression, and
this is the fix :(

Story: 1726713
Task: 11373

Change-Id: I6912853deab77f1365f665ca1e52c13063d2cdf1
2019-02-26 16:26:45 +08:00
Kaifeng Wang 8b2fb384c3 Revise driver api for introspection data backend
The patch revises driver interface for the introspection data
backends. Previously getting introspection data supports node
uuid/name, while saving takes a node_info object, which is not
consistent and makes migration tool looks weird if implemented
based upon it.

For the get() interface, actually only uuid will be passed in,
so it's narrowed down to accept only uuids, logic names will be
converted from api level if there is a need.

The save() interface is changed to accept node uuid instead of
node_info, which is consistent with the get() interface.

Change-Id: I4702ed7372d0e60ed6252879a7496649a0453b84
Story: 1726713
Task: 11373
2019-02-19 11:18:04 +08:00
Kaifeng Wang 9f8f3fc08d Introspection data storage backend follow up
An issue is spotted during implementing [1], we have a solely code
path from api, do_reapply, reapply, to _reapply. Inspector now reads
intrspection data ahead, so there is no need to do so in _reapply,
reading introspection data there is never reached.

The patch removes unecessary code and corresponding tests.

[1] https://storyboard.openstack.org/#!/story/1564863

Change-Id: I5558bce2bc49de3d1c5dba59e203de4824a3addd
Story: 1726713
Task: 11373
2019-01-24 17:33:00 +08:00
space d278bb6f77 introspection data backend: plugin layer
Configurable introspection data storage backend [1] is proposed
to provide flexible extension of introspection data storage
instead of the single support of Swift storage backend.

This patch adds plugin mechanism for loading introspection
storage, creates database backend and moves Swift storage
into a plugin.

[1] http://specs.openstack.org/openstack/ironic-inspector-specs/specs/configurable-introspection-data-backends.html

Story: 1726713
Task: 11373

Co-Authored-By: Kaifeng Wang <kaifeng.w@gmail.com>
Change-Id: Ie4d09dc0afc441b20a1e5e3bd8e742b1df918954
2019-01-07 17:31:15 +08:00
dparalen 7e72ceffd1 Terminal state transitions in transactions
Multiple spots were not using DB transactions when processing the terminal
state transitions (error, abort, finish, timeout). The pattern looked like
this:

    node_info.fsm_event(istate.Events.error)
    # more code
    node_info.finished(error='Oops!')

which led to brief periodes of state inconsistency of NodeInfo records in
the DB.

This patch refactors the NodeInfo.finished() method to require a terminal state
transition to perform as part of the NodeInfo state update:

   NodeInfo().finished(istate.Events.finish)
   NodeInfo().finished(istate.Events.abort, 'Canceled by operator')

This patch also introduces a new state: aborting to allow the inspector to
try call power-off the node before marking the introspection aborted.

There's a new DB migration since the new state implies a schema change too
(Enum).

Closes-Bug: #1721233
Closes-Bug: #1721230
Closes-Bug: #1723384

Change-Id: I0bb051d1956a996ed006d55a5ca2d670d9455047
2017-12-19 18:15:31 +01:00
dparalen 7b27585463 Refactoring the firewall
Adopting the PXE filter interface/driver concept

Related-Bug: 1665666
Change-Id: If83db978080b9c4e5d51ba50bbe8ed26e29abe83
2017-10-19 16:38:15 +02:00
Dmitry Tantsur a1d19d97b5 Do not rely on the older ipmi_address field on lookup
We need to always use bmc_address from inventory, the ipmi_address field is
there only for compatibility with older processing hooks.

Change-Id: Ibf00ecd9278af4ee9318ada44c7c670d13ac22aa
Closes-Bug: #1714944
2017-09-04 14:43:13 +02:00
Dmitry Tantsur a46e708160 [refactoring] simplify signature of _finish function
This is a follow-up to commit e05257035c.

Change-Id: Ia05f533c3c14d926b3c524175260cfcf0bb09e1e
2017-06-06 16:55:38 +02:00
Dmitry Tantsur e05257035c Completely remove support for setting IPMI credentials
This experimental feature was deprecated in the Ocata release,
as it was found unstable, untested and dangerous.

API version is bumped to 1.12 to indicate this change to users.

Change-Id: I1aad6ddfd03946edc19ae510accd6c8daf5fc268
Closes-Bug: #1654318
2017-05-26 14:29:13 +02:00
Dmitry Tantsur 9575d31920 Move port creation to validate_interfaces hook
Currently port logic is placed inconsistently: port creation is in core processing,
while port deletion is in validate_interfaces before_update. This changes moves
port creation there as well. This should only affect deployments that tamper with
validation_interfaces hook, as previously port creation was run just before running
before_update hooks.

This allows deployments to replace port creation logic by replacing the
validate_interfaces hook.

Change-Id: Idd8f748fdf31fc694bd7b554837e509024716c18
Partial-Bug: #1667472
2017-04-11 17:58:56 +02:00
Annie Lezil 1e6b689286 Remove translation of log messages Part-2
The i18n team has decided not to translate the logs because it seems
like it not very useful.

Change-Id: I46c1b0c3efa28c3f887b1a29dc77d47fe749be87
Closes-Bug: #1674374
2017-03-28 01:53:40 +00:00
Jenkins 34eb861913 Merge "Reapply doesn't update started_at time" 2017-03-09 22:58:00 +00:00
Annie Lezil 43c89efcb7 Reapply doesn't update started_at time
The reapply API/Action(openstack baremetal introspection reprocess UUID)
doesn't update the started_at time when Ironic Inspector begins processing
the node.

This adds the started_at time when the reapply API/Action is performed.

Change-Id: Ic79db4ba9305841fb662afcb56f556ad4a57a500
Closes-Bug: #1625180
2017-03-07 21:28:13 +00:00
John L. Villalovos ea97d2b733 Use flake8-import-order
Use the flake8 plugin flake8-import-order to check import ordering. It
can do it automatically and don't need reviewers to check it.

Change-Id: I9ced9c297273db0eec6ab3995b663b1e8dffe87d
2017-02-16 10:11:06 -08:00
Moshe Levi 1dce3b12d3 Adding InfiniBand Support
InfiniBand is computer-networking communications standard
used in high-performance computing, features very high
throughput and very low latency.
This patch allow ironic-inspector to add the client_id
to ironic port extra. The client_id option allow pxe boot
from InfiniBand interface.

Closes-Bug: #1532534
Depends-On: Ifad453977e5d3be64b34e544f269835a72b4d73f
Change-Id: I479d54c29bcacb6bd5c1ab20033ae6e428b0e744
2017-02-01 08:34:38 -05:00
Jenkins 2a80a41c51 Merge "Use oslo_serialization.base64 to follow OpenStack Python3" 2016-12-15 17:47:20 +00:00
Luong Anh Tuan f6d02e1cb9 Use oslo_serialization.base64 to follow OpenStack Python3
This patch replaces python standard base64 library call to
oslo_serialization.base64 to follow OpenStack Python3 porting
standard [1].

Use base64 encoding takes 8-bit binary byte data and encodes it. On
Python3, A string is a sequence of Unicode characters thus base64 has
no idea what to do with Unicode data, it's not 8-bit[2]. We use
oslo_serialization.base64 for python2 and python3.

[1] https://wiki.openstack.org/wiki/Python3
[2] http://stackoverflow.com/questions/8908287/base64-encoding-in-python-3

Change-Id: Ibf24df3a90ecbcdec400a0570f2818f89b78ea0a
2016-12-15 10:35:16 +00:00
dparalen 3ddc0615e5 Introducing node introspection state management
Currently, state of a node introspection isn't kept in the database.
This change introduces:

* a new database column to keep the node introspection state
* an automaton to manage the node introspection state
* a decorator to declare a function performing an introspection state
  transition
* a version_id column is added, to enhance database consistency, that
  is consulted whenever node_info is committed

This change is part of the HA_Inspector effort[1]

[1] https://specs.openstack.org/openstack/ironic-inspector-specs/specs/HA_inspector.html

Closes-Bug: #1618835
Partial-Bug: #1525218
Change-Id: I18cb45f0d1194414715ccbe826d8a95610ec718d
2016-12-15 00:20:27 +01:00
John L. Villalovos ba9adaa2f5 Update to hacking 0.12.0 and use new checks
Use hacking 0.12.0

Use the new checks that are available:
    [H106] Don’t put vim configuration in source files.
    [H203] Use assertIs(Not)None to check for None.
    [H904] Delay string interpolations at logging calls.

Fix code so tests pass.

Change-Id: I7b1cd98aeb3bb8f4f01dd1d69d1afcd839749074
2016-11-09 05:37:10 -08:00
Jenkins de04f7c5a0 Merge "Log hook names as we run them" 2016-09-16 15:49:37 +00:00
Dmitry Tantsur 2853ec72a4 Log hook names as we run them
Change-Id: I477b2896dfce0d0e74a0133b8478adbf4dd2097c
2016-09-08 09:53:31 +02:00
Galyna Zholtkevych f29a71f576 Set node to the error if reapply fails
Setting node to the error status with error message
if reapply fails to get introspection data from swift.

Change-Id: Idccb68847d14d5050c735facf3da7b3ec108adbe
Closes-Bug: #1618833
2016-09-06 16:08:03 +03:00
Zhenguo Niu a97c8cb6a3 Add config to skip power off after introspection
This adds configuration option 'processing.power_off'
defaulting to True, which will prevent powering off the
node after introspection

Change-Id: I16eb6b73fd57e84175bbce81c79e432ed8d1d3fa
Closes-Bug: #1488534
2016-07-19 19:44:06 +08:00
Dmitry Tantsur c98d3f479b Allow customizing ramdisk logs file names and simplify the default
The template for ramdisk logs file names can now be changed via
the configuration. The default now contains only node UUID and datetime.
Also a proper tar.gz extension is appended to avoid confusion.

Depends-On: Ie507e2e5c58cffa255bbfb2fa5ffb95cb98ed8c4
Change-Id: I738f9bd35705d0d11c95b0164186ed0b366b5252
2016-07-01 14:30:44 +02:00
Dmitry Tantsur 223ff38dcf Return HTTP 500 from /v1/continue on unexpected exceptions
Currently it returns HTTP 400 which is obviously wrong.

Closes-Bug: #1590302
Change-Id: I56114aa1c0a3248cc80f79d1a5da3ace6c615e52
2016-06-08 09:56:37 +02:00
Dmitry Tantsur 0b58e31e3e Remove support for the old bash ramdisk
From now on only rely on the IPA inventory and 2 additional fields:
boot_interface and root_device.

Also updated unit tests to use one inventory example.
Also removed duplicating unit tests and checks in test_process.
Also removed devstack support for the old ramdisk.

Change-Id: Ib382328295fc2c1b9143171b1047304febadcaca
2016-06-06 13:36:05 +02:00
Dmitry Tantsur af6fbf0717 Support Ironic node names in our API
This change drops check on UUID validness from our API.
It also has a subtle effect of doing Ironic node fetching in
the introspection status and data fetch calls, which might make them
slightly longer (but only when name is used).

A new helper common.ironic.get_node is created to unify how we fetch nodes
from Ironic. It also provides nicer exceptions.

Change-Id: I20cf65e57910568b70a62c3f9269a962e78a07e2
Closes-Bug: #1523902
2016-05-09 15:01:31 +02:00
Dmitry Tantsur b65ab065ea Store ramdisk logs on all processing failures, not only reported by the ramdisk
Previously the ramdisk logs were only stored if the ramdisk reported an error.
However, we are moving from ramdisk-side validation to server-side, so we need
ramdisk logs to be available if processing fails too.

This change moves storing ramdisk logs from a ramdisk_error plugin to core
processing code. As before, it can be disabled by setting ramdisk_logs_dir to
an empty value.

Change-Id: Ib3742ee1c1d4f2f96d29466626e1121694610dc3
Closes-Bug: #1564448
2016-05-04 12:46:59 +02:00
dparalen f9e9a88d89 Allow rerunning introspection on stored data
As requested in the related bug, this pull request allows to run the
introspection again on previously stored data.  This should make it simple
to correct mistakes in the introspection rules.

For this purpose, new API entry point was introduced:
/v1/introspection/<UUID>/data/unprocessed that supports an empty POST
method to trigger the introspection over stored data.  New function
`reapply` was introduced that takes care about the entry point and carries
out the introspection.  The `process` function was modified to allow
reusing common parts in the new reapply function.  The storage access
methods were updated to allow saving the "raw" memdisk data besides the
processed introspection data.

Following preconditions are checked the reapply function having been
triggered:
* no data is being sent along with the request
* Swift store is configured and enabled and the stored data is present
  for the node UUID
* node_info object is cached for the UUID and it is possible to lock the
  object

Should the preconditions fail, an immediate response is given to the user:
* 400 if the request contained data or in case Swift store is not enabled
  in configuration
* 409 if it wasn't possible to acquire lock for the node_info object
* 404 in case Ironic didn't keep track of related BM node

If the preconditions are met, a background task is executed to carry out
the processing and a 202 Accepted response is returned to the endpoint
user.

As requested, these steps are performed in the background task:
* preprocessing hooks
* post processing hooks, storing result in Swift
* introspection rules

These steps are avoided, based on the RFE:
* not_found_hook is skipped
* power operations

Limitations:
* IMPI credentials are not updated --- ramdisk not running
* there's no way to update the raw data atm.
* the raw data is never cleaned from the store
* check for stored data presence is performed in background;
  missing data situation still results in a 202 response

Change-Id: Ic027c9d15f7f5475fcc3f599d081d1e8d5e244d4
Closes-Bug: #1525237
2016-04-07 10:40:55 +02:00
Dmitry Tantsur b638c70f82 Use all valid MAC's for lookup
Currently we are using only the resulting MAC(s) when doing a node lookup.
In many cases it is the MAC of the PXE-booting NIC. However, it's not necessary
the MAC that people used for enrolling the Ironic node, which will lead to
lookup failures on the virtual environment. This change makes the lookup
procedure use all of the valid MAC's.

Similarly, the enroll node_not_found_hook now checks all MAC's before creating
a node.

Code in the validate_interfaces hook was reordered to ensure we only keep
interfaces with valid MAC's even in the "all_interfaces" list.

Change-Id: Ie7df05d9a7855716fb835c90cfb0ac7fc4cd66df
2016-03-18 13:41:04 +01:00
Jenkins f8b2b7db99 Merge "Don't fail on power off if in enroll state" 2016-03-14 14:16:28 +00:00
Anton Arefiev 2cbcfcd2a4 Don't fail on power off if in enroll state
With enroll hook it's possible to process nodes in enroll state.
As nodes in this state are not expected to have valid power
credentials, we shouldn't fail if power off goes wrong.

Change-Id: I79e502052c4ae1b531b4f0a0bc314b4cf6d29aac
2016-03-11 15:06:04 +02:00
Dmitry Tantsur 5b02024cca Use futurist library for asynchronous tasks
A green thread is now used instead of spawn_n for running asynchronous
operations during introspection, processing and aborting.
The existing periodic tasks are now run using Futurist PeriodicWorker.

Main shut down procedure was split into a separate function for convenience.
Also updated the example.conf to the latest versions (some pending updates from
3rdparty libraries included).

Change-Id: Id0efa31aee68a80ec55e4136c53189484b452559
2016-03-02 15:40:01 +01:00
Dmitry Tantsur 0f8b5de248 Split ironic-related functions from utils to separate common.ironic module
Change-Id: I56c1a5eececb555c14847aecfc153ed40e680863
2016-03-01 16:11:11 +01:00
Dmitry Tantsur e86cec3b07 Stop storing ramdisk logs with the introspection data
Ramdisk logs are huge (and may potentially be VERY huge) and clutter the output
for 'introspection data save'. They might even potentially expose unwanted
information to a user.

Change-Id: I0e7f1f647e60711f84d7883caab5ce8b14d6184f
2016-02-11 16:18:57 +01:00
Jenkins 8a6936773a Merge "Remove deprecated support for passing patches lists into hooks" 2016-02-11 10:29:14 +00:00
dparalen 7a3d93700e Introduce API for aborting introspection
It is not currently possible to stop a running introspection.  This
may be annoying for the operator, considering the amount of time it
takes for the bare metal node to call the continue API request,
especially knowing the introspection will fail/time-out eventually
(such as when debugging).

This patch introduces a REST API endpoint "POST
/v1/introspection/<node-UUID>/abort" in order to fill the gap.  Upon the
abort method call, following preconditions are checked:

* there's a bare metal node matching the UUID
* introspection was not finished for the node
* introspection process is waiting for the node to give the continue call

Following Responses are returned to the caller in case the
preconditions were not met:

* 404 Response in case node wasn't found
* 409 Response (resource busy) in case the introspection process is not
      waiting for the Continue call

Otherwise, a 202 Response is returned.

When the abort method is processed, the node is powered off and it is
black-listed in inspector's firewall to prevent it from booting the
introspection image. This happens asynchronously.

To prevent interference with the continue call processing, the
processing method was updated to give a 400 Response to the
introspection client in case continuing a finished or canceled
introspection.

Limitations:
* IMPI credentials are never updated in case introspection was canceled
* 202 response is returned even if the introspection was already finished
* the endpoint differs from requested "DELETE
  /v1/introspection/<node-UUID>"

Links:
[1] https://bugs.launchpad.net/ironic-inspector/+bug/1525235

Change-Id: If043171f0d292ae2775dc1f26233dd4911599247
Closes-Bug: #1525235
2016-02-09 15:30:46 +01:00
Dmitry Tantsur cc422c0a5e Remove deprecated support for passing patches lists into hooks
Long time ago we had an idea of batching node updates. However,
it fails miserably when a plugin or an introspection rule has to inspect
existing node properties, because it receives possibly stale values.
We've deprecated support for batching patches back in Liberty, this
patch removes the associated bits from the hook interface.

Change-Id: Ia482ff50ca276ce1ffec631f016c6a6b54d5a4ab
Closes-Bug: #1506348
2016-02-09 12:36:07 +01:00
Dmitry Tantsur 0423d93736 Track node identification during the whole processing
Currently our logging in processing is very inconsistent:
some log strings mention node UUID, some - node BMC IP, some nothing.
This change introduces a common prefix for all processing logs
based on as much information as possible.
Only code that actually have some context about the node (either
NodeInfo or introspection data) is updated.

Also logging BMC addresses can be disabled now.

Updates example.conf (a lot of updated comments from oslo).

Change-Id: Ib20f2acdc60bfaceed7a33467557b92857c32798
2016-01-13 12:23:15 +01:00
Dmitry Tantsur 22fddca500 Properly report preprocessing errors to a user
Currently we only return "Data pre-processing failed" and the only option
to figure out what went wrong is to dive into logs.

Also fixed error message for unexpected exceptions during pre-processing
to actually contain some exception information.

Change-Id: I998c0bbd48cc57b7b13fac220f4f7cfe2a5e0681
Closes-Bug: #1523907
2015-12-09 17:31:47 +01:00
Dmitry Tantsur 2b8cc3a93f Lock nodes to avoid simultaneous introspection requests
Now a NodeInfo object has a lock instance attached, which is associated
with a node UUID. New calls acquire_lock and release_lock are added.
NodeInfo.find_node now always acquire a lock, get_node does it optionally.
NodeInfo.finished now always release a lock.

Note that this change does not make NodeInfo thread-safe. Rather it's
preventing simultaneous operations on the same node.

Change-Id: I0c0bd2f4e0622fd7086660b0874c5b8dca82b4d2
Implements: blueprint node-state
2015-11-27 12:05:27 +01:00