Commit Graph

195 Commits

Author SHA1 Message Date
Luca Miccini 8b9c49fd96 Add a configurable delay to Nova Evacuate calls
In case /var/lib/nova/instances resides on NFS we have seen migrations
failing with 'Failed to get "write" lock - Is another process using the
image' errors.

This has been tracked down to grace/lease timeouts not having expired
before attempting the migration/evacuate, so in this cases it might be
desirable to delay the nova evacuate call to give the storage time to
release the locks.

Change-Id: Ie2fe784202d754eda38092479b1ab3ff4d02136a
Resolves: rhbz#1740069
2019-09-25 20:33:45 +02:00
OpenDev Sysadmins 41387bf09d OpenDev Migration Patch
This commit was bulk generated and pushed by the OpenDev sysadmins
as a part of the Git hosting and code review systems migration
detailed in these mailing list posts:

http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003603.html
http://lists.openstack.org/pipermail/openstack-discuss/2019-April/004920.html

Attempts have been made to correct repository namespaces and
hostnames based on simple pattern matching, but it's possible some
were updated incorrectly or missed entirely. Please reach out to us
via the contact information listed at https://opendev.org/ with any
questions you may have.
2019-04-19 19:52:01 +00:00
Adam Spiers 42bb0c53e3 NovaEvacuate: fix a syntax error
In all the chaos of complicated patch unentanglement and rebasing,
I forgot to actually even syntax-check the thing :-(

Change-Id: Ib0a31efe2ff75fc55cf67d4de73e74ebafb219b0
2018-01-17 12:17:06 +00:00
Zuul 1102662aad Merge "NovaEvacuate: Allow debug logging to be turned on easily" 2018-01-16 15:32:36 +00:00
Zuul a655cffbc3 Merge "neutron-ha-tool: do not replicate dhcp" 2018-01-16 13:53:18 +00:00
Andrew Beekhof d46f78fda4 NovaEvacuate: Support the new split-out IHA fence agents with backwards compatibility
Change-Id: Ib43294fa1fe3e814d041167aabbfe46140032e24
2018-01-15 16:52:04 +00:00
Andrew Beekhof 7d61c3b0b2 NovaEvacuate: Correctly handle stopped hypervisors
Change-Id: Ia6c62c17945cd0ad00df19b2c5a4c306dd7f4f09
2018-01-15 16:49:43 +00:00
Mate Lakat 08b85da715 neutron-ha-tool: do not replicate dhcp
Neutron already takes care of the HA for dhcp agents. See neutron setting
`dhcp_agents_per_network` in `neutron.conf`. Having the OCF script call dhcp
replication function of neutron-ha-tool would mean that the tool will
try to plug each network to each DHCP agent, which contradicts neutron's
settings.

Change-Id: I87d9f7010092178c1677e14456b5e2606e5830dc
2017-05-09 10:33:52 +02:00
Vincent Untz fe84d75954 NovaCompute: Support parsing host option from /etc/nova/nova.conf.d
Change-Id: Ic08f05d217e1321ee7d3feec4d12bf32593e7982
2017-01-30 18:12:32 +01:00
Vincent Untz 7a01081e73 NovaCompute: Use variable to avoid calling crudini a second time
We don't need to run crudini twice to get the same config item; instead,
just remember the result of the first time.

Change-Id: I7591f5c7d1474447e29861e499d04b4b5bdb2a27
2017-01-30 18:09:49 +01:00
Andrew Beekhof 5b5c080a0b Ensure nova-compute unfences itself after starting
Change-Id: I687d01b346a45b0df96b66b767017356b0cf63c2
2016-11-22 21:07:37 +11:00
Jenkins 597077ea03 Merge "Extract the nova wait functionality into its own agent" 2016-09-09 16:10:42 +00:00
Andrew Beekhof 9c635bfe34 Extract the nova wait functionality into its own agent
Change-Id: I635ef96946e376b4182c15575edc3e02705d02be
2016-06-28 12:34:23 +10:00
Andrew Beekhof b2197fcc13 NovaEvacuate: Allow debug logging to be turned on easily
Change-Id: I334c80ec0f9b2aa78b95decc6e9e7c4a01101248
2016-06-28 12:32:34 +10:00
Andrew Beekhof 4f2c49d7ba NovaEvacuate should use the existing status operation
No need to duplicate the same logic, plus it is slower than looking up
one specific host

Change-Id: I3ae432a8f42da80f8f235f689d5162d87ad2df5f
2016-06-28 12:02:50 +10:00
Dirk Mueller 3d724a29a9 Relicense to Apache-2.0
Apache-2.0 is the recommended license for OpenStack Big Tent
projects (see https://governance.openstack.org/reference/licensing.html)
and this simplifies the licensing of the overall git repo
quite a bit by removing an exception clause.

Change-Id: I827eb91fd18ced1848439d573cfe6df16ed27748
Closes-Bug: #1564844
2016-05-09 13:46:21 +02:00
Adam Spiers fff75c5eb4 neutron-ha-tool: fix monitor return code
When neutron routers need migration, make neutron-ha-tool's monitor
action return OCF_ERR_GENERIC not OCF_NOT_RUNNING.  This is based on the
OCF Resource Agent Developer’s Guide, which says in the section for
OCF_ERR_GENERIC:

    The action returned a generic error. A resource agent should use
    this exit code only when none of the more specific error codes,
    defined below, accurately describes the problem.

    The cluster resource manager interprets this exit code as a soft
    error. This means that unless specifically configured otherwise, the
    resource manager will attempt to recover a resource which failed
    with OCF_ERR_GENERIC in-place — usually by restarting the resource
    on the same node.

      -- http://www.linux-ha.org/doc/dev-guides/_literal_ocf_err_generic_literal_1.html

and also in the section for OCF_NOT_RUNNING:

    If the resource is not running due to an error condition, the
    monitor action should instead return one of the OCF_ERR_ exit codes
    or OCF_FAILED_MASTER.

      -- http://www.linux-ha.org/doc/dev-guides/_literal_ocf_not_running_literal_7.html

Change-Id: I55f78a5c341a8a552e06a252a9c6836877c0cf77
2016-04-01 20:27:11 +01:00
Jenkins 51748eb269 Merge "neutron-ha-tool: make start action retry" 2016-04-01 19:22:13 +00:00
Adam Spiers 75dcff3b9d Clarify risks of not using shared storage
Make it clearer what the risks of not using shared storage are.
Information is based on:

  http://docs.openstack.org/user-guide-admin/cli_nova_evacuate.html

which says "The command rebuilds the instance from the original image or
volume" but later says that "To preserve the user disk data on the
evacuated server, deploy Compute with a shared file system" and then use
--on-shared-storage.

Change-Id: I09600414eb0d7fff1cf301b11b3fa9a76fc08c77
2016-03-28 22:44:31 +01:00
Adam Spiers a0451cbf57 neutron-ha-tool: make start action retry
https://github.com/SUSE-Cloud/cookbook-openstack-network/pull/1
adds (amongst many other things) support for neutron-ha-tool to retry
its connections to neutron-server.  By taking advantage of this in
this OCF RA, we can make failover more robust.

Signed-off-by: Adam Spiers <aspiers@suse.com>
Change-Id: I41c37500f691e2e0ecfd6c31f1720f483513e447
2016-03-25 13:41:30 +00:00
Jenkins 734f5f4e60 Merge "Fix neutron-ha-tool for active/passive usage" 2016-03-22 19:06:57 +00:00
Adam Spiers 8ea5709572 Update support email address to "new" OpenStack list
The openstack mailing list moved from launchpad to openstack.org quite a
long time ago.

Change-Id: I8fcc16d223891c3cd12289b5ccd6a6a674bd2255
Signed-off-by: Adam Spiers <aspiers@suse.com>
2016-03-17 19:51:03 +00:00
Adam Spiers 32348b3a11 os_password is no longer a mandatory option
neutron-ha-tool.py is now being maintained in the
neutron-ha-tool-maintenance branch of this fork:

  https://github.com/SUSE-Cloud/cookbook-openstack-network/

One of the new changes in that branch is the option to obtain
os_password from /etc/neutron/os_password instead of from the Pacemaker
CIB.  This is more secure and also avoids quoting issues with crmsh when
the password has unusual characters:

  29e9759937

When we are using that approach, os_password is not set on the
primitive, so we change this parameter to no longer be required,
in order to avoid warnings from crm_verify etc.

Change-Id: I6cd675fc744c7cfb444bf524c6d6d6444f8e4368
Signed-off-by: Adam Spiers <aspiers@suse.com>
2016-03-17 19:45:13 +00:00
Adam Spiers 8bf05cdc53 Update neutron-ha-tool's description to reference new upstream
neutron-ha-tool.py is no longer available in the original upstream, so
it is now being maintained in the neutron-ha-tool-maintenance branch of
this fork:

  https://github.com/SUSE-Cloud/cookbook-openstack-network/

Change-Id: If5145d76bd703c1e9f44b5ee6433216715755702
2016-03-17 19:42:23 +00:00
Adam Spiers 35282eb288 NovaEvacuate: fix comment in header
This was a copy'n'paste from NovaCompute which someone forgot to change.

Change-Id: I240e6e21d4a87924ab96be1dae672119b58bca56
2016-03-15 17:15:01 +00:00
Adam Spiers 34447f8fa8 Fix neutron-ha-tool for active/passive usage
The neutron-ha-tool Pacemaker resource primitive is only intended to be
run on a single node at a time, i.e. in active/passive mode, rather than
as a clone.  However until now, the RA didn't change behaviour depending
on whether it was supposed to be active on the current node.  So if
Pacemaker did a probe on a node where it was not expecting it to be
active, the monitor action would typically return OCF_SUCCESS, causing
messages from pengine like:

  error: Resource neutron-ha-tool (ocf::neutron-ha-tool) is active on 2 nodes attempting recovery
  warning: See http://clusterlabs.org/wiki/FAQ#Resource_is_Too_Active for more information.

and then Pacemaker could attempt unnecessary recovery according to the
value of the cluster-wide "multiple-active" option, which defaults to
"stop-start".  This would stop the resource everywhere (which is a
noop), and then start it on one node, resulting in unnecessary cluster
transitions and unnecessary runs of this RA's "start" action.

To avoid this, we introduce a state file to keep track of whether it's
active on the current node, and if so, skip the l3-agent check and
always return OCF_NOT_RUNNING.  This is the same technique already used
by NovaEvacuate.

Change-Id: I459e49d27802552ef5424d290ef3fca51640723b
Closes-Bug: #1555711
Signed-off-by: Adam Spiers <aspiers@suse.com>
2016-03-15 10:07:25 +00:00
Jenkins 04051d7bb6 Merge "neutron-ha-tool: add os_region_name parameter" 2016-03-10 17:09:42 +00:00
Adam Spiers a78003ee1c neutron-ha-tool: add os_region_name parameter
This adds an os_region_name parameter which gets passed to
the neutron-ha-tool Python script, in order to support this
upstream change to the latter:

  58f12c5060

This was rescued from the unmerged pull request on the old repository:

  https://github.com/madkiss/openstack-resource-agents/pull/21

See also https://github.com/crowbar/barclamp-neutron/pull/217

(cherry picked from commit 7fa9b868e30143bc26e09b9db8ace41c5efeb49b)
Closes-Bug: #1508416
Change-Id: Iaee553c71ecce063e9272024e42590fc2e8aa515
2016-03-10 16:56:44 +00:00
Adam Spiers d9eeb2f133 neutron-ha-tool: fix 'defaut' typo
Self-explanatory.  This was rescued from the unmerged pull request on
the old repository:

  https://github.com/madkiss/openstack-resource-agents/pull/21

(cherry picked from commit a4ba41bd23f5386afb4c3c6608f0a31211b5c179)

Change-Id: Id487efdbf1ec93242c30d4ec157fd482abbfc8b5
2016-03-10 16:46:42 +00:00
Norbert Illes a0b55d3329 Add bashate version >=0.5.0 as test dependency
We have a lots of long heredocs lines in the OCF scripts and older bashate
versions consider these as E006 violations.
From version 0.5.0, bashate doesn't check heredocs, so we specify this
version as a dependency.

In addition, this commit turns on E006 violation checking again.

Change-Id: I1ff675dd587239f0b7fd65c15b8df57a39a2c72b
Signed-off-by: Norbert Illes <norbert.e.illes@ericsson.com>
2016-03-07 19:43:10 +01:00
Norbert Illes fa93525cea Temporary ignore bashate E006 errors
The currently available bashate releases are considering heredocs as
normal code lines, hence lines longer than 79 columns in these sections
are also considered as E006 violations. As the OCF scripts are
containing lots of heredocs, we are affected by this behaviour.
However, there is a commit in the bashate repository (649c7dc79948)
which modifies bashate to ignore long lines in heredocs.

Currently there is no bashate release which contains the above commit,
so we ignore E006 errors until a new bashate released.

Change-Id: I33a9737ce1ec7eddab0b24ddedefe5c17da03b7a
Partial-Bug: #1550203
Signed-off-by: Norbert Illes <norbert.e.illes@ericsson.com>
2016-03-05 13:49:58 +01:00
Norbert Illes 98a54ad759 Fix bashate E006 violations
This commit fixes bashate E006 (lines longer than 79 columns) violations
in the OCF scripts.

Partial-Bug: #1550203
Change-Id: Ic208477b2299697a03b641f8272a0946c897fb3e
Signed-off-by: Norbert Illes <norbert.e.illes@ericsson.com>
2016-03-02 19:45:07 +01:00
Jenkins b64bdae693 Merge "Add .tox/ directory to .gitignore" 2016-02-27 14:13:39 +00:00
Adam Spiers 2126e8bbc3 Add .tox/ directory to .gitignore
Change-Id: Ic0dc9fbd29621689368b97bbb1f4ddc26b0aa8c4
Signed-off-by: Adam Spiers <aspiers@suse.com>
2016-02-27 14:10:42 +00:00
Adam Spiers 1d019f5f73 Fix bashate E010 violation
This commit fixes a bashate E010 violation:

[E] E010: The "do" should be on same line as for: '        for i in `ps -o pid --no-headers --ppid $pid`'
 - /home/adam/SUSE/cloud/OpenStack/git/openstack-resource-agents/ocf/cinder-volume : L219

Change-Id: I25b6e05336b1679818ad6f876bf94679a6d5ac10
Partial-Bug: #1550203
Signed-off-by: Adam Spiers <aspiers@suse.com>
2016-02-27 14:06:26 +00:00
Norbert Illes 173a77cec8 Fix bashate E003 violations
This commit fixes bashate E003 (indents are a multiple of 4 spaces)
violations in the OCF scripts.

Partial-Bug: #1550203
Change-Id: I6fbc935bd5f9b383ca97c45f2dd89d7d33a5780f
Signed-off-by: Norbert Illes <norbert.e.illes@ericsson.com>
2016-02-27 12:10:58 +01:00
Jenkins 076ae60516 Merge "Fix bashate E002 violations" 2016-02-26 15:49:19 +00:00
Norbert Illes 4397355193 Fix bashate E002 violations
This commit fixes bashate E002 (indents are only spaces, and not hard
tabs) violations

Partial-Bug: #1550203
Change-Id: I7d156d47023781be74e6fa8daef6ffc311b55d9d
Signed-off-by: Norbert Illes <norbert.e.illes@ericsson.com>
2016-02-26 16:15:05 +01:00
Norbert Illes ad9cefd1d6 Fix bashate E001 violations
This commit fixes bashate E001 (lines ending with trailing whitespace)
violations in the OCF scripts.

Partial-Bug: #1550203
Change-Id: I9ed3a5012509d85463098b3489641f67cfa69eac
Signed-off-by: Norbert Illes <norbert.e.illes@ericsson.com>
2016-02-26 09:57:30 +01:00
Norbert Illes 6ad8eb01ae Move syntax-check test to tox.ini
This commit moves the syntax-check test from a make target to tox.ini

Change-Id: Id15320c589afea2b3a4a5cff5e7fa9c5c2b9d0b8
Signed-off-by: Norbert Illes <norbert.e.illes@ericsson.com>
2016-02-24 15:19:00 +01:00
Norbert Illes e7672a0aa9 Add tox.ini configuration to run bashate tests
This commit implements a simple tox.ini configuration to run bashate
style checker against all files in the ocf directory.

Partial-Bug: #1508559
Change-Id: I34b3fc108a86d902d0d856f632b5221e14f1f118
Signed-off-by: Norbert Illes <norbert.e.illes@ericsson.com>
2016-02-23 14:09:33 +01:00
Vincent Untz 25c306755f NovaCompute: Clarify comment when there's no evacuate attribute
Change-Id: I3bbe9114a2c1b0e7aec7c1118f24b711c44ad52e
2016-02-17 21:08:01 +01:00
Jenkins 807c45f376 Merge "NovaEvacuate: Do not use reboot action for fence_compute" 2016-02-15 16:01:58 +00:00
Vincent Untz 9df293dcb1 Add insecure and region_name parameters to NovaCompute and NovaEvacuate
These can be quite useful in some setups.

This depends on https://github.com/ClusterLabs/fence-agents/pull/37

Change-Id: I2cfef0a4bf7f94f74041c8fee236788c7a110cc5
Signed-off-by: Vincent Untz <vuntz@suse.com>
2016-02-15 15:50:46 +01:00
Jenkins f178723d40 Merge "NovaCompute: Call "fence_compute -o on" after evacuation" 2016-02-15 14:37:45 +00:00
Jenkins 29c9e6d0eb Merge "NovaEvacuate: Add domain parameter" 2016-02-15 14:30:07 +00:00
Jenkins 312fd78942 Merge "NovaCompute: Fix loop on start checking for evacuate attribute" 2016-02-15 14:02:40 +00:00
Norbert Illes b1266c77dd Remove Keystone dependency from neutron-server RA
When we check the availability of Neutron API service, now we simply
check the response code of "List API version" call instead of getting a
token from Keystone, then checking a Neutron API endpoint using that
token. This way we don't need a token anymore so the checking process
will not depend on the availability on Keystone.

Partial-Bug: #1511721
Change-Id: I5fee8d47bd8e9af9f415b9f74c4f9325ac99df2f
Signed-off-by: Norbert Illes <norbert.e.illes@ericsson.com>
2016-02-15 13:58:20 +01:00
Jenkins a756a909a1 Merge "NovaCompute, NovaEvacuate: Add missing content in username description" 2016-02-11 11:25:28 +00:00
Vincent Untz 143864c694 NovaEvacuate: Avoid initial useless message on stderr
When no evacuation has been done yet, we're spamming syslog with:

  Could not query value of evacuate: attribute does not exist

So let's just filter this out, since it's known to be expected on
initial setup.

As this requires a bashism, also move the script to use bash.

Change-Id: I3351919febc0ef0101e4a08ce6eb412e3c7cfc76
2016-02-11 11:35:09 +01:00