Commit Graph

40 Commits

Author SHA1 Message Date
Vasyl Saienko 25ec6e7e4f Set ip_nonlocal_bind to 1 for HA routers and DVR snat
Set nonlocal_bind to 1 to allow starting applications in both
routers (like ipsec from vpnaas). nonlocal_bin 0 prevens us from
starting ipsec in both routers simulteniously as process can't bind
to non existing address which was worarkunded in [1]
by setting dependency on python process during failover.

This revert [2] completely, which was partially reverted by [3].

[1] https://review.opendev.org/c/openstack/neutron-vpnaas/+/823904
[2] https://review.opendev.org/393886
[3] https://review.opendev.org/c/openstack/neutron/+/752360

Related-Bug: 1999761

Change-Id: I18a15aa3ca745b2b794610350f538d02575ccbe0
2023-01-05 14:45:20 +00:00
Damian Dabrowski 5288593faf [L3-HA] Disable automatic link-local address assignment for HA routers
In order to get both [1] and [2] fixed, we set
`net.ipv6.conf.all.addr_gen_mode=1` in HA router namespace to
prevent auto-assigning link-local address(lla) to the interfaces.
We don't need lla auto-assignment as keepalived manages them.
With this change, we will have link-local addresses only on active
router, which will prevent 'dadfailed' and MLD packets will not be
sent from standby router.

Previously we also reverted [3] to always keep qg-* interface up on both
active&standby router's instance, no matter if keepalived is started or
not.
Without link-local address assigned, backup router's instance won't
send any packets, so I see no reason to keep qg-* interface down.

[1] https://bugs.launchpad.net/neutron/+bug/1952907
[2] https://bugs.launchpad.net/neutron/+bug/1859832
[3] https://review.opendev.org/c/openstack/neutron/+/834162

Closes-Bug: #1952907
Related-Bug: #1859832
Depends-On: https://review.opendev.org/c/openstack/neutron/+/834162
Change-Id: I306f14aa6b7e8bb69a81f441be337bc1a584d3b2
2022-05-24 11:30:02 +00:00
Slawek Kaplonski f3e836217c [Functional] Add extra logs to the L3 HA router transitions
This patch adds extra logs to log current and expected ha state of the
routers on various fake agents during the functional tests, log on which
agent router is "primary" and where it is "backup" and when failover of
the router should happen.
Those logs should allow us better understand what happens during those
functional tests and why some of them are failing from time to time.

Related-Bug: #1956958
Change-Id: I567036470b7256275f67e8ef3546ed780c81b5ae
2022-01-11 09:59:37 +01:00
Slawek Kaplonski c6a6c5ae12 [Functional] Fix expected number of the enqueue_state_change calls
In the HA router's keepalived state change monitor tests, it was
expected that enqueue_state_change method will be called 3 or 4 times.
But after some changes in the keepalived_state_change monitor which were
done some time ago, it may be now that it will be called just 2 or 3
times:
- 2 when initial status will be "primary" and it will be just
  transition to "backup",
- 3 when initial status will be "backup", then it will transition to
  "primary" and finally to "backup" again.

To reflect those 2 possibilities, test was changed that it will expect
2 or 3 calls and will check only that last 2 will be always transition
to "primary" and then to "backup".

Additionally this patch adds some extra logging in that test so it will
be easier to check what was going on in that test.

Closes-Bug: #1954751
Change-Id: Ib5de7e65839f52c35c43801969e3f0c16dead5bb
2021-12-16 22:25:27 +01:00
elajkat 3a9a17ad82 Add functional tests for ECMP routes
Adding ECMP routes, where the destination is the same for multiple
routes available since [0].
One DVR test was merged with it, and this patch aims to cover other
l3 setups: legacy and HA.

[0]: https://review.opendev.org/c/openstack/neutron/+/743661

Change-Id: I78b8ea85548e11074191f46ef560728c4dc89405
2021-11-08 13:47:51 +01:00
Slawek Kaplonski b8ef8e722a [Functional] Wait for the initial state of ha router before test
In functional tests of the HA and DVR HA routers, when e.g.
failover is tested, we should always wait for routers to be in the
expected initial state (primary or backup) before router failover
will actually be done.
Without that, we may hit race condition when initial router's state
is enqueued but not processed yet and then state will be changed thus
no any action will be performed by L3 agent and test may fail.

Closes-Bug: #1939507
Change-Id: Ibd8f78fc822b04965c6a79b57b13be364934f64f
2021-09-09 15:10:40 +02:00
Slawek Kaplonski b5dd6efdca [DVR] Fix update of the MTU in the SNAT namespace
When network's MTU is changed, Neutron sends notification about it
to the L3 agents. In case of DVR (and DVR HA) MTU is then changed in
the qrouter- namespace but should be also changed on snat interfaces
in the snat namespace. And that part was missing.

This patch adds special implementation of the internal_network_updated()
method in the DvrEdgeRouter class so it can configure MTU also for
in the snat namespace.

This patch also removed passing attributes "interface_name",
"ip_cidrs" and "mtu" to the internal_network_updated() method and adds
"port" dict to be passed there. It is consistent with what is already
done in e.g. internal_network_added() method and "port" dict is actually
necessary to configure properly snat internal interface in the snat
namespace.

This patch adds also functional test of update network mtu for all types
of routers as there was no such test at all.

There is additional issue with DVR-HA which isn't fixed with that patch
and for which follow up will be proposed. Because of that this patch is
marked as partial fix for the related bug.

Related-Bug: #1933273
Change-Id: I200acfcaaae7f056ea9a563fead9ff2de8464971
2021-08-30 16:49:01 +02:00
Brian Haley 055036ba2b Improve terminology in the Neutron tree
There is no real reason we should be using some of the
terms we do, they're outdated, and we're behind other
open-source projects in this respect. Let's switch to
using more inclusive terms in all possible places.

Change-Id: I99913107e803384b34cbd5ca588451b1cf64d594
2020-08-19 16:47:53 -04:00
Brian Haley 8126f88894 Complete removal of dependency on the "mock" package
Now that we are python3 only, we should move to using the built
in version of mock that supports all of our testing needs and
remove the dependency on the "mock" package.

This completes removal of all references to "import mock",
changing to "from unittest import mock" in fullstack and
functional tests.

Added a hacking check to enforce it in future patches.

Change-Id: Ifcaf1c21bea0ec3c35278e49cecc90a101a82113
2020-05-01 12:05:34 -04:00
Rodolfo Alonso Hernandez 33fb446add Deprecate config option "ovs_integration_bridge"
Remove this duplicated option and rely only in OVS.integration_bridge.

NOTE: other projects are still using it; first we need to deprecate it
      in those projects.

Change-Id: I4e826c8b9fa764b1820adacc3427934dc393c0bc
Related-Bug: #1856152
2020-02-17 11:02:16 +00:00
Brian Haley 555238da69 Start using oslo_utils.netutils.is_ipv6_enabled()
Seems that is_enabled_and_bind_by_default() from
neutron.common.ipv6_utils was copied directly into
oslo_utils.netutils, so start using it instead.

Trivialfix

Change-Id: I00fa441e7a20fcd1115485bb8ab75750e6a8cf07
2019-10-16 21:44:56 -04:00
Rodolfo Alonso Hernandez 3f022a193f Delay HA router transition from "backup" to "master"
As described in the bug, when a HA router transitions from "master" to
"backup", "keepalived" processes will set the virtual IP in all other
HA routers. Each HA router will then advert it and "keepalived" will
decide, according to a trivial algorithm (higher interface IP), which
one should be "master". At this point, the other "keepalived" processes
running in the other servers, will remove the HA router virtual IP
assigned an instant before

To avoid transitioning some routers form "backup" to "master" and then
to "backup" in a very short period, this patch delays the "backup" to
"master" transition, waiting for a possible new "backup" state. If
during the waiting period (set to the HA VRRP advert time, 2 seconds
default) to set the HA state to "master", the L3 agent receives a new
"backup" HA state, the L3 agent does nothing.

Closes-Bug: #1837635

Change-Id: I70037da9cdd0f8448e0af8dd96b4e3f5de5728ad
2019-08-27 16:47:00 +00:00
Slawek Kaplonski 7bb1bbba36 Fix race in test_keepalived_state_change_notification
In case if initial keepalived status find in
keepalived_state_change.MonitorDaemon is "master"
this test_keepalived_state_change_notification was failing
because there was 4 calls to the mocked enqueue_state_change()
method instead of 3.

This patch changes test to wait until 3 or 4 calls to this
method will be counted and it also changes assertions of
what state should be set on which call.
Before the patch test was expecting that calls are always like:
backup, master, backup
but if there are 4 calls it is like: backup, master, master, backup.
As it doesn't matter if there was one or two calls with "master"
state, test will now assert that the last call is always with
"backup" state.

Change-Id: I78c30ab32ffda37176a9c71348d83e17ab2c972a
Closes-Bug: #1836565
2019-07-15 13:03:30 +02:00
Slawek Kaplonski 66eb1e29f3 Enable ipv6_forwarding in HA router's namespace
When HA router is created in "stanby" mode, ipv6 forwarding is
disabled by default in its namespace.
But when router is transitioned to be "master" on node, ipv6
forwarding should be enabled. This was fine for routers with
configured gateway but we somehow missed the case when router don't
have gateway configured.
Because of that missing ipv6 forwarding setting in such case, IPv6
W-E traffic between 2 subnets was not working fine in L3 HA case.

This patch fixes it by adding configuring ipv6_forwarding on
"all" interface in router's namespace always, even if it don't have
gateway configured.

Change-Id: I8b1b2b426f7a26a4b2407a83f9bf29dd6e9ba7b0
CLoses-Bug: #1818224
2019-03-15 14:30:23 +00:00
Sławek Kapłoński b09b44608b Remove deprecated 'external_network_bridge' option
This option is deprecated and marked to be deleted in Ocata. So
as we are now in Stein development cycle I think that it's good time
to remove it.

Change-Id: I07474713206c218710544ad98c08caaa37dbf53a
2019-03-09 22:07:38 +00:00
Brian Haley b847cd02c5 Enable 'all' IPv6 forwarding knob correctly
When the external gateway is plugged and we enable IPv6
forwarding on it, make sure the 'all' sysctl knob is also
enabled, else IPv6 packets will not be forwarded.  This
seems to only affect HA routers that default to disabling
this 'all' knob on creation.

Also, when we are removing all the IPv6 addresses from a
HA router internal interface, set 'accept_ra' to zero so
it doesn't accidentally auto-configure an address.  Set
it back to one when adding them back.

Re-homed newly added _wait_until_ipv6_forwarding_has_state()
accordingly.

Closes-bug: #1787919

Change-Id: Ia1f311ee31d1479089685367a97bf13cf170b342
2018-11-15 14:59:49 -05:00
Slawek Kaplonski 916e774516 Wait to ipv6 forwarding be really changed by L3 agent
In test test_ha_router_namespace_has_ipv6_forwarding_disabled
functional test it may happen that L3 agent will not change ipv6
forwarding and test fails because it checks that only once just
after router state is change to master.

This patch fixes that race by adding wait for 60 seconds to
ipv6 forwarding change.

Change-Id: I85a602561ebe9b7ab135913af49a3f010b09f196
Closes-Bug: #1801930
2018-11-06 15:40:34 +01:00
Zuul 12a838d94e Merge "Use constant IP_VERSION_4/6 in functional tests" 2018-09-03 22:15:06 +00:00
Slawek Kaplonski deb84b6756 Skip L3 ha functional IPv6 test if IPv6 is disabled
Test test_ha_router_namespace_has_ipv6_forwarding_disabled requires
IPv6 to be enabled on host on which tests are run.
It is now skipped if IPv6 is disabled.

TrivialFix

Change-Id: Idaa93bf112b2d29a56e3e4a2535c7d3fa0c49db4
2018-08-28 20:15:12 +00:00
Hongbin Lu 46913a69fd Use constant IP_VERSION_4/6 in functional tests
Change-Id: I62b5a37508838a42b03a39de02660b8cafc08c41
2018-08-27 21:45:56 +00:00
Slawek Kaplonski 3e9e2a5b4b Disable IPv6 forwarding by default on HA routers
In case of HA routers IPv6 forwarding is not disabled by default and
then enabled only on master node.
Before this patch it was done in opposite way, so forwarding was
enabled by default and then disabled on backup nodes.
When forwarding was enabled/disabled for qg- port, MLDv2 packets are
sent and that might lead to temportary packets loss as packets to
FIP were sent to this backup node instead of master one.

Related-Bug: #1771841

Change-Id: Ia6b772e91c1f94612ca29d7082eca999372e60d6
2018-05-31 20:19:21 +00:00
Daniel Alvarez 507cdde679 Fix: set IPv6 forwarding when there's an IPv6 gw
Currently when there is an IPv6 gateway set, IPv6 forwarding
configuration is skipped. This patch fixes it and also adds test
coverage to check the values of `accept_ra` and `forwarding` with
and without an IPv6 gateway.

Related-bug: #1667756

Change-Id: I0297aa6d0afeb56a8c69be41424d4b70509377d2
2017-05-05 19:59:37 +00:00
fpxie d2976d46d0 Replace six.iteritems with dict.items(Part-1)
according to https://wiki.openstack.org/wiki/Python3,
now we should avoid using six.iteritems and replace
it with dict.items.

Change-Id: I8753e80b34c0f86cf70aebc3bcbd3392ee933f62
Partial-Bug: #1680761
2017-04-17 14:08:47 +08:00
Daniel Alvarez 676a3ebe2f Disable RA and IPv6 forwarding on backup HA routers
Neutron does not disable ipv6 forwarding for HA routers and it's
enabled by default in all router namespaces. For ipv6, this means
that it will automatically join the following groups:

* link-local all-routers multicast group (ff02::2)
* interface-local all routers multicast group (ff01::2)
* site-local all routers multicast group (ff05::2))

As a side effect it will answer to multicast listener queries, thus
causing external switch to learn its MAC address and disrupting traffic
to the master instance.

This patch will enable ipv6 forwarding on the gateway interface only
for master instances and disable it otherwise to fix the issue.

Also, the accept_ra procfs entry was enabled under certain
circumstances but it wasn't disabled otherwise. This patch, will
disable RA on the gateway interface for non master instances.

Closes-Bug: #1667756

Change-Id: I9bc890b43f750cad68fc67f4c79f1426c3506863
2017-03-17 15:06:08 +00:00
Lubosz Kosnik 185d6cbc64 Add support for Keepalived VRRP health check
Adds functionality to generate bash script which verifies health of current
keepalived instance by pinging all available and configured GW addresses.
This functionality supports IPv4 and IPv6 by detecting needed ping version
using netaddr.

DocImpact:
Added a new parameter to 'l3_agent.ini' named
'ha_vrrp_health_check_interval' which is by default set to 0 (disabled).
Values > 0 designate health check functionality should be enabled.
Requires allowed ICMP ECHO_REQUEST because that is disabled by default.

Co-Authored-By: Artur Korzeniewski <artur.korzeniewski@intel.com>
Change-Id: Ib4d0691f432830357ea3f113036719645bc59a62
Closes-Bug: #1365461
2017-02-02 12:09:12 -05:00
Artur Korzeniewski 8d3f216e24 Addressing L3 HA keepalived failures in functional tests
Current testing of Keepalived was not configuring the connectivity
between 2 agent namespaces.
Added setting up the veth pair.

Also, bridges external qg-<id> and internal qr-<id> were removed
from agent1 namespace and moved to agent2 namespace, because they had
the same name.
Added patching the qg and qr bridges name creation to be different for
functional tests.


Change-Id: I82b3218091da4feb39a9e820d0e54639ae27c97d
Closes-Bug: #1580648
2017-01-27 11:19:16 +01:00
Robert Li e48caf6736 Add agent object in router info
agent object is a member of some sub classes of RouterInfo such as
HaRouter. This changeset makes it a member of the RouterInfo class
itself.

Prior to the change, the agent object has been passed in to some
methods of RouterInfo that requires it to access the agent object's
member information. The bugs in concern requires calling the PD object
that is a member of the agent object to get IPs that need to be
preserved in the gateway port. Without this change, signatures of the
methods external_gateway_added() and external_gateway_updated() have
to be modified to pass in the agent object. And any subclass of
RouterInfo that overwrites or uses the methods must be changed as
well. It doesn't seem to make sense considering the subclass such as
HaRouter has the agent object as one of its members already.

The changeset fixes the bugs by preserving the LLAs for prefix
delegation when the gateway port is being updated.

Closes-Bug: #1639042
Closes-Bug: #1640271

Change-Id: I61c6128ed1973deb8440c54234e77a66987d7e28
2016-12-19 17:08:23 -05:00
Jenkins a39cbe3473 Merge "Add L3 HA test with linux bridge" 2016-11-15 09:38:47 +00:00
Kevin Benton 16ae4190a7 Add L3 HA test with linux bridge
Adds an HA test case using the linux bridge interface and
a test to recreate the router to ensure all cleanup was
done appropriately on teardown.

Related-Bug: #1629159
Change-Id: I80b70b848ea64d5f996055edc4bfb0ec1f4ae548
2016-11-15 00:06:17 +00:00
Jakub Libosvar 4fdd89e94f l3-ha: Send gratuitous ARP when new floating IP is added
We rely on keepalived to send gratuitous ARPs when floating IP is added.
Older versions of keepalived up to 1.2.20 (exclusive) contain bug [1] where
keepalived does not send GARP on receiving SIGHUP. Unfortunately, newer
versions containing the fix are not packaged yet for some distributions
like RHEL or CentOS or Ubuntu Xenial, so this patch adds a workaround for
such distributions until new packages are available.

The patch also sets net.ipv4.ip_nonlocal_bind kernel parameter to 0 for
Snat and HA router namespaces in order to avoid sending gratuitous ARPs
for IP addresses that are not bound to the interface anymore - possibly
because of failover or removal. Note that kernel < 3.19 contain a bug
where this knob is missing. In case it attempts to set the parameter and
it's missing on the system, it doesn't set the knob in root
namespace like it's done for fip namespaces, but only issues a warning
message.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1391553

Change-Id: Ieab53624dc34dc687a0e8eebd84778f7fc95dd77
Closes-bug: 1639315
2016-11-10 15:27:21 -05:00
Dustin Lundquist 4f0caa0ece Rename ipv6_utils.is_enabled()
IPv6 utils is_enabled() doesn't actually determine if IPv6 is enabled on
the host. It checks if /proc/sys/net/ipv6/conf/default/disable_ipv6 is
present and is set to 0. This kernel configuration option controls if
the kernel will automatically assign IPv6 link-local addresses to newly
created network interfaces when their link state changes to up. The
existence of this /proc files does indicate that the Linux kernel has
the ipv6 module loaded or ipv6 was compiled in. Having this /proc file
set to zero does not indicate IPv6 is not available on the system, just
that newly created interfaces will inherit this configuration and will
not have IPv6 addresses bound to them unless the administrator changes
the interfaces specific /proc/sys/net/ipv6/conf/$IFACE/disable_ipv6
configuration.

This check was added to Neutron so it could operate with distributions
which didn't load the ipv6 kernel module, preventing errors when
attempting to make IPv6 specific configurations in the iptables firewall
driver and the L3 agent. Removing it would break existing deployments.

Renaming this function to provide clarity for complex conditions tested
by this function. In fact it is a good security practice to set this
default disable_ipv6 option to 1, and explicitly enable IPv6 by setting
disable_ipv6=0 on individual interfaces which the administrator intends
to bind IPv6 addresses on. This establishes parity with IPv4 behavior
where interfaces are not active in an address family until the
administrator explicitly configures them to be active in that address
family. This practice does not currently work as expected with the
Neutron, since setting /proc/sys/net/ipv6/conf/default/disable_ipv6 to 1
unexpectedly disables creating IPv6 security group rules leaving
instances completely exposed via IPv6 regardless of security group
rules.

Change-Id: I844b992240a5db642766ec9c04e3b5fcab8e2e23
2016-10-26 02:11:57 -07:00
AKamyshnikova 25f5912cf8 Check for ha port to become ACTIVE
After reboot(restart of l3 and l2 agents) of the node routers
can be processed by l3 agent before openvswitch agent sets up
appropriate ha ports. This change add notification for l3 agent
that ha port becomes ACTIVE and keepalived can be enabled.

Closes-bug: #1597461

Co-Authored-By: venkata anil <anilvenkata@redhat.com>

Change-Id: Iedad1ccae45005efaaa74d5571df04197757d07a
2016-08-29 19:31:22 +03:00
Gary Kotton 9f09f27c5d Fix deprecation warnings
Remove deprecation warnings for various constants
and exceptions that have moved to neutron_lib.

Fix miscellaneous other deprecations.

Uses constants instead of l3_constants when importing
neutron-lib constants.

Co-Authored By: Henry Gessau <gessau@gmail.com>
Co-Authored By: Gary Kotton <gkotton@vmware.com>

Change-Id: Ib0e8ff5c3e23677c1009241a1818cbc8a3430c38
2016-08-26 22:16:06 -04:00
Hong Hui Xiao 347778a9f9 Enable ra on gateway when add gateway to HA router
Now the 'accept_ra' will only be configured when HA router change
state to 'master'. If the router gateway is added after router state
change, the gateway port in 'master' HA router will not be configured.

This patch will configure 'accept_ra' for the 'master' HA router.

Change-Id: Ice1f3e6e48597ea8c366e243c2ca1771ea9b7770
Closes-bug: #1585246
2016-08-17 05:50:53 -06:00
Jakub Libosvar a626172706 Move wait_until_true to neutron.common.utils
We need to be able to re-use wait_until_true in tempest scenario tests.
There is tempest bug https://bugs.launchpad.net/tempest/+bug/1592345
that prevents us to do so.

Also wait_until_true is not linux specific so it makes more sense to
have it in common package.

Change-Id: Ib8b0e51dbd9edaa58391774d428a737836dfdf77
2016-06-27 11:40:11 +00:00
Henry Gessau 4148a347b3 Use constants from neutron-lib
With this we enable the deprecation warnings by default.

Related-Blueprint: neutron-lib

Change-Id: I5b9e53751dd164010e5bbeb15f534ac0fe2a5105
2016-04-23 21:23:56 -04:00
Jenkins 7300520266 Merge "Remove floatingip address only when the address has been configured" 2016-02-19 12:07:06 +00:00
Hong Hui Xiao 77f84fa935 Remove floatingip address only when the address has been configured
For HA router, adding a floatingip will add a vip to keepalived, then
keepalived will add the ip address to port. So there is a time window
that the qg device in qrouter namespace will not have the address of
floatingip.

If user delete the floatingip at the time window, ip command will fail,
because it tries to remove an ip address that doesn't exist on the qg device.

The fix here is to check if the ip address is on the qg device, before
removing the ip address. A functional test is added to address the issue.

Change-Id: I72996d9a77f5f17b4d7a19d5be20b3f97f90dcba
Closes-bug: #1533904
2016-01-19 22:17:20 -05:00
sridhargaddam 30e048d222 Fix floatingip status for an HA router
When we associate a floatingip in an HA router setup, it is properly
associated. However, when we check the status of the Floating ip, it
is shown as empty.  The L3 agent ha_router implementation never sets
the status of the floating IP when it's added.  This behavior blocks
tempest scenario tests that use FIPs when run against a deployment
with L3 agent HA routers.  This fix sets the FIP status to active
when the FIP is added by an HaRouter instance.

Co-Authored-By: Assaf Muller <amuller@redhat.com>
Closes-Bug: #1449049
Change-Id: I7a3de36b64132e483a927ce9fed30777e84df96a
2016-01-13 15:29:10 -05:00
Manjeet Singh Bhatia ee83e0be06 Reorganize and improve l3_agent functional tests
This will reorganize the l3_agent functional tests

Change-Id: I10008fd7216c8de47162657e280b7245c38f5154
Closes-Bug: #1501150
2015-11-20 16:24:29 +00:00