When neutron-server is down, ovs-agent waits for it to become available
during agent startup. When neutron-server is up, but it cannot reach the
DB, it can do nothing pretty much the same way. However ovs-agent
reacted differently to this failure. With this patch it reacts the same
way and delays its startup until neutron-server is up together with its
DB.
Change-Id: Ia55e82540aedc236e9b016bb58047d0b437eeb99
Closes-Bug: #2025341
CacheBackedPluginApi.get_device_details tries to fetch net info
from remote_resource_cache to fill the qos_policy_id for the
device.
Fix the case when the cache has no entry for the network.
Change-Id: I1612dc4515ec0f02fbaf05d60b753485885d8c84
Closes-bug: #1999391
Previously when a neutron-openvswitch-agent was stopped it left
behind the following fanout queues in rabbitmq:
neutron-vo-Network-1.0_fanout_someuuid
neutron-vo-Port-1.1_fanout_someuuid
neutron-vo-SecurityGroup-1.0_fanout_someuuid
neutron-vo-SecurityGroupRule-1.0_fanout_someuuid
neutron-vo-SubPort-1.0_fanout_someuuid
neutron-vo-Subnet-1.0_fanout_someuuid
neutron-vo-Trunk-1.1_fanout_someuuid
In this change we ensure that all but the SubPort and Trunk fanout
queues are correctly removed from rabbitmq by cleanly stopping the
RemoteResourceCache when the agent stops.
Partial-Bug: #1586731
Change-Id: I672f9414a1a8ed91e259e9379ca707a70f6b4467
Added parameter "status" to port dictionary information returned from
the agent RPC.
Change-Id: I34383aa02c7515b3bcd2faa8b4617e730ce3e6c9
Closes-Bug: #1942234
Number of resources for neutron to divide
the large RPC call data sets always equals 100.
In "big" networks even these numbers can be
still huge and cause service timeouts.
Now we can decrease these numbers in config.
Default value equals 20 now.
Closes-Bug: 1938202
Change-Id: Idf545ad31398ded460b6c2ae1675dd5e9ae71440
SR-IOV agent can handle ports with same MAC address (located in
different networks). The agent can retrieve, from the system, the
MAC address and the PCI slot; because the PCI slot is unique per
port in the same host, this parameter is used to match with the
Neutron port ID stored in the database (published via RPC).
RPC API bumped to version 1.9.
Closes-Bug: #1791159
Change-Id: Id8c3e0485bebc55c778ecaadaabca1c28ec56205
The goal of this patch is to avoid the connection disruption during
the live-migration using OVS. Since [1], when a port is migrated,
both the source and the destination hosts are added to the profile
binding information. Initially, the source host binding is activated
and the destination is deactivated.
When the port is created in the destination host (created by Nova),
the port was not configured because the binding was not activated.
The binding (that means, all the OpenFlow rules) was done when Nova
sent the port activation. That happend when the VM was already
running in the destination host. If the OVS agent was loaded, the
port was bound seconds later to the port activation.
Instead, this patch enables the OpenFlow rule creation in the
destination host when the port is created.
Another problem are the "neutron-vif-plugged" events sent by Neutron
to Nova to inform about the port binding. Nova is expecting one single
event informing about the destination port binding. At this moment,
Nova considers the port is bound and ready to transmit data.
Several triggers were firing expectedly this event:
- When the port binding was updated, the port is set to down and then
up again, forcing this event.
- When the port binding was updated, first the binding is deleted and
then updated with the new information. That triggers in the source
host to set the port down and the up again, sending the event.
This patch removes those events, sending the "neutron-vif-plugged"
event only when the port is bound to the destination host (and as
commented before, this is happening now regardless of the binding
activation status).
This feature depends on [2]. If this Nova patch is not in place, Nova
will never plug the port in the destination host and Neutron won't be
able to send the vif-plugged event to Nova to finish the
live-migration process.
Because from Neutron cannot query Nova to know if this patch is in
place, a new temporary configuration option has been created to enable
this feature. The default value will be "False"; that means Neutron
will behave as before.
[1]https://bugs.launchpad.net/neutron/+bug/1580880
[2]https://review.opendev.org/c/openstack/nova/+/767368
Closes-Bug: #1901707
Change-Id: Iee323943ac66e566e5a5e92de1861832e86fc7fc
Adds agent side code to enable the OVS agent to receive address groups
from the push notifications cache.
Change-Id: I1f27eccb2a69c553631fdc12d34e9025925844c5
Partial-Bug: #1592028
Currently the ovs agent calls update_device_list with the
agent_restarted flag set only on the first loop iteration. Then the
server knows to send the l2pop flooding entries for the network to
the agent. But when a compute node with many instances on many
networks reboots, it takes time to readd all the active devices and
some may be readded after the first loop iteration. Then the server
can fail to send the flooding entries which means there will be no
flood_to_tuns flow and broadcasts like dhcp will fail.
This patch fixes that by renaming the agent_restarted flag to
refresh_tunnels and setting it if the agent has not received the
flooding entries for the network.
Change-Id: I607aa8fa399e72b037fd068ad4f02b6210e57e91
Closes-Bug: #1853613
Currently, if one wanted to add any other resources (including custom
objects), there is no simple way to achieve that, since list of defined
resource types is hardcoded in create_cache_for_l2_agent function,
which is called in __init__ of the CacheBackedPluginApi. Even if we
derive from it, we must call super() on descendant, otherwise we end up
with uninitialized PluginApi part. But if we do the super() on it, we
end up on having hardcoded resources only, and creating a new remote
resource cache object will make a new set of listeners, while the
listeners for the old object still exist, and may cause memory leaks.
RemoteResourceWatcher class have only initializers for those listeners,
and there is no obvious way to stop/clean them.
In this patch we propose to move create_cache_for_l2_agent function to
CacheBackedPluginApi class, and make resource list to be class
attribute, so that it can be easily modified.
Change-Id: Ia65ecaf7b48926b74505226a5922b85e2cb593a6
Closes-Bug: 1837529
Ovs-agent will scan and process the ports during the
first rpc_loop, and a local port update notification
will be sent out. This will cause these ports to
be processed again in the ovs-agent next (second)
rpc_loop.
This patch passes the restart flag (iteration num 0)
to the local port_update call trace. After this patch,
the local port_update notification will be ignored in
the first RPC loop.
Related-Bug: #1813703
Change-Id: Ic5bf718cfd056f805741892a91a8d45f7a6e0db3
In case of Smart NIC vNIC type neutron should mimic nova-compute
that plug the port to the ovs bridge.
Extend the Neutron OVS mechanism driver and Neutron OVS Agent to bind
the Neutron port for the baremetal host with Smart NIC. This will allow
the Neutron OVS Agent to configure the pipeline of the OVS running on
the Smart NIC and leverage the pipeline features such as: VXLAN,
Security Groups and ARP Responder.
Story: #2003346
Closes-Bug: #1785608
Change-Id: I6d520d3bac2e9ceb30b5b6197c6eb0f958cc3659
Added the ability to change the segmentation ID of a network
with ports bound to OVS agent. The rules, both in the integration
bridge and the physical bridge, to convert the internal VLAN tag
and the external segmentation ID (external VLAN tag) are deleted
and created again with the new value. The traffic from the tenant
networks will be tagged then with the new segmentation ID.
Added get network details agent RPC call to retrieve the information
of the updated network.
Partial-Bug: #1806052
Change-Id: I69f6f3ef717c3ed40218099b1f389afd3d39bd62
All of the externally consumed variables from neutron.common.constants
now live in neutron-lib. This patch removes neutron.common.constants
and switches all uses over to lib.
NeutronLibImpact
Depends-On: https://review.openstack.org/#/c/647836/
Change-Id: I3c2f28ecd18996a1cee1ae3af399166defe9da87
Ovs-agent can process the ports in large sets, then all
of these ports will have to update DB status or attributes.
But neutron server is centralized. It may have to do
something else, or the database processing can be also
time-consuming. Because of these, it sometimes returns
the RPC timeout exception to ovs-agent. And a fullsync
will be triggered in next rpc loop. The restart time is
becoming longer and longer.
Adds a default step to update the port to reduce
the probability of RPC timeout.
Related-Bug: #1813703
Related-Bug: #1813704
Related-Bug: #1813706
Related-Bug: #1813707
Change-Id: Ie37f4a4869969e235ce16b73cdfcbdc98626823e
Ovs-agent can be very time-consuming in handling a large number
of ports. At this point, the ovs-agent status report may have
exceeded the set timeout value. Some flows updating operations
will not be triggerred. This results in flows loss during agent
restart, especially for hosts to hosts of vxlan tunnel flow.
This fix will let the ovs-agent explicitly, in the first rpc loop,
indicate that the status is restarted. Then l2pop will be required
to update fdb entries.
Closes-Bug: #1813703
Closes-Bug: #1813714
Closes-Bug: #1813715
Closes-Bug: #1794991
Closes-Bug: #1799178
Change-Id: I8edc2deb509216add1fb21e1893f1c17dda80961
The neutron.common.rpc module has been in neutron-lib for awhile now and
neutron is shimmed to use neutron-lib already.
This patch removes neutron.common.rpc and switches the code over to use
neutron-lib's implementation where needed.
NeutronLibImpact
Change-Id: I733f07a8c4a2af071b3467bd710290eee11a4f4c
The ovs agent will install some basic drop flows first for the
physical bridge mappings during the init procedure. If message
queue is not connected, or neutron-servers are all down, real
traffic flows will not be refreshed anymore. This will cause
the data plane down if tenant network and provider network are
sharing the physical NICs.
This patch adds a RPC check during init L2 agent. When restart
the ovs-agent, if the MQ is OK and we have available neutron-server,
go next step. Otherwise, a rpc timeout will be raised. L2 agent
will start fail, physical bridge mapping drop flows will not be
installed. The original flows will not be replaced, so the traffic
can still work properly.
Closes-Bug: #1803919
Change-Id: Ie15cf625b3710eaf290d6aafecb3f65df664b9df
The common rpc and exceptions were rehomed into
neutron-lib with [1]. This patch shims those rehomed
modules in neutron to switch over to neutron-lib's
versions under the covers.
To do so:
- The rpc and common exceptions are changed to
reference their counterpart in neutron-lib effectively
swapping the impl over to neutron-lib.
- The fake_notifier is removed from neutron and lib's
version is used instead.
- The rpc tests are removed; they live in lib now.
- A few unit test related changes are required
including changing mock.patch to mock.patch.object,
changing the mock checks for a few UTs as they don't
quite work the same with the shim in place.
- Using the RPC fixture from neutron-lib rather than
that setup in neutron's base test class.
With this shim in place, consumers are effectively using
neutron-lib's RPC plumbing and thus we can move consumers
over to neutron-lib's version at will. Once all
consumers are moved over we can come back and remove
the RPC logic from neutron and follow-up with a consumption
patch.
NeutronLibImpact
[1] https://review.openstack.org/#/c/319328/
Change-Id: I87685be8764a152ac24366f13e190de9d4f6f8d8
The get_port_binding_by_status_and_host function was rehomed into
neutron-lib with https://review.openstack.org/#/c/580786/ and released
in neutron-lib 1.18.0. This patch consumes the function by removing it
in neutron and replacing all uses with lib's version.
NeutronLibImpact
Change-Id: Iac3246d0eb59709749e0b7e857091447d11a0133
As part of the implementation of multiple port bindings [1], add binding
activation support to the OVS agent. This will enable the execution in
OVS agents of the complete sequence of steps outlined in [1] during an
instance migration:
1) Create inactive port bindings for destination host
2) Migrate the instance to the destination host and plug its VIFs
3) Activate the port bindings in the destination host
4) Delete the port bindings for the source host
[1] https://review.openstack.org/#/c/309416/
Change-Id: Iabca39364ec95633b2a8891fc295b3ada5f4f5e0
Partial-Bug: #1580880
CacheBackedPluginApi enables the neutron server to push resources
updates to L2 agents. The agents retrieve the resources updates locally
from the cache implemented by it. CacheBackedPluginApi also emulates
server notifications to the agents, such as port_delete or port_update,
based on the updated data received by the cache. This commit adds code
to CacheBackedPluginApi to implement a binding_deactivate notification
from the server to the agents
Change-Id: I023ccbd405bc41379007d87a9c1051970aa8d603
Partial-Bug: #1580880
As a consequence of implementing multiple bindings for ports, [1] made
the following attributes lists:
- 'port_binding' in the in the SQLAlchemy Port model
- 'binding' in the Port OVO
This patch pluralizes their names to 'port_bindings' and 'bindings'
respectively
[1] Ie31d4e27e3f55edfe334c4029ca9ed685e684c39
Change-Id: I4ebe47cf9d51a700310aad8dcccc82fea3f00a16
Functionality is added to the ML2 plugin to handle multiple port
bindings
Co-Authored-By: Anindita Das <anindita.das@intel.com>
Co-Authored-By: Miguel Lavalle <miguel.lavalle@huawei.com>
Partial-Bug: #1580880
Change-Id: Ie31d4e27e3f55edfe334c4029ca9ed685e684c39
The neutron.common.rpc.create_connection function is just a reference to
the Connection class constructor. This patch removes create_connection
and replaces all uses with Connection instead.
NeutronLibImpact
Change-Id: I2f4b24ba732be47fc9911be1e24406fb1ffe821e
The neutron.common.topics module was rehomed into neutron-lib with
commit Ie88b84949cbd55a4e7ad06341aab77b286cdc485
This patch consumes it by removing the rehomed module from neutron
and using the module from neutron-lib instead.
NeutronLibImpact
Change-Id: Ia4a4604c259ce862597de80c6deeb3d408bf0e95
Neutron lib contains the latest callbacks and thus this patch removes
the callbacks package from neutron entirely.
NeutronLibImpact
Change-Id: I14e45fd5d2d3c816bb39f8ace56f7be460bac0d6
Changing rpc_api.rst file path from doc/source/devref/rpc_api.rst
to /doc/source/contributor/internals/rpc_api.rst. Because rpc_api.rst
file is located at this path
doc/source/contributor/internals/rpc_api.rst.
Closes-Bug #1722072
Change-Id: Ic243aab9e3428bfec69db61a94b4129cd768e233
A push notifications change added segment information
to the get_device_details() RPC call, but sometimes the
segment information is not present, resulting in an
AttributeError. Just treat the lack of segment info
as if the port was unbound, since the port is probably
in the process of being removed.
Change-Id: I631c6e1f02fa07eed330c99a96aa66d747784f37
Closes-bug: #1714068
This gets rid of the bulk_flood call and adjusts
the cache to query the server on demand as it's asked
for things it hasn't been asked for before.
Change-Id: I58f3d4dd9bcf545fd9dca8cd42673d705db06c10
Partially-Implements: blueprint push-notifications
The push notification logic always assumed the port security object
would exist but it is not present on the port when the extension is
disabled. This defaults it to true like the server side code.[1]
1.
c430e9b8d4/neutron/plugins/ml2/rpc.py (L142)
Change-Id: Ice89ad9dd486ad5fcac534ef5f7d8aae3b6b0f97
Closes-Bug: #1694420
Replace the calls to the OVSPluginAPI info retrieval functions
with reads directly from the push notification cache.
Since we now depend on the cache for the source of truth, the
'port_update'/'port_delete'/'network_update' handlers are configured
to be called whenever the cache receives a corresponding resource update.
The OVS agent will no longer subscribe to topic notifications for ports
or networks from the legacy notification API.
Partially-Implements: blueprint push-notifications
Change-Id: Ib2234ec1f5d328649c6bb1c3fe07799d3e351f48
If an agent tries to report_state to the neutron-server and it fails
because of a timeout (raising oslo_messaging.MessagingTimeout), then
there is an exponential back-off effect, which causes the
seemingly-simple report_state RPC call to take 60 seconds, then 120,
then 240 and so on. This can happen if all the controllers are
restarted simultaneously a number of time, as the bug report describes.
Since the feature was intended for heavy RPC calls (like get_routers())
and not for light calls such as report_state, it's safe to reduce the
timeout to a constant 60 seconds interval.
Closes-Bug: #1606827
Change-Id: I15aeea9f8265b859bb1a8ee933b8b2ce1e64b695
Python 3 deprecated the logger.warn method, see:
https://docs.python.org/3/library/logging.html#logging.warning
so we prefer to use warning to avoid DeprecationWarning.
Closes-Bugs: #1529913
Change-Id: Icc01ce5fbd10880440cf75a2e0833394783464a0
Co-Authored-By: Gary Kotton <gkotton@vmware.com>
- This does NOT break other projects that rely on neutron.i18n,
as this change includes a debtcollector shim to maintain those
older entry points, until they can migrate.
- Also updates _i18n.py to the latest pattern defined by oslo_i18n
- Guidance and template are from the reference:
http://docs.openstack.org/developer/oslo.i18n/usage.html
Partially-Closes-Bug: #1519493
Change-Id: I1aa3a5fd837d9156da4643a367013c869ed8bf9d
It's not used since the time we switched to oslo.messaging (Juno), it's
time to deprecate and eventually remove it.
Closes-Bug: #1506492
Change-Id: I57b0229c2b6028796cd10bbbfc9b166cf8a6dab0