When using routed provided network, the condition is bypassing
dhcp_agents_per_network which results that in a env with 3 agents and
dhcp_agents_per_network=2, for a given network already well handled
by 2 agents. If restarting the third agent It will start to handle the
network also which will result to have 3 agents handling the
network.
Closes-bug: #2058908
Signed-off-by: Sahid Orentino Ferdjaoui <sahid.ferdjaoui@industrialdiscipline.com>
Change-Id: Ia05a879b0ed88172694bd6bffc6f7eb0d36bb6b0
This patch is covering an edge case that could happen when the number
of DHCP agents ("dhcp_agents_per_network") or L3 agents
("max_l3_agents_per_router") has been reduced and there are more agents
assigned than the current number. If the user removes any agent
assignation from a L3 router or a DHCP agent, it is possible to remove
first the lower binding assigned registers.
Now the method ``get_vacant_binding_index`` calculates the number of
agents bound and the number required. If a new one is needed, the
method returns first the lower binding indexes not used.
Closes-Bug: #2006496
Change-Id: I25145c088ffdca47acfcb7add02b1a4a615e4612
Running with a stricter .pylintrc generates a lot of
C0330 warnings (hanging/continued indentation). Fix
the ones in neutron/scheduler.
Trivialfix
Change-Id: Ic131cd49307471982b8bd09f823809261d246c0d
When retrieving a vacant L3 agent binding index, if
"is_manual_scheduling" is set, the method "get_vacant_binding_index"
should always return a valid binding index. If the existing binding
indexes are sequentially aligned, the method will return a new one
on top; if there is a gap in the binding indexes list, the first
free index will be returned.
Closes-Bug: #1884906
Change-Id: I0a89bca0734d3e735fb357e488f85589e81d709f
WHen retrieving a vacant DHCP agent binding index, if
"force_scheduling" is set, the method should return a valid binding
index. If the existing binding indexes are sequentially aligned,
the method will return a new one on top; if there is a gap in the
binding indexes list, the first free index will be returned.
Change-Id: Ib4cbeb7c9f0c1e959ad53570320610925ff3d88f
Closes-Bug: #1883513
The patch proposes adding a new binding_index to the
NetworkDhcpAgentBinding table, with an additional Unique
Constraint that enforces a single <network_id, binding_index>
per network.
1. When a network is triggered to be auto-scheduled to DHCP
agents, the number of DHCP agents is constrained by
dhcp_agents_per_network in neutron.conf. This prevents
too many DHCP agents from being scheduled in the first place.
2. If users manually schedule a network to specific DHCP
agents, the binding_index increments to show the number of
DHCP agents hosting this network.
Co-Authored-By: Oleg Bondarev <obondarev@mirantis.com>
Change-Id: I1bc3f8b69c337f7c1cf7375509a0da61def9baf1
Closes-Bug: #1535554
Removed E125 (continuation line does not distinguish itself
from next logical line) from the ignore list and fixed all
the indentation issues. Didn't think it was going to be
close to 100 files when I started.
Change-Id: I0a6f5efec4b7d8d3632dd9dbb43e0ab58af9dff3
Added log information about the hostable DHCP agents per network.
Added log information when an agent is declared "down" because the
heartbeat is too old.
Those log messages are included according to [1].
[1] https://bugs.launchpad.net/neutron/+bug/1799555/comments/8
Change-Id: I7b30499a86e5ae5de49813dfca01788fd5fce139
Related-Bug: #1799555
These new debug lines can be helpful to resolve the mentioned bug.
Sometimes the DHCP agent does not reschedule and the log does not
contain enough information to debug the problem.
Spotted error during fullstack tests:
Traceback (most recent call last):
File "/opt/stack/new/neutron/neutron/tests/base.py", line 151, in func
return f(self, *args, **kwargs)
File "/opt/stack/new/neutron/neutron/tests/fullstack/test_dhcp_agent.py",
line 168, in test_reschedule_network_on_new_agent
self._wait_until_network_rescheduled(network_dhcp_agents[0])'
File "/opt/stack/new/neutron/neutron/tests/fullstack/test_dhcp_agent.py",
line 137, in _wait_until_network_rescheduled
common_utils.wait_until_true(_agent_rescheduled)
File "/opt/stack/new/neutron/neutron/common/utils.py", line 646,
in wait_until_true
raise WaitTimeout(_("Timed out after %d seconds") % timeout)
neutron.common.utils.WaitTimeout: Timed out after 60 seconds
Change-Id: I794e737c30f597519fba873e36f26b82b6f2c799
Related-Bug: #1799555
Reduces E128 warnings by ~260 to just ~900,
no way we're getting rid of all of them at once (or ever).
Files under neutron/tests still have a ton of E128 warnings.
Change-Id: I9137150ccf129bf443e33428267cd4bc9c323b54
Co-Authored-By: Akihiro Motoki <amotoki@gmail.com>
Fixed all pep8 E265 errors and changed tox.ini to no longer
ignore them. Also removed an N536 comment missed from a
previous change.
Change-Id: Ie6db8406c3b884c95b2a54a7598ea83476b8dba1
Agent object has been merged [1].
This patch uses Agent object in agents_db and test_agents_db.
We also introduce a new function (get_agents_object) and keep
the old function (get_agents_db) for backward compatibility.
[1] https://review.openstack.org/#/c/297887/
Co-Authored-By: Nguyen Phuong An <AnNP@vn.fujitsu.com>
Change-Id: I4c4283cb1aa05d52dca00cc249e094ea7d55b1d3
Partially-Implements: blueprint adopt-oslo-versioned-objects-for-db
Commit I1d4ded9959c05c65b04b118b1c31b8e6db652e67 rehomed the
availability zone extension's API definition into neutron-lib. This
patch consumes it, removing the rehomed logic in neutron and switching
over to lib's version of it.
NeutronLibImpact
Change-Id: I761381de0d6e26a0380386700e7921b824991669
Since Pike log messages should not be translated.
This patch removes calls to i18n _LC, _LI, _LE, _LW from
logging logic throughout the code. Translators definition
from neutron._i18n is removed as well.
This patch also removes log translation verification from
ignore directive in tox.ini.
Change-Id: If9aa76fcf121c0e61a7c08088006c5873faee56e
In reviews we usually check import grouping but it is boring.
By using flake8-import-order plugin, we can avoid this.
It enforces loose checking so it sounds good to use it.
This flake8 plugin is already used in tempest.
Note that flake8-import-order version is pinned to avoid unexpected
breakage of pep8 job.
Setup for unit tests of hacking rules is tweaked to disable
flake8-import-order checks. This extension assumes an actual file exists
and causes hacking rule unit tests.
Change-Id: Ib51bd97dc4394ef2b46d4dbb7fb36a9aa9f8fe3d
This patch introduces and integrates OVO for SegmentHostMapping.
Change-Id: I99598cf6fa4aefe7d3faee5cb0867d8ea1fff5c2
Partially-Implements: blueprint adopt-oslo-versioned-objects-for-db
Change I7be4ce2513e49e6da46a7bdffb8538613f0be7c7 relocated the Agent
model (and a few other functions), but some references to old
functions/model still remain. These cause a considerable amount of
warnings when running unit tests and the code itself.
Change-Id: Id026cae75bfa56b1023f8a1c4e5db750cf0bff5f
Partial-Bug: #1597913
This patch set is for breaking the circular dependency between
Agent/AgentVersionedObject.
See:https://review.openstack.org/#/c/297887/ for details.
Change-Id: I7be4ce2513e49e6da46a7bdffb8538613f0be7c7
Partial-Bug: #1597913
Co-Authored-By: Victor Morales <victor.morales@intel.com>
Co-Authored-By: Sindhu Devale <sindhu.devale@intel.com>
In the l3_agent_scheduler.py file, some functions accept both the
'plugin' and 'context' argument. However, some functions expect
'context, plugin' (context first) and some functions expect
'plugin, context' (context last). I'm a real nit-picker and this
bothered me for a while, so here's a fix :)
Since the base scheduler class expects 'plugin, context', some functions
couldn't be changed to accept the other variation. Instead, context will
always be last. Also, modified unit tests to make sure they test.
This also fixes an odd-ordering in one of the dhcp scheduler's private
functions.
Change-Id: I825e108170a29d5ecaa0f0883bb0a171b5fdb895
In routed provider networks, a network can be divided by segments and
each segment will have its own DHCP agent. AutoScheduling for networks
exists. This patchset is for enabling auto scheduling for routed
networks.
Change-Id: I7d9f01dbd6d9c216474ee47c968919c1826787f7
Partially-implements: blueprint routed-networks
Since a network can now be broken up by segments, each segment will
need to have its own DHCP Agent. This maintains backwards
compatibility when a network does not have a segment. However,
once a segment is created on a network, a dhcp agent should be
scheduled per segment with a dhcp enabled subnet.
The scheduling happens by filtering the candidate dhcp agents by
the hosts that are bound to that segment.
Partially-Implements: blueprint routed-networks
Change-Id: If73211978e14b7533a1213cfb8c2c155a408f19e
There is a bug in pep8, when 'select' used, it omits all default checks
and runs only those specified by 'select'. We got hit by this issue
since I2d26534230ffe5d01aa0aab6ec902f81cfba774d was merged which lead to
almost no static checks in pep8 job.
Also note that off_by_default decorator has no effect for now because
factory in hacking is triggered after ignored checks are collected.
There will be a follow-up patch for that in order to make pep8 doing
its job quickly.
[1] https://github.com/PyCQA/pycodestyle/issues/390
Related-Bug: 1594756
Change-Id: I8e27f40908e1bb4307cc7c893169a9d99f3433c4
This patch set is for breaking the circular dependency between
Agent/NetworkDhcpAgentBinding. The goal of this is to implement
Agent OVO
Partially-Implements: blueprint adopt-oslo-versioned-objects-for-db
Change-Id: I3f2a8bcc6f8644e94e377dc916fba6743cb230bc
Problem details can be found in bug description. AZ here stands
available zone.
heapq.heapify() will sort tuple according to the first element. If
the first element is equal, then the second element is used. When
creating a new network, each AZ doesn't hold the network. So, the
AZ handling list is actually a name ordered list. As a consequence,
when creating a new network, a certain AZ will always be used,
for example, 'nova1' in ['nova1', 'nova2', 'nova3'].
This patch will sort the resource_hostable_agents firstly, so that
the AZ that holds dhcp-agent with least load comes first. Then use
min() to get the first AZ.
Change-Id: Id57b4656337ab8f1bd2dc3e8bd679a23778a2dea
Closes-Bug: #1522677
In bind() method of dhcp_agent_scheduler for transaction was not
used context manager. This is not correct as if another exception
appear, not DBDuplicateEntry, that is caught, current transaction
will hang.
Current change is a simple refactoring, the code works as it does
prevously. Existing unit tests
* test_schedule_bind_network_multi_agent_fail_one,
* test_auto_schedule_network(Network already scheduled)
cover possible issues.
Closes-bug: #1557546
Change-Id: Ieb77738e065b997e0ab65afdc1f3bdbfb8f13fef
Python 3 deprecated the logger.warn method, see:
https://docs.python.org/3/library/logging.html#logging.warning
so we prefer to use warning to avoid DeprecationWarning.
Closes-Bugs: #1529913
Change-Id: Icc01ce5fbd10880440cf75a2e0833394783464a0
Co-Authored-By: Gary Kotton <gkotton@vmware.com>
Currently neutron DCHP scheduler assumes that that every server running
a dhcp-agent can reach every network. Typically the scheduler can
wrongly schedule a vlan network on a dhcp-agent that has no reachability
to the network it's supposed to serve (ex: network's physical_network
not supported).
Typically such usecase can append if:
* physical_networks are dedicated to a specific service and we don't
want to mix dnsmasqs related to different services (for
isolation/configuration purpose),
* physical_networks are dedicated to a specific rack (see example
diagram http://i.imgur.com/NTBxRxk.png), the rack interconnection can
be handled outside of neutron or inside when routed-networks will be
supported.
This change makes the DHCP scheduler network reachability aware by
querying plugin's filter_hosts_with_network_access method.
This change provides an implementation for ML2 plugin delegating host
filtering to its mechanism drivers: it aggregates the filtering done by
each mechanism or disables filtering if any mechanism doesn't overload
default mechanism implementation[1] (for backward compatibility with
out-of-tree mechanisms). Every in-tree mechanism overloads the default
implementation: OVS/LB/SRIOV mechanisms use their agent mapping to filter
hosts, l2pop/test/logger ones return empty set (they provide to "L2
capability").
This change provides a default implementation[2] for other plugins
filtering nothing (for backward compatibility), they can overload it to
provide their own implementation.
Such host filtering has some limitations if a dhcp-agent is on a host
handled by multiple l2 mechanisms with one mechanism claiming network
reachability but not the one handling dhcp-agent ports. Indeed the
host is able to reach the network but not dhcp-agent ports! Such
limitation will be handled in a follow-up change using host+vif_type
filtering.
[1] neutron.plugin.ml2.driver_api.MechanismDriver.\
filter_hosts_with_network_access
[2] neutron.db.agents_db.AgentDbMixin.filter_hosts_with_network_access
Closes-Bug: #1478100
Co-Authored-By: Cedric Brandily <zzelle@gmail.com>
Change-Id: I0501d47404c8adbec4bccb84ac5980e045da68b3
- This does NOT break other projects that rely on neutron.i18n,
as this change includes a debtcollector shim to maintain those
older entry points, until they can migrate.
- Also updates _i18n.py to the latest pattern defined by oslo_i18n
- Guidance and template are from the reference:
http://docs.openstack.org/developer/oslo.i18n/usage.html
Partially-Closes-Bug: #1519493
Change-Id: I1aa3a5fd837d9156da4643a367013c869ed8bf9d
This patch adds the availability_zone support for network.
APIImpact
DocImpact
Change-Id: I9259d9679c74d3b3658771290e920a7896631e62
Co-Authored-By: IWAMOTO Toshihiro <iwamoto@valinux.co.jp>
Partially-implements: blueprint add-availability-zone
Availability zone aware dhcp scheduler uses hosted agents list
to schedule dhcp agent to proper availability zone.
The scheduler can avoid scheduling the same availalibity zone as
agents host a network which should be distributed for HA.
Change-Id: Ib01f6d852956dc1e89b83321d657c0baf829e28a
Partially-implements: blueprint add-availability-zone
This fix let DHCP scheduler not looks at only active agents, but all
available agents.
This helps db module to remove dead agent, when rebinding, properly.
Change-Id: I8534ddfae299724c05641183c2fe4c1c98c614e8
Closes-Bug: 1388698
In this blueprint, we also propose to write a generic scheduler
framework which can be used to schedule a new resource on
selected least loaded agents.
Currently dhcp_load_type will be fetched from neutron.conf file
and corresponding load is obtained by the agent report state.
The obtained load will be populated in the "load" column of the
agents table.
During scheduling, agent will be selected based on sorting all
the agents of particular type based on load column.
Example dhcp_load_type is networks
DocImpact
Implements: blueprint dhcpservice-loadbalancing
Change-Id: I5ec8adf0c4336f885d603662223caa7694708876
Author: Shivakumar M <shiva.kum.m@hp.com>
Co-Authored-By: Praveen Kumar SM <praveen-sm.kumar@hp.com>
Co-Authored-By: Benjamin GRASSART <benjamin.grassart@thalesgroup.com>
Co-Authored-By: Sourabh Patwardhan <sopatwar@cisco.com>
It's mostly a matter of changing imports to a new location.
Non-obvious changes needed:
* pass overwrite= argument to oslo_context since oslo.log reads context
from its thread local store and not local.store from incubator
* don't store context at local.store now that there is no code that
would consume it
* LOG.deprecated() -> versionutils.report_deprecated_feature()
* dropped LOG.audit check from hacking rule since now the method does
not exist
* WritableLogger is now located in oslo_log.loggers
Dropped log module from the tree. Also dropped local module that is now
of no use (and obsolete, as per oslo team).
Added versionutils back to openstack-common.conf since now we use the
module directly from neutron code and not just as a dependency of some
other oslo-incubator module.
Note: tempest tests are expected to be broken now, so instead of fixing
all the oslo.log related issues for the subtree in this patch, I only
added TODOs with directions for later fix.
Closes-Bug: #1425013
Change-Id: I310e059a815377579de6bb2aa204de168e72571e
In some cases this exception is thrown while accessing Agent
object from logging statement after a transaction was rolled back.
There is a unit test that covers thsi code patch, but the issue
is not reproducible with sqlite.
Just avoid accessing db object after session had been closed.
Change-Id: Iff6b72156b08f177bd0c71f6ba93d3bf46c82fa4
Closes-Bug: #1424578
For DHCP agents sometimes it's not enough to check agent's last heartbeat
time because in its starting period the agent may fail to send state reports
because it's busy processing networks.
In rescheduling logic such DHCP agent is given additional time after start.
Additional time is proportional to amount of networks the agent is hosting.
Need to apply the same logic to DHCP agent scheduler to avoid a case
when starting agent is considered dead and a network gets more hosting
agents than configured.
Change-Id: I0fe6244c7d2ed42e4744351be34f251318322c54
Closes-Bug: #1417708
Oslo project decided to move away from using oslo.* namespace for all their
libraries [1], so we should migrate to new import path.
This patch applies new paths for:
- oslo.config
- oslo.db
- oslo.i18n
- oslo.messaging
- oslo.middleware
- oslo.rootwrap
- oslo.serialization
- oslo.utils
Added hacking check to enforce new import paths for all oslo libraries.
Updated setup.cfg entry points.
We'll cleanup old imports from oslo-incubator modules on demand or
if/when oslo officially deprecates old namespace in one of the next
cycles.
[1]: https://blueprints.launchpad.net/oslo-incubator/+spec/drop-namespace-packages
Depends-On: https://review.openstack.org/#/c/147248/
Depends-On: https://review.openstack.org/#/c/152292/
Depends-On: https://review.openstack.org/#/c/147240/
Closes-Bug: #1409733
Change-Id: If0dce29a0980206ace9866112be529436194d47e
Mostly trivial import changes.
- oslo.i18n no longer provide install() method to inject _() into
globals(), so removed all calls to it;
- removed Babel from dependencies (it will now be grabbed by oslo.i18n);
- updated tox.ini to ignore import violations for oslo.i18n.
Change-Id: I6623d551f512fb7fe9bf35ee734ed6d4c6cbc287
oslo.db first stable release has been cut and we can start using it
instead of openstack/common/db/* code which is now marked obsolete.
Change-Id: I1ccf896922a5a762d37a1a3b93c56c8b8ae8c085
This exception may happen if API and RPC workers are in different
processes.
Also make minor refactoring of auto_schedule_networks method
to avoid unnecessary db queries.
Add missing unit tests and adjust unit test naming style
Change-Id: I6460744e2cffec0b9f009da071597374d8c027f6
Closes-Bug: #1331456