Commit Graph

1200 Commits

Author SHA1 Message Date
Zuul 9540c87b3f Merge "Drop dependency on pytz" 2024-02-28 20:41:41 +00:00
Takashi Kajinami fd1bacf738 Remove deprecated [pod_vif_nested] worker_nodes_subnet
This option was deprecated in 4.0.0[1] in favor of the new subnet"s"
option. The latest release is 9.0.0, so we can assume enough times has
been given to users so that they can switch to the new option.

[1] b3814a33d6

Change-Id: Ie86c019bbb560cca9b5a3a77319ed639a2245a2d
2024-02-28 18:25:54 +09:00
Takashi Kajinami 52ae921e28 Drop dependency on pytz
Current usage of pytz can be easily replaced by the built-in datetime
library and this allows reducing dependency on 3rd party libraries.

Change-Id: I74c5b8ebce7600cc5986f48a9874ab1882a49ed4
2024-02-28 11:35:24 +09:00
Maysa Macedo 4030f2706a Skip retry of Network Policy event
When attempting to Handle a Network Policy and one Namespace
which is affected by it is in the process of getting handled
by Kuryr, the Network Policy event would be often retried.
This commit removes the retry to make sure the Network Policy
gets updated only once the Namespace handling has finsihed.

Change-Id: I73c9488dca21f73070ca84352e3ba3780ea7298f
2023-04-28 16:25:32 +02:00
Michał Dulko 2141dba99c KuryrPort cleanup: Fix issue of subport not found
It can happen that during the cleanup of KuryrPort when Pod is already
gone we'll fail trying to find the parent port ID. We have a bug that in
this case finalizing of KuryrPort fails.

This commit changes the way we look for the hostIP of the pod to
actually look up a node using the info from KuryrPort CRD. If this fails
(node removed?) we try querying OpenStack API to get this information.
If this fails too, we just don't pass hostIP to mocked Pod.

Change-Id: I72aea5713f90c8df2f5d0269fa83b8fdd5220c59
2023-04-25 12:46:08 +02:00
Maysa Macedo 2b69e039a8 Fix value ValueError when Pod has no IP address
In case the Pod has no IP address we shouldn't attempt to
convert it to a Python address. Instead, we should skip that
operation and expect it to be retried later.

Change-Id: I1eb9c2f51fd792405cbb87742645518a00fdc890
2023-04-14 13:53:26 +02:00
Michał Dulko 9cd15b6d37 Revert "Nit: Change from dict to object notation"
This reverts commit feec91cec1. Turns out
this wasn't okay, sub_ports property on Trunk objects is a list of
dictionaries.

This also fixes unit tests to account for that.

Change-Id: I17f217a6f2bfc833019ba407c248564e74b663d2
2023-03-28 15:54:11 +02:00
Michał Dulko feec91cec1 Nit: Change from dict to object notation
Looks like we have one last occurrence of usage of dictionary notation
to access properties of the openstacksdk object. This commit replaces it
with object notation.

Change-Id: I033e6166ecfbccd5e05dba4f7d66422212bc15c9
2023-03-08 09:59:29 +01:00
Stephen Finucane 6f2a8daf36 Remove munch
openstacksdk no longer uses this and we don't need to either. Instead,
create fake version of the actual resources openstacksdk would return.
This is more realistic and let's us remove munch entirely.

Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Change-Id: I4549340611cf4da74d525e2adaf724c3cb749f57
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2023-02-22 12:05:41 +00:00
Michał Dulko ed9f348e87 Fix VIF revert on KuryrPort status update error
There's argument number mismatch on release_vif() call while reverting
port association. This commit fixes that.

Change-Id: I54816e86910d9328d703fd7e7010d95995085cbf
2023-01-17 15:17:55 +00:00
Roman Dobosz ba4cc2b8f0 Use either subnet name or id for Machines.
Currently, we support only subnet id for primarySubnet field for
OpenShift Machines. Even though it's possible to create objects in
OpenStack with the same name, it is more natural to use names instead of
id especially in OpenShift world. In this patch we introduce support
for using names as well.

Change-Id: Ib21646b4b7cf0e3c07ddef15b3569a0fb4539e8a
2022-12-19 07:57:32 +00:00
Zuul 624a106fe2 Merge "Cleanup KuryrPort when Pod is missing" 2022-09-22 10:46:45 +00:00
Zuul 608dde9d60 Merge "LoadBalancer Members Reconciliation" 2022-09-16 16:24:12 +00:00
Sunday Mgbogu 93ef2f8650 LoadBalancer Members Reconciliation
This patchset implements the members reconciliation to
ensure that members in pools in OpenStack are matching that of
the respective Kubernetes Endpoints.

Implements: blueprint reconcile-openstack-resources-with-k8s
Depends-On: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/854764
Change-Id: Id7df1d13ca80a08e7a4b33949a3584845403d9ba
2022-09-05 10:05:13 +00:00
Michał Dulko 03b98adde2 Cleanup KuryrPort when Pod is missing
We can easily imagine an user frustrated by his pod not getting deleted
and opting to remove the finalizer from the Pod. If the cause of the
deletion delay was the kuryr-controller being down, we end up with an
orphaned KuryrPort. At the moment this causes crashes, which obviously
it shouldn't. Moreover we should figure out how to clean up the Neutron
port if that happens. This commit does so as explained below.

1. KuryrPort on_present() will trigger its deletion when it detects that
   Pod does not longer exist.
2. Turns out security_groups parameter passed to release_vif() was never
   used. I removed it from drivers and got rid of get_security_groups()
   call from on_finalize() as it's no longer necessary.
3. When we cannot get the Pod in KuryrPort on_finalize() we attempt to
   gather info required to cleanup the KuryrPort and "mock" a Pod
   object. A precaution is added that any error from release_vif() is
   ignored in that case to make sure failed cleanup is not causing the
   system to go down.

Change-Id: Iaf48296ff28394823f68d58362bcc87d38a2cd42
2022-08-24 17:48:02 +02:00
Zuul 553ca961a6 Merge "Replace abc.abstractproperty with property and abc.abstractmethod" 2022-08-04 12:34:34 +00:00
ljhuang 54178936c4 Replace abc.abstractproperty with property and abc.abstractmethod
Replace abc.abstractproperty with property and abc.abstractmethod,
as abc.abstractproperty has been deprecated since python3.3[1]

[1]https://docs.python.org/3.8/whatsnew/3.3.html?highlight=deprecated#abc

Change-Id: I24f62b02f292ce7a0bc21d4a39fc7787c87098ca
2022-08-03 20:28:07 +08:00
Maysa Macedo 9300fb495f Clean up Neutron Ports by ID
While parallelizing the Ports deletion, the clean up should handle
the list of Ports ID, instead of expect Ports objects. This commit fixes
that by directly calling Neutron port deletion, also it retries the
Ports removal from Trunk in case an error occured.

Closes-bug: #1983027
Change-Id: I50fa4e3cdeffba2413dd9c8c327673ba12706570
2022-08-01 12:18:26 +02:00
Zuul cf2cf599d6 Merge "Add more info to Async handler logs" 2022-07-22 17:06:16 +00:00
Zuul e9200beef0 Merge "Error handling improvements" 2022-07-22 16:53:36 +00:00
Zuul cebd7495b5 Merge "Lock pool port creation during prepopulation." 2022-07-22 12:11:43 +00:00
Zuul 3da4cce87e Merge "Remove SR-IOV support" 2022-07-21 13:11:07 +00:00
Roman Dobosz d4310f6a76 Lock pool port creation during prepopulation.
It might happen that during several pod creation there would be
triggered port population for the same namespace when using nested
driver. And since there is bulk port creation involved it might be
uneasy for Neutron. Solution for this would be locking port creation
operation, so that we make sure, there is only one population happening
at a time.

Closes-Bug: #1982478
Change-Id: I6e55963374b9c4105bf759c16add286def559491
2022-07-21 10:32:33 +00:00
Michał Dulko e478f1a5aa Update minimum openstacksdk version to 0.59.0
This commit bumps openstacksdk to at least 0.59.0 in requirements.txt
and removes the workarounds that we carried to actually support 0.36.0.

Change-Id: I5a5a37b621521cab9d7a8b7634b9392fc70b3c39
2022-07-12 14:42:28 +02:00
Michał Dulko 04d4439606 Remove SR-IOV support
This got decided at the PTG. The code is old, not maintained, not tested
and most likely doesn't work anymore. Moreover it gave us a hard
dependency on grpcio and protobuf, which is fairly problematic in Python
and gave us all sorts of headaches.

Change-Id: I0c8c91cdd3e1284e7a3c1e9fe04b4c0fbbde7e45
2022-06-29 12:49:37 +02:00
Maysa Macedo 20707b0330 Fix unbound router_id variable while creating event
If there was any SDK exception while adding Subnet to the Router
e.g. 504 Gateway Time-out, no router_id would be returned causing
the Kuryr controller to restart when trying to create an event with
the unbound variable router_id. This commit fixes the issue by
removing the variable.

Change-Id: Ib26fce6ccdc83c61821b54297a388d0acd2da2c7
2022-06-28 12:03:00 +02:00
Maysa Macedo da7a4b59a4 Fix Ports len calculation on Networks cleanup
Generator doesn't have lenght available, so
we must convert it to list instead.

Change-Id: Ia979894de405bba83952284a1df714753d62494b
2022-06-23 12:59:46 +02:00
Zuul 5af8716960 Merge "Do not crash on Neutron quota exceptions" 2022-06-22 14:06:10 +00:00
Zuul 3a65308784 Merge "Remove dead networks in current cluster." 2022-06-22 10:52:40 +00:00
Zuul 579b327e85 Merge "Use description to store identifier for networks and subnets." 2022-06-21 21:12:07 +00:00
Michał Dulko 3fc324fb54 Do not crash on Neutron quota exceptions
We shouldn't be failing on Neutron quota exceptions as Kuryr is not in a
position to ever solve that by restarting. Anyway because bulk create
method that we implemented raised a different exception, we failed to
ignore them for health checks.

This commit makes sure bulk create method raises ConflictException, so
that Retry handler will correctly handle it. Moreover the exception
handling there is improved to make sure we're reading error code instead
of error message.

Closes-Bug: 1978701
Change-Id: I520d415a0a8091737c9a3dd4d1268b3950421d79
2022-06-15 13:26:13 +02:00
Zuul 9a92961f7e Merge "Strip leading zeros from "funny" Service IPs" 2022-06-14 20:33:16 +00:00
Roman Dobosz a38d764ffc Get rid of obsolete KuryrNetPolicy CRD.
There are some of the mentions of KuryrNetPolicy around our code. In
this patch we are removing it (with one exception - the spec for
originally designed CRD for network policy handling), just to avoid
confusion with currently used KuryrNetworkPolicy.

Change-Id: Ie9bb46467a249e1c0ada3a9810c4fff59fd57757
2022-06-10 15:46:31 +02:00
Roman Dobosz a63bf23976 Remove dead networks in current cluster.
There was already introduced port cleanup function, so in this patch
there is new function introduced for cleaning up unused networks and
subnets for whole deployment, by using identifier from description.

Change-Id: Ia11e8953fab3f9cd8f6598f9da94daae324b1bf8
2022-06-10 15:46:04 +02:00
Roman Dobosz a47dcf2476 Use description to store identifier for networks and subnets.
For Neutron networks and subnets, add identifier (which origins from
tags) to the description field as a workaround for inability for create
tagged resources in atomic way. This change might be reverted when
Neutron gain such ability.

Depends-On: https://review.opendev.org/c/openstack/kuryr-tempest-plugin/+/841107
Change-Id: I1750a0b6ae569752b44a4fe682288686868450fe
2022-06-10 15:46:04 +02:00
Michał Dulko ce6fb744b5 Strip leading zeros from "funny" Service IPs
According to [1] we're supposed to support "funny" Service IPs, which
means IPs with leading zeros. Unlike unix in general we should not parse
them as octal values, but rather treat them like without leading zeros.
This commit makes sure that we strip the zeros from both ClusterIP and
LoadBalancerIP before we put them into KuryrLoadBalancer struct. This
means that later on Octavia and Neutron will get values that they
support and will proceed with LB or FIP creation normally.

[1] https://github.com/kubernetes/kubernetes/blob/v1.24.1/test/e2e/network/funny_ips.go

Change-Id: I0aa8d7326dbf40459f73037ae54d2afc01ea9bb6
Closes-Bug: 1978112
2022-06-09 13:47:26 +02:00
Roman Dobosz 846f158724 Removed all occurrences of removed KuryrNet CRD.
CRD KuryrNet was already replaced by KuryrNetwork, although there are
some spots, where it is mentioned - mostly docs and log messages. In
this commit we get rid of it once and for all.

Change-Id: I20345a1f4d4288534d620f0bd2196fc77ee795e9
2022-06-01 13:10:48 +02:00
Michał Dulko b69e991a27 Add more info to Async handler logs
This commit makes sure every log Async handler produces includes
information about what K8s resource is processed by the thread in
question. Without this patch only UID is printed which isn't something
that helps a lot when debugging K8s e2e tests results or when
correlating with CNI logs.

Change-Id: I4183b17efd8ad37731e2fc3c3db1ad7b76f18534
2022-05-24 16:43:46 +02:00
Michał Dulko 8f61307fa6 Error handling improvements
This combines a few error handling improvements fixing problems found by
e2e K8s tests.

1. Logs on K8sNamespaceTerminating are no longer on WARNING but DEBUG
   level. This is because they're harmless, yet they can spam the logs
   when multiple namespaces are deleted, such as in e2e K8s tests.
2. 400 Bad Request is ignored on LB member creation. This happens when
   subnet got deleted in the meanwhile and the LB will be gone soon too.
3. 404 Not Found is ignored when subports are detached from a trunk.
   This can happen when some other thread already detached that port. If
   the request was just for a single port, then error can be safely
   ignored. In case of bulk request we don't really know which and how
   many subports are detached already, so on detach error we'll just
   proceed to delete ports and on errors attempt to detach them
   one-by-one.

Change-Id: Ic11f15e44086f8b25380e20457f28c351403b4d9
2022-05-24 16:43:32 +02:00
Zuul 76b7fd92be Merge "Support specify project id by annotation" 2022-05-12 17:40:52 +00:00
Roman Dobosz bc8ba2bc17 Add periodic task for cleaning up dead ports.
Sometimes it happen, that during ports creation, there could be quota
violations, so that port could be created without tags, and would just
hanging there. This periodic task would remove all the dead ports.

Change-Id: If646cb3bf00aca387c769fafe2f73a8194642f69
2022-05-05 07:06:02 +02:00
Roman Dobosz d49add94df Take care about OS resources which was not found on CRD.
Neutron networks and subnets might be omitted during removing resources
on deletion of KuryrNetwork CRD. This might happen, when there is event
for resources creation, and some other calls might make Kuryr Controller
to crash, resulting in not tagged and/or missing netId field in KN.
Although resources are created while we don't have updated CRD, with the
naming convention which ensure its uniqueness we may now get the
potential leftovers to clean it properly.

Change-Id: I5e54d2ba08325935e4ee1da95372b34e0e40e0f0
2022-04-29 14:32:25 +02:00
yangjianfeng 90088f3b0d Support specify project id by annotation
The implementation have some difference with the description of
blueprint. For more strict isolation, we only get project id from
namespace annotaion or configure option. The other resources's
project id inherit it's project or get from configiure option.

Implements: blueprint specify-project-by-annotation
Change-Id: Ia82cce6b211226599b4e1ca0d10416ed5e519ea2
2022-04-29 15:22:50 +08:00
Zuul 1bf9d6286c Merge "Include more events for Pods" 2022-04-27 13:42:37 +00:00
Zuul b7e87c94b1 Merge "Pools: Fix order of updated SGs" 2022-04-27 12:02:40 +00:00
Zuul 42ae6a37ab Merge "Create networks/subnets in bulletproof manner." 2022-04-19 19:47:36 +00:00
Roman Dobosz e9fd3bb134 Parallelize ports removal.
During removal of Neutron resources, sometimes there could be hanging
orphaned ports. Till now, all the removal was done one by one which
slows down removing process. In this change there is introduced removing
port in parallel in five concurrently run workers.

Change-Id: I74842989784601325b6d8977da4bc936ceedbc0e
2022-04-12 14:19:04 +02:00
Roman Dobosz 7ef2d54150 Create networks/subnets in bulletproof manner.
Currently, creating network and/or subnet depending heavily on tags.
There could be an issue with selecting right network when tag is set in
configuration, while it'll fail on actual tagging network or subnet. In
that case it might happen, that multiple, untagged subnets would be
created, while only one is expected. In this patch we introduce adding
unique UID from Kubernetes namespace into Neutron network description
field, so that it will indicate only one, specific namespace, so that
there shouldn't be collision with other k8s deployments within same
OpenStack cloud, despite of the tag.

Depends-On: https://review.opendev.org/c/openstack/kuryr-tempest-plugin/+/835245
Change-Id: Id68da7932be66d339119ed92b870c8f7538afb15
2022-04-07 15:58:34 +02:00
Roman Dobosz 5afa4925fc Fix potential issue with network/subnet name length.
In OpenStack Neutron and Octavia, name and descriptions of the objects
are limited to the 255 characters, while on Kubernetes names are limited
to the 253 characters. Kuryr often creates names for the networks and
subnets using Kubernetes object name with additional prefix and suffix,
which may exceed allowed character limit. In this patch there is
proposed solution for this issue by simply truncate kubernetes resource
name, while keeping prefix and suffix intact, to fit the Neutron name
limit.

Change-Id: I6e404104034f11593fc313797ad464458bbdf82d
2022-04-07 15:57:50 +02:00
Maysa Macedo d930d50392 Include more events for Pods
This commit includes more Kubernetes events to
notify the user about the progress when handling
a Pod event.

Change-Id: I87abc65aee41036d6c69f1b65f061885851f4a14
2022-04-06 12:08:09 +02:00