According to [1] we're supposed to support "funny" Service IPs, which
means IPs with leading zeros. Unlike unix in general we should not parse
them as octal values, but rather treat them like without leading zeros.
This commit makes sure that we strip the zeros from both ClusterIP and
LoadBalancerIP before we put them into KuryrLoadBalancer struct. This
means that later on Octavia and Neutron will get values that they
support and will proceed with LB or FIP creation normally.
[1] https://github.com/kubernetes/kubernetes/blob/v1.24.1/test/e2e/network/funny_ips.go
Change-Id: I0aa8d7326dbf40459f73037ae54d2afc01ea9bb6
Closes-Bug: 1978112
CRD KuryrNet was already replaced by KuryrNetwork, although there are
some spots, where it is mentioned - mostly docs and log messages. In
this commit we get rid of it once and for all.
Change-Id: I20345a1f4d4288534d620f0bd2196fc77ee795e9
This combines a few error handling improvements fixing problems found by
e2e K8s tests.
1. Logs on K8sNamespaceTerminating are no longer on WARNING but DEBUG
level. This is because they're harmless, yet they can spam the logs
when multiple namespaces are deleted, such as in e2e K8s tests.
2. 400 Bad Request is ignored on LB member creation. This happens when
subnet got deleted in the meanwhile and the LB will be gone soon too.
3. 404 Not Found is ignored when subports are detached from a trunk.
This can happen when some other thread already detached that port. If
the request was just for a single port, then error can be safely
ignored. In case of bulk request we don't really know which and how
many subports are detached already, so on detach error we'll just
proceed to delete ports and on errors attempt to detach them
one-by-one.
Change-Id: Ic11f15e44086f8b25380e20457f28c351403b4d9
During the refactoring to allow proper Events creation for Services,
we've lost ability to update KLBs .spec.provider when configuration is
changed to use another provider. This commit fixes it by making sure
.spec.provider mismatch with the configured provider triggers the .spec
update.
Change-Id: Ic93a105b4a67b8dff15bf0e80b13f4743ffce7ff
Closes-Bug: 1960311
In Endpoints' on_delete we're attempting to remove spec.endpointSlices
from the KuryrLoadBalancer object. In case of 422 Unprocessable Entity
error we're logging a warning. Such code is however mostly returned when
our patch cannot be applied, so in this particular case when spec is
already missing the endpointSlices field. This patch makes sure we don't
log anything in that case as it's definitely not a warning.
Change-Id: If4f3173bd3bdf72862c2f0f94ed016b68eaccb30
This commit implements adding Events for various things that may happen
when Kuryr handles a Service - either incidents or just informative
messages helpful to the user.
As to add an Event it is required to have a uid of the object and in
most of KuryrLoadBalancerHandler we don't have the Service
representation, this commit implements populating ownerReferences of the
KuryrLoadBalancer object with a reference to the Service. This has a
major side effect that KLB will be garbage collected if corresponding
service doesn't exist, which is probably a good thing (and we manage
that ourselves using finalizers anyway).
Another set of refactorings is to remove KLB creation from
EndpointsHandler in order to stop it fighting with ServiceHandler over
creation - EndpointsHandler cannot add ownerReference as it has no uid
of the Service.
Other refactorings related to the error messages are also included.
Change-Id: I36b4d62e6fc7ace00909e9b1aa7681f6d59ca455
As part of the change below, we are adding a utility code to get klb crd
path.
The klb crd shares the same namespace and the name as the service,
endpoints.
This code can be hence re-used for services too.
Depends-On: https://review.opendev.org/c/openstack/kuryr-kubernetes/+/804205
Change-Id: I90c8851e01de16b82617d60c987bad50a1522dd5
Neutron should make a subport that is already attached to a trunk ACTIVE
immediately. Unfortunately there seems to be an OVN bug causing an event
triggering this to be lost, leaving the port in DOWN state forever. This
is a disaster for Kuryr, because we can't proceed to wire the pods in
such case.
This commit attempts to workaround this by making Kuryr reattach the
ports that are in DOWN state for more than 90 seconds after they're
plugged.
Change-Id: If9a3968d68dced588614cd5521d4a111e78d435f
In case of hairpin LB traffic (member of the LB calls the LB and the
request is directed back to the same member) OVN replaces the source-ip
of the request with the LB IP. This means that pods with network
policies applied may have that traffic blocked when it should be
allowed.
To fix that this commit makes sure that SGs used for NPs include ingress
rules for each of the Service in it's namespace. It's not ideal but
seems to be a fair compromise between opening as little traffic as
possible and increasing number of security groups and rules.
As this commit makes sure all the NPs in the namespaces are reanalyzed
every time a Service is created or deleted, a little fixes in order to
support that are also made.
Change-Id: I7e0458c4071e4a43ab4d158429e05c67cd897a3c
Closes-Bug: 1923452
The timeout-client-data and timeout-member-data configurations
for Octavia listeners default to 50 seconds for load balancers
created by Kuryr. This patch allows the creation and modification
of load balancers handled by Kuryr with different timeouts values.
Implements: blueprint configure-lb-listeners-timeout
Change-Id: I99016001c2263023d1fa2637d7b5aeb23b3b2d9d
When moving from annotation to CRD a compatibility code was added
to allow user to upgrade from ussuri to victoria version. Now we are going
to upgrade to wallaby, so everyone should already use CRD and we
can remove the code.
Change-Id: I2acea7f6b3e1f02edd89f2f08127065a6556d367
The targetRef field is optional for the Endpoints object,
and when service without selectors are created that field
is more likely to not be specified. This commit ensures
Kuryr properly wires Endpoints without targetRef.
Change-Id: Ib43e88aafd9e0907556a0e740990a6acbd173fb0
This patch ensures that endpoints scale down to 0 event is handled
and therefore members are deleted from the loadbalancer pool
Change-Id: Idb879245607a6f0f914a44220312d54aa40e2e3d
Closes-Bug: 1904395
The _generate_lbaas_port_specs method on the lbaas handler
is not used anymore, it should be removed along with
the unit tests for it.
Change-Id: I3982fc6c292a2823c4581897569e885a27b9e406
Closes-Bug: #1901161
This commit removes some methods or variables definitions
that are not used. Also, reorder or remove interactions to
k8s API.
Change-Id: I424ce0b9a9a8c9fb0940bd3b60690f14140442ed
In theory with the usage of Finalizers having leaks of loadbalancers
is not possible anymore, and if the CRD is deleted it gets recreated
and also the loadbalancer is recreated.
This commit is deleting ensure_release_lbaas and _cleanup_leftover_lbaas
functions.
Change-Id: I0db62a845b23a32eef4358368332c4da2cad5460
Headless Service are not wired by Kuryr as it returns
directly the Pod's IPs behind the Service and do not
load-balance between them. This commit makes sure a klb
CR is not created for headless services as a LB won't
get created for it.
Change-Id: Ib389ddd5edca44c713149adc13486ab0b08007d2
This is another attempt at getting the useless tracebacks out of logs
for any level higher than debug. In particular:
* All pools logs regarding "Kuryr-controller not yet ready to *" are now
on debug level.
* The ugly ResourceNotReady raised from _populate_pool is now suppressed
if method is run from eventlet.spawn(), which prevents that exception
being logged by eventlet.
* ResourceNotReady will only print namespace/name or name, not the full
resource representation.
* ConnectionResetError is suppressed on Retry handler level just as any
other K8sClientError.
Change-Id: Ic6e6ee556f36ef4fe3429e8e1e4a2ddc7e8251dc
It might happen, that during adding and instantly removing pod, vif
handler will not be able to set the finalizer for such object, resulting
in error. In this patch we prevent for such case.
Closes-Bug: 1894212
Change-Id: I5b3b4b36ae0c2fca7c16153f75c62607efbc3ce4
If a loadBalancerIP is defined on the Service it should be used
on the created Load Balancer. However, that is not happening as
the LB CRD can be missing the lb_ip field definition, consequently
a new Floating IP is allocated to the Load Balancer. This commit
fixes the issue by ensuring the lb_ip is present on the CRD if
defined on the Service.
Closes-Bug: 1893927
Change-Id: Ie626328972587772dbeeaf52f96b312de41026f2
There is a delay between namespace termination and having all the
objects from that namespace marked as being terminated. This means that
if on CRD creation we'll get a response stating that the namespace is
being terminated - we should ignore it, not fail. This commit makes sure
we only log a warning for those errors.
Change-Id: I87006ffbabea90babd0245753f0d3a0b860b88b5
Closes-Bug: 1893772
As the Finalizer is first removed from the lb CRD and
then from the Service, we should add protection for when
the CRD is gone but the Service is still present with
the Finalizer set.
Change-Id: I9fe88bb78fea4930c241ecf1117eafecaf279c94
On upgrade from version using annotations on Endpoints and Services
objects to save information about created Octavia resources, to a
version where that information lives in KuryrLoadBalancer CRD we need to
make sure that data is converted. Otherwise we can end up with doubled
loadbalancers.
This commit makes sure data is converted before we try processing any
Service or Endpoints resource that has annotations.
Change-Id: I01ee5cedc7af8bd02283d065cd9b6f4a94f79888
This commit adds support for creation of loadbalancer, listeners,
members, pools with using the CRD, it is also filling the status
field in the CRD.
Depends-On: https://review.opendev.org/#/c/743214/
Change-Id: I42f90c836397b0d71969642d6ba31bfb49786a43
Upon kuryr-controller restart, if kuryr configuration regarding
octavia provider has changed, the cleanup process not only
removes the leftovers loadbalancers but also:
- remove the lbs with the previous octavia provider driver
- modifies endpoints annotations (provider information) to force
the recreation of the loadbalancer with the new provider
Change-Id: I78fd6bdd1f53ea8eb2ba85395fd2e3c651487c60
When deleting services and the respective load balancer
with using ovn-octavia provider, the lb sg is not deleted.
This commit fixes the issue by removing the LB sg creation
when the octavia provider is ovn-octavia, as that sg is not
really enforced.
Closes-bug: 1880207
Change-Id: I2c77b1d0ac682008ff6c31781d6075c208c689d0
Seems like flake8 3.8.1 got released that introduced new rules we
violate, rendering our pep8 job broken. This commit fixes them.
Change-Id: I7e4f2767856be8666c8f37f3b481f8bf43412ce8
As soon as the service is created it's possible that the backend pods
are not yet created resulting in an lbaas_spec annotation with no
security groups defined, and so security group rules can turn out
to be removed from the load balancer sg. This commit ensures the
lbaas_state annotation contains the updated sgs.
Closes-bug: 1872962
Change-Id: I296d16a627e39e6534ad9c1dff18b4564624d35d
This patch moves the namespace handling to be more aligned
with the k8s style.
Depends-on: If0aaf748d13027b3d660aa0f74c4f6653e911250
Change-Id: Ia2811d743f6c4791321b05977118d0b4276787b5
Due to a bug in Octavia this was previously avoided. This is now
supported for the amphora provider, so it won't be blocked on the
kuryr side
Change-Id: If7d32023e467d1a63812057e1d151901ce462c7f
Right now the lbaas name and the service name are compared when
identifying a load balancer that exist with no matching service.
This might cause a wrong lbaas deletion as the service and the lbaas
name might differ. This commit fixes the issue by ensuring the
cluster_ip field of the service is used to verify the existence
of a matching lbaas.
Closes-bug: 1858108
Change-Id: I4a32b8ce3bae68b406f23429f2ad895dd44c9842
When pods that run on host network and are pointed by a svc
are removed, the removal of the load balancer member is
ignored, as right now the name of the pod is not taken into
account on the decision of which members to delete, and the
current attributes checked are pod ip and port, which remains
the same. This commit fixes the issue by ensuring the
pod name is also checked.
Closes-Bug: 1857354
Change-Id: I44f7f76da692ac3002c91a1420969584b4d31cbd
When the deletion of a SVC is triggered while the load balancer
is still creating and the controller restarts, the deletion
event will be gone and the lbaas remains. This commit fixes
the issue, by removing the leftover lbaas upon controller restart.
Change-Id: I2d7dd14c3f05b0b1da6db7ac9b58731e34b593e6
When a member is being added to the loadbalancer (as loadbalancer
creation may take some time) it may happen that the namespace
deletion action has been already triggered. This impact the L2 mode
as the members need to get their subnet from the kuryrnet crd object
if namespace subnet driver is being used. This patch handles that
exception and skips the member addition since there is no need to
add it if it is being deleted.
Change-Id: I45f2cc3664dffa53a733218c38f22ee2ea2ff714
Closes-Bug: 1853604
This patch set increases the timeout to wait for resources to be
created/deleted. This is needed to better support spikes without
restarting the kuryr-controller. This patch also ensures that
future retry events are not afecting the kuryr controller if
they are retried once the related resources are already deleted,
i.e., the on_delete event was executed before one of the retries.
Closes-Bug: 1847753
Change-Id: I725ba22f0babf496af219a37e42e1c33b247308a
When a SVC is recreated all the LBaaS resources are first deleted,
which requires the LBaaS state obj to be updated accordingly, then
created. Right now, when a LBaaS deletion happens the values for
members, pools and listeners are not updated on the lbaas state obj.
This commit fixes the issue by ensuring the lbaas state object is
updated.
Closes-Bug: 1843142
Change-Id: Ibe72cb3ebd3d2a861684e7e8ac5a310fe8060aa9
LoadBalancerHandler._get_lbaas_spec is identical to
utils.get_lbaas_spec, which is used by LBaaSSpecHandler, so we can reuse the
function from utils instead of duplication of code.
Note that the passed k8s object is endpoint in case of
LoadBalancerHandler and service in case of LBaaSSpecHandler (but same
annotation used in both cases, so a common function should be enough)
Change-Id: I124109f79bcdefcc4948eb35b4bbb4a9ca87c43b
Signed-off-by: Yash Gupta <y.gupta@samsung.com>