When fetching the target value (T in HCTL) for the storage HBAs,
we use the /sys/class/fc_transport path to find available targets.
However, this path only contains targets that already have a LUN
attached from, to the host.
Scenario:
If we have 2 controllers on the backend side with 4 target HBAs each (total 8).
For the first LUN mapping from controller1, we will do a wildcard
scan and find the 4 targets from controller1 which will get
populated in the /fc_transport path.
If we try mapping a LUN from controller2, we try to find targets in the
fc_transport path but the path only contains targets from controller1 so
we will not be able to discover the LUN from controller2 and fail with
NoFibreChannelVolumeDeviceFound exception.
Solution:
In each rescan attempt, we will first search for targets in the
fc_transport path: "/sys/class/fc_transport/target<host>*".
If the target in not found then we will search in the fc_remote_ports
path: "/sys/class/fc_remote_ports/rport-<host>*"
If a [c,t,l] combination is found from either path, we add it to
the list of ctls we later use it for scanning.
This way, we don't alter the current "working" mechanism of scanning
but also add an additional way of discovering targets and improving
the scan to avoid failure scenarios in each rescan attempt.
Closes-Bug: #2051237
Change-Id: Ia74b0fc24e0cf92453e65d15b4a76e565ed04d16
The nvme cli has changed its behavior, now they no longer differentiate
between errors returning a different exit code.
Exit code 1 is for errors and 0 for success.
This patch fixes the detection of race conditions to also look for the
message in case it's a newer CLI version.
Together with change I318f167baa0ba7789f4ca2c7c12a8de5568195e0 we are
ready for nvme CLI v2.
Closes-Bug: #1961222
Change-Id: Idf4d79527e1f03cec754ad708d069b2905b90d3f
Attaching NVMe-oF no longer works in CentosOS 9 stream using nvme 2.4
and libnvme 1.4.
The reason is that the 'address' file in sysfs now has the 'src_addr'
information.
Before we had:
traddr=127.0.0.1,trsvcid=4420 After:
Now we have:
traddr=127.0.0.1,trsvcid=4420,src_addr=127.0.0.1
This patch fixes this issue and future proofs for any additional
information that may be added by parsing the contents and searching for
the parts we care: destination address and port.
Closes-Bug: #2035811
Change-Id: I7a33f38fb1b215dd23e2cff3ffa79025cf19def7
When an nvme subsystem has all portals in connecting state and we try
to attach a new volume to that same subsystem it will fail.
We can reproduce it with LVM+nvmet if we configure it to share targets
and then:
- Create instance
- Attach 2 volumes
- Delete instance (this leaves the subsystem in connecting state [1])
- Create instance
- Attach volume <== FAILS
The problem comes from the '_connect_target' method that ignores
subsystems in 'connecting' state, so if they are all in that state it
considers it equivalent to all portals being inaccessible.
This patch changes this behavior and if we cannot connect to a target
but we have portals in 'connecting' state we wait for the next retry of
the nvme linux driver. Specifically we wait 10 more seconds that the
interval between retries.
[1]: https://bugs.launchpad.net/nova/+bug/2035375
Closes-Bug: #2035695
Change-Id: Ife710f52c339d67f2dcb160c20ad0d75480a1f48
Dell Powerflex 4.x changed the error code of VOLUME_NOT_MAPPED_ERROR
to 4039. This patch adds that error code.
Closes-Bug: #2046810
Change-Id: I76aa9e353747b1651480efb0f3de11c707fe5abe
The mypy job complaints about 'exc' variable[1] since it was used
for ExceptionChainer as well as TargetPortalNotFound exception.
Changing the variable name for TargetPortalNotFound exception from
'exc' to 'target_exc' makes the 'type: ignore' comments unnecessary.
[1] Trying to read deleted variable 'exc'
Change-Id: I4b10db0754f0e00bb02d3a60f9aaf88b90466a8f
This patch improves the creation of the /etc/nvme/hostnqn file by using
the system UUID value we usually already know.
This saves us one or two calls to the nvme-cli command and it also
allows older nvme-cli versions that don't have the `show-hostnqn`
command or have it but can only read from file to generate the same
value every time, which may be useful when running inside a container
under some circumstances.
Change-Id: Ib250d213295695390dbdbb3506cb297a86e95218
The Dell PowerFlex (scaleio) connector maintains a token cache
for PowerFlex OS.
The cache was overwritten with None by misktake
in Change-ID I6f01a178616b74ed9a86876ca46e7e46eb360518.
This patch fixes the broken cache to avoid unnecessary login.
Closes-Bug: #2004630
Change-Id: I2399b0b2af8254cd5697b44dcfcec553c2845bec
from an image
This patch fixes the issue of password getting writen in plain text in
logs while creating a new volume. It created a new logger with default
log level at error.
Closes-Bug: #2003179
Change-Id: I0292a30f402e5acddd8bbc31dfaef12ce24bf0b9
Dell Powerflex 4.x changed the error code of VOLUME_ALREADY_MAPPED_ERROR
to 4037. This patch adds that error code.
Closes-Bug: #2013749
Change-Id: I928c97ea977f6d0a0b654f15c80c00523c141406
In some old nvme-cli versions the NVMe-oF create_hostnqn method fails.
This happens specifically on versions between not having the
show-hostnqn command and having it always return a value. On those
version the command only returns the value present in the file and never
tries to return an idempotent or random value.
This patch adds for that specific case, which is identified by the
stderr message:
hostnqn is not available -- use nvme gen-hostnqn
Closes-Bug: #2035606
Change-Id: Ic57d0fd85daf358e2b23326022fc471f034b0a2f
After merging change I0b60f9078f23f8464d8234841645ed520e8ba655, we
noticed an issue with existing unit tests which started failing.
The reason is 'nvme_hostid' was an additional parameter returned
in the response while fetching connector properties from nvme
connector.
This is environment specific and won't occur in environments where
'/etc/nvme/hostid' file doesn't exist due to which these tests
passed in gate but failed in the local run when hostid file
was present.
This patch mocks the get_nvme_host_id method for tests so the
hostid is never returned irrespective of the environment.
Closes-Bug: #2032941
Change-Id: I8b1aaedfdb9bef6e34813e39dede9afe98371d2b
In a multipath enabled deployment, when we try to extend a volume
and some paths are down, we fail to extend the multipath device and
leave the environment in an inconsistent state. See LP Bug #2032177
for more details.
To handle this, we check if all the paths are up before trying to
extend the device and fail fast if any path is down. This ensures
we don't partially extend some paths and leave the other to the
original size leading to inconsistent state in the environment.
Closes-Bug: 2032177
Change-Id: I5fc02efc5e9657821a1335f1c1ac5fe036e9329a
The NVMe-oF connector currently create the `/etc/nvme/hostnqn` file if
it doesn't exist, but it may still be missing the `/etc/nvme/hostid`
value.
Some distribution packages create the file on installation but others
may not.
It is recommended for the file to be present so that nvme doesn't
randomly generate it.
Randomly generating it means that the value will be different for the
same storage array and the same volume if we connect, disconnect, and
connect the volume again.
This patch ensures that the file will exist and will try to use the
system's UUID as reported by DMI or a randomly generated one.
Closes-Bug: #2016029
Change-Id: I0b60f9078f23f8464d8234841645ed520e8ba655
On Change-Id Ib3b066a7da071b1c2de78a1a4e569676539bd335 we improved the
RBDVolumeIOWrapper's flush and close methods, and this patch improves
them even further.
If the IOWrapper's close is not explicitly called and it's just
dereferenced (happens in unit tests) then during Garbage Collection the
wrapped image may be destroyed before the wrapper, which would trigger
the image being closed without the wrapper knowing, so when the wrapper
gets destroyed it will fail because it calls its close method, which
calls its flush, which calls the underlying image's flush, which will
fail because the underlying image was already closed.
We need to check if the underlying image has already being flushed
before calling the flush.
Calling the underlying close method for the Image or IOWrapper classes
is not a problem because they are idempotent.
Change-Id: Ib5a517d58427df0d1d1b22ad3dc66f673da147fe
This patch adds non SAM addressing modes for LUNs, specifically for
SAM-2 and SAM-3 flat addressing.
Code has been manually verified on Pure storage systems that uses SAM-2
addressing mode, because it's unusual for CI jobs to have more than
256 LUNs on a single target.
Closes-Bug: #2006960
Change-Id: If32d054e8f944f162bdc8700d842134a80049877
Stop testing for presence of
- lv activation support (added in 2.02.91)
- thin provisioning (added in 2.02.95)
- --ignoreactivationskip (added in 2.02.99)
2.02.99 was released in 2013.
Change-Id: I98bbe898bb1e75aa519dddfd44546fe9d477392f
When multipath is enabled and friendly names are ON and we try
to extend a volume, we pass the SCSI WWID to the multipath_resize_map
method which is not correct.
There are 2 things we can pass to the multipathd resize map command:
1) Multipath device (eg: dm-0)
2) Alias (eg: mpath1) or UUID (eg: 36005076303ffc56200000000000010aa)
The value should be an alias (mpath1) when friendly names are ON
and UUID (36005076303ffc56200000000000010aa) when friendly names
are OFF. However, we only pass the UUID irrespective of the value
set for friendly names.
This patch passes the multipath device path (to multipathd resize
map command) which is the real path of the multipath device (/dev/dm-*).
This fixes the issue as it passes the same value irrespective of if
the friendly names are ON or OFF.
-> self.multipath_resize_map(os.path.realpath(mpath_device))
(Pdb) mpath_device
'/dev/disk/by-id/dm-uuid-mpath-3600140522774ce73be84f9cb9537e0c9'
(Pdb) os.path.realpath(mpath_device)
'/dev/dm-5'
Closes-Bug: 1609753
Change-Id: I1c60af19c2ebaa9de878cd07cfc0077c5ea56fe3
The -v arg suppresses printing of [/dir] with the
device for bind mounts and btrfs volumes, which is
what we want for this usage.
This fixes _get_host_uuid() failing when using
a btrfs rootfs.
Closes-Bug: #2026257
Change-Id: I2d8f24193ecf821843bf8f4ea14b445561d6225c
We want to remove the deprecated 'CryptsetupEncryptor' encryptor, but
'LuksEncryptor' and the 'Luks2Encryptor' subclass depend on it. Fix this
by copying across the common code, allowing us to remove
'CryptsetupEncryptor' in a future change.
Change-Id: I7c523dc982803b337a51c111b01e170bba7c341f
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>