Commit Graph

433 Commits

Author SHA1 Message Date
Scott Little 8979864699 Remove CentOS/OpenSUSE build support
StarlingX stopped supporting CentOS builds in the after release 7.0.
This update will strip CentOS from our code base.  It will also remove
references to the failed OpenSUSE feature as well.

Story: 2011110
Task: 49955
Change-Id: I927a02d39114862c6a4ebd12c8c88640be18e370
Signed-off-by: Scott Little <scott.little@windriver.com>
2024-04-26 14:12:58 -04:00
Jiping Ma 1a089999c4 iavf: upgrade to iavf-4.5.3.4
This commit upgrades iavf to version 4.5.3.4 from 4.5.3.2 to fix the
issue "iavf 0000:17:01.6: Never saw reset".

The following root cause analysis comes from Intel.

  """
  The iavf_adminq_task() function processes the device Admin queue,
  which is used to handle receiving messages from the PF driver.

  It calls iavf_clean_arq_element() to extract the message at the head
  of the queue, and processes it by calling iavf_virtchnl_completion().

  There is a subtle race between iavf_adminq_task() and
  iavf_watchdog_task() involving the processing of
  VIRTCHNL_EVENT_RESET_IMPENDING. The race results in the iavf driver
  getting stuck waiting for a reset that has already completed, printing
  "Never saw reset" once every 5 seconds, and locking the driver in the
  __IAVF_RESET state, preventing normal operations from proceeding.

  The entire race can be avoided if the iavf_adminq_task() stops holding
  onto potentially stale data. To do this, acquire the
  __IAVF_IN_CRITICAL_TASK at the start of the function. With this, it is
  no longer possible for the function to be blocked holding the data in
  its event buffer while the iavf_watchdog_task() function processes the
  entire hardware reset.

  Instead of sleeping with a while loop, just re-queue the
  iavf_adminq_task() when we are unable to acquire the bit lock.
  Additionally, align with upstream and check the removal status to
  avoid re-queuing in the event that the driver has already started
  remove.

  This new flow also aligns with the way the upstream driver handles
  locking and completely avoids the race. If the iavf_adminq_task()
  happens to be delayed until the hardware reset completes, it will no
  longer see the VIRTCHNL_EVENT_RESET_IMPENDING data, as this will have
  been cleared by the hardware reset.
  """

Verification:
- The following command with this commit results in a successful iavf
  kernel module build for standard and PREEMPT_RT kernels:
    build-pkgs -c -p iavf

- A StarlingX ISO image was installed onto an All-in-One Dell XR11 lab
  with one Intel E810 NIC server in low-latency mode.

- The user who reported this issue was provided with a StarlingX
  designer patch that incorporates this change. The user in question
  did not encounter any issues during their testing with the designer
  patch.

Closes-Bug: 2058858

Change-Id: I448ee1e302bdc7277a6c5db990d4d5cfc485a0f4
Signed-off-by: Jiping Ma <jiping.ma2@windriver.com>
2024-03-26 21:29:33 -04:00
Peng Zhang 59890746d7 QAT: Integrate user space package QATengine
Intel 4th generation Xeon Scalable Processor (Sapphire Rapids) support
has been introduced for the platform. In order to leverage the
integrated QAT device of the SP-MCC SKUs, QAT user space package
QATengine need be integrated.

QATengine provides cryptographic acceleration for both hardware
and optimized software using Intel QuickAssist Technology enabled Intel
platforms. Intel QATengine project repository link is
    https://github.com/intel/QAT_Engine
And qat_hw target with OpenSSL\* 1.1.1 is built from source.

New package qat2.0.l-dev is added to contain the *.c and *.h files in
the qat2.0.l driver package. Some of these files are required by the
user-space packages' build procedures.

Test plan:
    - PASS: build-pkgs -a && build-image
    - PASS: /usr/bin/openssl engine -t -c qatengine
    - PASS: Test engine with openssl utility

Story: 2010796
Task: 49675

Change-Id: Id174fd06580e693a305b3e9ebaa09f550418b51c
Signed-off-by: Peng Zhang <Peng.Zhang2@windriver.com>
2024-03-15 23:55:48 +08:00
Zuul f3d0ff822b Merge "QAT:Integrate user space packages QATzip" 2024-03-06 03:49:54 +00:00
Zuul d01fa4bdc5 Merge "octeon-ep: Update to release v23.11" 2024-03-06 01:10:49 +00:00
M. Vefa Bicakci b015aa8940 octeon-ep: Update to release v23.11
This commit uprevisions the octeon_ep, octeon_ep_vf and oct_ep_phc
drivers from v23.04 to v23.11 to enable use cases that utilize the Dell
Open RAN Accelerator (DORA) card based on Marvell's Octeon
system-on-chip (SoC).

As the driver source code available on Sourceforge does not appear to be
kept up-to-date, the build system configuration files are updated to
acquire the driver source code from a Marvell-maintained git repository
on GitHub.

This commit also accommodates the minor differences between the
directory structures of the source code tar archive on Sourceforge and
the git repository on GitHub by modifying the debian/rules file.

We also block the automatic loading of the oct_ep_phc driver via a
modprobe.d configuration entry, for two reasons:

1) The oct_ep_phc driver does not appear to be needed by the major user
   whose use cases are enabled by this driver uprevision.
2) The oct_ep_phc driver triggers a kernel crash when being unloaded,
   due to an initialization error handling bug related to the DORA card,
   as reported at:
   https://github.com/MarvellEmbeddedProcessors/pcie_ep_octeon_host/issues/2

The two patches applied to the driver package as part of StarlingX are
refreshed and adapted to apply cleanly onto the newer driver package
version acquired from GitHub.

The modprobe configuration file is renamed to octeon-ep.conf to adhere
to inclusive language guidelines.

Finally, the "debian/copyright" file is updated to adhere to Debian's
formatting guidelines published at [1], to update the name of the source
package, to note that most files are licensed under the GPL-2 license
and that the "apps" directory is licensed under the Apache-2.0 license.
Also, please note that the Makefile in the source code package acquired
from GitHub does not have a specific/different license, unlike the
package acquired from Sourceforge, so the special case for that file is
removed.

[1] https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/

== Additional Information ==

We would like to note that the DORA card is a bit special with respect
to its configuration interface. In summary, the octeon_ep physical
function (PF) driver instantiates a network interface managed by the
kernel, which acts as the configuration interface to the accelerator
card. The card sends DHCP discovery requests via the network interface.
If a DHCP server is listening on the network interface, then the card
acquires an IP address and a firmware download can be carried out via
the interface to fully initialize the accelerator card.

Unfortunately, we do not have access to the firmware images and the
software packages necessary to test the accelerator card end-to-end, so
our verification has been limited to ensuring that the DHCP discovery
requests are observed on the network interface created by the PF driver
after the driver is loaded.

The accelerator also has a serial console that can be attached to the
host via a USB-to-serial adapter, but our understanding is that the labs
we have been using for verification did not have this serial connection
set up.

== Verification ==

* An ISO image can be built with this commit applied to a repo project
  of a StarlingX-based distribution tracking StarlingX's master branch.

* The ISO image can be installed to a Dell XR11 server with a DORA card,
  and the system is successfully Ansible-bootstrapped.

* The octeon_ep, octeon_ep_vf and oct_ep_phc drivers are observed to not
  be automatically loaded.

* The octeon_ep driver can be loaded manually with modprobe, and a PF
  interface is instantiated by the kernel. Once the PF interface is
  brought up with the "ip" command, DHCP discovery packets are observed
  on the interface by running (as root):

  tcpdump -i <pf_iface> -nn -e 'udp port 67 or udp port 68'

* Virtual function (VF) interfaces can be instantiated by loading the
  octeon_ep_vf driver with modprobe and then writing (for example) the
  string "2" to the magic sysfs file at:
  /sys/class/net/<pf_iface>/device/sriov_numvfs

* The VF interfaces can be brought up with the "ip" command.

Story: 2010047
Task: 49651

Change-Id: I11965bf1be278030934b4b517860bc28683a6673
Signed-off-by: M. Vefa Bicakci <vefa.bicakci@windriver.com>
2024-03-04 23:49:33 +00:00
Peng Zhang 19252c45ff QAT:Integrate user space packages QATzip
Intel 4th generation Xeon Scalable Processor (Sapphire Rapids) support
has been introduced for the platform. In order to leverage the
integrated QAT device of the SP-MCC SKUs, QAT user space packages
QATzip need be integrated.

QATzip provides extended accelerated compression and decompression
services by offloading the actual compression and decompression
request(s) to the Intel® Chipset Series. Intel QATzip project
repository link is
	https://github.com/intel/QATzip

Test plan:
	- PASS: build test
	- PASS: qzip -O 7z FILE1 FILE2 FILE3... -o result.7z
	- PASS: qzip -d result.7z
	- PASS: qzip -k $your_input_file  -O gzipext -A deflate

Story: 2010796
Task: 48568

Change-Id: I59e62d81e40b8d062bf780c681a38bed79fb520e
Signed-off-by: Peng Zhang <Peng.Zhang2@windriver.com>
2024-03-04 13:28:00 +08:00
Hugo Brito 1de147823c Fix constraints file in tox.ini
The constraints file used for tox.ini was removed. We need to
update the file to use the StarlingX Debian constraints file.

Test Plan:
PASS - Run tox command

Closes-bug: 2055734

Change-Id: I02d8d7e65cd889a24ffb6e9f9d3a5cc36a0f4248
Signed-off-by: Hugo Brito <hugo.brito@windriver.com>
2024-03-01 19:24:03 -03:00
Peng Zhang 706581da39 Update kernel to v5.10.205
This commit updates kernel to v5.10.205 to fix following CVE issues:
1.CVE-2023-51782: https://nvd.nist.gov/vuln/detail/CVE-2023-51782
2.CVE-2023-51781:https://nvd.nist.gov/vuln/detail/CVE-2023-51781
3.CVE-2023-51780: https://nvd.nist.gov/vuln/detail/CVE-2023-51780
4.CVE-2023-6531: https://nvd.nist.gov/vuln/detail/CVE-2023-6531
5.CVE-2023-6121: https://nvd.nist.gov/vuln/detail/CVE-2023-6121
6.CVE-2023-6546: https://nvd.nist.gov/vuln/detail/CVE-2023-6546
7.CVE-2023-6931: https://nvd.nist.gov/vuln/detail/CVE-2023-6931
8.CVE-2023-6932: https://nvd.nist.gov/vuln/detail/CVE-2023-6932
9.CVE-2023-6817: https://nvd.nist.gov/vuln/detail/CVE-2023-6817
10.CVE-2023-46862: https://nvd.nist.gov/vuln/detail/CVE-2023-46862
11.CVE-2023-39197: https://nvd.nist.gov/vuln/detail/CVE-2023-39197
12.CVE-2023-6176: https://nvd.nist.gov/vuln/detail/CVE-2023-6176
13.CVE-2023-4881: https://nvd.nist.gov/vuln/detail/CVE-2023-4881
14.CVE-2023-34324:  https://nvd.nist.gov/vuln/detail/CVE-2023-34324
15.CVE-2023-5717: https://nvd.nist.gov/vuln/detail/CVE-2023-5717
16.CVE-2023-5178: https://nvd.nist.gov/vuln/detail/CVE-2023-5178
17.CVE-2023-46813: https://nvd.nist.gov/vuln/detail/CVE-2023-46813
18.CVE-2023-35827: https://nvd.nist.gov/vuln/detail/CVE-2023-35827

A local StarlingX kernel patch had already been integrated into the
linux-yocto repository's v5.10/standard/preempt-rt/base branch after
v5.10.198 as commit 2dccf008aa65 ("net: replace
raw_write_seqcount_t_begin by do_raw_write_seqcount_begin").
Hence, we drop the following now-redundant local patch:
  0083-net-replace-raw_write_seqcount_t_begin-by-do_raw_wri.patch.

Verification:
- Build kernel and out of tree modules success for rt and std.
- Build iso success for rt and std.
- Install success onto a AIO-DX lab with rt kernel.
- Boot up successfully in the lab.
- The sanity testing was done by our test team and no regression
  defect was found.
- The cyclictest benchmark was also run on the starlingx lab, the
  result is "samples: 259200000 avg: 1602 max: 4460 99.9999th
  percentile: 2737 overflows: 0".
  Given that the maximum and 99.9999 percentile latency values are
  well below 5 microseconds, the results are acceptable, and they are
  not significantly different than the ones acquired with kernel
  v5.10.198.

Closes-Bug: 2043947

Change-Id: I558e40c4398428d73444bd4f50928c5248da0899
Signed-off-by: Peng Zhang <Peng.Zhang2@windriver.com>
2024-01-19 10:59:54 +08:00
M. Vefa Bicakci 8e6c846e06 qat2.0.l: Add version to shared library soname
This commit resolves the following error message printed out during
qat2.0.l and qatzip package builds:

  dpkg-shlibdeps: warning: can't extract name and version from \
    library name 'libqat_s.so'

This is caused by the lack of a version number in the "libqat_s.so" and
"libusdm_drv_s.so" shared libraries' file names as well as their soname
fields, and it is resolved by adding a placeholder version number to the
soname field of the shared libraries built by the qat2.0.l package. For
further information, please see the description of the included patch.

This commit also adds symbolic links from the non-versioned library file
names to the versioned library file names, to adhere to the shared
library conventions.

Verification:
* An ISO image was successfully built with this commit and a
  cherry-picked version of the commit at the following link, and the
  build logs for both the qat2.0.l-common and the qatzip packages did
  not exhibit the aforementioned warning message:

  https://review.opendev.org/c/starlingx/kernel/+/890744

* The "qatzip" Debian package resulting from the build automatically
  included the qat2.0.l-common package in its dependencies list:

  ```
  $ dpkg-deb -f \
    /localdisk/.../std/qatzip/qatzip_1.1.2-1.stx.1_amd64.deb \
    Depends

  libc6 (>= 2.17), liblz4-1 (>= 0.0~r127), \
    qat2.0.l-common (>= 1.0.20), zlib1g (>= 1:1.2.2)
  ```

* The ISO image built with this commit was installed into a
  qemu/KVM-based virtual machine in All-in-One simplex low-latency mode,
  and running the "cpa_sample_code" and "qzip" executables did not
  result in shared library resolution-related error messages.
  Furthermore, "ldd" indicated that the libraries were successfully
  located by the dynamic linker. An example:

  $ ldd /usr/bin/qzip | grep -e libusdm_drv_s -e libqat_s
  libusdm_drv_s.so.0 => /lib/x86_64-linux-gnu/libusdm_drv_s.so.0 \
    (0x00007fd6150c6000)
  libqat_s.so.0 => /lib/x86_64-linux-gnu/libqat_s.so.0 \
    (0x00007fd614fcf000)

Closes-Bug: 2046175
Change-Id: I2039b09be89bc75540550d94acb779a489326dce
Signed-off-by: M. Vefa Bicakci <vefa.bicakci@windriver.com>
2023-12-11 17:33:37 +00:00
Jiping Ma 1f0f8289b9 bnxt_re: Fix the version compatibility issue
We meet the version compatibility issue after upgrading mlnx-ofa_kernel
to 5.9. mlnx-ofa_kernel-5.5 is based on linux kernel 5.13-rc4.
mlnx-ofa_kernel-5.9 is based on linux kernel v6.0-rc5. We adapt bnxt_re
to mlnx-ofa_kernel-5.9 by referring to the following two upstream
commits and the bnxt_re-227.0.130.0 source code.

The definition of create_qp() was changed with the following commit
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.15-rc6&id=514aee660df493cd673154a6ba6bab745ec47b8c

IB_DEVICE_LOCAL_DMA_LKEY was removed with the following commit
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-6.0.y&id=e945c653c8e972d1b81a88e474d79f801b60213a

bnxt_re-220.0.12.0/main.c:1340:16: error: initialization of int \
  (*)(struct ib_qp *, struct ib_qp_init_attr *, struct ib_udata *) \
  from incompatible pointer type struct ib_qp * (*)(struct ib_pd *,\
  struct ib_qp_init_attr *, struct ib_udata *) \
  [-Werror=incompatible-pointer-types]
    1340 |  .create_qp  = bnxt_re_create_qp,
         |                ^~~~~~~~~~~~~~~~~
bnxt_re-220.0.12.0/ib_verbs.c:160:11: error: IB_DEVICE_LOCAL_DMA_LKEY \
  undeclared (first use in this function); did you mean \
  IBK_LOCAL_DMA_LKEY?
    160 |         | IB_DEVICE_LOCAL_DMA_LKEY
        |           ^~~~~~~~~~~~~~~~~~~~~~~~

Note: The dependency mlnx-ofed-kernel-dev package's name in debian/rules
and debian/control files was updated to have a @KERNEL_TYPE@ suffix, to
accommodate a similar change in the packaging of the Mellanox drivers.

Verification:
- Build module success for kernel-std/kernel-rt.
- Installation of the ISO image is successful with standard and
  low-latency profiles.
- Physical function interfaces are up and pass packets for rt and std.
- Create vfs, ensure that the interface can come up and pass packets.
- RDMA/Infiniband over Ethernet functionalities of the Broadcom adapters
  were successfully tested using the Linux RDMA community's perftest
  package.

Story: 2010958
Task: 49057

Depends-On: https://review.opendev.org/c/starlingx/kernel/+/900742

Change-Id: Ib2e597811f9289c7840fcef662d44ca6dbf26270
Signed-off-by: Jiping Ma <jiping.ma2@windriver.com>
2023-11-16 13:53:54 +00:00
Jiping Ma 5f3e3f5a52 mlnx-ofa_kernel: upgraded the related packages
This upgrades the OFED driver related packages to the ones that are
located in https://linux.mellanox.com/public/repo/mlnx_ofed/5.9-0.5.6.0/SRPMS/
That includes rdma-core and the mlnx-tools package that
mlnx-ofa_kernel depends on, and the firmware tool mstflint.

The new versions are:
    mlnx-ofa_kernel-5.9.tgz
    rdma-core-59mlnx44.tgz
    mstflint-4.16.1-2.tar.gz
    mlnx-tools-5.2.0.tar.gz

Verification:
- Install onto a StarlingX system with two controller and two compute
  nodes with network adapters Mellanox's OFED. The network adapters
  of controllers are Mellanox Technologies MT27710 Family
  [ConnectX-4 Lx], the network adapters of computes are Mellanox
  Technologies MT27800 Family [ConnectX-5].
- Use mstflint to query the firmware on the device.
- Use mstflint to verify firmware.
- Use mstconfig to query configurations.
- Use mstvpd to dump the on-card VPD.
- Use mstregdump to dump hardware registers from Mellanox hardware.
- RDMA/Infiniband over Ethernet functionalities of the Mellanox adapters
  were successfully tested using the Linux RDMA community's perftest
  package.

Story: 2010958
Task: 49056

Depends-On: https://review.opendev.org/c/starlingx/kernel/+/900742

Change-Id: I7811eb10682e204225933316cd45a0ea8e84fb96
Signed-off-by: Jiping Ma <jiping.ma2@windriver.com>
2023-11-16 13:53:22 +00:00
Jiping Ma 9f51fddb27 mlnx-ofa_kernel: Upgrade OFED driver version to 5.9
This upgrades the OFED driver package to the mlnx-ofa_kernel-5.9.tgz
located in https://linux.mellanox.com/public/repo/mlnx_ofed/5.9-0.5.6.0/SRPMS/

In addition, removes irq_update_affinity_hint related patch
because the fix is already included in the source code.

Reason:
The required Mellanox drivers must be upgraded to the latest version
(5.8+) to support Dell 15G and 16G platforms.

Verification:
- Build module success for kernel-std/kernel-rt.
- Build package success for rdma-core, mstflint and mlnx-tools.
- Install onto a StarlingX system with All-in-One lab with network
  adapters Mellanox's OFED. The network adapters of controllers
  are Mellanox Technologies MT27800 Family [ConnectX-5].
- Install onto a StarlingX system in labs with network
  adapters Mellanox's OFED. The network adapters of controllers
  are [ConnectX-6 DX],[ConnectX-6 LX].
- The physical function interfaces are up and pass packets for rt
  and std.
- create vfs, ensure that the interface can come up and pass packets.
- RDMA/Infiniband over Ethernet functionalities of the Mellanox adapters
  were successfully tested using the Linux RDMA community's perftest
  package.

Story: 2010958
Task: 49055

Change-Id: I824e5e07b597e8b7cc518388a2ab93264b3f0947
Signed-off-by: Jiping Ma <jiping.ma2@windriver.com>
2023-11-14 21:08:57 -05:00
M. Vefa Bicakci a7c05b0c05 kernel-rt: Fix incorrect RT task migration
This commit resolves the following issue in StarlingX: When running
sysbench's disk I/O workload in a pod for an extended period, the
following warning shows up in the kernel logs, the fall-out from which
can eventually trigger a kernel panic, such as the following:

  ------------[ cut here ]------------
  WARNING: CPU: 49 PID: 0 at kernel/sched/core.c:2503 \
    set_task_cpu+0x1cd/0x1e0
  Modules linked in: ...
   ...
  CPU: 49 PID: 0 Comm: swapper/49 Kdump: loaded \
    Tainted: G S         O \
    5.10.0-6-rt-amd64 #1 Debian 5.10.177-1.stx.60
  Hardware name: HPE Edgeline e920t/Edgeline e920t, BIOS H10 04/20/2023
  RIP: 0010:set_task_cpu+0x1cd/0x1e0
  Code: ...
  <snip register dump>
  Call Trace:
   <IRQ>
   push_rt_task.part.0+0x1bf/0x410
   task_woken_rt+0x5d/0x70
   ttwu_do_wakeup+0x45/0x190
   try_to_wake_up+0x194/0x690
   __handle_irq_event_percpu+0x86/0x1f0
   ? mwait_idle+0x76/0x90
   handle_irq_event+0xa5/0x110
   handle_edge_irq+0x93/0x290
   asm_call_irq_on_stack+0xf/0x20
   </IRQ>
   common_interrupt+0xb3/0x130
   asm_common_interrupt+0x1e/0x40
  RIP: 0010:mwait_idle+0x76/0x90
  Code: ...
   <snip register dump>
   default_idle_call+0x3b/0x150
   do_idle+0x251/0x2f0
   cpu_startup_entry+0x19/0x20
   secondary_startup_64_no_verify+0xc2/0xcb
  ---[ end trace 0000000000000002 ]---
  ------------[ cut here ]------------

  DEBUG_LOCKS_WARN_ON(l->owner != current)
  WARNING: CPU: 0 PID: 1069 at include/linux/local_lock_internal.h:68 \
    __local_bh_enable+0x119/0x160
  Modules linked in: ...
   ...
  CPU: 0 PID: 1069 Comm: irq/1862-nvme0q Kdump: loaded \
    Tainted: G S      W  O \
    5.10.0-6-rt-amd64 #1 Debian 5.10.177-1.stx.60
  Hardware name: HPE Edgeline e920t/Edgeline e920t, BIOS H10 04/20/2023
  RIP: 0010:__local_bh_enable+0x119/0x160
  Code: ...
    <snip register dump>
  Call Trace:
   __local_bh_enable_ip+0x5e/0xd0
   irq_forced_thread_fn+0x73/0x80
   irq_thread+0x102/0x1d0
   ? irq_finalize_oneshot.part.0+0xe0/0xe0
   ? irq_thread_check_affinity+0xa0/0xa0
   kthread+0x176/0x190
   ? __kthread_parkme+0xa0/0xa0
   ret_from_fork+0x1f/0x30
  ---[ end trace 0000000000000003 ]---

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 4a4b49067 P4D 0
  Oops: 0002 [#1] PREEMPT_RT SMP NOPTI
  CPU: 0 PID: 1000 Comm: irq/1795-nvme0q Kdump: loaded \
    Tainted: G S      W  O \
    5.10.0-6-rt-amd64 #1 Debian 5.10.177-1.stx.60
  Hardware name: HPE Edgeline e920t/Edgeline e920t, BIOS H10 04/20/2023
  RIP: 0010:rb_erase+0x1b4/0x350
  Code: ...
    <snip register dump>
  Call Trace:
   mark_wakeup_next_waiter+0x73/0x140
   rt_mutex_futex_unlock+0x60/0xb0
   dma_pool_free+0xa7/0xc0
   nvme_unmap_data.part.0+0x7b/0xc0 [nvme]
   nvme_pci_complete_rq+0x45/0xc0 [nvme]
   nvme_process_cq+0x173/0x290 [nvme]
   ? irq_thread_fn+0x60/0x60
   nvme_irq+0x10/0x20 [nvme]
   irq_forced_thread_fn+0x2e/0x80
   irq_thread+0x102/0x1d0
   ? irq_finalize_oneshot.part.0+0xe0/0xe0
   ? irq_thread_check_affinity+0xa0/0xa0
   kthread+0x176/0x190
   ? __kthread_parkme+0xa0/0xa0
   ret_from_fork+0x1f/0x30
  Modules linked in: ....
   ...
  CR2: 0000000000000000

In addition to the kernel panic mentioned above, we have also observed
three cases (one of which was locally reproduced) where the system under
test becomes unresponsive after a number of kernel warnings, starting
with the first warning quoted above. Reviewing the vmcore file generated
via the use of the magic system request key (while the system was
unresponsive) indicated that an NVMe-related IRQ thread may have been
migrated incorrectly while the thread had disabled migration.

We cherry-pick commit feffe5bb274d ("sched/rt: Fix bad task migration
for rt tasks") to resolve the aforementioned issues. These issues are
caused by a commit that was cherry-picked to the PREEMPT_RT kernel by
upstream:

The "fixed" commit was inherited by StarlingX from linux-yocto's
v5.10/standard/preempt-rt/base branch:

* commit ad592ffad5a7 ("sched,rt: Use the full cpumask for balancing")
  https://git.yoctoproject.org/linux-yocto/commit/?h=ad592ffad5a7

That commit in turn was likely inherited from the linux-stable-rt
project's rt-stable/v5.10-rt branch:

* commit 0523fce6f661 ("sched,rt: Use the full cpumask for balancing")
  https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/commit/?id=0523fce6f661

And that commit was cherry-picked from the following mainline commit
(a.k.a., v5.11-rc1~7^2~30^2~5):

* commit 95158a89dd50 ("sched,rt: Use the full cpumask for balancing")
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=95158a89dd50

Unfortunately, the aforementioned commit introduced the bug we describe
above, which necessitated the following mainline bug-fix commit (a.k.a.
v6.4-rc1~94^2~1), which we are applying to the StarlingX kernel with
this commit:

* commit feffe5bb274d ("sched/rt: Fix bad task migration for rt tasks")
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=feffe5bb274d

Verification confirming that the issue is fixed, carried out with a
StarlingX-based distribution using the v5.10.177 PREEMPT_RT kernel:

* Different manifestations of the same issue were reproduced on two
  All-in-One Simplex systems installed with the low-latency profile
  (i.e., the PREEMPT_RT kernel) which were running the sysbench workload
  in a loop.

* One server reproduced the "set_task_cpu" warning followed by the
  kernel panic quoted above, after about ~9 hours running the workload.
  At the time, we were not aware of the relationship of the
  "set_task_cpu" warning and the unresponsive system scenario. We also
  noticed that the panic occurred while the sysbench workload was disk
  I/O-intensive.

* We prepared a designer patch, incorporating an earlier version of this
  commit with the same cherry-picked mainline commit, and we installed
  the designer patch onto the same server. The sysbench workload was
  modified so that only the disk I/O-intensive parts of the workload
  were repeatedly executed. The workload was kept running for more than
  seven days, and the "set_task_cpu" warning was not encountered during
  this time frame.

* Concurrently, near the end of the sixth day of the tests running with
  the designer patch applied, another server (that was *not* patched)
  reproduced the "set_task_cpu" warning followed by the "unresponsive
  system" scenario after about 40 hours of uptime, during the most of
  which the workload was running. Based on a review of the vmcore file
  (as discussed above) we connected the two issues (unresponsive system
  and kernel panic).

Verification carried out for due diligence, with a distribution based on
the StarlingX master branch using the v5.10.198 PREEMPT_RT kernel:

* An ISO image was successfully built with this commit (via "build-pkgs
  --reuse"), and the resulting ISO image was installed into a
  qemu/KVM-based virtual machine in All-in-One Simplex and low-latency
  configuration. The installation was Ansible bootstrapped successfully.

* We should note that we did encounter one issue after the Ansible
  bootstrap autonomously/expectedly rebooted the system. About 20
  minutes of uptime after the reboot, two "dockerd" tasks and one "sync"
  task were reported by the kernel to be blocked for more than 122
  seconds, with backtraces implicating file system sync operations. We
  currently believe that this issue is related to disk I/O contention on
  the host running the virtual machine, and that this issue is not
  related to this commit.

Closes-Bug: 2043023
Change-Id: Ifd186473c33d221a1e3e51c44edd7325f59a7c7f
Signed-off-by: M. Vefa Bicakci <vefa.bicakci@windriver.com>
2023-11-08 15:07:45 +00:00
Peng Zhang 88eaffd00c Update kernel to v5.10.198
This commit updates kernel to v5.10.198 to fix following CVE issues:
1.CVE-2023-4244: https://nvd.nist.gov/vuln/detail/CVE-2023-4244
2.CVE-2023-31085: https://nvd.nist.gov/vuln/detail/CVE-2023-31085
3.CVE-2023-45871: https://nvd.nist.gov/vuln/detail/CVE-2023-45871
4.CVE-2023-5197: https://nvd.nist.gov/vuln/detail/CVE-2023-5197
5.CVE-2023-39194: https://nvd.nist.gov/vuln/detail/CVE-2023-39194
6.CVE-2023-39192: https://nvd.nist.gov/vuln/detail/CVE-2023-39192
7.CVE-2023-39193: https://nvd.nist.gov/vuln/detail/CVE-2023-39193
8.CVE-2023-42756: https://nvd.nist.gov/vuln/detail/CVE-2023-42756
9.CVE-2023-42754: https://nvd.nist.gov/vuln/detail/CVE-2023-42754
10.CVE-2023-39189: https://nvd.nist.gov/vuln/detail/CVE-2023-39189
11.CVE-2023-31084: https://nvd.nist.gov/vuln/detail/CVE-2023-31084
12.CVE-2023-3389: https://nvd.nist.gov/vuln/detail/CVE-2023-3389
13.CVE-2022-45884: https://nvd.nist.gov/vuln/detail/CVE-2022-45884
14.CVE-2023-42755: https://nvd.nist.gov/vuln/detail/CVE-2023-42755
15.CVE-2023-42752: https://nvd.nist.gov/vuln/detail/CVE-2023-42752
16.CVE-2023-4622: https://nvd.nist.gov/vuln/detail/CVE-2023-4622
17.CVE-2023-37453: https://nvd.nist.gov/vuln/detail/CVE-2023-37453
18.CVE-2023-42753: https://nvd.nist.gov/vuln/detail/CVE-2023-42753
19.CVE-2023-4623: https://nvd.nist.gov/vuln/detail/CVE-2023-4623
20.CVE-2023-4921: https://nvd.nist.gov/vuln/detail/CVE-2023-4921

One of our source patches requires refresh against the new kernel
source.It was deleted for content has been contained in the new
kernel:
  0072-kernel-fork-beware-of-__put_task_struct-calling-cont.patch.

Under PREEMPT_RT, when kernel is upgraded to v5.10.198,
raw_write_seqcount_t_begin function is still used by qdisc_run_begin
function in include/net/sch_generic.h. While
raw_write_seqcount_t_begin function is replaced by
do_raw_write_seqcount_begin in include/linux/seqlock.h whose commit
is a8dd21118b0f.
 Commit ID                      Title
a8dd21118b0f  seqlock: Prefix internal seqcount_t-only macros with
              a "do_"
To fix implicit declaration of function raw_write_seqcount_t_begin,
replace it with do_raw_write_seqcount_begin in the following patch:
 0083-net-replace-raw_write_seqcount_t_begin-by-do_raw_wri.patch

Verification:
- Build kernel and out of tree modules success for rt and std.
- Build iso success for rt and std.
- Install success onto a AIO-DX lab with rt kernel.
- Boot up successfully in the lab.
- The sanity testing was done by our test team and no regression
  defect was found.
- The cyclictest benchmark was also run on the starlingx lab, the
  result is "samples: 259200000 avg: 1610 max: 4658 99.9999th
  percentile: 2403 overflows: 0", It is not big difference with
  5.10.192 for avg and percentile.

Closes-Bug: 2038710

Change-Id: I7ed77309e83d4edd39623452c9348488f8db1523
Signed-off-by: Peng Zhang <Peng.Zhang2@windriver.com>
2023-11-01 19:09:37 +08:00
Zhixiong Chi 134d5d2fbd Add the pci reboot quirk in DMI table for Dell PowerEdge R750
Problem:
The Dell R750 will hang after the following command being executed:
$sudo -i /bin/bash -c 'echo b > /proc/sysrq-trigger'
This issue can be reproduced almost within 5 times testing cycle.

The activated controller will send reboot command to mtcClient on the
standby controller due to the SM failure(heartbeat missed), and then
mtcClient tries to reboot the system gracefully. But if the standby
controller isn't rebooted within 120s, mtcClient tries to force reboot
it using the following command "echo b > /proc/sysrq-trigger".
Unfortunately the machine Dell PowerEdge R750 is stuck and the BMC
console doesn't show anything.

Solution:
After searching if there is any revelant clues about this machine,
nothing was found but the kernel parameter 'reboot=p' to change the
reboot type to pci_reboot for the sysrq magic key. With doing the test
cycle multiple times, and the issue has been gone with the kernel
option. The behavior that the system can reboot properly is expected.
So this way should be helpful for the Dell R750 reset.
Considering this kernel option should not be applicable to all target
machines, we just adjust the method to change reboot type for R750
machine based on DMI table quirk. The other kind of machine still uses
the default reboot type, and this commit just affects the R750 machine.

Base on the above, we add the pci reboot quirk in DMI table to change
the reboot_type to pci_reboot to make sure the kernel On Dell PowerEdge
R750 reboot properly.

On the R750 target we can see the following dmidecode information:
$sudo dmidecode |grep 'Product Name'
	Product Name: PowerEdge R750
$sudo dmidecode |grep 'Vendor'
	Vendor: Dell Inc.

TestPlan:
PASS: downloader && build-pkgs && build-image
PASS: Jenkins Installation on R750 machine and the other labs.
PASS: Execute the following testing cycle more than 20 times:
       $sudo -i /bin/bash -c 'echo b > /proc/sysrq-trigger'
       The system can reboot properly every time during test cycles.
       The stuck issue after reset hasn't been seen anymore.

Closes-Bug: 2041606

Signed-off-by: Zhixiong Chi <zhixiong.chi@windriver.com>
Change-Id: I05467cc6d5105aa813852dca0c935278741b043f
2023-10-30 22:30:42 -04:00
Zuul a4bdae97dd Merge "cengn reference removal" 2023-10-11 18:26:05 +00:00
Peng Zhang b5cfde2411 Update kernel to v5.10.192
This commit updates kernel to v5.10.192 to fix following CVE issues:
CVE-2023-21400: https://nvd.nist.gov/vuln/detail/CVE-2023-21400
CVE-2023-3773: https://nvd.nist.gov/vuln/detail/CVE-2023-3773
CVE-2023-3777: https://nvd.nist.gov/vuln/detail/CVE-2023-3777
CVE-2023-4015: https://nvd.nist.gov/vuln/detail/CVE-2023-4015
CVE-2023-4208: https://nvd.nist.gov/vuln/detail/CVE-2023-4208
CVE-2023-4206: https://nvd.nist.gov/vuln/detail/CVE-2023-4206
CVE-2023-4207: https://nvd.nist.gov/vuln/detail/CVE-2023-4207
CVE-2023-3772: https://nvd.nist.gov/vuln/detail/CVE-2023-3772
CVE-2022-45887: https://nvd.nist.gov/vuln/detail/CVE-2022-45887
CVE-2022-45886: https://nvd.nist.gov/vuln/detail/CVE-2022-45886
CVE-2022-45919: https://nvd.nist.gov/vuln/detail/CVE-2022-45919.
Also this commit fixes following CVE issues which can be fixed
in v5.10.190.
CVE-2022-45919: https://nvd.nist.gov/vuln/detail/CVE-2022-45919
CVE-2023-20588: https://nvd.nist.gov/vuln/detail/CVE-2023-20588
CVE-2023-35829: https://nvd.nist.gov/vuln/detail/CVE-2023-35829
CVE-2023-35828: https://nvd.nist.gov/vuln/detail/CVE-2023-35828
CVE-2023-35824: https://nvd.nist.gov/vuln/detail/CVE-2023-35824
CVE-2023-35823: https://nvd.nist.gov/vuln/detail/CVE-2023-35823
CVE-2023-2163: https://nvd.nist.gov/vuln/detail/CVE-2023-2163
CVE-2023-34256: https://nvd.nist.gov/vuln/detail/CVE-2023-34256
CVE-2022-39189: https://nvd.nist.gov/vuln/detail/CVE-2022-39189
CVE-2022-4269: https://nvd.nist.gov/vuln/detail/CVE-2022-4269
CVE-2023-1380: https://nvd.nist.gov/vuln/detail/CVE-2023-1380
CVE-2023-2002: https://nvd.nist.gov/vuln/detail/CVE-2023-2002
CVE-2023-21255: https://nvd.nist.gov/vuln/detail/CVE-2023-21255
CVE-2023-2269: https://nvd.nist.gov/vuln/detail/CVE-2023-2269
CVE-2023-31084: https://nvd.nist.gov/vuln/detail/CVE-2023-31084
CVE-2023-3268: https://nvd.nist.gov/vuln/detail/CVE-2023-3268
CVE-2023-3389: https://nvd.nist.gov/vuln/detail/CVE-2023-3389
CVE-2023-34319: https://nvd.nist.gov/vuln/detail/CVE-2023-34319
CVE-2023-4194: https://nvd.nist.gov/vuln/detail/CVE-2023-4194
CVE-2023-4147: https://nvd.nist.gov/vuln/detail/CVE-2023-4147
CVE-2023-4273: https://nvd.nist.gov/vuln/detail/CVE-2023-4273
CVE-2022-40982: https://nvd.nist.gov/vuln/detail/CVE-2022-40982
CVE-2023-4128: https://nvd.nist.gov/vuln/detail/CVE-2023-4128
CVE-2023-40283: https://nvd.nist.gov/vuln/detail/CVE-2023-40283
CVE-2023-1206: https://nvd.nist.gov/vuln/detail/CVE-2023-1206
CVE-2023-0160: https://nvd.nist.gov/vuln/detail/CVE-2023-0160

None of our source patches requires refresh against the new kernel
source.

Verification:
- Build kernel and out of tree modules success for rt and std.
- Build iso success for rt and std.
- Install success onto a AIO-DX lab with rt kernel.
- Boot up successfully in the lab.
- The sanity testing was done by our test team and no regression
  defect was found.
- The cyclictest benchmark was also run on the starlingx lab, the
  result is "samples: 259200000 avg: 1631 max: 9232 99.9999th
  percentile: 8542 overflows: 0", It is not big difference with
  5.10.189 for avg.

Closes-Bug: 2036491
Closes-Bug: 2036311

Change-Id: I1a4d8c640c0a0bd9fc656b0d5bc46ee4e6937d86
Signed-off-by: Peng Zhang <Peng.Zhang2@windriver.com>
2023-10-10 10:50:22 +08:00
Zuul 83813f2c46 Merge "kernel-modules: Build-Depends on linux-headers-stx-amd64" 2023-10-07 01:27:22 +00:00
Zuul 078578fa44 Merge "Add pkgs without abiname for image/headers" 2023-10-07 01:27:17 +00:00
M. Vefa Bicakci 06a162c47d intel-iavf: Update from v4.5.3 to v4.5.3.2
This commit updates the default Intel NIC driver bundle version of the
iavf driver from v4.5.3 to v4.5.3.2 to resolve an issue involving system
hangs after the following messages are printed out by the iavf driver:

```
iavf 0000:51:11.0: Failed to init adminq: -53
iavf 0000:51:11.0: failed to allocate resources during reinit
```

This is reproduced with the following commands on iavf-4.5.3, which
carry out rapid virtual function (VF) interface resets:

```
while true; do
  # enp81s17 is the first VF interface
  ip l set dev enp81s17 up;

  # enp81s0f2 is the corresponding PF interface
  ip l set dev enp81s0f2 vf 0 trust on;
  ip l set dev enp81s0f2 vf 0 vlan 333;

  ip l set dev enp81s0f2 vf 0 trust off;
  ip l set dev enp81s0f2 vf 0 vlan 310;

  ip l set dev enp81s17 down;
  sleep 0.1 ;
done
```

Eventually, iavf reports the aforementioned error messages, and the VF
bring down operation hangs. This is followed by the hang of many
unrelated processes, likely due to the "rtnl" mutex.

This commit updates iavf from v4.5.3 to v4.5.3.2 to resolve this issue
and other issues that Intel has recommended to fix. Please note that
this version of the iavf driver is found in the "unsupported" directory
on Intel's Sourceforge project for NIC drivers, despite Intel having
recommended this version of the iavf driver to fix the reported issue.
This is how Intel provides fixed intermediate versions of their older
NIC drivers on Sourceforge. Furthermore, this version of iavf has gone
through testing by Intel as well as by the StarlingX community, despite
the driver having been declared as an "unsupported" version by Intel.

The corresponding mainline commits are as follows, but note that the
changes in iavf 4.5.3.2 are only loosely based on these commits, due to
the divergence between the out-of-tree and mainline versions of the iavf
source code:

* Commit 31071173771e ("iavf: Fix reset error handling")
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=31071173771e
  (This is the commit that resolves the issue the user in question has
  encountered.)

* Commit c2ed2403f12c ("iavf: Wait for reset in callbacks which trigger
  it")
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c2ed2403f12c

* Commit 7598f4b40bd6 ("iavf: Move netdev_update_features() into
  watchdog task")
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7598f4b40bd6

The iavf driver versions belonging to other Intel NIC driver bundle
versions are not updated due to the following reasons:

- intel-iavf-cvl-2.54: We do not yet know if this version of iavf
  (v4.0.1) is affected by this issue. The user reporting the issue fixed
  by this commit is currently using iavf v4.5.3, and we have not
  received field reports regarding a similar issue encountered with iavf
  v4.0.1.

- intel-iavf-cvl-4.10: This version of iavf (v4.6.1) is not affected by
  this issue, as the changes included in iavf v4.5.3.2 were backported
  by Intel from iavf v4.6.1.

Verification

- The following command with this commit results in a successful iavf
  kernel module build for standard and PREEMPT_RT kernels:
    build-pkgs -c -p iavf

- A StarlingX ISO image from 2023-09-28 was installed onto an All-in-One
  Duplex Dell XR11 lab with one quad-port Intel E810 NIC per server in
  low-latency mode (i.e., with the PREEMPT_RT kernel).

- The issue was reproduced using a script similar to the one depicted at
  the beginning of this commit message. We should note that the issue
  manifests itself usually within ~200 iterations of the loop.

- Afterwards, in a StarlingX build environment, the kernel and all of
  the kernel modules were built with this commit from scratch. The
  resulting *.deb files were copied to controller-1 of the StarlingX
  installation and converted into a "sneaky" designer patch with a
  customized version of the "sneaky_patch.py" script, the original
  version of which is available in StarlingX.

- The resulting designer patch was successfully applied onto
  controller-0 of the aforementioned StarlingX ISO image installation.
  Afterwards, it was confirmed that the iavf driver version changed from
  4.5.3 (prior to the designer patch) to 4.5.3.2 (after the application
  of the designer patch).

- Afterwards, a shell script based on the snippet quoted above was
  executed for 4000 iterations of the loop, without the reproduction of
  the original issue.

- Furthermore, basic tests with iavf-managed VF interfaces were carried
  out, involving creating two network namespaces on controller-0,
  assigning one iavf-managed VF interface to each network namespace, and
  finally, running iperf3 across the VF interfaces, from within the
  network namespaces.

Closes-Bug: 2037692
Change-Id: I75415e5668b002b91c2208bff081775c9eced083
Signed-off-by: M. Vefa Bicakci <vefa.bicakci@windriver.com>
2023-10-03 15:11:35 +00:00
Li Zhou 8fed8973bc kernel-modules: Build-Depends on linux-headers-stx-amd64
Remove the kernel abiname/version from Build-Depends in OOT kernel
modules. After commit <Add pkgs without abiname for image/headers>
the new dependency is as this:
linux-kbuild-5.10 is depended by linux-headers-5.10.0-6-amd64;
linux-headers-5.10.0-6-amd64 is depended by linux-headers-stx-amd64.

Package linux-keys-5.10 is renamed to linux-keys.
Then the version numbers and abiname can be completely removed from
the Build-Depends of OOT kernel modules' codes.

Similar is done for RT kernel modules.

This is a preparation for kernel upgrading with major version.

Test plan:
 PASS: Build all the packages and iso successfully.
 PASS: The rt/std installations are fine for both qemu and lib.
 PASS: No warning appears for insmod/modprobe.

Depends-On: https://review.opendev.org/c/starlingx/kernel/+/896187

Story: 2010643
Task: 48815

Signed-off-by: Li Zhou <li.zhou@windriver.com>
Change-Id: I860a751cf4c11f64c81877714ecddb10b488fa96
2023-09-26 22:01:29 -04:00
Li Zhou 5cc80444d7 Add pkgs without abiname for image/headers
Add 2 packages linux-image-stx-amd64/linux-headers-stx-amd64
which don't have abiname in their names. They depend on packages with
abiname in names. Then we can use these 2 packages in anywhere
that involves image/headers packages (e.g. Build-Depends/yaml config
and so on). When the abiname is changed later in any kernel upgrading
we don't need change above places involved any more.

We don't use the linux-image-amd64/linux-headers-amd64 as Debian does
because they are built by linux-signed-amd64, and coupled with signed
kernel. We don't follow Debian's signing process so we create 2 new
packages which are coupled with unsigned image.

BTW, rename package "linux-keys-@version@" to "linux-keys" because the
"@version@" isn't necessary for this package. Then the version numbers
can be completely removed from the Build-Depends of OOT kernel
modules' codes.

All of above are done on rt kernel too.

This is a preparation for kernel upgrading with major version.

Test plan:
 PASS: 2 new pkgs linux-image-stx-amd64/linux-headers-stx-amd64 can be
       built successfully for linux.
 PASS: 2 new pkgs linux-rt-image-stx-amd64/linux-rt-headers-stx-amd64
       can be built successfully for linux-rt.

Story: 2010643
Task: 48815

Signed-off-by: Li Zhou <li.zhou@windriver.com>
Change-Id: I63f968d3b24728b2b5b08e889c26c3c4f6a0e1df
2023-09-26 22:00:59 -04:00
Peng Zhang 825266d5ac Update kernel to v5.10.189
This commit updates kernel to 5.10.189 to fix following CVE issue:
CVE-2023-4132: https://nvd.nist.gov/vuln/detail/CVE-2023-4132
CVE-2023-4004: https://nvd.nist.gov/vuln/detail/CVE-2023-4004
CVE-2023-20593: https://nvd.nist.gov/vuln/detail/CVE-2023-20593
CVE-2023-3863: https://nvd.nist.gov/vuln/detail/CVE-2023-3863
CVE-2023-31248: https://nvd.nist.gov/vuln/detail/CVE-2023-31248
CVE-2023-35001: https://nvd.nist.gov/vuln/detail/CVE-2023-35001
CVE-2023-3117: https://nvd.nist.gov/vuln/detail/CVE-2023-3117
CVE-2023-3611: https://nvd.nist.gov/vuln/detail/CVE-2023-3611
CVE-2023-3610: https://nvd.nist.gov/vuln/detail/CVE-2023-3610
CVE-2023-3776: https://nvd.nist.gov/vuln/detail/CVE-2023-3776
CVE-2023-3390: https://nvd.nist.gov/vuln/detail/CVE-2023-3390
CVE-2023-2898: https://nvd.nist.gov/vuln/detail/CVE-2023-2898

One of our source patches requires refresh against the new kernel
source. It was modified for missed parameter need be added in the
new kernel:
       Port-negative-dentries-limit-feature-from-3.10.patch.

After upgrading kernel, new function eth_hw_addr_set was added in
linux-headers-5.10.0-6-common. While it has already defined in the
following driver modules:
        i40e,i40e-cvl-4.10,iavf,iavf-cvl-4.10,ice,ice-cvl-4.10.
To avoid the redefinition conflict, we allow the out-of-tree drivers
to use the newly added in-tree version of the eth_hw_addr_set
function. This is achieved by undefining the NEED_ETH_HW_ADDR_SET
macro.

Verification:
- Build kernel and out of tree modules success for rt and std.
- Build iso success for rt and std.
- Install success onto a AIO-DX lab with rt kernel.
- Boot up successfully in the lab.
- The sanity testing was done by our test team and no regression
  defect was found.
- The cyclictest benchmark was also run on the starlingx lab, the
  result is "samples: 259199999 avg: 1633 max: 8817 99.9999th
  percentile: 7612 overflows: 0", It is not big difference with
  5.10.185 for avg and max.

Closes-Bug: 2029211

Change-Id: I107a0c0285ad2de39d56863cc5fed6273ad7fbd4
Signed-off-by: Peng Zhang <Peng.Zhang2@windriver.com>
2023-09-23 01:17:16 +08:00
Scott Little 68cfe67368 cengn reference removal
mirror.starlingx.cengn.ca no longer exists. CENGN is kindly forwarding
requests to the new location mirror.starlingx.windriver.com for now, but
that will only last a few months. We need to replace all the references
with the new URL.

I will also remove as many 'cengn' references as possible, replacing
them with 'stx_mirror'

Partial-Bug: 2033555
Change-Id: I250f7aff90f71ea67b1502c21b4e914ba682946c
Signed-off-by: Scott Little <scott.little@windriver.com>
2023-09-14 09:59:51 -04:00
Zuul 1d2b5fc39a Merge "perf/core: Fix perf_cgroup_switch()" 2023-09-13 17:03:21 +00:00
Alyson Deives Pereira 2f15f5cb6c perf/core: Fix perf_cgroup_switch()
When the system is stressed running pods on isolated cores (using
stress-ng for instance [1]) and the Power Metrics App [2] is also
being executed, the system hangs.

[1] https://github.com/ColinIanKing/stress-ng
[2] https://opendev.org/starlingx/app-power-metrics

Dmesg shows the following output:
WARNING: CPU: 16 PID: 207561 at
  kernel/events/core.c:868 perf_cgroup_switch+0x222/0x230
RIP: 0010:perf_cgroup_switch+0x222/0x230
Call Trace:
 ? __warn+0x79/0xc0
 ? perf_cgroup_switch+0x222/0x230
 ? report_bug+0x9e/0xc0
 ? handle_bug+0x41/0x90
 ? exc_invalid_op+0x14/0x70
 ? asm_exc_invalid_op+0x12/0x20
 ? perf_cgroup_switch+0x222/0x230
 ? perf_cgroup_switch+0xff/0x230
 __perf_event_task_sched_in+0x169/0x330
 ? __perf_event_task_sched_out+0x27c/0x6d0
 ? newidle_balance+0x3fd/0x480
 finish_task_switch.isra.0+0x118/0x4b0
 __schedule+0x2ae/0x930
 ? hrtimer_start_range_ns+0x2fc/0x420
 schedule+0xa7/0x110
 do_nanosleep+0x7c/0x1a0
 hrtimer_nanosleep+0x9b/0x140
 ? __hrtimer_init+0xe0/0xe0
 __x64_sys_nanosleep+0xad/0xe0
 do_syscall_64+0x30/0x40
 entry_SYSCALL_64_after_hwframe+0x61/0xc6

There is an upstream patch set that fix a race condition on
perf_cgroup_switch. Applying these patches into stx kernel solved the
issue.

* commit a0827713e298
  ("perf/core: Don't pass task around when ctx sched in")
  (v5.18-rc2~8^2~3)

* commit 6875186aea5c
  ("perf/core: perf/core: Use perf_cgroup_info->active to check if
  cgroup is active") (v5.18-rc2~8^2~2)

* commit 96492a6c558a
  ("perf/core: Fix perf_cgroup_switch()") (v5.18-rc2~8^2~1)

* commit e19cd0b6fa59
  ("perf/core: Always set cpuctx cgrp when enable cgroup event")
  (v5.18-rc2~8^2)

Note: It was verified that are no "fixes" commits from mainline kernel
to the commits mentioned above

Test plan:
PASS: Build iso success for rt and std.
PASS: Install success onto a AIO-SX lab with both rt and std kernel.
PASS: Apply power-metrics app, launch stress pods and confirm the
      system is stable.

Closes-Bug: 2035124
Change-Id: I30fcb63e4564a23cdb26794f4dfefa748eaa0cee
Signed-off-by: Alyson Deives Pereira <alyson.deivespereira@windriver.com>
2023-09-13 10:16:15 -03:00
Zuul 4ab014cf35 Merge "Upversion Octeon-ep to 11.23.04 for Marvell Octeon based PCIe cards" 2023-09-06 19:19:03 +00:00
Zuul 16c3057ed5 Merge "kernel: Fix drivers panic during shutdown" 2023-09-06 19:09:34 +00:00
Jiping Ma 2a5f9f3490 kernel: Fix drivers panic during shutdown
This commit prevents the ice and iavf drivers from causing kernel
panics when forced reboot is initiated with "reboot -f".

Issue #1: iavf driver

If the netdev pointer is NULL, then iavf_remove() returns early to
ensure that it does not proceed with an already-freed netdev instance.
However, drvdata field of the iavf driver's pci_dev structure continues
to keep the former value of the netdev pointer, and this value can be
acquired from the pci_dev structure via pci_get_drvdata(). This causes
a kernel panic when a forced reboot/shutdown is in progress due to the
following sequence of events:

- The iavf_shutdown() callback is called by the kernel. This function
  detaches the device, brings it down if it was running and frees
  resources.
- Later, the associated PF driver's shutdown callback is called:
  ice_shutdown(). That callback calls, among others, sriov_disable(),
  which then indirectly calls iavf_remove() again.
- Kernel WARNING is reported because the work adminq_task->func is NULL
  in cancel_work_sync(&adapter->adminq_task) during iavf_remove(), that
  reason is the resource already had been freed in the first
  iavf_remove() running stage.
  "WARNING: CPU: 63 PID: 93678 at kernel/workqueue.c:3047
    __flush_work.isra.0+0x6b/0x80"

The patch for iavf resolves this issue by checking the pci_dev
structure's is_busmaster field at the beginning of iavf_remove(). If the
PCI device had already been disabled by an earlier call to
iavf_shutdown() or iavf_remove(), via a call to pci_disable_device(),
then the is_busmaster field would be set to zero. Based on this logic,
if the is_busmaster field is set to zero, then the iavf_remove function
returns early. This in turn avoids the aforementioned kernel panic
caused by multiple calls to iavf_remove().

Note that the description above is applicable to iavf-4.6.1 (in NIC
driver bundle cvl-4.10); however, a similar issue occurs in earlier
versions of the iavf driver as well, which necessitates the same fix.

Issue #2: ice driver

When the system is rebooted, then the PTP-related resources are released
by the ice driver's ice_remove() function before the irq_msix_misc
interrupt is disabled. However, the interrupt handler continues to use
these resources, and when the interrupt in question occurs, then a
kernel panic occurs.

This issue is fixed by disabling the irq_msix_misc interrupt before the
call to ice_ptp_release() in ice_remove().

Please note that colleagues at Intel have reviewed the fixes included in
this commit, and they have confirmed that these changes could be used as
a temporary workaround for now. The changes introduced by this commit
can be reverted once Intel resolves the aforementioned issues in the
official ice and iavf driver releases.

This issue can be reproduced with the below steps.
1. Installed sts-silicom app.
2. Make sure sts-silicom must be running status.
3. reboot -f

Verification:
- build-pkgs; build-iso; install and boot up on aio-sx lab.
- The issue can not be reproduced after the fix with the up reproduced
  steps.

Closes-Bug: 2030725

Change-Id: Ib296dc3180023230c46aa028a7d7c4283b17cff0
Signed-off-by: Jiping Ma <jiping.ma2@windriver.com>
2023-09-05 23:24:19 -04:00
Zuul 07e9487413 Merge "QAT: Update QAT driver to version 2.0" 2023-08-31 15:50:11 +00:00
Peng Zhang 137e5b38b1 QAT: Update QAT driver to version 2.0
Intel 4th generation Xeon Scalable Processor (Sapphire Rapids) support
has been introduced for the platform. In order to leverage the
integrated QAT device of the SP-MCC SKUs, QAT driver need to be
upgraded to version 2.0.
To upgrade to version 2.0, the following items have been done:
	1. Update qat related patches for code context change;
	2. Update control, rules and such things for QAT version 2.0.

Test plan:
	- PASS: build-pkgs -a && build-image
	- PASS: lsmod |grep qat
	- PASS: /etc/init.d/qat_service status/start/stop
	- PASS: ./cpa_sample_code

Story: 2010796
Task: 48248

Change-Id: I1cba1660a13d1f28eee2b35713a54d31c048c609
Signed-off-by: Peng Zhang <Peng.Zhang2@windriver.com>
2023-08-31 23:00:16 +08:00
M. Vefa Bicakci 5a1dfa8274 doc: Fix Zuul/tox failures
This commit resolves the Zuul/tox failures encountered when running
sphinx to generate documentation, which in turn prevents merging changes
that are otherwise fine:

```
docs: 350 W commands[1]> sphinx-build -a -E -W -d doc/build/doctrees \
  -b html doc/source doc/build/html [tox/tox_env/api.py:427]
Running Sphinx v6.2.1

Warning, treated as error:
Invalid configuration value found: 'language = None'. Update your \
  configuration to a valid language code. Falling back to 'en' \
  (English).
docs: 723 C exit 2 (0.37 seconds) \
    /home/zuul/src/opendev.org/starlingx/kernel> \
    sphinx-build -a -E -W -d doc/build/doctrees -b html doc/source \
    doc/build/html pid=1720 [tox/execute/api.py:279]
  docs: FAIL code 2 (0.44=setup[0.07]+cmd[0.00,0.37] seconds)
  evaluation failed :( (0.54 seconds)
```

This issue was fixed for another StarlingX repository (tools) with
  https://review.opendev.org/c/starlingx/tools/+/893165
from which this commit is inspired.

The issue is related to a Sphinx update that requires the language
parameter to be specified:
  https://github.com/sphinx-doc/sphinx/issues/10062
  https://github.com/sphinx-doc/sphinx/issues/10474

Partial-Bug: 1976377
Partial-Bug: 2033431

Change-Id: Ic20fec5145b1a4ddb12051f018614562a4773b95
Signed-off-by: M. Vefa Bicakci <vefa.bicakci@windriver.com>
2023-08-31 01:36:50 +00:00
Jiping Ma b541465cc3 kernel-rt: beware of __put_task_struct() calling context
Under PREEMPT_RT, __put_task_struct() indirectly acquires sleeping
locks. Therefore, it can't be called from an non-preemptible context.

Instead of calling __put_task_struct() directly, we defer it using
call_rcu(). A more natural approach would use a workqueue, but since
in PREEMPT_RT, we can't allocate dynamic memory from atomic context,
the code would become more complex because we would need to put the
work_struct instance in the task_struct and initialize it when we
allocate a new task_struct.

We met 5 same panics, __put_task_struct is called during the process
holding a lock that caused the kernel BUG_ON. The below is the call
trace.

We also need cherry pick the following commits, because the necessary
context is not in 5.10.18x, such as there is not definition
DEFINE_WAIT_OVERRIDE_MAP.

* commit 5f2962401c6e
  ("locking/lockdep: Exclude local_lock_t from IRQ inversions")
* commit 175b1a60e880
  ("locking/lockdep: Clean up check_redundant() a bit")
* commit bc2dd71b2836
  ("locking/lockdep: Add a skip() function to __bfs()")
* commit 0cce06ba859a
  ("debugobjects,locking: Annotate debug_object_fill_pool() wait type
   violation")

kernel BUG at kernel/locking/rtmutex.c:1331!
invalid opcode: 0000 [#1] PREEMPT_RT SMP NOPTI
......
Call Trace:
 rt_spin_lock_slowlock_locked+0xb2/0x2a0
 ? update_load_avg+0x80/0x690
 rt_spin_lock_slowlock+0x50/0x80
 ? update_load_avg+0x80/0x690
 rt_spin_lock+0x2a/0x30
 free_unref_page+0xc5/0x280
 __vunmap+0x17f/0x240
 put_task_stack+0xc6/0x130
 __put_task_struct+0x3d/0x180
 rt_mutex_adjust_prio_chain+0x365/0x7b0
 task_blocks_on_rt_mutex+0x1eb/0x370
 rt_spin_lock_slowlock_locked+0xb2/0x2a0
 rt_spin_lock_slowlock+0x50/0x80
 rt_spin_lock+0x2a/0x30
 free_unref_page_list+0x128/0x5e0
 release_pages+0x2b4/0x320
 tlb_flush_mmu+0x44/0x150
 tlb_finish_mmu+0x3c/0x70
 zap_page_range+0x12a/0x170
 ? find_vma+0x16/0x70
 do_madvise+0x99d/0xba0
 ? do_epoll_wait+0xa2/0xe0
 ? __x64_sys_madvise+0x26/0x30
 __x64_sys_madvise+0x26/0x30
 do_syscall_64+0x33/0x40
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Verification:
- build-pkgs; build-iso; install and boot up on aio-sx lab.
- Can not reproduce the isue during the stress-ng test for almost 24 hours.
  while true; do sudo stress-ng --sched rr --mmapfork 23 -t 20; done
  while true; do sudo stress-ng --sched fifo--mmapfork 23 -t 20; done

Closes-Bug: 2031597
Signed-off-by: Jiping Ma <jiping.ma2@windriver.com>
Change-Id: If022441d61492eaec88eede8603a6cb052af99d1
2023-08-17 05:47:43 -04:00
Peng Zhang 481ad14aa4 Update kernel to v5.10.185
This commit updates kernel to 5.10.185 to fix following CVE issues:
CVE-2023-3609: https://nvd.nist.gov/vuln/detail/CVE-2023-3609
CVE-2023-3090: https://nvd.nist.gov/vuln/detail/CVE-2023-3090
CVE-2023-3212: https://nvd.nist.gov/vuln/detail/CVE-2023-3212
CVE-2023-35788: https://nvd.nist.gov/vuln/detail/CVE-2023-35788
CVE-2023-3141: https://nvd.nist.gov/vuln/detail/CVE-2023-3141
CVE-2023-3111: https://nvd.nist.gov/vuln/detail/CVE-2023-3111
CVE-2023-2124: https://nvd.nist.gov/vuln/detail/CVE-2023-2124
CVE-2023-3338: https://nvd.nist.gov/vuln/detail/CVE-2023-3338

None of our source patches requires refresh against the new kernel
source.

Verification:
- Build kernel and out of tree modules success for rt and std.
- Build iso success for rt and std.
- Install success onto a AIO-DX lab with rt kernel.
- Boot up successfully in the lab.
- The sanity testing was run including kernel and applications
  by our test team.
- The cyclictest benchmark was also run on the starlingx lab, the
  result is "samples: 259199999 avg: 1649 max: 9363 99.9999th
  percentile: 8579 overflows: 0", It is not big difference with
  5.10.180 for avg and max.

Closes-Bug: 2025123
Change-Id: Ia4d825573e03a8c6f03a4c5f53104db5903f41ae
Signed-off-by: Peng Zhang <Peng.Zhang2@windriver.com>
2023-08-02 14:49:49 +08:00
M. Vefa Bicakci 0ee7813d66 kernel: Improve Sapphire Rapids CPU support
This commit cherry-picks commits from the mainline kernel to improve
Sapphire Rapids CPU support in the following components of the StarlingX
kernel: intel_idle, perf/x86/RAPL and powercap, and perf/x86/cstate.
(RAPL stands for "Running Average Power Limit", which is a CPU feature
for measuring and limiting power consumption.)

These improvements are required to support a new power metrics
application in StarlingX, which is intended to work with Sapphire Rapids
CPUs: https://opendev.org/starlingx/app-power-metrics

The following commits are cherry-picked as part of this effort, in
chronological order, organized by component:

=> intel_idle
* commit 9edf3c0ffef0
  ("intel_idle: add SPR support")
  (v5.18-rc1~203^2~3^3~5)
* commit da0e58c038e6
  ("intel_idle: add 'preferred_cstates' module argument")
  (v5.18-rc1~203^2~3^3~4)
* commit 3a9cf77b60dc
  ("intel_idle: add core C6 optimization for SPR")
  (v5.18-rc1~203^2~3^3~3)
* commit 39c184a6a9a7
  ("intel_idle: Fix the 'preferred_cstates' module parameter")
  (v5.18-rc5~22^2^2~1)
* commit 7eac3bd38d18
  ("intel_idle: Fix SPR C6 optimization")
  (v5.18-rc5~22^2^2)
* commit 1548fac47a11
  ("intel_idle: make SPR C1 and C1E be independent")
  (v6.0-rc1~184^2~2^2^2)

=> perf/x86/rapl + powercap
* commit ffb20c2e52e8
  ("perf/x86/rapl: Add msr mask support")
  (v5.12-rc1~146^2~3)
* commit b6f78d3fba7f
  ("perf/x86/rapl: Only check lower 32bits for RAPL energy counters")
  (v5.12-rc1~146^2~2)
* commit 838342a6d6b7
  ("perf/x86/rapl: Fix psys-energy event on Intel SPR platform")
  (v5.12-rc1~146^2~1)
* commit 931da6a0de5d
  ("powercap: intel_rapl: support new layout of Psys PowerLimit Register
    on SPR")
  (v5.17-rc1~167^2^4^2~1)
* commit 80275ca9e525
  ("perf/x86/rapl: Use standard Energy Unit for SPR Dram RAPL domain")
  (v6.1-rc4~3^2~3)

=> perf/x86/cstate
* commit 87bf399f86ec
  ("perf/x86/cstate: Add ICELAKE_X and ICELAKE_D support")
  (v5.14-rc1~7^2~1)
* commit 528c9f1daf20
  ("perf/x86/cstate: Add SAPPHIRERAPIDS_X CPU support")
  (v5.18-rc4~3^2)

The set of commits listed above is a reduced version of a slightly
larger superset of commits we had originally considered for
cherry-picking. We opted for the commits listed above to limit potential
impact on the StarlingX kernel by focusing on Sapphire Rapids support
and direct dependencies only.

We should note that we encountered a number of merge conflicts while
cherry-picking these commits; however, none of the merge conflict
resolutions required significantly altering the modifications made by
the original commits. The individual patch files denote the nature of
the merge conflicts.

Verification:

* The kernel recipes and all kernel modules were built from scratch with
  this commit, using the following command in a StarlingX build
  environment:

  $ build-pkgs -c -p linux,linux-rt,bnxt-en,i40e,i40e-cvl-2.54,\
  i40e-cvl-4.10,iavf,iavf-cvl-2.54,iavf-cvl-4.10,ice,ice-cvl-2.54,\
  ice-cvl-4.10,igb-uio,iqvlinux,kmod-opae-fpga-driver,mlnx-ofed-kernel,\
  octeon-ep,qat1.7.l

  These packages were further packaged into a StarlingX (ostree) patch
  for easier deployment.

* An Ansible-bootstrapped low-latency All-in-One simplex StarlingX
  set-up was prepared on a server with a Sapphire Rapids CPU.

* The ostree patch was installed onto the server to start testing our
  changes. The kernel was confirmed to boot up as expected.

* We enabled RAPL Psys domain reporting the server's BIOS (originally
  disabled), and we also disabled the BIOS-enforced limit on the CPU
  *package* C-states (originally set to C0/C1).

* We forcibly removed the "intel_idle.max_cstate=0" kernel command line
  argument by modifying the sysinv daemon's Python source code on the
  server (with a systemd service that bind-mounts a replacement *.py
  file, to avoid another ostree patch). This was required to prevent the
  intel_idle driver from disabling itself, so that we could confirm the
  sanity of the cherry-picked commits.

* The following tests were carried out, first with the patched
  preempt-rt kernel, and next with the original unpatched preempt-rt
  kernel:

  * Confirm that the intel_idle CPU idling driver is active:
    $ cat /sys/devices/system/cpu/cpuidle/current_driver
  * Confirm the CPU idling state names and parameters:
    $ grep -s '^' \
    /sys/devices/system/cpu/cpu0/cpuidle/state[0-9]*/\
    {name,desc,time,latency,residency}
  * Confirm that the RAPL/powercap and C-state related performance
    monitor unit (PMU) counters are usable by the kernel and with perf:
    $ sudo perf list
  * Confirm that the CPU and package C-state residency counters are
    working:
    $ perf stat -a \
      -e cstate_core/c1-residency/ -e cstate_core/c6-residency/ \
      -e cstate_pkg/c2-residency/ -e cstate_pkg/c6-residency/ \
      -- sleep 5
  * Confirm that RAPL/powercap-related performance counters are working:
    $ perf stat -a \
      -e power/energy-pkg/ -e power/energy-ram/ -e power/energy-psys/ \
      -- sleep 5

  With the unpatched kernel, we observed that the intel_idle driver used
  CPU idling information exposed by the ACPI tables, with the following
  idle state names: POLL, C1_ACPI, C2_ACPI. With the patched kernel the
  C-state tables embedded in the intel_idle driver were used as
  expected, with the following idle state names: POLL, C1, C1E, C6.

  With the unpatched kernel, we observed that the CPU/package C-state
  residency counters were not detected, whereas they were detected with
  the patched kernel, as expected.

  With both the unpatched and the patched kernels, the RAPL/powercap
  related performance counters were detected. We observed that the units
  for the DRAM domain were incorrect for the unpatched kernel, which was
  expected due to the lack of commit 80275ca9e525 ("perf/x86/rapl: Use
  standard Energy Unit for SPR Dram RAPL domain").

* To confirm the sanity of our results acquired with the patched kernel
  in the previous step, we also carried out the following experiment
  with the v6.4.3-rt6 kernel available in the linux-yocto repository as
  commit 917d160a84f6 ("Merge branch 'v6.4/standard/base' into
  v6.4/standard/preempt-rt/base") in the "v6.4/standard/preempt-rt/base"
  branch.

  The "notification of death" StarlingX kernel patch was forward-ported
  to the v6.4.3-rt6 kernel and the "kernel.sched_nr_migrate" sysctl was
  reintroduced to make this kernel work with the aforementioned
  Ansible-bootstrapped StarlingX system.

  Furthermore, to ensure that the RAPL/powercap features are aligned to
  the most recent mainline kernel version, we cherry-picked the
  following commits from v6.5-rc1 onto the v6.4.3-rt6 kernel:
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=range&q=44c026a73be8..49776c712eb6

  Afterwards, this v6.4.3-rt6-based test kernel was built and installed
  onto the test server, and test procedures discussed in the previous
  step were repeated.

  Compared to the patched StarlingX v5.10 kernel, we observed that the
  RAPL/powercap measurements were similar, and the CPU and package
  C-state residency counters were not extremely different with the
  v6.4.3-rt6-based test kernel.

* We should note that we have repeated tests with the patched StarlingX
  v5.10 kernel as well, but we did not reinstall the system to acquire a
  standard/non-low-latency set-up. Instead, we opted for running the
  following command, rebooting the system into the standard kernel,
  followed by repeating the test procedures, which had similar results.

  sudo grub-editenv /boot/1/kernel.env set kernel=vmlinuz-5.10.0-6-amd64

Acknowledgements:

* Thanks to Alyson Deives Pereira for his extensive help in pruning the
  commits that we had originally thought of cherry-picking with this
  commit.

* Thanks to Mark Asselstine for his advice on the second phase of the
  commit pruning activity.

Story: 2010773
Task: 48449

Change-Id: Ibe6bff65e8a415ac027a5d493a0e65fe58c9e344
Signed-off-by: M. Vefa Bicakci <vefa.bicakci@windriver.com>
2023-07-24 13:44:38 +00:00
Shinas Rasheed 3d0ea9f6be Upversion Octeon-ep to 11.23.04 for Marvell Octeon based PCIe cards
Octeon-ep debian package includes PF, VF and virtual PHC device drivers.
These drivers are loaded as part of octeon operator deployment.
This commit includes release 11.23.04.

Test Plan:
Pass: build-pkgs -p octeon-ep
Pass: build-image

Tested and verified on hardware.

Story: 2010047
Task: 48033
Change-Id: Id505d96e190dd95e9200d2134142a17174712b64
Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
2023-06-30 10:39:41 -07:00
Alyson Deives Pereira f475c5db86 Enable Intel RAPL and uncore frequency control
This change enables support for the Intel RAPL (Running Average Power
Limit) technology via MSR interface, which allows power limits to be
enforced and monitored on modern Intel processors, and the
intel-uncore-frequency driver, which allows control of Uncore
frequency limits on supported server platforms.

This is achieved by enabling the following kernel configuration items:
- CONFIG_POWERCAP
- CONFIG_INTEL_RAPL
- CONFIG_INTEL_RAPL_CORE
- CONFIG_INTEL_UNCORE_FREQ_CONTROL

This change also adds intel-uncore-frequency support for Sapphire
Rapids processor by including the following upstream kernel commit:

* commit 60accc011af0 (v5.12-rc1~123^2~41)
("platform/x86/intel-uncore-freq: Add Sapphire Rapids server support")

TEST PLAN:
PASS: Build iso success for rt and std.
PASS: Install success onto a AIO-DX lab with both rt and std kernel.
PASS: The following kernel modules are successfully enabled on a
Sapphire Rapids server, on both std and rt kernels:
- sudo modprobe rapl
- sudo modprobe msr
- sudo modprobe intel_rapl_common
- sudo modprobe intel_rapl_msr
- sudo modprobe intel-uncore-frequency

Story: 2010773
Task: 48304

Change-Id: I1b618f65483c657d8c936f6f8494a8611ab09e70
Signed-off-by: Alyson Deives Pereira <alyson.deivespereira@windriver.com>
2023-06-30 14:36:34 -03:00
Peng Zhang 734233561c Update kernel to v5.10.180
This commit updates kernel to 5.10.180 to fix following CVE issue:
CVE-2023-32233: https://nvd.nist.gov/vuln/detail/CVE-2023-32233
CVE-2023-31436: https://nvd.nist.gov/vuln/detail/CVE-2023-31436
CVE-2023-2513: https://nvd.nist.gov/vuln/detail/CVE-2023-2513
CVE-2023-1859: https://nvd.nist.gov/vuln/detail/CVE-2023-1859
CVE-2023-34256: https://nvd.nist.gov/vuln/detail/CVE-2023-34256

One of our source patches requires refresh against the new kernel
source. It was deleted for content has been contained in the new
kernel:
       xfs-drop-submit-side-trans-alloc-for-append-ioends.patch

Verification:
- Build kernel and out of tree modules success for rt and std.
- Build iso success for rt and std.
- Install success onto a AIO-DX lab with rt kernel.
- Boot up successfully in the lab.
- The sanity testing was done by our test team and no regression
  defect was found.
- The cyclictest benchmark was also run on the starlingx lab, the
  result is "samples: 259200000 avg: 1660 max: 10167 99.9999th
  percentile: 2527 overflows: 0", It is not big difference with
  5.10.177 for avg and max.

Closes-Bug: 2021927
Change-Id: Ia676889d752715dc404132ed66e2f2ddb7d17d62
Signed-off-by: Peng Zhang <Peng.Zhang2@windriver.com>
2023-06-09 22:31:33 +08:00
Zhixiong Chi 221fe56504 Drop the components of livepatch feature
As there is no plan to support livepatch feature in StarlingX community,
now we drop the packages and components of this feature.
Since this component depends on the userspace tools kpatch, so this
patch should be merged first(Another commit in integ repo will depend on
this).

TestPlan:
PASS: build-pkgs -a
PASS: build-image
PASS: Jenkins installation.

Signed-off-by: Zhixiong Chi <zhixiong.chi@windriver.com>
Change-Id: Iaf96daddab40f87d6155333eae7d1780a3696764
2023-06-06 02:41:36 +00:00
Zuul eaff902150 Merge "Downgrade legacy ice driver from 1.5.8.2 to 1.5.8" 2023-05-23 15:01:16 +00:00
Haiqing Bai 20e578cdd8 kernel: Disable unprivileged eBPF by default
The following warning message is printed out starting with kernel
version 5.10.105 in response to a newer Spectre-type security issue:

  Spectre V2: WARNING: Unprivileged eBPF is enabled with eIBRS on, data
  leaks possible via Spectre v2 BHB attacks!

This message is printed out when Spectre v2 mitigations are enabled and
unprivileged eBPF is enabled.

This warning message was introduced with commit afc2d635b5e1
("x86/speculation: Include unprivileged eBPF status in spectre v2
mitigation reporting") in the Linux stable team's linux-5.10.y branch.
The first tag that includes this change in that branch is "v5.10.105".

This commit sets the "CONFIG_BPF_UNPRIV_DEFAULT_OFF" Kconfig option to
suppress the aforementioned warning message. Note that unprivileged eBPF
is disabled by default in most distributions. Disabling unprivileged
eBPF is recommended as a (partial) mitigation against attack primitives
known as Spectre-v2-BHB ("Spectre v2 aided by the Branch History
Buffer"), as documented at the following links:
- https://www.vusec.net/projects/bhi-spectre-bhb/
- https://www.intel.com/content/www/us/en/developer/articles/\
  technical/software-security-guidance/technical-documentation/\
  branch-history-injection.html

Also note that if unprivileged eBPF is re-enabled at runtime via
"sysctl" or by writing to "/proc/sys/kernel/unprivileged_bpf_disabled",
then the warning message in question will appear in the kernel logs.

Verification:
- On a test system, remove the kernel command line argument
  'nospectre_v2' from "/boot/efi/EFI/BOOT/boot.env", and save the file.
- Reboot.
- With this commit, the following warning message does not appear in the
  kernel's logs, as confirmed with "dmesg | grep eIBRS": "Spectre V2:
  WARNING: Unprivileged eBPF is enabled with eIBRS on, data leaks
  possible via Spectre v2 BHB attacks!". (Without this commit, the
  aforementioned warning message would appear in the kernel logs.)

Closes-Bug: 2019268

Signed-off-by: Haiqing Bai <haiqing.bai@windriver.com>
Change-Id: I03d9ef494384c52cd4d81d02d8c76cd0fef6edb5
2023-05-16 12:46:33 +00:00
Jiping Ma 94d92e2047 Downgrade legacy ice driver from 1.5.8.2 to 1.5.8
This reverts commit ce36f16cd721b2b267d8658492f6eid3a6f64c81b.

The legacy ice driver was upgraded from v1.5.8 to v1.5.8.2 to fix
Bug-2016445. The fixed issue is a memory corruption bug that affects
use cases involving the ice driver and virtual function (VF)
interface reset operations, to the best of our current knowledge.

Despite efforts to validate ice driver v1.5.8.2, concerns remained that
the upgrade might negatively affect deployed systems. Hence, this commit
downgrades the legacy ice driver from v1.5.8.2 back to v1.5.8.

Verification:
- ethtool and confirm the version is 1.5.8
- links come up and pass traffic

Closes-Bug: 2019769

Signed-off-by: Jiping Ma <Jiping.ma2@windriver.com>
Change-Id: Ie25ceab4b8c677f15ce589dab207eb0b696b912f
2023-05-15 20:57:51 -04:00
Jim Somerville bc9fce3401 Port negative dentries limit feature from 3.10
This ports the Redhat feature forward from the 3.10 kernel version.

This feature allows one to specifiy a loose maximum of total memory
which is allowed to be used for negative dentries.  This is done
via setting a sysctl variable which is used to calculate a
negative dentry limit for the system.  Every 15 seconds a kworker
task will prune back the negative dentries that exceed the limit,
plus an extra 1% for hysteresis purposes.

Intent is that the feature code is kept as close to the 3.10 version
as possible.

Main differences from the 3.10 version of the code:
- count of dentries associated with a superblock is kept in a
different location, requiring a procedure call to obtain
- superblocks are now kept by node id and memcg, requiring
more calls into iterate_super

Verification
- ensure the sysctl variable is set to 20:
sysctl fs.negative-dentry-limit
- run a test program that continuously tries to access a lot of
non-existent files, causing a lot of negative dentries to build up
- monitor the number of negative dentries building up in the system:
cat /proc/sys/fs/dentry-state
the fifth number is the negative dentries
- get the calculated negative dentry limit:
dmesg | grep Negative
- watch the negative dentries you are monitoring periodically drop
to below the limit before they start building back up again.  The
drop should happen about 4 times per minute

Closes-Bug: 2017703

Signed-off-by: Jim Somerville <jim.somerville@windriver.com>
Change-Id: I3f55249aab45471802d123ed2253b6f36cc4af50
2023-05-10 17:43:03 -04:00
Zuul 455abfa46c Merge "Fix github mirroring for this repo" 2023-05-01 20:55:07 +00:00
Davlet Panech aa4e9eee05 Fix github mirroring for this repo
Updating the rsa ssh host key based on:
https://github.blog/2023-03-23-we-updated-our-rsa-ssh-host-key/

Note: In the future, StarlingX should have a zuul job and
secret setup for all repos so we do not need to do this
for every repo.

Needed to rename the secret, because zuul fails if like-named
secrets have diffent values in different branches of the same
repo.

Partial-Bug: #2015246
Change-Id: Ia60a3b7e0725182edd64078aa7f8c6bb4b35a373
Signed-off-by: Davlet Panech <davlet.panech@windriver.com>
2023-04-28 12:38:51 -04:00
M. Vefa Bicakci 8829ae9ffa kernel: intel_pstate: Support newer CPUs
This commit cherry-picks commits from the mainline kernel tree to let
the intel_pstate support the following newer CPUs in certain cases:

* Ice Lake
* Sapphire Rapids

Support for the following cases are added:

* When hardware P-states (HWP) is disabled in the BIOS/firmware, then
  the intel_pstate driver will still get enabled (albeit in 'passive'
  mode) for these CPUs, as opposed to the intel_pstate driver reporting
  that the CPU is not supported.

* When out-of-band (OOB) P-state management is enabled with Ice Lake and
  Sapphire Rapids CPUs, then the intel_pstate driver gracefully disables
  itself, as the BIOS/platform firmware is responsible for managing the
  HWP in such cases.

The following bullet point list depicts the commits that have been
cherry-picked, along with the output of "git describe --contains" for
each commit, to provide a sense of how recent each commit is:

* commit fbdc21e9b038 ("cpufreq: intel_pstate: Add Icelake servers
  support in no-HWP mode") (v5.14-rc1~144^2~1^2~1^2~6)

* commit cd23f02f1668 ("cpufreq: intel_pstate: Add Ice Lake server to
  out-of-band IDs") (v5.16-rc3~24^2~3)

* commit bbd67f1b5a94 ("cpufreq: intel_pstate: Support Sapphire Rapids
  OOB mode") (v5.19-rc1~182^2~2^2~11)

* commit df51f287b5de ("cpufreq: intel_pstate: Add Sapphire Rapids
  support in no-HWP mode") (v6.2-rc1~189^2~3^2~4)

We should note that we considered cherry-picking the following commits
too, but we opted to not do that due to the reasons discussed below:

* commit 706c5328851d ("cpufreq: intel_pstate: Add Cometlake support in
  no-HWP mode") (v5.14-rc1~144^2~1^2~1^2~5)

* commit b6e6f8beec98 ("cpufreq: intel_pstate: Update EPP for AlderLake
  mobile") (v5.17-rc1~167^2~2^2~21)

* commit 71bb5c82aaae ("cpufreq: intel_pstate: Add Tigerlake support in
  no-HWP mode") (v6.1-rc1~205^2~1^2~1)

* commit 60675225ebee ("cpufreq: intel_pstate: Adjust
  balance_performance EPP for Sapphire Rapids") (v6.3-rc1~21^2~6)

The commits related to Comet Lake and Tiger Lake were skipped, because
they refer to non-server class CPUs to the best of our knowledge, and
StarlingX is usually run on server-class hardware.

The commit for Alder Lake mobile CPUs is a dependency for the commit
that adjusts the Energy/Performance Preference (EPP) setting for
Sapphire Rapids CPUs. The latter commit improves performance when the
'powersave' governor is used, or when balance_performance setting is
used in the BIOS. While it is possible to cherry-pick both of these
commits (which was done in an earlier iteration of this commit), we
opted to not do that, mainly to keep the scope of this commit smaller,
and also because the two commits appeared optional (as opposed to
necessary) to us.

Verification

- The standard and real-time kernel packages were successfully built
  with this commit.

- An ISO image, built with a slightly different version of this commit
  for an older CentOS-based StarlingX version, containing Ice Lake
  CPU-related changes only, was installed onto a server that has an Ice
  Lake Xeon CPU, with HWP disabled and with out-of-band hardware P-state
  management enabled in the BIOS. This resulted in the intel_pstate
  driver being disabled, as expected.

- A StarlingX ISO image based on the StarlingX master branch with this
  commit included was successfully installed onto a server with a
  Sapphire Rapids Xeon CPU, in low-latency All-in-One simplex mode.
  (Please note that the server was not Ansible-bootstrapped due to
  unrelated difficulties.)

  The intel_pstate driver was successfully initialized for the case
  where HWP was enabled ("Native Mode") in the BIOS settings, and for
  the case where HWP was turned off ("Disabled") in the BIOS settings.
  For the former case, intel_pstate's status was reported as "active",
  and for the latter case, intel_pstate's status was reported as
  "passive", as indicated by
  "/sys/devices/system/cpu/intel_pstate/status".

  Prior to this commit, with HWP disabled in the BIOS, intel_pstate
  would not load due to the CPU not being recognized.

- On the same server with a Sapphire Rapids CPU, enabling out-of-band
  HWP management caused the intel_pstate driver to disable itself by
  printing out "P-states controlled by the platform" and returning
  -ENODEV, as expected.

Partial-Bug: 2016028
Change-Id: I55384c2239d6543662eeef62e86a4b8951887fd7
Signed-off-by: M. Vefa Bicakci <vefa.bicakci@windriver.com>
2023-04-27 21:34:29 +00:00
Peng Zhang d5db2760ab Update kernel to v5.10.177
This commit updates kernel to 5.10.177 to fix following CVE issue:
CVE-2022-4379: https://nvd.nist.gov/vuln/detail/CVE-2022-4379

One of our source patches requires refresh against the new kernel
source. It was modified to acommodate the context changes in the new
kernel:
	0001-Notification-of-death-of-arbitrary-processes.patch

Verification:
- Build kernel and out of tree modules success for rt and std.
- Build iso success for rt and std.
- Install success onto a All-in-One lab with rt kernel.
- Boot up successfully in the lab.
- The sanity testing was run including kernel and applications
  by our test team.
- The cyclictest benchmark was also run on the starlingx lab, the result
  is "samples: 259199999 avg: 1614 max: 4759 99.9999th percentile: 2572
  overflows: 0", It is not big difference with 5.10.162 for avg and max,
  but percentile seems little lower than 5.10.162.

Closes-Bug: 2015711
Change-Id: I98a92534154989446ba6eda9529cd799498ee800
Signed-off-by: Peng Zhang <Peng.Zhang2@windriver.com>
2023-04-28 03:51:42 +08:00
Jim Somerville 63492b8ddd Reduce kernfs_mutex contention
This is a backport of a collection of 12 upstream patches.
The main one being the switch to use a rwsem instead.
The next important one being the switch of the rwsem to be a
per filesystem lock instead of global.

See the individual patches for details.  They did not require
much work or wiggling to get them applied.

They all come from Linus' tree and are easily located.  As such
I have not modified their individual headers with upstream
commit ids.

Verification:
- two scripts, the concept behind them supplied by Vefa Bicakci.
The first one causes a lot of concurrent contention in sysfs.
The second script highlights how well systemd is also contending.
Run Script1 followed by Script2

Without this change, Script2 has timeouts and fails.

Script1:
for i in `seq 20`; do
  (while :; do find /sys/fs/cgroup/ -type f -readable -print0 \
    2>/dev/null | xargs -0 -n 20 -r cat >&/dev/null ; done) &
done

for i in `seq 10`; do
  (while :; do systemd-run --scope -q sleep 0.5 >/dev/null; done) &
done

Script2:
while true; do
        date -Is
        /usr/bin/time -f %e systemctl enable  -q lighttpd.service ||
break
        /usr/bin/time -f %e systemctl disable -q lighttpd.service ||
break
        /usr/bin/time -f %e systemctl restart -q lighttpd.service ||
break
        sleep 0.5 || break
done

- also soak testing to ensure that these patches don't introduce issues

Partial-Bug: 2016028

Signed-off-by: Jim Somerville <jim.somerville@windriver.com>
Change-Id: I6ad64cd7c90f756c6eb904065febfeb516e73009
2023-04-25 17:51:09 +00:00
Jiping Ma 9001d5abaa Fix kernel build warning
Correct the date format in the changelog file.

Erro logs:
stderr: dch: warning:     debian/changelog(l6): badly formatted
        trailer line
stderr: LINE:  -- Jiping Ma <jiping.ma2@windriver.com>  Wed Jan 11
        9:33:12 CST 2023
stderr: dch: warning:     debian/changelog(l8): found start of
        entry where expected more change data or trailer
stderr: LINE: linux (5.10.152-1) unstable; urgency=medium
stderr: dch: warning:     debian/changelog(l8): found end of file
        where expected more change data or trailer

Closes-Bug: 2017385

Signed-off-by: Jiping Ma <jiping.ma2@windriver.com>
Change-Id: Ia9f0abd3d30b1e755e56609b673014ad6d5002a4
2023-04-23 23:10:39 -04:00