How can I inject an AER Error?
Sophia Terry
I need to inject AER errors onto a SUSE machine. I've modprobbed the aer_inject module just fine, and I compiled the aer-inject tool from kernel.org.
Whenever I run it, I get the following error.
Error: Failed to write, No such deviceEven though my device exists according to lspci -vvv, and I'm running with root permissions.
Here's my file that I'm using to pass to aer-inject
AER
PCI_ID 18:00.0
COR_STATUS BAD_TLP
HEADER_LOG 0 1 2 3And on my machine, 18:00.0 corresponds to
18:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]Which has Advanced Error Reporting according to lspci -vvv
Why am I getting this error? Am I using the tool correctly? What should I put for the PCI_ID field if not what I see in lspci?
1 Answer
I just hit this same "Failed to write, No such device" issue on openSUSE Leap 15.2 running on a Dell T30 server. It turns out there is some level of ownership for AER handling and the aer_inject module will fail to find devices if the AER handling support appears to be associated elsewhere (possibly tied by BIOS to ACPI?). Regardless, I got aer-inject to work by appending pcie_ports=native to the kernel command line and rebooting.
FWIW I used yast2 to append the pcie_ports=native option: yast2 -> System -> Boot Loader -> Kernel Parameters -> Optional Kernel Command Line Parameter