summaryrefslogtreecommitdiffstats
path: root/drivers/edac
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'irq-core-2021-08-30' of ↵Linus Torvalds2021-08-301-5/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Updates to the interrupt core and driver subsystems: Core changes: - The usual set of small fixes and improvements all over the place, but nothing stands out MSI changes: - Further consolidation of the PCI/MSI interrupt chip code - Make MSI sysfs code independent of PCI/MSI and expose the MSI interrupts of platform devices in the same way as PCI exposes them. Driver changes: - Support for ARM GICv3 EPPI partitions - Treewide conversion to generic_handle_domain_irq() for all chained interrupt controllers - Conversion to bitmap_zalloc() throughout the irq chip drivers - The usual set of small fixes and improvements" * tag 'irq-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits) platform-msi: Add ABI to show msi_irqs of platform devices genirq/msi: Move MSI sysfs handling from PCI to MSI core genirq/cpuhotplug: Demote debug printk to KERN_DEBUG irqchip/qcom-pdc: Trim unused levels of the interrupt hierarchy irqdomain: Export irq_domain_disconnect_hierarchy() irqchip/gic-v3: Fix priority comparison when non-secure priorities are used irqchip/apple-aic: Fix irq_disable from within irq handlers pinctrl/rockchip: drop the gpio related codes gpio/rockchip: drop irq_gc_lock/irq_gc_unlock for irq set type gpio/rockchip: support next version gpio controller gpio/rockchip: use struct rockchip_gpio_regs for gpio controller gpio/rockchip: add driver for rockchip gpio dt-bindings: gpio: change items restriction of clock for rockchip,gpio-bank pinctrl/rockchip: add pinctrl device to gpio bank struct pinctrl/rockchip: separate struct rockchip_pin_bank to a head file pinctrl/rockchip: always enable clock for gpio controller genirq: Fix kernel doc indentation EDAC/altera: Convert to generic_handle_domain_irq() powerpc: Bulk conversion to generic_handle_domain_irq() nios2: Bulk conversion to generic_handle_domain_irq() ...
| * EDAC/altera: Convert to generic_handle_domain_irq()Marc Zyngier2021-08-121-5/+2
| | | | | | | | | | | | | | Replace generic_handle_irq(irq_linear_revmap()) with a single call to generic_handle_domain_irq(). Signed-off-by: Marc Zyngier <maz@kernel.org>
* | Merge tag 'edac_updates_for_v5.15' of ↵Linus Torvalds2021-08-308-38/+202
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC updates from Borislav Petkov: "The usual EDAC stuff which managed to trickle in for 5.15: - Add new HBM2 (High Bandwidth Memory Gen 2) type and add support for it to the Intel SKx drivers - Print additional useful per-channel error information on i10nm, like on SKL - Don't load the AMD EDAC decoder in virtual images - The usual round of fixes and cleanups" * tag 'edac_updates_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/i10nm: Retrieve and print retry_rd_err_log registers EDAC/i10nm: Fix NVDIMM detection EDAC/skx_common: Set the memory type correctly for HBM memory EDAC/altera: Skip defining unused structures for specific configs EDAC/mce_amd: Do not load edac_mce_amd module on guests EDAC/mc: Add new HBM2 memory type EDAC/amd64: Use DEVICE_ATTR helper macros
| * EDAC/i10nm: Retrieve and print retry_rd_err_log registersYouquan Song2021-08-234-3/+157
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Retrieve and print retry_rd_err_log registers like the earlier change: commit e80634a75aba ("EDAC, skx: Retrieve and print retry_rd_err_log registers") This is a little trickier than on Skylake because of potential interference with BIOS use of the same registers. The default behavior is to ignore these registers. A module parameter retry_rd_err_log(default=0) controls the mode of operation: - 0=off : Default. - 1=bios : Linux doesn't reset any control bits, but just reports values. This is "no harm" mode, but it may miss reporting some data. - 2=linux: Linux tries to take control and resets mode bits, clears valid/UC bits after reading. This should be more reliable (especially if BIOS interference is reduced by disabling eMCA reporting mode in BIOS setup). Co-developed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210818175701.1611513-3-tony.luck@intel.com
| * EDAC/i10nm: Fix NVDIMM detectionQiuxu Zhuo2021-08-231-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MCDDRCFG is a per-channel register and uses bit{0,1} to indicate the NVDIMM presence on DIMM slot{0,1}. Current i10nm_edac driver wrongly uses MCDDRCFG as per-DIMM register and fails to detect the NVDIMM. Fix it by reading MCDDRCFG as per-channel register and using its bit{0,1} to check whether the NVDIMM is populated on DIMM slot{0,1}. Fixes: d4dc89d069aa ("EDAC, i10nm: Add a driver for Intel 10nm server processors") Reported-by: Fan Du <fan.du@intel.com> Tested-by: Wen Jin <wen.jin@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210818175701.1611513-2-tony.luck@intel.com
| * EDAC/skx_common: Set the memory type correctly for HBM memoryQiuxu Zhuo2021-08-231-1/+4
| | | | | | | | | | | | | | | | | | Set the memory type to MEM_HBM2 if it's managed by the HBM2 memory controller. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210720163009.GA1417532@agluck-desk2.amr.corp.intel.com
| * EDAC/altera: Skip defining unused structures for specific configsKrzysztof Kozlowski2021-08-161-18/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Altera EDAC driver has several features conditionally built depending on Kconfig options. The edac_device_prv_data structures are conditionally used in of_device_id tables. They reference other functions and structures which can be defined as __maybe_unused. Silence build warnings like: drivers/edac/altera_edac.c:643:37: warning: ‘altr_edac_device_inject_fops’ defined but not used [-Wunused-const-variable=] Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Link: https://lkml.kernel.org/r/20210601092704.203555-1-krzysztof.kozlowski@canonical.com
| * EDAC/mce_amd: Do not load edac_mce_amd module on guestsSmita Koralahalli2021-08-091-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hypervisors likely do not expose the SMCA feature to the guest and loading this module leads to false warnings. This module should not be loaded in guests to begin with, but people tend to do so, especially when testing kernels in VMs. And then they complain about those false warnings. Do the practical thing and do not load this module when running as a guest to avoid all that complaining. [ bp: Rewrite commit message. ] Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Tested-by: Kim Phillips <kim.phillips@amd.com> Link: https://lkml.kernel.org/r/20210628172740.245689-1-Smita.KoralahalliChannabasappa@amd.com
| * EDAC/mc: Add new HBM2 memory typeNaveen Krishna Chatradhi2021-07-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | Add a new entry to 'enum mem_type' and a new string to 'edac_mem_types[]' for HBM2 (High Bandwidth Memory Gen 2) new memory type. Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Muralidhara M K <muralimk@amd.com> Signed-off-by: Naveen Krishna Chatradhi <nchatrad@amd.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210630152828.162659-4-nchatrad@amd.com
| * EDAC/amd64: Use DEVICE_ATTR helper macrosDwaipayan Ray2021-07-131-13/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of "open coding" DEVICE_ATTR, use the corresponding helper macros DEVICE_ATTR_{RW,RO,WO} in amd64_edac.c Some function names needed to be changed to match the device conventions <foo>_show and <foo>_store, but the functionality itself is unchanged. The devices using EDAC_DCT_ATTR_SHOW() are left unchanged. Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210713065130.2151-1-dwaipayanray1@gmail.com
* | EDAC/igen6: fix core dependency AGAINRandy Dunlap2021-07-151-1/+1
|/ | | | | | | | | | | | | | | | My previous patch had a typo/thinko which prevents this driver from being enabled: change X64_64 to X86_64. Fixes: 0a9ece9ba154 ("EDAC/igen6: fix core dependency") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac@vger.kernel.org Cc: bowsingbetee <bowsingbetee@protonmail.com> Cc: stable@vger.kernel.org Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'akpm' (patches from Andrew)Linus Torvalds2021-07-021-0/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge more updates from Andrew Morton: "190 patches. Subsystems affected by this patch series: mm (hugetlb, userfaultfd, vmscan, kconfig, proc, z3fold, zbud, ras, mempolicy, memblock, migration, thp, nommu, kconfig, madvise, memory-hotplug, zswap, zsmalloc, zram, cleanups, kfence, and hmm), procfs, sysctl, misc, core-kernel, lib, lz4, checkpatch, init, kprobes, nilfs2, hfs, signals, exec, kcov, selftests, compress/decompress, and ipc" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits) ipc/util.c: use binary search for max_idx ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock ipc: use kmalloc for msg_queue and shmid_kernel ipc sem: use kvmalloc for sem_undo allocation lib/decompressors: remove set but not used variabled 'level' selftests/vm/pkeys: exercise x86 XSAVE init state selftests/vm/pkeys: refill shadow register after implicit kernel write selftests/vm/pkeys: handle negative sys_pkey_alloc() return code selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random kcov: add __no_sanitize_coverage to fix noinstr for all architectures exec: remove checks in __register_bimfmt() x86: signal: don't do sas_ss_reset() until we are certain that sigframe won't be abandoned hfsplus: report create_date to kstat.btime hfsplus: remove unnecessary oom message nilfs2: remove redundant continue statement in a while-loop kprobes: remove duplicated strong free_insn_page in x86 and s390 init: print out unknown kernel parameters checkpatch: do not complain about positive return values starting with EPOLL checkpatch: improve the indented label test checkpatch: scripts/spdxcheck.py now requires python3 ...
| * kernel.h: split out panic and oops helpersAndy Shevchenko2021-07-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernel.h is being used as a dump for all kinds of stuff for a long time. Here is the attempt to start cleaning it up by splitting out panic and oops helpers. There are several purposes of doing this: - dropping dependency in bug.h - dropping a loop by moving out panic_notifier.h - unload kernel.h from something which has its own domain At the same time convert users tree-wide to use new headers, although for the time being include new header back to kernel.h to avoid twisted indirected includes for existing users. [akpm@linux-foundation.org: thread_info.h needs limits.h] [andriy.shevchenko@linux.intel.com: ia64 fix] Link: https://lkml.kernel.org/r/20210520130557.55277-1-andriy.shevchenko@linux.intel.com Link: https://lkml.kernel.org/r/20210511074137.33666-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Co-developed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Corey Minyard <cminyard@mvista.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Sebastian Reichel <sre@kernel.org> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Stephen Boyd <sboyd@kernel.org> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Acked-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge tag 'edac_updates_for_v5.14' of ↵Linus Torvalds2021-06-3011-60/+625
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC updates from Tony Luck: "Various fixes and support for new CPUs: - Clean up error messages from thunderx_edac - Add MODULE_DEVICE_TABLE to ti_edac so it will autoload - Use %pR to print resources in aspeed_edac - Add Yazen Ghannam as MAINTAINER for AMD edac drivers - Fix Ice Lake and Sapphire Rapids drivers to report correct "near" or "far" device for errors in 2LM configurations - Add support of on package high bandwidth memory in Sapphire Rapids - New CPU support for three CPUs supporting in-band ECC (IOT SKUs for ICL-NNPI, Tiger Lake and Alder Lake) - Don't even try to load Intel EDAC drivers when running as a guest - Fix Kconfig dependency on X86_MCE_INTEL for EDAC_IGEN6" * tag 'edac_updates_for_v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/igen6: fix core dependency EDAC/Intel: Do not load EDAC driver when running as a guest EDAC/igen6: Add Intel Alder Lake SoC support EDAC/igen6: Add Intel Tiger Lake SoC support EDAC/igen6: Add Intel ICL-NNPI SoC support EDAC/i10nm: Add support for high bandwidth memory EDAC/i10nm: Add detection of memory levels for ICX/SPR servers EDAC/skx_common: Add new ADXL components for 2-level memory MAINTAINERS: Make Yazen Ghannam maintainer for EDAC-AMD64 EDAC/aspeed: Use proper format string for printing resource EDAC/ti: Add missing MODULE_DEVICE_TABLE EDAC/thunderx: Remove irrelevant variable from error messages
| * | EDAC/igen6: fix core dependencyRandy Dunlap2021-06-201-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | igen6_edac needs mce_register()/unregister() functions, so it should depend on X86_MCE (or X86_MCE_INTEL). That change prevents these build errors: ld: drivers/edac/igen6_edac.o: in function `igen6_remove': igen6_edac.c:(.text+0x494): undefined reference to `mce_unregister_decode_chain' ld: drivers/edac/igen6_edac.o: in function `igen6_probe': igen6_edac.c:(.text+0xf5b): undefined reference to `mce_register_decode_chain' Fixes: 10590a9d4f23e ("EDAC/igen6: Add EDAC driver for Intel client SoCs using IBECC") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210619160203.2026-1-rdunlap@infradead.org
| * | EDAC/Intel: Do not load EDAC driver when running as a guestLuck, Tony2021-06-174-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's little to no point in loading an EDAC driver running in a guest: 1) The CPU model reported by CPUID may not represent actual h/w 2) The hypervisor likely does not pass in access to memory controller devices 3) Hypervisors generally do not pass corrected error details to guests Add a check in each of the Intel EDAC drivers for X86_FEATURE_HYPERVISOR and simply return -ENODEV in the init routine. Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210615174419.GA1087688@agluck-desk2.amr.corp.intel.com
| * | EDAC/igen6: Add Intel Alder Lake SoC supportQiuxu Zhuo2021-06-171-11/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alder Lake SoC shares the same memory controller and In-Band ECC (IBECC) IP with Tiger Lake SoC. Like Tiger Lake, it also has two memory controllers each associated one IBECC instance. The minor differences include the MMIO offset of each memory controller and the type of memory error address logged in the IBECC. So add Alder Lake compute die IDs, adjust the MMIO offset for each memory controller and handle the type of memory error address logged in the IBECC for Alder Lake EDAC support. Tested-by: Vrukesh V Panse <vrukesh.v.panse@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210611170123.1057025-7-tony.luck@intel.com
| * | EDAC/igen6: Add Intel Tiger Lake SoC supportQiuxu Zhuo2021-06-171-20/+253
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tiger Lake SoC shares the same memory controller and In-Band ECC (IBECC) IP with Elkhart Lake SoC. The main differences are that Tiger Lake has two memory controllers each associated with one IBECC and uses Machine Check for the memory error notification. So add Tiger Lake compute die IDs, MCE decoding chain registration, and memory slice decoding for Tiger Lake EDAC support. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210611170123.1057025-6-tony.luck@intel.com
| * | EDAC/igen6: Add Intel ICL-NNPI SoC supportQiuxu Zhuo2021-06-171-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Ice Lake Neural Network Processor for Deep Learning Inference (ICL-NNPI) SoC shares the same memory controller and In-Band ECC with Elkhart Lake SoC. Add the ICL-NNPI compute die IDs for EDAC support. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210611170123.1057025-5-tony.luck@intel.com
| * | EDAC/i10nm: Add support for high bandwidth memoryQiuxu Zhuo2021-06-173-19/+148
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A future Xeon processor will include in-package HBM (high bandwidth memory). The in-package HBM memory controller shares the same architecture with the regular DDR memory controller. Add the HBM memory controller devices for EDAC support. Tested-by: Hongyu Ning <hongyu.ning@linux.intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210611170123.1057025-4-tony.luck@intel.com
| * | EDAC/i10nm: Add detection of memory levels for ICX/SPR serversQiuxu Zhuo2021-06-172-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current i10nm_edac driver is only for system configured in 1-level memory. If the system is configured in 2-level memory, the driver doesn't report the 1st level memory DIMM for the error address, even if the error occurs in the 1st level memory. Both Ice Lake servers and Sapphire Rapids servers can be configured in 2-level memory. Add detection of memory levels to i10nm_edac for the two kinds of servers so that the driver can report the 2nd level memory DIMM or the 1st level memory DIMM according to error source. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210611170123.1057025-3-tony.luck@intel.com
| * | EDAC/skx_common: Add new ADXL components for 2-level memoryQiuxu Zhuo2021-06-172-11/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some Intel servers may configure memory in 2 levels, using fast "near" memory (e.g. DDR) as a cache for larger, slower, "far" memory (e.g. 3D X-point). In these configurations the BIOS ADXL address translation for an address in a 2-level memory range will provide details of both the "near" and far components. Current exported ADXL components are only for 1-level memory system or for 2nd level memory of 2-level memory system. So add new ADXL components for 1st level memory of 2-level memory system to fully support 2-level memory system and the detection of memory error source(1st level memory or 2nd level memory). Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210611170123.1057025-2-tony.luck@intel.com
| * | EDAC/aspeed: Use proper format string for printing resourceArnd Bergmann2021-05-181-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On ARMv7, resource_size_t can be 64-bit, which breaks printing it as %x: drivers/edac/aspeed_edac.c: In function 'init_csrows': drivers/edac/aspeed_edac.c:257:28: error: format '%x' expects argument of \ type 'unsigned int', but argument 4 has type 'resource_size_t' {aka 'long \ long unsigned int'} [-Werror=format=] 257 | dev_dbg(mci->pdev, "dt: /memory node resources: first page \ r.start=0x%x, resource_size=0x%x, PAGE_SHIFT macro=0x%x\n", Use the special %pR format string to pretty-print the entire resource instead. Fixes: edfc2d73ca45 ("EDAC/aspeed: Add support for AST2400 and AST2600") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Link: https://lkml.kernel.org/r/20210421135500.3518661-1-arnd@kernel.org
| * | EDAC/ti: Add missing MODULE_DEVICE_TABLEBixuan Cui2021-05-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The module misses MODULE_DEVICE_TABLE() for of_device_id tables and thus never autoloads on ID matches. Add the missing declaration. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Tero Kristo <kristo@kernel.org> Link: https://lkml.kernel.org/r/20210512033727.26701-1-cuibixuan@huawei.com
| * | EDAC/thunderx: Remove irrelevant variable from error messagesChristophe JAILLET2021-05-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'ret' is irrelevant (it is 0) for both dev_err() calls, so just remove it from the error message. [ bp: Massage commit message. ] Fixes: 41003396f932 ("EDAC, thunderx: Add Cavium ThunderX EDAC driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/0c046ef5cfb367a3f707ef4270e21a2bcbf44952.1620280098.git.christophe.jaillet@wanadoo.fr
* | | EDAC/mce_amd: Fix typo "FIfo" -> "Fifo"Colin Ian King2021-06-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is an uppercase letter I in one of the MCE error descriptions instead of a lowercase one. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lkml.kernel.org/r/20210603103349.79117-1-colin.king@canonical.com
* | | x86/MCE/AMD, EDAC/mce_amd: Add new SMCA bank typesMuralidhara M K2021-05-271-0/+70
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the (HWID, MCATYPE) tuples and names for new SMCA bank types. Also, add their respective error descriptions to the MCE decoding module edac_mce_amd. Also while at it, optimize the string names for some SMCA banks. [ bp: Drop repeated comments, explain why UMC_V2 is a separate entry. ] Signed-off-by: Muralidhara M K <muralimk@amd.com> Signed-off-by: Naveen Krishna Chatradhi <nchatrad@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lkml.kernel.org/r/20210526164601.66228-1-nchatrad@amd.com
* | x86/msr: Rename MSR_K8_SYSCFG to MSR_AMD64_SYSCFGBrijesh Singh2021-05-101-1/+1
|/ | | | | | | | | | | The SYSCFG MSR continued being updated beyond the K8 family; drop the K8 name from it. Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/20210427111636.1207-4-brijesh.singh@amd.com
* EDAC: altera: merge ARCH_SOCFPGA and ARCH_STRATIX10Krzysztof Kozlowski2021-03-232-7/+12
| | | | | | | | | | Simplify 32-bit and 64-bit Intel SoCFPGA Kconfig options by having only one for both of them. This the common practice for other platforms. Additionally, the ARCH_SOCFPGA is too generic as SoCFPGA designs come from multiple vendors. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
* Merge branch 'edac-misc' into edac-updates-for-v5.12Borislav Petkov2021-02-152-2/+2
|\
| * EDAC/xgene: Do not print a failure message to get an IRQ twiceMenglong Dong2021-01-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Coccinelle reports a redundant error print in xgene_edac_probe() because platform_get_irq() will already print an error message when it is unable to get an IRQ. Use platform_get_irq_optional() instead which avoids the error message and keep the driver-specific one. [ bp: Sanitize commit message. ] Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Robert Richter <rric@kernel.org> Link: https://lkml.kernel.org/r/20210112103540.7818-1-dong.menglong@zte.com.cn
| * EDAC/ppc4xx: Convert comma to semicolonZheng Yongjun2020-12-301-1/+1
| | | | | | | | | | | | | | | | Replace a comma between expression statements with a semicolon. Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201216131846.14937-1-zhengyongjun3@huawei.com
* | EDAC/amd64: Issue probing messages only on properly detected hardwareBorislav Petkov2021-01-221-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | amd64_edac was converted to CPU family autoprobing (from PCI device IDs) to not have to add a new PCI device ID each time a new platform is shipped but to support the whole family out-of-the-box. However, this caused a lot of noise in dmesg even when the machine doesn't have ECC DIMMs or ECC has been disabled in the BIOS: EDAC MC: Ver: 3.0.0 EDAC amd64: F17h detected (node 0). EDAC amd64: Node 0: DRAM ECC disabled. EDAC amd64: F17h detected (node 1). EDAC amd64: Node 1: DRAM ECC disabled. EDAC amd64: F17h detected (node 2). EDAC amd64: Node 2: DRAM ECC disabled. EDAC amd64: F17h detected (node 3). EDAC amd64: Node 3: DRAM ECC disabled. EDAC amd64: F17h detected (node 4). EDAC amd64: Node 4: DRAM ECC disabled. EDAC amd64: F17h detected (node 5). EDAC amd64: Node 5: DRAM ECC disabled. EDAC amd64: F17h detected (node 6). EDAC amd64: Node 6: DRAM ECC disabled. EDAC amd64: F17h detected (node 7). EDAC amd64: Node 7: DRAM ECC disabled. or even $ grep EDAC dmesg.log | sed 's/\[.*\] //' | sort | uniq -c 128 EDAC amd64: F17h detected (node 0). 128 EDAC amd64: Node 0: DRAM ECC disabled. 1 EDAC MC: Ver: 3.0.0 on a big machine. Yap, that's once per CPU for 128 of them. So move the init messages after all probing has succeeded to avoid unnecessary spew in dmesg. Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210119164141.17417-1-bp@alien8.de
* | EDAC/amd64: Limit error injection functionality to supported hwBorislav Petkov2020-12-282-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | Families up to and including 0x16 allow access to the injection hardware. Starting with family 0x17, access to those registers is blocked by security policy. Limit that only on the families which support it. Suggested-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201222180013.GD13463@zn.tnic
* | EDAC/amd64: Merge error injection sysfs facilitiesBorislav Petkov2020-12-285-252/+235
| | | | | | | | | | | | | | | | | | | | | | | | Merge them into the main driver and put them inside an EDAC_DEBUG ifdeffery to simplify the driver and have all debugging/injection stuff behind a debug build-time switch. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lkml.kernel.org/r/20201215110517.5215-2-bp@alien8.de
* | EDAC/amd64: Merge sysfs debugging attributes setup codeBorislav Petkov2020-12-284-69/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | There's no need for them to be in a separate file so merge them into the main driver compilation unit like the other EDAC drivers do. Drop now-unneeded function export, make the function static and shorten static function names. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lkml.kernel.org/r/20201215110517.5215-1-bp@alien8.de
* | EDAC/amd64: Tone down messages about missing PCI IDsYazen Ghannam2020-12-281-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Give these messages a debug severity as they are really only useful to the module developers. Also, drop the "(broken BIOS?)" phrase, since this can cause churn for BIOS folks. The PCI IDs needed by the module, at least on modern systems, are fixed in hardware. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201215170131.8496-1-Yazen.Ghannam@amd.com
* | EDAC/amd64: Do not load on family 0x15, model 0x13Borislav Petkov2020-12-281-3/+7
|/ | | | | | | | | | | | | | | | | Those were only laptops and are very very unlikely to have ECC memory. Currently, when the driver attempts to load, it issues: EDAC amd64: Error: F1 not found: device 0x1601 (broken BIOS?) because the PCI device is the wrong one (it uses the F15h default one). So do not load the driver on them as that is pointless. Reported-by: Don Curtis <bugrprt21882@online.de> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Don Curtis <bugrprt21882@online.de> Link: http://bugzilla.opensuse.org/show_bug.cgi?id=1179763 Link: https://lkml.kernel.org/r/20201218160622.20146-1-bp@alien8.de
* Merge branch 'akpm' (patches from Andrew)Linus Torvalds2020-12-151-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge misc updates from Andrew Morton: - a few random little subsystems - almost all of the MM patches which are staged ahead of linux-next material. I'll trickle to post-linux-next work in as the dependents get merged up. Subsystems affected by this patch series: kthread, kbuild, ide, ntfs, ocfs2, arch, and mm (slab-generic, slab, slub, dax, debug, pagecache, gup, swap, shmem, memcg, pagemap, mremap, hmm, vmalloc, documentation, kasan, pagealloc, memory-failure, hugetlb, vmscan, z3fold, compaction, oom-kill, migration, cma, page-poison, userfaultfd, zswap, zsmalloc, uaccess, zram, and cleanups). * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (200 commits) mm: cleanup kstrto*() usage mm: fix fall-through warnings for Clang mm: slub: convert sysfs sprintf family to sysfs_emit/sysfs_emit_at mm: shmem: convert shmem_enabled_show to use sysfs_emit_at mm:backing-dev: use sysfs_emit in macro defining functions mm: huge_memory: convert remaining use of sprintf to sysfs_emit and neatening mm: use sysfs_emit for struct kobject * uses mm: fix kernel-doc markups zram: break the strict dependency from lzo zram: add stat to gather incompressible pages since zram set up zram: support page writeback mm/process_vm_access: remove redundant initialization of iov_r mm/zsmalloc.c: rework the list_add code in insert_zspage() mm/zswap: move to use crypto_acomp API for hardware acceleration mm/zswap: fix passing zero to 'PTR_ERR' warning mm/zswap: make struct kernel_param_ops definitions const userfaultfd/selftests: hint the test runner on required privilege userfaultfd/selftests: fix retval check for userfaultfd_open() userfaultfd/selftests: always dump something in modes userfaultfd: selftests: make __{s,u}64 format specifiers portable ...
| * edac: ghes: use krealloc_array()Bartosz Golaszewski2020-12-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the helper that checks for overflows internally instead of manually calculating the size of the new array. Link: https://lkml.kernel.org/r/20201109110654.12547-7-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Knig <christian.koenig@amd.com> Cc: Christoph Lameter <cl@linux.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Airlie <airlied@linux.ie> Cc: David Rientjes <rientjes@google.com> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: James Morse <james.morse@arm.com> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Jason Wang <jasowang@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Maxime Ripard <mripard@kernel.org> Cc: "Michael S . Tsirkin" <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Robert Richter <rric@kernel.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Takashi Iwai <tiwai@suse.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge tag 'x86_cpu_for_v5.11' of ↵Linus Torvalds2020-12-142-4/+4
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpuid updates from Borislav Petkov: "Only AMD-specific changes this time: - Save the AMD physical die ID into cpuinfo_x86.cpu_die_id and convert all code to use it (Yazen Ghannam) - Remove a dead and unused TSEG region remapping workaround on AMD (Arvind Sankar)" * tag 'x86_cpu_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu/amd: Remove dead code for TSEG region remapping x86/topology: Set cpu_die_id only if DIE_TYPE found EDAC/mce_amd: Use struct cpuinfo_x86.cpu_die_id for AMD NodeId x86/CPU/AMD: Remove amd_get_nb_id() x86/CPU/AMD: Save AMD NodeId as cpu_die_id
| * | EDAC/mce_amd: Use struct cpuinfo_x86.cpu_die_id for AMD NodeIdYazen Ghannam2020-11-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and later systems. This function is used in amd64_edac_mod to do system-specific decoding for DRAM ECC errors. The function takes a "NodeId" as a parameter. In AMD documentation, NodeId is used to identify a physical die in a system. This can be used to identify a node in the AMD_NB code and also it is used with umc_normaddr_to_sysaddr(). However, the input used for decode_dram_ecc() is currently the NUMA node of a logical CPU. In the default configuration, the NUMA node and physical die will be equivalent, so this doesn't have an impact. But the NUMA node configuration can be adjusted with optional memory interleaving modes. This will cause the NUMA node enumeration to not match the physical die enumeration. The mismatch will cause the address translation function to fail or report incorrect results. Use struct cpuinfo_x86.cpu_die_id for the node_id parameter to ensure the physical ID is used. Fixes: fbe63acf62f5 ("EDAC, mce_amd: Use cpu_to_node() to find the node ID") Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201109210659.754018-4-Yazen.Ghannam@amd.com
| * | x86/CPU/AMD: Remove amd_get_nb_id()Yazen Ghannam2020-11-192-3/+3
| |/ | | | | | | | | | | | | | | | | | | | | | | | | The Last Level Cache ID is returned by amd_get_nb_id(). In practice, this value is the same as the AMD NodeId for callers of this function. The NodeId is saved in struct cpuinfo_x86.cpu_die_id. Replace calls to amd_get_nb_id() with the logical CPU's cpu_die_id and remove the function. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201109210659.754018-3-Yazen.Ghannam@amd.com
| |
| \
| \
| \
*---. \ Merge branches 'edac-spr', 'edac-igen6' and 'edac-misc' into ↵Borislav Petkov2020-12-1425-63/+1076
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | edac-updates-for-v5.11 Signed-off-by: Borislav Petkov <bp@suse.de>
| | | * | EDAC/amd64: Fix PCI component registrationBorislav Petkov2020-11-271-12/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to setup its PCI component, the driver needs any node private instance in order to get a reference to the PCI device and hand that into edac_pci_create_generic_ctl(). For convenience, it uses the 0th memory controller descriptor under the assumption that if any, the 0th will be always present. However, this assumption goes wrong when the 0th node doesn't have memory and the driver doesn't initialize an instance for it: EDAC amd64: F17h detected (node 0). ... EDAC amd64: Node 0: No DIMMs detected. But looking up node instances is not really needed - all one needs is the pointer to the proper device which gets discovered during instance init. So stash that pointer into a variable and use it when setting up the EDAC PCI component. Clear that variable when the driver needs to unwind due to some instances failing init to avoid any registration imbalance. Cc: <stable@vger.kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201122150815.13808-1-bp@alien8.de
| | | * | EDAC/synopsys: Return the correct value in mc_probe()Zhang Xiaoxu2020-11-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return the error value if the inject sysfs file creation fails, rather than returning 0, to signal to the upper layer that the ->probe function failed. [ bp: Massage. ] Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Michal Simek <michal.simek@xilinx.com> Link: https://lkml.kernel.org/r/20201116135810.3130845-1-zhangxiaoxu5@huawei.com
| | | * | EDAC: Fix some kernel-doc markupsMauro Carvalho Chehab2020-11-021-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kernel-doc markup should use this format: identifier - description Correct that and also fix some enums' names in the kernel-doc markup. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/1d291393ba58c7b80908a3fedf02d2f53921ffe9.1603469755.git.mchehab+huawei@kernel.org
| | | * | EDAC: Do not issue useless debug statements in the polling routineBorislav Petkov2020-10-2614-14/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | They have been spreading around the subsystem by example so remove them all. Reported-by: Raymond Bennett <raymond.bennett@gmail.com> Suggested-by: Jason Baron <jbaron@akamai.com> Signed-off-by: Borislav Petkov <bp@suse.de>
| | | * | EDAC/amd64: Remove unneeded breaksTom Rix2020-10-261-8/+0
| | | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A break is not needed if it is preceded by a return. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Robert Richter <rric@kernel.org> Link: https://lkml.kernel.org/r/20201019193524.13391-1-trix@redhat.com
| | * | EDAC/igen6: ecclog_llist can be statickernel test robot2020-11-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes: 10590a9d4f23 ("EDAC/igen6: Add EDAC driver for Intel client SoCs using IBECC") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/20201123031850.GA20416@aef56166e5fc Signed-off-by: Tony Luck <tony.luck@intel.com>