summaryrefslogtreecommitdiffstats
path: root/drivers/edac
AgeCommit message (Collapse)AuthorFilesLines
2010-06-04Merge branch 'linux_next' of ↵Linus Torvalds6-27/+2325
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core * 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core: (83 commits) i7core_edac: Better describe the supported devices Add support for Westmere to i7core_edac driver i7core_edac: don't free on success i7core_edac: Add support for X5670 Always call i7core_[ur]dimm_check_mc_ecc_err i7core_edac: fix memory leak of i7core_dev EDAC: add __init to i7core_xeon_pci_fixup i7core_edac: Fix wrong device id for channel 1 devices i7core: add support for Lynnfield alternate address i7core_edac: Add initial support for Lynnfield i7core_edac: do not export static functions edac: fix i7core build edac: i7core_edac produces undefined behaviour on 32bit i7core_edac: Use a more generic approach for probing PCI devices i7core_edac: PCI device is called NONCORE, instead of NOCORE i7core_edac: Fix ringbuffer maxsize i7core_edac: First store, then increment i7core_edac: Better parse "any" addrmask i7core_edac: Use a lockless ringbuffer edac: Create an unique instance for each kobj ...
2010-06-02of/edac: fix build breakage in driversAnatolij Gustschin2-9/+9
Fixes build errors in EDAC drivers caused by the OF device_node pointer being moved into struct device Signed-off-by: Anatolij Gustschin <agust@denx.de> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-05-27drivers/edac: convert logging messages direct uses of __FILE__ to %s, __FILEJoe Perches3-31/+31
Reduces text by eliminating multiple __FILE__ uses. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Joe Perches <joe@perches.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Tim Small <tim@buttersideup.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-22Merge remote branch 'origin' into secretlab/next-devicetreeGrant Likely4-6/+5
Merging in current state of Linus' tree to deal with merge conflicts and build failures in vio.c after merge. Conflicts: drivers/i2c/busses/i2c-cpm.c drivers/i2c/busses/i2c-mpc.c drivers/net/gianfar.c Also fixed up one line in arch/powerpc/kernel/vio.c to use the correct node pointer. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-05-22of: Remove duplicate fields from of_platform_driverGrant Likely2-23/+17
.name, .match_table and .owner are duplicated in both of_platform_driver and device_driver. This patch is a removes the extra copies from struct of_platform_driver and converts all users to the device_driver members. This patch is a pretty mechanical change. The usage model doesn't change and if any drivers have been missed, or if anything has been fixed up incorrectly, then it will fail with a compile time error, and the fixup will be trivial. This patch looks big and scary because it touches so many files, but it should be pretty safe. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Sean MacLennan <smaclennan@pikatech.com>
2010-05-18i7core_edac: Better describe the supported devicesMauro Carvalho Chehab1-2/+7
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-18Add support for Westmere to i7core_edac driverVernon Mauery1-37/+80
This adds new PCI IDs for the Westmere's memory controller devices and modifies the i7core_edac driver to be able to probe both Nehalem and Westmere processors. Signed-off-by: Vernon Mauery <vernux@us.ibm.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-18PCI: fix typos pci_device_dis/enable to pci_dis/enable_device in commentsRoman Fietze3-3/+3
This fixes all occurrences of pci_enable_device and pci_disable_device in all comments. There are no code changes involved. Signed-off-by: Roman Fietze <roman.fietze@telemotive.de> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-05-18i7core_edac: don't free on successTony Luck1-1/+2
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-18i7core_edac: Add support for X5670Mauro Carvalho Chehab1-1/+6
As reported by Vernon Mauery <vernux@us.ibm.com>, X5670 (Westmere-EP) uses a different register for one of the uncore PCI devices. Add support for it. Those are the PCI ID's on this new chipset: fe:00.0 0600: 8086:2c70 (rev 02) fe:00.1 0600: 8086:2d81 (rev 02) fe:02.0 0600: 8086:2d90 (rev 02) fe:02.1 0600: 8086:2d91 (rev 02) fe:02.2 0600: 8086:2d92 (rev 02) fe:02.3 0600: 8086:2d93 (rev 02) fe:02.4 0600: 8086:2d94 (rev 02) fe:02.5 0600: 8086:2d95 (rev 02) fe:03.0 0600: 8086:2d98 (rev 02) fe:03.1 0600: 8086:2d99 (rev 02) fe:03.2 0600: 8086:2d9a (rev 02) fe:03.4 0600: 8086:2d9c (rev 02) fe:04.0 0600: 8086:2da0 (rev 02) fe:04.1 0600: 8086:2da1 (rev 02) fe:04.2 0600: 8086:2da2 (rev 02) fe:04.3 0600: 8086:2da3 (rev 02) fe:05.0 0600: 8086:2da8 (rev 02) fe:05.1 0600: 8086:2da9 (rev 02) fe:05.2 0600: 8086:2daa (rev 02) fe:05.3 0600: 8086:2dab (rev 02) fe:06.0 0600: 8086:2db0 (rev 02) fe:06.1 0600: 8086:2db1 (rev 02) fe:06.2 0600: 8086:2db2 (rev 02) fe:06.3 0600: 8086:2db3 (rev 02) (as usual, the same PCI devices repeat at ff: bus) The PCI device 8086:2c70 is shown as: fe:00.0 Host bridge: Intel Corporation QuickPath Architecture Generic Non-core Registers (rev 02) So, for this device to be recognized, it is only a matter of adding this new PCI ID to the driver. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-18Always call i7core_[ur]dimm_check_mc_ecc_errVernon Mauery1-1/+2
This fixes an error in function i7core_check_error In commit ca9c90ba09ca3c9799319f46a56f397afbf617c2 which converts the driver to use double buffering, there is a change in the logic. Before, if mce_count was zero, it skipped over a couple of statements and finished out with a call to the *check_mc_ecc_err function. The current code checks to see if mce_count is 0 and then exits. This change reverts the behavior back to the original where if there are no errors to report, we skip to the end and call the *check_mc_ecc_err function. This fix allows the driver to work again on my Nehalem based blades again. Signed-off-by: Vernon Mauery <vernux@us.ibm.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-18i7core_edac: fix memory leak of i7core_devAlexander Beregalov1-1/+3
Free already allocated i7core_dev. Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-18EDAC: add __init to i7core_xeon_pci_fixupJiri Slaby1-1/+1
It's called only from an __init function and is the only user of pcibios_scan_specific_bus which will be marked as __devinit in the next patch. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Fix wrong device id for channel 1 devicesMauro Carvalho Chehab1-4/+4
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core: add support for Lynnfield alternate addressMauro Carvalho Chehab1-2/+11
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Add initial support for LynnfieldMauro Carvalho Chehab1-2/+37
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10edac: fix i7core buildRandy Dunlap1-0/+3
Fix build warning (missing header file) and build error when CONFIG_SMP=n. drivers/edac/i7core_edac.c:860: error: implicit declaration of function 'msleep' drivers/edac/i7core_edac.c:1700: error: 'struct cpuinfo_x86' has no member named 'phys_proc_id' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10edac: i7core_edac produces undefined behaviour on 32bitAlan Cox1-12/+12
Fix the shifts up Signed-off-by: Alan Cox <alan@linux.intel.com> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Use a more generic approach for probing PCI devicesMauro Carvalho Chehab1-39/+40
Currently, only one PCI set of tables is allowed. This prevents using the driver for other devices like Lynnfield, with have a different set of PCI ID's. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: PCI device is called NONCORE, instead of NOCOREMauro Carvalho Chehab1-3/+3
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Fix ringbuffer maxsizeMauro Carvalho Chehab1-6/+6
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: First store, then incrementMauro Carvalho Chehab1-7/+6
Fix ringbuffer store logic. While here, add a few comments to the code and remove the undesired printk that could otherwise be called during NMI time. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Better parse "any" addrmaskMauro Carvalho Chehab1-1/+1
Instead of accepting just "any", accept also "any\n" Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Use a lockless ringbufferMauro Carvalho Chehab1-28/+55
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10edac: Create an unique instance for each kobjMauro Carvalho Chehab2-32/+64
Current code only works when there's just one memory controller, since we need one kobj for each instance. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Convert UDIMM error counters into a proper sysfs groupMauro Carvalho Chehab1-37/+44
Instead of displaying 3 values at the same var, break it into 3 different sysfs nodes: /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm0 /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm1 /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm2 For registered dimms, however, the error counters are already being displayed at: /sys/devices/system/edac/mc/mc0/csrow*/ce_count So, there's no need to add any extra sysfs nodes. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10edac: Don't create csrow entries on instance groupsMauro Carvalho Chehab1-2/+2
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10edac: store/show methods for device groups weren't workingMauro Carvalho Chehab3-9/+88
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Add support for sysfs addrmatch groupMauro Carvalho Chehab1-103/+70
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10edac_core: Allow the creation of sysfs groupsMauro Carvalho Chehab2-28/+67
Currently, all sysfs nodes are stored at /sys/.*/mc. (regex) However, sometimes it is needed to create attribute groups. This patch extends edac_core to allow groups creation. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Avoid printing a warning when debug is disabledMauro Carvalho Chehab1-2/+1
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: We need to use list_for_each_entry_safe to avoid errorsMauro Carvalho Chehab1-2/+3
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: change remove module strategyMauro Carvalho Chehab1-20/+35
The old remove module stragegy didn't work on devices with multiple cores, since only one PCI device is used to open all mc's, due to Nehalem nature. Also, it were based at pdev value. However, this doesn't point to the pci device used at mci->dev. So, instead, it unregisters all devices at once, deleting them from the device list. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: remove static counter for max socketsMauro Carvalho Chehab1-1/+0
The number of sockets is now fully dynamic. Get rid of this obsolete var. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: at remove, don't remove all pci devices at onceMauro Carvalho Chehab1-17/+19
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Fix a bug when printing error counts with RDIMMsMauro Carvalho Chehab1-5/+8
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: a few fixes for multiple mc'sMauro Carvalho Chehab1-9/+12
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: sanity check: print a warning if a mcelog is ignoredMauro Carvalho Chehab1-1/+6
In thesis, the other mc controller should handle it. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: create one mc per socket/QPIMauro Carvalho Chehab1-279/+228
Instead of creating just one memory controller, create one per socket (e. g. per Quick Link Path Interconnect). This better reflects the Nehalem architecture. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10Dynamically allocate memory for PCI devicesMauro Carvalho Chehab1-61/+114
Instead of using a static table assuming always 2 CPU sockets, allocate space dynamically for Nehalem PCI devs. This patch is part of a series of patches that changes i7core_edac to allow more than 2 sockets and to properly report one memory controller per socket.
2010-05-10i7core: temporary workaround to allow it to compile against 2.6.30Mauro Carvalho Chehab1-2/+4
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Improve corrected_error_counts output for RDIMMMauro Carvalho Chehab1-3/+3
Just cosmetics. instead of showing something like: socket 0, channel 2dimm0: 1 dimm1: 0 dimm2: 0 socket 1, channel 2dimm0: 0 dimm1: 0 dimm2: 0 Show: socket 0, channel 2 RDIMM0: 1 RDIMM1: 0 RDIMM2: 0 socket 0, channel 2 RDIMM0: 0 RDIMM1: 0 RDIMM2: 0 This is more synthetic and easier to parse. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Probe on Xeons earilerKeith Mannthey1-13/+19
On the Xeon 55XX series cpus the pci deives are not exposed via acpi so we much explicitly probe them to make the usable as a Linux PCI device. This moves the detection of this state to before pci_register_driver is called. Its present position was not working on my systems, the driver would complain about not finding a specific device. This patch allows the driver to load on my systems. Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core: Use registered memories per processorMauro Carvalho Chehab1-16/+23
Instead of assuming that the entire machine has either registered or unregistered memories, do it at CPU socket based. While here, fix a bug at i7core_mce_output_error(), where the we're using m->cpu directly as if it would represent a socket. Instead, the proper socket_id is given by cpu_data[m->cpu].phys_proc_id. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> ---
2010-05-10i7core_edac: Use Device 3 function 2 to report errors with RDIMM'sMauro Carvalho Chehab1-30/+178
Nehalem and upper chipsets provide an special device that has corrected memory error counters detected with registered dimms. This device is only seen if there are registered memories plugged. After this patch, on a machine fully equiped with RDIMM's, it will use the Device 3 function 2 to count corrected errors instead on relying at mcelog. For unregistered DIMMs, it will keep the old behavior, counting errors via mcelog. This patch were developed together with Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Fix ecc enable shiftKeith Mannthey1-1/+1
From: Keith Mannthey <kmannth@us.ibm.com> Simple correction to a shift value. ECC_ENABLED is bit 4 of MC_STATUS, Dev 3 Fun 0 Offset 0x4c This correctly identifies the state of the ECC at the machine. Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Print an error message if pci register failsMauro Carvalho Chehab1-1/+7
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: CodingSyle fixes/cleanupsMauro Carvalho Chehab1-27/+23
No functional changes. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: fix error injectionMauro Carvalho Chehab1-15/+12
There were two stupid error injection bugs introduced by wrong cut-and-paste: one at socket store, and another at the error inject register. The last one were causing the code to not work at all. While here, adds debug messages to allow seeing what registers are being set while sending error injection. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: fix error codes for sysfs error injection interfaceMauro Carvalho Chehab1-4/+4
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>