summaryrefslogtreecommitdiffstats
path: root/drivers/edac
AgeCommit message (Collapse)AuthorFilesLines
2009-04-21edac: ppc mpc85xx fix mc err detectDave Jiang1-1/+1
Error found by Jeff Haran. The error detect register is 0s when no errors are detected. The check code is incorrect, so reverse check sense. Reported-by: Jeff Haran <jharan@Brocade.COM> Signed-off-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-13edac: use to_delayed_work()Jean Delvare3-3/+3
The edac-core driver includes code which assumes that the work_struct which is included in every delayed_work is the first member of that structure. This is currently the case but might change in the future, so use to_delayed_work() instead, which doesn't make such an assumption. linux-2.6.30-rc1 has the to_delayed_work() function that will allow this patch to work Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-13edac: fix local pci_write_bits32Jeff Haran1-2/+10
Fix the edac local pci_write_bits32 to properly note the 'escape' mask if all ones in a 32-bit word. Currently no consumer of this function uses that mask, so there is no danger to existing code. Signed-off-by: Jeff Haran <jharan@Brocade.COM> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02edac: AMD8111 driver Kconfig & MakefileHarry Ciao1-0/+7
Introduce Kconfig and Makefile options for AMD8111 EDAC driver. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02edac: AMD8131 driver Kconfig & MakefileHarry Ciao1-0/+7
Introduce Kconfig and Makefile options for AMD8131 EDAC driver. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02edac: AMD8131 driver source fileHarry Ciao1-0/+379
Introduce AMD8131 EDAC driver source file, which makes use of error detections on the PCI-X Bridge Controllers on the AMD8131 HyperTransport PCI-X Tunnel. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02edac: AMD8131 driver header fileHarry Ciao1-0/+119
Introduce AMD8131 EDAC driver header file, which adds register and bits definitions for the PCI-X Bridge Controller on the AMD8131 HyperTransport I/O Hub. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02edac: Add edac_pci_alloc_index()Harry Ciao2-0/+15
Add edac_pci_alloc_index(), because for MAPLE platform there may exist several EDAC driver modules that could make use of edac_pci_ctl_info structure at the same time. The index allocation for these structures should be taken care of by EDAC core. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02edac: AMD8111 driver source fileHarry Ciao1-0/+595
Introduce AMD8111 EDAC driver source file, which makes use of error detections on the LPC Bridge Controller and PCI Bridge Controller on the AMD8111 HyperTransport I/O Hub. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02edac: AMD8111 driver header fileHarry Ciao1-0/+130
Introduce AMD8111 EDAC driver header file, which adds register and bits definitions for the LPC Bridge Controller and PCI Bridge Controller on the AMD8111 HyperTransport I/O Hub. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02edac: new ppc4xx driver moduleGrant Erickson4-1/+1630
This adds support for an EDAC memory controller adaptation driver for the "ibm,sdram-4xx-ddr2" ECC controller realized in the AMCC PowerPC 405EX[r]. At present, this driver has been developed and tested against the controller realization in the AMCC PPC405EX[r] on the AMCC Kilauea and Haleakala boards (256 MiB w/o ECC memory soldered onto the board) and a proprietary board based on those designs (128 MiB ECC memory, also soldered onto the board). In the future, dynamic feature detection and handling needs to be added for the other realizations of this controller found in the 440SP, 440SPe, 460EX, 460GT and 460SX. Eventually, this driver will likely be evolved and adapted to the above variant realizations of this controller as well as broken apart to handle the other known ECC-capable controllers prevalent in other PPC4xx processors: - IBM SDRAM (405GP, 405CR and 405EP) "ibm,sdram-4xx" - IBM DDR1 (440GP, 440GX, 440EP and 440GR) "ibm,sdram-4xx-ddr" - Denali DDR1/DDR2 (440EPX and 440GRX) "denali,sdram-4xx-ddr2" [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Grant Erickson <gerickson@nuovations.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02edac: remove EDAC's experimental statusDoug Thompson1-3/+2
After 3 years, this is a patch to remove the EXPERIMENTAL tag on EDAC. We now have many module drivers submitters in EDAC and believe EDAC is no longer EXPERIMENTAL Signed-off-by: Doug Thompson <dougthompson@xmission.com Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02edac: add more verbose debug infoHitoshi Mitake2-1/+22
A patch for making a debugging information more verbose for use in development debugging. By enabling the new option "More verbose debugging", information about source file and line number will be added to debugging message. This is sample output, EDAC MC0: Giving out device to 'e7xxx_edac' 'E7205': DEV 0000:00:00.0 EDAC DEBUG: in drivers/edac/edac_pci.c, line at 48: edac_pci_alloc_ctl_info() EDAC DEBUG: in drivers/edac/edac_pci.c, line at 334: edac_pci_add_device() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-24edac: struct device - replace bus_id with dev_name(), dev_set_name()Kay Sievers3-6/+6
Cc: dougthompson@xmission.com Cc: bluesmoke-devel@lists.sourceforge.net Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
2009-01-28powerpc: More printing warning fixes for the l64 to ll64 conversionStephen Rothwell1-4/+4
These are all powerpc specific drivers. res.start in fsl_elbc_nand.c needs to be cast since it may be either 32 or 64 bit. Thanks to Scott Wood for noticing. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Arnd Bergmann <arnd@arndb.de> call_edac bits in particular Acked-by: Olof Johansson <olof@lixom.net> pasemi_nand peices Acked-by: Scott Wood <scottwood@freescale.com> fsl_elbc fixes Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-06edac: driver for i5400 MCH (update)Mauro Carvalho Chehab1-57/+62
Signed-off-by: Ben Woodard <woodard@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06edac: driver for i5400 MCH (Seaburg)Mauro Carvalho Chehab3-0/+1479
EDAC driver for i5400 MCH (Seaburg) This driver adds support for i5400 MCH chipset. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Ben Woodard <woodard@redhat.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06edac: fix mpc85xx and add mpc8536 mpc8560Kumar Gala1-42/+32
All other compatibles that are uniquely identifying the processor use a prefix of the form fsl,mpc85...'. We add support for it so we can deprecate the older 'fsl,85...' that was improperly used here. Additionally added mpc8536 & mpc8560 to the compatible lists. This patch is based on Nate's 8572 patch. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Acked-by: Dave Jiang <djiang@mvista.com> Cc: Nate Case <ncase@xes-inc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06edac: struct device: replace bus_id with dev_name(), dev_set_name()Kay Sievers4-6/+6
This patch is part of a larger patch series which will remove the "char bus_id[20]" name string from struct device. The device name is managed in the kobject anyway, and without any size limitation, and just needlessly copied into "struct device". [akpm@linux-foundation.org: coding-style fixes] Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06pci: use pci_ioremap_bar() in drivers/edacArjan van de Ven1-3/+1
Use the newly introduced pci_ioremap_bar() function in drivers/edac. pci_ioremap_bar() just takes a pci device and a bar number, with the goal of making it really hard to get wrong, while also having a central place to stick sanity checks. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-28Merge branch 'next' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (144 commits) powerpc/44x: Support 16K/64K base page sizes on 44x powerpc: Force memory size to be a multiple of PAGE_SIZE powerpc/32: Wire up the trampoline code for kdump powerpc/32: Add the ability for a classic ppc kernel to be loaded at 32M powerpc/32: Allow __ioremap on RAM addresses for kdump kernel powerpc/32: Setup OF properties for kdump powerpc/32/kdump: Implement crash_setup_regs() using ppc_save_regs() powerpc: Prepare xmon_save_regs for use with kdump powerpc: Remove default kexec/crash_kernel ops assignments powerpc: Make default kexec/crash_kernel ops implicit powerpc: Setup OF properties for ppc32 kexec powerpc/pseries: Fix cpu hotplug powerpc: Fix KVM build on ppc440 powerpc/cell: add QPACE as a separate Cell platform powerpc/cell: fix build breakage with CONFIG_SPUFS disabled powerpc/mpc5200: fix error paths in PSC UART probe function powerpc/mpc5200: add rts/cts handling in PSC UART driver powerpc/mpc5200: Make PSC UART driver update serial errors counters powerpc/mpc5200: Remove obsolete code from mpc5200 MDIO driver powerpc/mpc5200: Add MDMA/UDMA support to MPC5200 ATA driver ... Fix trivial conflict in drivers/char/Makefile as per Paul's directions
2008-12-23edac: fix edac core deadlock when removing a deviceHarry Ciao1-3/+9
When deleting an edac device, we have to wait for its edac_dev.work to be completed before deleting the whole edac_dev structure. Since we have no idea which work in current edac_poller's workqueue is the work we are conerned about, we wait for all work in the edac_poller's workqueue to be proceseed. This is done via flush_cpu_workqueue() which inserts a wq_barrier into the tail of the workqueue and then sleeping on the completion of this wq_barrier. The edac_poller will wake up sleepers when it is found. EDAC core creates only one kernel worker thread, edac_poller, to run the works of all current edac devices. They share the same callback function of edac_device_workq_function(), which would grab the mutex of device_ctls_mutex first before it checks the device. This is exactly where edac_poller and rmmod would have a great chance to deadlock. In below call trace of rmmod > ... > edac_device_del_device > edac_device_workq_teardown > flush_workqueue > flush_cpu_workqueue, device_ctls_mutex would have already been grabbed by edac_device_del_device(). So, on one hand rmmod would sleep on the completion of a wq_barrier, holding device_ctls_mutex; on the other hand edac_poller would be blocked on the same mutex when it's running any one of works of existing edac evices(Note, this edac_dev.work is likely to be totally irrelevant to the one that is being removed right now)and never would have a chance to run the work of above wq_barrier to wake rmmod up. edac_device_workq_teardown() should not be called within the critical region of device_ctls_mutex. Just like is done in edac_pci_del_device() and edac_mc_del_mc(), where edac_pci_workq_teardown() and edac_mc_workq_teardown() are called after related mutex are released. Moreover, an edac_dev.work should check first if it is being removed. If this is the case, then it should bail out immediately. Since not all of existing edac devices are to be removed, this "shutting flag" should be contained to edac device being removed. The current edac_dev.op_state can be used to serve this purpose. The original deadlock problem and the solution have been witnessed and tested on actual hardware. Without the solution, rmmod an edac driver would result in below deadlock: root@localhost:/root> rmmod mv64x60_edac EDAC DEBUG: mv64x60_dma_err_remove() EDAC DEBUG: edac_device_del_device() EDAC DEBUG: find_edac_device_by_dev() (hang for a moment) INFO: task edac-poller:2030 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. edac-poller D 00000000 0 2030 2 Call Trace: [df159dc0] [c0071e3c] free_hot_cold_page+0x17c/0x304 (unreliable) [df159e80] [c000a024] __switch_to+0x6c/0xa0 [df159ea0] [c03587d8] schedule+0x2f4/0x4d8 [df159f00] [c03598a8] __mutex_lock_slowpath+0xa0/0x174 [df159f40] [e1030434] edac_device_workq_function+0x28/0xd8 [edac_core] [df159f60] [c003beb4] run_workqueue+0x114/0x218 [df159f90] [c003c674] worker_thread+0x5c/0xc8 [df159fd0] [c004106c] kthread+0x5c/0xa0 [df159ff0] [c0013538] original_kernel_thread+0x44/0x60 INFO: task rmmod:2062 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. rmmod D 0ff2c9fc 0 2062 1839 Call Trace: [df119c00] [c0437a74] 0xc0437a74 (unreliable) [df119cc0] [c000a024] __switch_to+0x6c/0xa0 [df119ce0] [c03587d8] schedule+0x2f4/0x4d8 [df119d40] [c03591dc] schedule_timeout+0xb0/0xf4 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-22powerpc/cell: add QPACE as a separate Cell platformBenjamin Krill1-1/+1
Since the QPACE (Chromodynamics Parallel Computing on the Cell Broadband Engine) platform doesn't use a iommu, doesn't have PCI devices and a MPIC much lesser setup and configurations are needed. So far all devices are detected as OF device. A notifier function is used to set the dma_ops for the of_platform bus. Further this patch splits the PPC_CELL_NATIVE into PPC_CELL_COMMON which are parts that are shared with the QPACE platform and the rest. Signed-off-by: Benjamin Krill <ben@codiert.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2008-12-01i82875p_edac: fix module removeJarkko Lavinen1-6/+7
Fix module removal bugs of i82875p_edac. Also i82975x_edac code seems to have the same module removal bugs as in i82875p_edac. The problems were: 1. In module removal i82875p_remove_one() is never called. Variable i82875p_registered is newer changed from 1, which guarantees i82875p_remove_one() is not called (and even if it were called, it would be called in wrong order). As a result, the edac_mc workque is not stopped and keeps probing. If kernel debugging options are not enabled, user may not notice anything going wrong. if debugging options are enabled and I do "rmmod i82875p_edac", I get: edac debug: edac_pci_workq_function() checking BUG: unable to handle kernel paging request at f882d16f ... call trace: [<f8834df3>] ? edac_mc_workq_function+0x55/0x7e [edac_core] [<c0233974>] ? run_workqueue+0xd7/0x1a5 [<c023392f>] ? run_workqueue+0x92/0x1a5 [<f8834d9e>] ? edac_mc_workq_function+0x0/0x7e [edac_core] [<c0233af9>] ? worker_thread+0xb7/0xc3 [<c0236a7b>] ? autoremove_wake_function+0x0/0x33 [<c0233a42>] ? worker_thread+0x0/0xc3 [<c0236809>] ? kthread+0x3b/0x61 [<c02367ce>] ? kthread+0x0/0x61 [<c0204587>] ? kernel_thread_helper+0x7/0x10 Fix for this is to get rid of needles variable i82875p_registered altogether and run i82875p_remove_one() *before* pci_unregister_driver(). 2. edac_mc_del_mc() uses mci after freeing mci edac_mc_del_mc() calls calls edac_remove_sysfs_mci_device(). The kobject refcount of mci drops to 0 and mci is freed. After this mci is accessed via debug print and i82875p_remove_one() still uses mci->pvt and tries to free mci again with edac_mc_free(). The fix for this is add kobject_get(&mci->edac_mci_kobj) after edac_mc_alloc(). Then the mci is still available after returning from edac_mc_del_mc() with refcount 1, and mci->pvt is still available. When i82875p_remove_one() finally calls edac_mc_free(), this will cause kobject_put() and mci is released properly. Signed-off-by: Jarkko Lavinen <jlavi@iki.fi> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-01i82875p_edac: fix overflow device resource setupJarkko Lavinen1-0/+1
When I do "modprobe i82875p_edac" on my Asus P4C800 MB on kernels 2.6.26 or later, the module load fails due to BAR 0 collision. On 2.6.25 the module loads just fine. The overflow device on the MB seems to be hidden and its resources are not allocated at normal PCI bus init. Log shows the missing resource problem: EDAC DEBUG: i82875p_probe1() PCI: 0000:00:06.0 reg 10 32bit mmio: [fecf0000, fecf0fff] pci 0000:00:06.0: device not available because of BAR 0 [0xfecf0000-0xfecf0fff] collisions EDAC i82875p: i82875p_setup_overfl_dev(): Failed to enable overflow device The patch below fixes this by calling pci_bus_assign_resources() after the overflow device is revealed and added to the bus. With this patch I am again able to load and use the module. Signed-off-by: Jarkko Lavinen <jlavi@iki.fi> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-12i5000-edac: hold reference to mci kobjectDarrick J. Wong1-1/+3
It turns out that edac_mc_del_mc will kobject_put the last kref on the mci object. If the timing is just right, that means that the mci object is freed before before i5000_remove_one has a chance to free the resources associated with it, causing a null pointer exceptions when unloading the driver. Insert a kobject_{get,put} pair so that this doesn't happen. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30edac: fix enabling of polling cell moduleBenjamin Herrenschmidt1-0/+3
The edac driver on cell turned out to be not enabled because of a missing op_state. This patch introduces it. Verified to work on top of Ben's next branch. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jens Osterkamp <jens@linux.vnet.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30edac x38: new MC driver moduleHitoshi Mitake3-0/+532
I wrote a new module for Intel X38 chipset. This chipset is very similar to Intel 3200 chipset, but there are some different points, so I copyed i3200_edac.c and modified. This is Intel's web page describing this chipset. http://www.intel.com/Products/Desktop/Chipsets/X38/X38-overview.htm I've tested this new module with broken memory, and it seems to be working well. Signed-off-by: Hitoshi Mitake <mitake@clustcom.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20edac cell: fix incorrect edac_modeBenjamin Herrenschmidt1-1/+1
The cell_edac driver is setting the edac_mode field of the csrow's to an incorrect value, causing the sysfs show routine for that field to go out of an array bound and Oopsing the kernel when used. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: <stable@kernel.org> [2.6.27.x, 2.6.26.x. 2.6.25.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16edac i5000: fix thermal issuesAristeu Rozanski1-1/+16
Make the Thermal messages (temperature got past Tmid) be displayed only once because: 1) it's the BIOS job to configure and handle the memory throttling 2) if the BIOS is broken or is aware about the condition, flooding the system logs won't help anything. 3) According to the specification update for Intel 5000 MCHs, all the revisions of this MCH have problems on the thermal sensors, making not automatic (a.k.a. intelligent thermal throttling) impossible. Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16edac i5000: fix error messagesAristeu Rozanski1-62/+119
Update the i5000_edac messages, making everything pass through the EDAC (so the log controls will work) and being more specific about the errors. Also, it makes the miscellaneous errors optional and disabled by default. As I didn't found anywhere information about M23ERR-M26ERR (FERR_NF_THERMAL) on FERR_NF_FBD, I'm removing them. Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16edac mpc85xx: add support for mpc8572Andrew Kilkenny1-7/+26
This adds support for the dual-core MPC8572 processor. We have to support making SPR changes on each core. Also, since we can have multiple memory controllers sharing an interrupt, flag the interrupts with IRQF_SHARED. Signed-off-by: Andrew Kilkenny <akilkenny@xes-inc.com> Signed-off-by: Nate Case <ncase@xes-inc.com> Acked-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16edac: make i82443bxgx_edac coexist with intel_agpVladislav Bogdanov1-2/+61
Fix 443BX/GX MCH suppport in a EDAC. It makes i82443bxgx_edac coexist with intel_agp using the same approach as several other EDAC drivers. Tested on Intel's L443GX with redhat's 2.6.18 with whole EDAC subsystem backported a while ago. [root@host ~]# dmesg|grep -iE '(AGP|EDAC)' Linux agpgart interface v0.101 (c) Dave Jones agpgart: Detected an Intel 440GX Chipset. agpgart: AGP aperture is 64M @ 0xf8000000 EDAC MC: Ver: 2.1.0 Jun 27 2008 EDAC MC0: Giving out device to 'i82443bxgx_edac' 'I82443BXGX': DEV 0000:00:00.0 EDAC PCI0: Giving out device to module 'i82443bxgx_edac' controller 'EDAC PCI controller': DEV '0000:00:00.0' (POLLED) Signed-off-by: Vladislav Bogdanov <slava@nsys.by> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-23removed unused #include <linux/version.h>'sAdrian Bunk1-1/+0
This patch lets the files using linux/version.h match the files that #include it. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25edac: mpc85xx fix pci ofdev 2nd passDave Jiang1-24/+43
Convert PCI err device from platform to open firmware of_dev to comply with powerpc schemes. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25edac: mv64x60 add pci fixupDave Jiang1-0/+35
Fixup of missing bit 0 on 64360 PCIx_ERR_MASK and errata FEr-#11 and FEr-#16 for the 64460. Bit 0 must remain 0. Signed-off-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Doug Thompson <dougthompson.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25edac: mv64x60 fix get_propertyDave Jiang1-1/+1
Update get_property() call to use of_get_property() in order to fix compile Signed-off-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Doug Thompson <dougthompson.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25edac: e752x fix too loud on nonmemory errorsDoug Thompson1-17/+42
This module harvests more than just memory errors, it also harvests various bus and dma errors that the Chipset detects. Previously, it would report all such errors, which would cause output to be TOO loud. This patches therefore adds a parameter which is used to turn off NON-MEMORY error reports by default. Or the reporting can be enabled via the parameter Also did code style cleanup: less than 80 characters per line rule Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25edac: core fix added newline to sysfs dimm labelsArthur Jones1-1/+5
The channel DIMM label does not seem to be used much in the edac code. However, where it is used (in the core code), it is assumed to not have a newline embedded. This leaves the sysfs file newline free which looks funny when cat'ing it. Here we just add the trailing newline to the sysfs chX_dimm_label output... [Doug Thompson note: the DIMM label is one of the primary uses of EDAC. User space daemon scripts, edac-utils@sourceforge, populate the DIMM label fields, via /sys/devices/system/edac attributes, with the silk screen labels of the motherboard in use. dmidecode access BIOS tables, but BIOS tables are well known to be incorrect and useless in these respects. edac-utils will strip off any newlines before its use of the output, when displaying DIMM slot silk screen labels. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25edac: core fix static to dynamic ksetArthur Jones1-9/+6
Static kobjects and ksets are not supported in Linux kernel. Convert the mc_kset from static to dynamic. This patch depends on my previous patch to remove the module parameter attributes from mc... Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25edac: core fix redundant sysfs controls to parametersArthur Jones1-116/+1
/sys/devices/system/edac/mc has a few files which are duplicated in /sys/module/edac_core/parameters. Now that all the functionality is duplicated between these two locations, we remove the former kobject attributes and update the documentation. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25edac: core fix workq timerArthur Jones1-1/+21
When updating the edac_mc_poll_msec module parameter from the sysfs /sys/module/edac_core/parameters/edac_mc_poll_msec file, we don't update the workq timers. So that, if we move from a big poll time to a small one, the small one won't take effect until the big one has timed out. Here we provide a new module parameter set method to call out to the update routine. This brings the /sys/module/edac_core/parameters functionality up to that provided by the /sys/drivers/system/edac/mc sysfs module parameter files so that we can remove them or at least link to the /sys/module files... Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25edac: core fix to use dynamic kobjectArthur Jones1-9/+21
Static kobjects are not supported in linux kernel. Convert the edac_pci_top_main_kobj from static to dynamic. This avoids the double free of the edac_pci_top_main_kobj.name that we see on module reload of the e752x edac driver (and probably others as well). In addition Greg KH <greg@kroah.com> has pointed out that this code may be cleaned up significantly. I will look at that as a follow-on patch, for now, I just want the minimum fix to get this double-free oops bug squashed... Many thanks to Greg KH for his patience in showing me what the Documentation/kobject.txt already said (oops)... Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25edac: i5100: cleanupArthur Jones1-135/+261
Some code cleanliness issues found by Andrew Morton (thanks!) which should not affect functionality, but which should help make the code more maintainable. In particular, we now: * convert all #define's w/ a parameter to static inlines * use 1UL rather than 1ULL when calculating an unsigned long * use pci_disable_device The resulting code is tested and seems to work fine... Signed-off-by: Arthur Jones <ajones@riverbed.com> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25edac: i5100 fix unmask ecc bitsArthur Jones1-0/+6
Explicitly unmask ECC errors we are interested in reporting. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25edac: i5100 fix enable ecc hardwareArthur Jones1-0/+10
It is possible that the BIOS did not enable ECC at boot time. We check for that case and fail to load if it is true. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25edac: i5100 fix missing bitsArthur Jones1-3/+15
The error mask we use to trigger ECC notifications is missing many bits of interest. We add these bits here so that all possible ECC errors can be reported. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25edac: i5100 new intel chipset driverArthur Jones3-0/+835
Preliminary support for the Intel 5100 MCH. CE and UE errors are reported along with the current DIMM label information and other memory parameters. Reasons why this is preliminary: 1) This chip has 2 independent memory controllers which, for best perforance, use interleaved accesses to the DDR2 memory. This architecture does not map very well to the current edac data structures which depend on symmetric channel access to the interleaved data. Without core changes, the best I could do for now is to map both memory controllers to different csrows (first all ranks of controller 0, then all ranks of controller 1). Someone much more familiar with the edac core than I will probably need to come up with a more general data structure to handle the interleaving and de-interleaving of the two memory controllers. 2) I have not yet tackled the de-interleaving of the rank/controller address space into the physical address space of the CPU. There is nothing fundamentally missing, it is just ending up to be a lot of code, and I'd rather keep it separate for now, esp since it doesn't work yet... 3) The code depends on a particular i5100 chip select to DIMM mainboard chip select mapping. This mapping seems obvious to me in order to support dual and single ranked memory, but it is not unique and DIMM labels could be wrong on other mainboards. There is no way to query this mapping that I know of. 4) The code requires that the i5100 is in 32GB mode. Only 4 ranks per controller, 2 ranks per DIMM are supported. I do not have hardware (nor do I expect to have hardware anytime soon) for the 48GB (6 ranks per controller) mode. 5) The serial presence detect code should be broken out into a "real" i2c driver so that decode-dimms.pl can work. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-22powerpc/cell/edac: Log a syndrome code in case of correctable errorMaxim Shchetynin1-2/+3
If correctable error occurs the syndrome code was logged as 0. This patch lets EDAC to log a correct syndrome code to make problem investigation easier. Signed-off-by: Maxim Shchetynin <maxim@de.ibm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-05-24edac: mpc85xx: fix building as a moduleKumar Gala1-3/+0
including of <asm/mpc85xx.h> causes build problems since it doesn't exist. Also removed warning: drivers/edac/mpc85xx_edac.c:45: warning: 'mpc85xx_ctl_name' defined but not used Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Acked-by: Doug Thompson <dougthompson@xmission.com> Acked-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>