DebuggingVoodoo

Differences between revisions 1 and 2
Revision 1 as of 2010-07-21 13:29:59
Size: 1762
Editor: 193
Comment:
Revision 2 as of 2010-12-10 17:39:33
Size: 2802
Editor: p5B2E5002
Comment:
Deletions are marked like this. Additions are marked like this.
Line 10: Line 10:
  
Line 29: Line 30:

== IRQ0 not causing wake-up ==

This has been observed on some systems. Probably more likely on those which use HPET to emulate the timer (which is probably all the newer ones). In that case IRQ0 is usually incrementing, but still the system goes into sleep and does not wake up until a key is pressed or some other interrupt arrives.

Usually the graphics card, the disk or the network interface generate a lot of interrupts, which would wake up the system often enough to hide any problems. But it seems that using MSI/-X may change that. On one of my personal systems both disk controller and ethernet controller were using MSI and the graphics card was known to have the issue of not generating an interrupt on sync. And by that the system suffered badly from hangs.

Surprisingly there is a simple solution: "'''pci=nomsi'''" forces the pci system to not use MSI and suddenly those hangs were gone. This is a bit of a magic work-around because one should expect the timer interrupt, as it seems to arrive, to wake things.

This page is intended to be a pool of symptoms and tricks or links to places that help getting to the bottom of the problem.

Symptoms

Debugging Hints

Incorrect IRQ0 override

The BIOS claims that IRQ0 is routed to another IRQ on the IO-APIC, but this is not true. In some cases[1][2] it was possible to check the chipset configuration and ignore the wrong override. But the documentation for most chipsets are NDA. With a tickless system we got the state where all CPU's go into a deeper C-state and require a timer interrupt to wake up. If the timer interrupt is not routed correctly this does not happen.

Possible boot options to try (with "debug lapic=debug"):

  • acpi_skip_timer_override
  • hpet=disabled / nohpet
  • idle=poll (idle=halt might only work on systems without C1E)

With "debug lapic=debug" Linux will show the process of trying to get the interrupt right and also displays the interrupt routing in the IO-APIC.

[1] x86: SB450: skip IRQ0 override if it is not routed to INT2 of IOAPIC

[2] x86: SB600: skip IRQ0 override if it is not routed to INT2 of IOAPIC

IRQ0 not causing wake-up

This has been observed on some systems. Probably more likely on those which use HPET to emulate the timer (which is probably all the newer ones). In that case IRQ0 is usually incrementing, but still the system goes into sleep and does not wake up until a key is pressed or some other interrupt arrives.

Usually the graphics card, the disk or the network interface generate a lot of interrupts, which would wake up the system often enough to hide any problems. But it seems that using MSI/-X may change that. On one of my personal systems both disk controller and ethernet controller were using MSI and the graphics card was known to have the issue of not generating an interrupt on sync. And by that the system suffered badly from hangs.

Surprisingly there is a simple solution: "pci=nomsi" forces the pci system to not use MSI and suddenly those hangs were gone. This is a bit of a magic work-around because one should expect the timer interrupt, as it seems to arrive, to wake things.

Kernel/DebuggingVoodoo (last edited 2010-12-10 17:39:33 by p5B2E5002)