KernelMaverickBIOSTestAutomation

Summary

It is desirable to be able automatically test specific BIOS functionality, such as suspend/resume, hibernate, wakeup, fan control, battery, C states etc.. to locate and hence be able to fix or workaround BIOS/ACPI errors. Also, it is desirable to add more kernel debug into hibernate/suspend code paths to help automatically to pin point BIOS errors. We propose a tool + kernel debug to do this automatic testing + diagnosis.

Release Note

This section should include a paragraph describing the end-user impact of this change. It is meant to be included in the release notes of the first release in which it is implemented. (Not all of these will actually be included in the release notes, at the release manager's discretion; but writing them is a useful exercise.)

It is mandatory.

Rationale

It is useful to be able to automatically detect and diagnose ACPI/BIOS issues that cause problems with a range of problems suspend/hibernate/resume, wakealarm wakeups, fan control, battery sensing, lid open/close sensing, broken ACPI hotkey events, processor C states, etc.

While the Intel Linux Firmware Testkit tried to address firmware (BIOS) issues, as a project it has been unmaintained since October 2007 and thus stagnated. The intention is to leverage the best parts of the Intel Firmware Testkit and complement this with BIOS testing know-how gained from hardware enablement to create an up to date suite of BIOS tests.

User stories

Bill's laptop hangs when coming out of hibernate. He is unsure if the problem is a kernel driver hang or a fundamentally broken BIOS. He runs the BIOS test tool and gets a diagnosis that the BIOS does not return control back to the kernel on resume from hibernate and is directed to check for a BIOS upgrade.

Doug's laptop battery discharges without getting any feedback on the desktop. Running the BIOS test checks battery charging/un-charging and sanity checks the ACPI DSDT and informs him that there is a bug in the BAT ACPI methods.

Eve's laptop hotkeys and lid open/close events don't seem to work. The BIOS test tool detects that ACPI is not handling these events are not being correctly generated by ACPI and hence the bug is not in the kernel or userspace.

Fred's suspend/resume fails and the BIOS test tool automatically runs through a set of test scenarios and is able to detect that the BIOS works OK and the failure in in a binary blob driver when coming out of resume.

A new machine comes onto the market - running the full set of tests validates the machine and picks up some subtle ACPI semantic bugs (such as not detecting DSDT mutex wait timeouts) that may need fixing.

Assumptions

Design

Several points of attack for testing BIOS:

1. Parsing the BIOS ACPI tables, e.g. the DSDT can allow top level syntax and semantic checking using the iasl assembler/disassembler.

2. Automated test suite to test individual BIOS related issues.

3. Add extra debug hooks into the kernel suspend/resume and hibernate paths to aid debugging. Leverage tools like ftrace to detect hibernate failure code paths.

Create a specific ACPI test kit kernel (daily/weekly build?) containing features such as:

  • CONFIG_ACPI_DEBUG enabled
  • Extra debug S3/S4 code paths
  • s3_led flag to acpi_sleep kernel parameter to enabled keyboard LED debug on S3 resume

4. Automated scanning of the kernel message log to identify kernel generated ACPI warnings and errors and attempt to diagnose and suggest workarounds/fixes for known ACPI issues.

Implementation

Tests in test suite will be written if possible either in a high level scripting language such as Python and/or C (to re-use the Intel Firmware Testkit code).

Addition suspend/resume/hibernate debug probably needs to be added to the kernel to aid debugging. This should be enabled via kernel boot options and upstreamed where appropriate.

Test/Demo Plan

To be completed

It's important that we are able to test new features, and demonstrate them to users. Use this section to describe a short plan that anybody can follow that demonstrates the feature is working. This can then be used during testing, and to show off after release. Please add an entry to http://testcases.qa.ubuntu.com/Coverage/NewFeatures for tracking test coverage.

This need not be added or completed until the specification is nearing beta.

BoF agenda and discussion

Use this section to take notes during the BoF; if you keep it in the approved spec, use it for summarising what was discussed and note any options that were rejected.

Comments


CategorySpec

KernelTeam/Specs/KernelMaverickBIOSTestAutomation (last edited 2010-05-27 09:06:01 by nfilus)