HighTemperatures

Differences between revisions 7 and 8
Revision 7 as of 2010-07-04 13:59:23
Size: 6001
Editor: apw
Comment:
Revision 8 as of 2012-10-18 19:46:22
Size: 8211
Editor: penalvch
Comment: Added section = working around overheat = as I and others use many of these tools actively to address overheat.
Deletions are marked like this. Additions are marked like this.
Line 11: Line 11:
Line 20: Line 19:
Line 66: Line 64:
Line 70: Line 67:
Line 82: Line 78:
Line 89: Line 84:
Line 95: Line 89:

= Working around overheat =

While your bug report is being addressed, one may proactively address potential overheat in the interim. One helpful tool is to monitor the hardware temperatures via [[https://launchpad.net/ubuntu/+source/lm-sensors|lm-sensors]]. For more on this, please see [[https://help.ubuntu.com/community/SensorInstallHowto]].

== ASPM power management for Precise and onwards ==

Please be advised that as of Precise, a patch has been issued to help address overheat. For more on this, please see [[https://wiki.ubuntu.com/Kernel/PowerManagementASPM]].
 
== CPU governor ==

One may governor their CPU via the cpufreq-selector command, provided by the package [[https://launchpad.net/ubuntu/+source/gnome-applets|gnome-applets]]. For example, if one had a dual core CPU, one could execute at a terminal: {{{
sudo cpufreq-selector --cpu=0 --governor=powersave && sudo cpufreq-selector --cpu=1 --governor=powersave }}}
This will change your CPU setting from ondemand, to powersave. Then, you could verify the CPU governor status by executing at a terminal: {{{
cpufreq-info }}} This is provided by the package [[https://launchpad.net/ubuntu/+source/cpufrequtils|cpufrequtils]].

== AMD proprietary driver fglrx ==

Certain AMD graphics cards may have better heat profiles using the proprietary AMD driver fglrx, versus the default open source ones. The Ubuntu repositories offer both [[https://launchpad.net/ubuntu/+source/fglrx-installer|fglrx-installer]] and [[https://launchpad.net/ubuntu/+source/fglrx-installer-updates|fglrx-installer-updates]]. Install instructions may be found at [[https://help.ubuntu.com/community/BinaryDriverHowto/ATI]].

== Increase fan speed ==

=== Dell Inspiron laptops ===

For Dell Inspiron laptops, one may use i8kmon to increase the fan speed. This utility is provided by the package [[https://launchpad.net/ubuntu/+source/i8kutils|i8kutils]]. For instructions on using this, please see the [[http://manpages.ubuntu.com/manpages/precise/man1/i8kmon.1.html|man page]].

== General maintenance ==

Use a can of air to regularly blow out the dust that may accumulate in the computer exhaust area. As well, do not obstruct the exhaust areas.

There are a number of causes of high temperatures and excessive fan use being reported by a systems sensors. This page intends to provide background information on how you might better isolate the real cause of the issue, to help prevent conflation of issues onto a single bug; a bug which says my machine is too hot will simply collect duplicates and me-toos and become useless. This page also aims to record known issues in this area so that the most appropriate bug can be found, these are arranged by release.

Filing a Bug

We would request that all suffers any temperature or fan related issue file a new bug and attach their machine information to it. Filing this bug using ubuntu-bug linux from a terminal window (menu item Applications/Accessories/Terminal). We can then look at these bugs and acertain whether they are duplicates of existing issues or not. Having the full hardware information for each instance greatly improves our chances of finding and fixing these issues. Once the bug is filed please ensure it is tagged kernel-therm.

Required Information

Where you believe you have a difference in thermal behaviour between two kernels or between two releases, please ensure you have your own bug and use the scripts in Monitoring System Sensors to produce logs of the temperature over time for both the before and after scenarios. Include this data with a clear description of the two test cases.

Where the issue is between releases you can use the live CDs for the previous release to attempt to recreate the before scenario.

Please ensure your bug is tagged kernel-therm.

Diagnostic Techniques

Monitoring System Sensors

Often bugs are characterised by a feeling that the machine is worse now than sometime in the past. To confirm this it is sensible to get concrete information using the system sensors.

A simple way to get a visual feel for the current temperatures is to run the following command in a terminal window (menu item Applications/Accessories/Terminal):

  • cd /proc/acpi/thermal_zone && watch grep temperature */*

This will display a constantly updating listing of your current temperatures:

  • Every 2.0s: grep temperature TZ00/cooling_mode TZ00...  Thu May 20 11:06:27 2010
    
    TZ00/temperature:temperature:             52 C
    TZ01/temperature:temperature:             47 C
    TZ02/temperature:temperature:             0 C

To provide a permanent record of this information you can paste the command below into a terminal:

  • ( cd /proc/acpi/thermal_zone && \
    while :; do \
      line="`date`:`grep temperature */* | awk '{ printf(\" %03d\", $2) }'`"; \
      echo "$line"; \
      sleep 10; \
    done ) | tee LOG

This will provide a log of the temperatures over time in a file called LOG. Which can be attached to a launchpad bug report:

  • Thu May 20 11:13:40 BST 2010: 051 047 000
    Thu May 20 11:13:50 BST 2010: 051 047 000
    Thu May 20 11:14:00 BST 2010: 051 047 000
    Thu May 20 11:14:10 BST 2010: 051 047 000
    Thu May 20 11:14:20 BST 2010: 051 048 000
    Thu May 20 11:14:30 BST 2010: 051 048 000

Known Issues

Below are a list of known temperature/fan related bugs with information on how to tell which bug you have and also indicating if they are fixed and if in which releases and kernel versions.

Lucid

Numerous Dell systems suffering total fan failure after suspend/resume (FIXED)

526354 -- A number of Dell models suffered from total fan failure following suspend/resume. This tended to exhibit itself in one of two ways. Firstly sensor readings (see above) tended to float up from the normal around 40c level to more like 70c. Secondly under heavy load the machine would drift up to 90c or so and then power off without warning, exhibiting very high fan speeds on reboot for the first few minutes.

This issue was triggered by an embedded controller (EC) interface issue, wherein the EC would become confused following a suspend/resume cycle and no longer control the fans on our behalf. This issue was fixed shortly following the release of Lucid and contained in kernels 2.6.32-22.33 and later.

This issue is only known to affect Dell systems. The key indicator that you are seeing this bug is that fans and temperature are controlled normally until after a suspend/resume. Upgrading to the latest kernel should fix the issue.

ATI Radeon based systems running hot since upgrades to Lucid (Open)

563156 -- There are a number of reports of systems running hot, often with fans running constantly on systems with ATI Graphics. There are reports that switching to fgrlx binary graphics drivers returns fan control to normal.

To confirm this is your issue, it would be good to get temperature readings from a previous release (you can use live CD for this) and from Lucid. Also installing fgrlx binary drivers from Jockey (menu item System/Administration/Hardware Drivers) and comparing temperatures before and after would be useful. Please report back on the bug should you have this issue.

Upgrade to Lucid causes overheating, scratch re-install fixes it

583099 -- There are sporadic reports that an upgrade to Lucid (all from Karmic so far) may leave you with poor fan control but that a scratch install then resolves things. Reporter has confirmed that a karmic clean install upgraded is showing different levels of idle temperature as compared to a scratch install of lucid on the same hardware. Investigation continues.

It should be noted that between the karmic scratch install and the lucid scratch install temperature improves by 5c or about 20%. It is the upgrade which seems at times to be out of kilter.

Working around overheat

While your bug report is being addressed, one may proactively address potential overheat in the interim. One helpful tool is to monitor the hardware temperatures via lm-sensors. For more on this, please see https://help.ubuntu.com/community/SensorInstallHowto.

ASPM power management for Precise and onwards

Please be advised that as of Precise, a patch has been issued to help address overheat. For more on this, please see https://wiki.ubuntu.com/Kernel/PowerManagementASPM.

CPU governor

One may governor their CPU via the cpufreq-selector command, provided by the package gnome-applets. For example, if one had a dual core CPU, one could execute at a terminal:

sudo cpufreq-selector --cpu=0 --governor=powersave && sudo cpufreq-selector --cpu=1 --governor=powersave 

This will change your CPU setting from ondemand, to powersave. Then, you could verify the CPU governor status by executing at a terminal:

cpufreq-info 

This is provided by the package cpufrequtils.

AMD proprietary driver fglrx

Certain AMD graphics cards may have better heat profiles using the proprietary AMD driver fglrx, versus the default open source ones. The Ubuntu repositories offer both fglrx-installer and fglrx-installer-updates. Install instructions may be found at https://help.ubuntu.com/community/BinaryDriverHowto/ATI.

Increase fan speed

Dell Inspiron laptops

For Dell Inspiron laptops, one may use i8kmon to increase the fan speed. This utility is provided by the package i8kutils. For instructions on using this, please see the man page.

General maintenance

Use a can of air to regularly blow out the dust that may accumulate in the computer exhaust area. As well, do not obstruct the exhaust areas.

Kernel/Debugging/HighTemperatures (last edited 2014-01-03 16:44:50 by penalvch)