Article Index

 

In a recent thread on the Ubuntu Forums, a user asked how they might add the CPU temperature for an AMD processor to their conky.

A "simple" answer such as adding

${execi 1 sensors k10temp-pci-00c3 | grep 'temp1' | awk -F'.' '{print $1}'| awk -F'+' '{print $2}'}°C

to your conky might generally suffice to provide a nice integer reading.

The problem is that AMD processors don't return a physical temperature, so lm-sensors is not providing an actual temperature. For those of you who have used AMD processors for very long, this is familiar to you. Also familiar is the lament that AMD just can't seem to give accurate temperature readings. "They are always way too low!" is the refrain.

But "accurate" depends on the frame of reference in this case. AMD does not intend to give an accurate reading of physical temperature. What the value represents is a "thermal margin from a critical limit" that "specifies the processor temperature relative to the point at which the system must supply the maximum cooling for the processor's specified maximum case temperature and maximum thermal power dissipation." The value is accurate: it accurately tells your machine how fast to spin the fan(s) on your HSF.

Here's what the the k10temp module documentation has to say about it:

"There is one temperature measurement value, available as temp1_input in
sysfs. It is measured in degrees Celsius with a resolution of 1/8th degree.
Please note that it is defined as a relative value; to quote the AMD manual:

Tctl is the processor temperature control value, used by the platform to
control cooling systems. Tctl is a non-physical temperature on an
arbitrary scale measured in degrees. It does _not_ represent an actual
physical temperature like die or case temperature. Instead, it specifies
the processor temperature relative to the point at which the system must
supply the maximum cooling for the processor's specified maximum case
temperature and maximum thermal power dissipation.

The maximum value for Tctl is available in the file temp1_max.

If the BIOS has enabled hardware temperature control, the threshold at
which the processor will throttle itself to avoid damage is available in
temp1_crit and temp1_crit_hyst.
"


Yeah. OK. Thanks.

And this from the lm-sensors wiki (this is fairly old) about the K8 module. This still seems to be the case:

"coretemp returns unrealistic values

So, if the temperature value reported by coretemp is unrealistically low, all it means is that you are far away from the critical limit so your systems are running totally fine and cool and you don't have to worry at all. Unfortunately, there is no way to improve the readings, this is a hardware limitation.

Additionally, the critical limit value may be wrong on come CPU models. We may be able to address this problem over time, but again it's not really a problem in the first place. All that really matters is how far the measurement is from that limit. If the difference is above 40 pseudo degrees Celsius (again these are not real degrees Celsius!) then you're safe."


OK, then.

That's all well and good, I suppose, but I'm a human like everyone else and I like to see things at a glance, not figure in my head how the temperature of my CPU is asymptotically approaching some critical limit at which point my fans have to be spinning fast enough to avoid a meltdown.

So, you are going to get a low temperature reading at idle for your AMD processor in most cases. Sometimes sub-ambient, which we know can't be the case. That doesn't make a lot of sense unless you are a motherboard determining how fast to spin an HSF fan.

Why not just add a few degrees to the value returned by sensors and call it good? Well, then at the top end you have an unrealistically high -- possibly alarming -- value.

Add a bit at the bottom and know you are high on the top? Leave it low on the bottom and know it is getting more accurate at the top? I really don't like either approach. So what can we do?

I decided to dust off a short script I had wrtten some time ago to use with my conky. Short, and probably wrong. But maybe more reasonable.

To make it work I made a few stupid, maybe even foolish, assumptions:

1. Since this reading is supposed to drive cooling, assume that means fan speed.
2. Assume the relationship to the fan speed is linear. It most certainly isn't, but I don't know what the curve looks like.
3. Assume that at max fan speed, the k10temp is roughly the same as the physical temp or very close to it.



So I came up with the following, which may or may not be worth a hill of beans.

#!/bin/bash

# AMD processors do not report a physical temperature. Can we generate
# something reasonable for our conky?

#******************************************************************************
# ***** From lm-sensors wiki at their website: ********************************
#******************************************************************************

# coretemp returns unrealistic values
#
# The temperature value returned by the coretemp driver isn't absolute. It's a
# thermal margin from the critical limit, and the greater the margin, the worse
# the accuracy. It isn't really returning degrees Celsius. At high temperatures,
# the (small) thermal margin is almost expressed in degrees Celsius, but at low
# temperature, the (high) thermal margin is no longer expressed in actual
# degrees Celsius.
#
# So, if the temperature value reported by coretemp is unrealistically low, all
# it means is that you are far away from the critical limit so your systems are
# running totally fine and cool and you don't have to worry at all.
# Unfortunately, there is no way to improve the readings, this is a hardware
# limitation.
#
# Additionally, the critical limit value may be wrong on come CPU models. We
# may be able to address this problem over time, but again it's not really a
# problem in the first place. All that really matters is how far the
# measurement is from that limit. If the difference is above 40 pseudo degrees
# Celsius (again these are not real degrees Celsius!) then you're safe.
#******************************************************************************


#******************************************************************************
# ***** From the k10temp module documantation: ********************************
#******************************************************************************

# There is one temperature measurement value, available as temp1_input in
# sysfs. It is measured in degrees Celsius with a resolution of 1/8th degree.
# Please note that it is defined as a relative value; to quote the AMD manual:
#
# Tctl is the processor temperature control value, used by the platform to
# control cooling systems. Tctl is a non-physical temperature on an
# arbitrary scale measured in degrees. It does _not_ represent an actual
# physical temperature like die or case temperature. Instead, it specifies
# the processor temperature relative to the point at which the system must
# supply the maximum cooling for the processor's specified maximum case
# temperature and maximum thermal power dissipation.
#
# The maximum value for Tctl is available in the file temp1_max.
#
# If the BIOS has enabled hardware temperature control, the threshold at
# which the processor will throttle itself to avoid damage is available in
# temp1_crit and temp1_crit_hyst.
#******************************************************************************

# For those of you following along, I'm going to make a few of very stupid
# and very wrong assumptions to see if I can get a more reasonable and workable
# CPU temp reading at a glance:
#
# 1. Since this reading is supposed to drive cooling, assume that means
# how fast the fan runs
# 2. Assume the relationship to the fan speed is linear.
# 3. Assume that at max fan speed, the k10temp is roughly the same as the
# physical temp

# IF YOU USE THIS AND BREAK YOUR SYSTEM, IT'S ON YOU!

# Here are some basic values I'm assuming from observing my own system --
# an FX-8350 cooled by a monster Noctua HSF.
# These will take a bit of observation for another system.
idle_correction=20
no_correction=0
min_fanspeed=500
max_fanspeed=1200
speed_span=$(( ($max_fanspeed - $min_fanspeed) ))

# Get the CPU temp and HSF fan speed from sensors. This may take some fiddling
# depending on your motherboard.
cpu_temp=$(sensors k10temp-pci-00c3 | grep 'temp1' | awk -F'.' '{print $1}'| awk -F'+' '{print $2}')
fanspeed=$(sensors it8728-isa-0228 | grep 'fan1' | awk '{print $2}' | cut -c1-4)

# If fanspeed<=500, factor = 1, so correction = idle_correction
# If fanspeed>=1200, factor = 0, so correction = 0
# Otherwise, calculate correction

if [ $fanspeed -le $min_fanspeed ]
     then
     # At idle, add the idle correction
     final_temp=$(( ($cpu_temp + $idle_correction) ))
     echo $final_temp
elif [ $fanspeed -ge $max_fanspeed ]
    then
     # At max fan speed, assume k10temp approaches physical temp
     final_temp=$cpu_temp
     echo $final_temp
else
     # Interpolate a correction based on fan speed
     fan_diff=$(( ($max_fanspeed - $fanspeed) ))
     # A little trick here, since integer division returning values less than
     # 1 gives 0.
     correction_factor=$(( (($fan_diff) * 100 )/ ($speed_span) ))
     correction=$(( ($idle_correction * $correction_factor) ))

     # And now we need a useful integer.
     correction=$(( ($correction / 100) ))

     # Calculate the corrected temperature
     final_temp=$(( ($cpu_temp + $correction) ))
     echo $final_temp
fi




There is probably nothing new here that hasn't been thought of by a million conky users before me. I've just had this thing lying around for several years and dusted it off in responsde to the Ubuntu Forums question. So, there it is for what it's worth.

The images below show what happens at idle and after running all 8 cores of my FX-8350 for 30 minutes at 100% calculating pi. The first CPU reading is simply the k10temp + 20. The next is corrected using the script above. The last is the raw k10temp.

 

 


Legal Disclaimer:  I am not an Ubuntu apologist, but I do use Ubuntu and Kubuntu.  I am not a Fedora apologist, but I do use Fedora.  I am not a Windows apologist, but I do use Windows.  I'm not a FOSS apologist, but I use FOSS tools when I can and when they fit the job at hand.  I don't find any sense in a religious affiliation with tools and operating systems.  That's just asinine.