Difference between revisions of "Suspend debugging"

From Toshiba AC100 wiki
Jump to: navigation, search
m (Doesn't want to wake up?)
m (Tests)
Line 133: Line 133:
  
 
Looks like AC100 failes to resume from devices state.
 
Looks like AC100 failes to resume from devices state.
<pre>
+
<syntaxhighlight lang="bash">
 
cat /sys/power/pm_test
 
cat /sys/power/pm_test
 
# here will be available values
 
# here will be available values
Line 142: Line 142:
 
echo mem > /sys/power/state
 
echo mem > /sys/power/state
 
# Fail (usually, wake up 1 time of 7)
 
# Fail (usually, wake up 1 time of 7)
</pre>
+
</syntaxhighlight>
 
 
  
 
===Plan===
 
===Plan===

Revision as of 09:56, 26 December 2016

Usefull links

[PATCH v3 13/28] ARM: tegra: Add suspend and hotplug support

Linux Kernel and Android Suspend/Resume

How to get s2ram working

suspend/resume debugging: device filter

Debugging hibernation and suspend

DebuggingKernelSuspend

Linux Kernel and Android Suspend/Resume -blog archive

Android Power Management


Doesn't want to wake up?

NOTE: looks like pm_trace is x68/64 only.
OK. With that understood.... In order to simulate your suspend/resume process, enter the following commands:

sudo sh -c "sync; echo 1 > /sys/power/pm_trace; pm-suspend"

At this point your computer should enter the suspend state within a few seconds. Usually the power LED will slowly flash when in the suspended state. When that has happened, initiate the resume process by pressing the power button. Observe closely if the disk light comes on briefly. This indicates that resume has begun. If resume fails to complete, then press the power button until the computer turns off. Power on your computer making sure that it loads the same kernel that exhibited the resume problem. You have about 3 minutes to start this boot process before the information saved in the RTC gets corrupted.

Start a console and enter:

dmesg > dmesg.txt

You can edit this file and find lines similar to these:

[   11.323206]   Magic number: 0:798:264
[   11.323257]   hash matches drivers/base/power/resume.c:46

There may well be another 'hash matches' line beyond that. If so, then you are in luck because the last one is the likely culprit. For example:

hash matches device i2c-9191

The only way to prove this is to remove the module prior to initiating suspend. Repeat as needed...

If you get a device number rather than name, lspci and /sys/devices/pci* are your friends.

Logs

lp1 early suspend crach 2

rel-15r7 suspend->wakeup

suspend cm10 beta3 wifi on, no ac

u-boot lp1 no nvec

Patches

https://gitorious.org/~marvin24/ac100/marvin24s-kernel/commit/312cef53a6ebfbe1f09b8a053fdc726515861d22

https://gitorious.org/ac100/kernel/commit/b3380bae699f1cf315836f77b2abf05e3549e0fa

https://gitorious.org/ac100/kernel/commit/1d0d7ff80526d0f6451d230a81349dac01b466c1

Code analysis

Call stack (based on sources):

pm_suspend (kernel/power/main.c)
{
	enter_state (kernel/power/suspend.c)
	{
		suspend_prepare (kernel/power/suspend.c)
		suspend_test(TEST_FREEZER) (kernel/power/suspend.c)
		
		suspend_devices_and_enter (kernel/power/suspend.c)
		{
			platform_suspend_begin (kernel/power/suspend.c)
			suspend_console (kernel/printk/printk.c)
			suspend_test_start (kernel/power/suspend_test.c)
			
			dpm_suspend_start (drivers/base/power/main.c)
			{
				dpm_prepare
				dpm_suspend
					cpufreq_suspend
						for_all( device_suspend )
			}
			
			suspend_test_finish (kernel/power/suspend_test.c)
			suspend_test(TEST_DEVICES) (kernel/power/suspend.c)
			
			suspend_enter (kernel/power/suspend.c)
			{
				platform_suspend_prepare
				
				dpm_suspend_late
				platform_suspend_prepare_late
				
				dpm_suspend_noirq
				platform_suspend_prepare_noirq
				
				suspend_test(TEST_PLATFORM)

				disable_nonboot_cpus
				suspend_test(TEST_CPUS)

				arch_suspend_disable_irqs
				
				syscore_suspend
				suspend_test(TEST_CORE)
				
				suspend_ops->enter
				
				syscore_resume
			}
		}
	}
}

Important: function suspend_test(level) (kernel/power/suspend.c) - compares level and pm_test_level. If they match function do mdelay(5000) and abort suspend process. We can abort suspend/freeze process in specified states:

suspend_test(TEST_FREEZER)
suspend_test(TEST_DEVICES)
suspend_test(TEST_PLATFORM)
suspend_test(TEST_CPUS)
suspend_test(TEST_CORE)

This logic works only if CONFIG_PM_DEBUG is enabled.

Magic

There is PM_TRACE support for x86 architecture. The main idea is to keep some values between reboots. For x86 the data is saved in RTC. PM_TRACE is not implemented for ARM.

Tests

  • don't disable serial console during suspend - no_console_suspend=1 (in cmdline)
  • PM_DEBUG and PM_TRACE must be enabled

Looks like AC100 failes to resume from devices state.

cat /sys/power/pm_test
# here will be available values
echo freezer > /sys/power/pm_test
echo mem > /sys/power/state
# OK
echo devices > /sys/power/pm_test
echo mem > /sys/power/state
# Fail (usually, wake up 1 time of 7)

Plan

  • set up tftpboot (for easier boot process between dev kernels)
  • check devices suspend/resume on latest NVEC patches for mainline
  • check suspend/resume with disabled PM for nvec
  • check suspend/resume with disabled PM for i2c-2 (where nvec is placed)
  • try to implement trace logic for suspend