summaryrefslogtreecommitdiffstats
path: root/Documentation/power/opp.txt
blob: b8a907dc01697890747ead021b836b4c60c24ae0 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
Operating Performance Points (OPP) Library
==========================================

(C) 2009-2010 Nishanth Menon <nm@ti.com>, Texas Instruments Incorporated

Contents
--------
1. Introduction
2. Initial OPP List Registration
3. OPP Search Functions
4. OPP Availability Control Functions
5. OPP Data Retrieval Functions
6. Cpufreq Table Generation
7. Data Structures

1. Introduction
===============
1.1 What is an Operating Performance Point (OPP)?

Complex SoCs of today consists of a multiple sub-modules working in conjunction.
In an operational system executing varied use cases, not all modules in the SoC
need to function at their highest performing frequency all the time. To
facilitate this, sub-modules in a SoC are grouped into domains, allowing some
domains to run at lower voltage and frequency while other domains run at
voltage/frequency pairs that are higher.

The set of discrete tuples consisting of frequency and voltage pairs that
the device will support per domain are called Operating Performance Points or
OPPs.

As an example:
Let us consider an MPU device which supports the following:
{300MHz at minimum voltage of 1V}, {800MHz at minimum voltage of 1.2V},
{1GHz at minimum voltage of 1.3V}

We can represent these as three OPPs as the following {Hz, uV} tuples:
{300000000, 1000000}
{800000000, 1200000}
{1000000000, 1300000}

1.2 Operating Performance Points Library

OPP library provides a set of helper functions to organize and query the OPP
information. The library is located in drivers/base/power/opp.c and the header
is located in include/linux/pm_opp.h. OPP library can be enabled by enabling
CONFIG_PM_OPP from power management menuconfig menu. OPP library depends on
CONFIG_PM as certain SoCs such as Texas Instrument's OMAP framework allows to
optionally boot at a certain OPP without needing cpufreq.

Typical usage of the OPP library is as follows:
(users)		-> registers a set of default OPPs		-> (library)
SoC framework	-> modifies on required cases certain OPPs	-> OPP layer
		-> queries to search/retrieve information	->

Architectures that provide a SoC framework for OPP should select ARCH_HAS_OPP
to make the OPP layer available.

OPP layer expects each domain to be represented by a unique device pointer. SoC
framework registers a set of initial OPPs per device with the OPP layer. This
list is expected to be an optimally small number typically around 5 per device.
This initial list contains a set of OPPs that the framework expects to be safely
enabled by default in the system.

Note on OPP Availability:
------------------------
As the system proceeds to operate, SoC framework may choose to make certain
OPPs available or not available on each device based on various external
factors. Example usage: Thermal management or other exceptional situations where
SoC framework might choose to disable a higher frequency OPP to safely continue
operations until that OPP could be re-enabled if possible.

OPP library facilitates this concept in it's implementation. The following
operational functions operate only on available opps:
opp_find_freq_{ceil, floor}, dev_pm_opp_get_voltage, dev_pm_opp_get_freq, dev_pm_opp_get_opp_count
and dev_pm_opp_init_cpufreq_table

dev_pm_opp_find_freq_exact is meant to be used to find the opp pointer which can then
be used for dev_pm_opp_enable/disable functions to make an opp available as required.

WARNING: Users of OPP library should refresh their availability count using
get_opp_count if dev_pm_opp_enable/disable functions are invoked for a device, the
exact mechanism to trigger these or the notification mechanism to other
dependent subsystems such as cpufreq are left to the discretion of the SoC
specific framework which uses the OPP library. Similar care needs to be taken
care to refresh the cpufreq table in cases of these operations.

WARNING on OPP List locking mechanism:
-------------------------------------------------
OPP library uses RCU for exclusivity. RCU allows the query functions to operate
in multiple contexts and this synchronization mechanism is optimal for a read
intensive operations on data structure as the OPP library caters to.

To ensure that the data retrieved are sane, the users such as SoC framework
should ensure that the section of code operating on OPP queries are locked
using RCU read locks. The opp_find_freq_{exact,ceil,floor},
opp_get_{voltage, freq, opp_count} fall into this category.

opp_{add,enable,disable} are updaters which use mutex and implement it's own
RCU locking mechanisms. dev_pm_opp_init_cpufreq_table acts as an updater and uses
mutex to implment RCU updater strategy. These functions should *NOT* be called
under RCU locks and other contexts that prevent blocking functions in RCU or
mutex operations from working.

2. Initial OPP List Registration
================================
The SoC implementation calls dev_pm_opp_add function iteratively to add OPPs per
device. It is expected that the SoC framework will register the OPP entries
optimally- typical numbers range to be less than 5. The list generated by
registering the OPPs is maintained by OPP library throughout the device
operation. The SoC framework can subsequently control the availability of the
OPPs dynamically using the dev_pm_opp_enable / disable functions.

dev_pm_opp_add - Add a new OPP for a specific domain represented by the device pointer.
	The OPP is defined using the frequency and voltage. Once added, the OPP
	is assumed to be available and control of it's availability can be done
	with the dev_pm_opp_enable/disable functions. OPP library internally stores
	and manages this information in the opp struct. This function may be
	used by SoC framework to define a optimal list as per the demands of
	SoC usage environment.

	WARNING: Do not use this function in interrupt context.

	Example:
	 soc_pm_init()
	 {
		/* Do things */
		r = dev_pm_opp_add(mpu_dev, 1000000, 900000);
		if (!r) {
			pr_err("%s: unable to register mpu opp(%d)\n", r);
			goto no_cpufreq;
		}
		/* Do cpufreq things */
	 no_cpufreq:
		/* Do remaining things */
	 }

3. OPP Search Functions
=======================
High level framework such as cpufreq operates on frequencies. To map the
frequency back to the corresponding OPP, OPP library provides handy functions
to search the OPP list that OPP library internally manages. These search
functions return the matching pointer representing the opp if a match is
found, else returns error. These errors are expected to be handled by standard
error checks such as IS_ERR() and appropriate actions taken by the caller.

dev_pm_opp_find_freq_exact - Search for an OPP based on an *exact* frequency and
	availability. This function is especially useful to enable an OPP which
	is not available by default.
	Example: In a case when SoC framework detects a situation where a
	higher frequency could be made available, it can use this function to
	find the OPP prior to call the dev_pm_opp_enable to actually make it available.
	 rcu_read_lock();
	 opp = dev_pm_opp_find_freq_exact(dev, 1000000000, false);
	 rcu_read_unlock();
	 /* dont operate on the pointer.. just do a sanity check.. */
	 if (IS_ERR(opp)) {
		pr_err("frequency not disabled!\n");
		/* trigger appropriate actions.. */
	 } else {
		dev_pm_opp_enable(dev,1000000000);
	 }

	NOTE: This is the only search function that operates on OPPs which are
	not available.

dev_pm_opp_find_freq_floor - Search for an available OPP which is *at most* the
	provided frequency. This function is useful while searching for a lesser
	match OR operating on OPP information in the order of decreasing
	frequency.
	Example: To find the highest opp for a device:
	 freq = ULONG_MAX;
	 rcu_read_lock();
	 dev_pm_opp_find_freq_floor(dev, &freq);
	 rcu_read_unlock();

dev_pm_opp_find_freq_ceil - Search for an available OPP which is *at least* the
	provided frequency. This function is useful while searching for a
	higher match OR operating on OPP information in the order of increasing
	frequency.
	Example 1: To find the lowest opp for a device:
	 freq = 0;
	 rcu_read_lock();
	 dev_pm_opp_find_freq_ceil(dev, &freq);
	 rcu_read_unlock();
	Example 2: A simplified implementation of a SoC cpufreq_driver->target:
	 soc_cpufreq_target(..)
	 {
		/* Do stuff like policy checks etc. */
		/* Find the best frequency match for the req */
		rcu_read_lock();
		opp = dev_pm_opp_find_freq_ceil(dev, &freq);
		rcu_read_unlock();
		if (!IS_ERR(opp))
			soc_switch_to_freq_voltage(freq);
		else
			/* do something when we can't satisfy the req */
		/* do other stuff */
	 }

4. OPP Availability Control Functions
=====================================
A default OPP list registered with the OPP library may not cater to all possible
situation. The OPP library provides a set of functions to modify the
availability of a OPP within the OPP list. This allows SoC frameworks to have
fine grained dynamic control of which sets of OPPs are operationally available.
These functions are intended to *temporarily* remove an OPP in conditions such
as thermal considerations (e.g. don't use OPPx until the temperature drops).

WARNING: Do not use these functions in interrupt context.

dev_pm_opp_enable - Make a OPP available for operation.
	Example: Lets say that 1GHz OPP is to be made available only if the
	SoC temperature is lower than a certain threshold. The SoC framework
	implementation might choose to do something as follows:
	 if (cur_temp < temp_low_thresh) {
		/* Enable 1GHz if it was disabled */
		rcu_read_lock();
		opp = dev_pm_opp_find_freq_exact(dev, 1000000000, false);
		rcu_read_unlock();
		/* just error check */
		if (!IS_ERR(opp))
			ret = dev_pm_opp_enable(dev, 1000000000);
		else
			goto try_something_else;
	 }

dev_pm_opp_disable - Make an OPP to be not available for operation
	Example: Lets say that 1GHz OPP is to be disabled if the temperature
	exceeds a threshold value. The SoC framework implementation might
	choose to do something as follows:
	 if (cur_temp > temp_high_thresh) {
		/* Disable 1GHz if it was enabled */
		rcu_read_lock();
		opp = dev_pm_opp_find_freq_exact(dev, 1000000000, true);
		rcu_read_unlock();
		/* just error check */
		if (!IS_ERR(opp))
			ret = dev_pm_opp_disable(dev, 1000000000);
		else
			goto try_something_else;
	 }

5. OPP Data Retrieval Functions
===============================
Since OPP library abstracts away the OPP information, a set of functions to pull
information from the OPP structure is necessary. Once an OPP pointer is
retrieved using the search functions, the following functions can be used by SoC
framework to retrieve the information represented inside the OPP layer.

dev_pm_opp_get_voltage - Retrieve the voltage represented by the opp pointer.
	Example: At a cpufreq transition to a different frequency, SoC
	framework requires to set the voltage represented by the OPP using
	the regulator framework to the Power Management chip providing the
	voltage.
	 soc_switch_to_freq_voltage(freq)
	 {
		/* do things */
		rcu_read_lock();
		opp = dev_pm_opp_find_freq_ceil(dev, &freq);
		v = dev_pm_opp_get_voltage(opp);
		rcu_read_unlock();
		if (v)
			regulator_set_voltage(.., v);
		/* do other things */
	 }

dev_pm_opp_get_freq - Retrieve the freq represented by the opp pointer.
	Example: Lets say the SoC framework uses a couple of helper functions
	we could pass opp pointers instead of doing additional parameters to
	handle quiet a bit of data parameters.
	 soc_cpufreq_target(..)
	 {
		/* do things.. */
		 max_freq = ULONG_MAX;
		 rcu_read_lock();
		 max_opp = dev_pm_opp_find_freq_floor(dev,&max_freq);
		 requested_opp = dev_pm_opp_find_freq_ceil(dev,&freq);
		 if (!IS_ERR(max_opp) && !IS_ERR(requested_opp))
			r = soc_test_validity(max_opp, requested_opp);
		 rcu_read_unlock();
		/* do other things */
	 }
	 soc_test_validity(..)
	 {
		 if(dev_pm_opp_get_voltage(max_opp) < dev_pm_opp_get_voltage(requested_opp))
			 return -EINVAL;
		 if(dev_pm_opp_get_freq(max_opp) < dev_pm_opp_get_freq(requested_opp))
			 return -EINVAL;
		/* do things.. */
	 }

dev_pm_opp_get_opp_count - Retrieve the number of available opps for a device
	Example: Lets say a co-processor in the SoC needs to know the available
	frequencies in a table, the main processor can notify as following:
	 soc_notify_coproc_available_frequencies()
	 {
		/* Do things */
		rcu_read_lock();
		num_available = dev_pm_opp_get_opp_count(dev);
		speeds = kzalloc(sizeof(u32) * num_available, GFP_KERNEL);
		/* populate the table in increasing order */
		freq = 0;
		while (!IS_ERR(opp = dev_pm_opp_find_freq_ceil(dev, &freq))) {
			speeds[i] = freq;
			freq++;
			i++;
		}
		rcu_read_unlock();

		soc_notify_coproc(AVAILABLE_FREQs, speeds, num_available);
		/* Do other things */
	 }

6. Cpufreq Table Generation
===========================
dev_pm_opp_init_cpufreq_table - cpufreq framework typically is initialized with
	cpufreq_frequency_table_cpuinfo which is provided with the list of
	frequencies that are available for operation. This function provides
	a ready to use conversion routine to translate the OPP layer's internal
	information about the available frequencies into a format readily
	providable to cpufreq.

	WARNING: Do not use this function in interrupt context.

	Example:
	 soc_pm_init()
	 {
		/* Do things */
		r = dev_pm_opp_init_cpufreq_table(dev, &freq_table);
		if (!r)
			cpufreq_frequency_table_cpuinfo(policy, freq_table);
		/* Do other things */
	 }

	NOTE: This function is available only if CONFIG_CPU_FREQ is enabled in
	addition to CONFIG_PM as power management feature is required to
	dynamically scale voltage and frequency in a system.

dev_pm_opp_free_cpufreq_table - Free up the table allocated by dev_pm_opp_init_cpufreq_table

7. Data Structures
==================
Typically an SoC contains multiple voltage domains which are variable. Each
domain is represented by a device pointer. The relationship to OPP can be
represented as follows:
SoC
 |- device 1
 |	|- opp 1 (availability, freq, voltage)
 |	|- opp 2 ..
 ...	...
 |	`- opp n ..
 |- device 2
 ...
 `- device m

OPP library maintains a internal list that the SoC framework populates and
accessed by various functions as described above. However, the structures
representing the actual OPPs and domains are internal to the OPP library itself
to allow for suitable abstraction reusable across systems.

struct dev_pm_opp - The internal data structure of OPP library which is used to
	represent an OPP. In addition to the freq, voltage, availability
	information, it also contains internal book keeping information required
	for the OPP library to operate on.  Pointer to this structure is
	provided back to the users such as SoC framework to be used as a
	identifier for OPP in the interactions with OPP layer.

	WARNING: The struct dev_pm_opp pointer should not be parsed or modified by the
	users. The defaults of for an instance is populated by dev_pm_opp_add, but the
	availability of the OPP can be modified by dev_pm_opp_enable/disable functions.

struct device - This is used to identify a domain to the OPP layer. The
	nature of the device and it's implementation is left to the user of
	OPP library such as the SoC framework.

Overall, in a simplistic view, the data structure operations is represented as
following:

Initialization / modification:
            +-----+        /- dev_pm_opp_enable
dev_pm_opp_add --> | opp | <-------
  |         +-----+        \- dev_pm_opp_disable
  \-------> domain_info(device)

Search functions:
             /-- dev_pm_opp_find_freq_ceil  ---\   +-----+
domain_info<---- dev_pm_opp_find_freq_exact -----> | opp |
             \-- dev_pm_opp_find_freq_floor ---/   +-----+

Retrieval functions:
+-----+     /- dev_pm_opp_get_voltage
| opp | <---
+-----+     \- dev_pm_opp_get_freq

domain_info <- dev_pm_opp_get_opp_count