summaryrefslogtreecommitdiffstats
path: root/Documentation/driver-api
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/driver-api')
-rw-r--r--Documentation/driver-api/80211/cfg80211.rst3
-rw-r--r--Documentation/driver-api/device-io.rst201
-rw-r--r--Documentation/driver-api/device_link.rst18
-rw-r--r--Documentation/driver-api/dma-buf.rst92
-rw-r--r--Documentation/driver-api/firmware/built-in-fw.rst38
-rw-r--r--Documentation/driver-api/firmware/core.rst16
-rw-r--r--Documentation/driver-api/firmware/direct-fs-lookup.rst30
-rw-r--r--Documentation/driver-api/firmware/fallback-mechanisms.rst195
-rw-r--r--Documentation/driver-api/firmware/firmware_cache.rst51
-rw-r--r--Documentation/driver-api/firmware/fw_search_path.rst26
-rw-r--r--Documentation/driver-api/firmware/index.rst16
-rw-r--r--Documentation/driver-api/firmware/introduction.rst27
-rw-r--r--Documentation/driver-api/firmware/lookup-order.rst18
-rw-r--r--Documentation/driver-api/firmware/request_firmware.rst56
-rw-r--r--Documentation/driver-api/iio/buffers.rst125
-rw-r--r--Documentation/driver-api/iio/core.rst182
-rw-r--r--Documentation/driver-api/iio/index.rst17
-rw-r--r--Documentation/driver-api/iio/intro.rst33
-rw-r--r--Documentation/driver-api/iio/triggered-buffers.rst69
-rw-r--r--Documentation/driver-api/iio/triggers.rst80
-rw-r--r--Documentation/driver-api/index.rst6
-rw-r--r--Documentation/driver-api/infrastructure.rst15
-rw-r--r--Documentation/driver-api/pm/conf.py10
-rw-r--r--Documentation/driver-api/pm/devices.rst736
-rw-r--r--Documentation/driver-api/pm/index.rst16
-rw-r--r--Documentation/driver-api/pm/notifiers.rst70
-rw-r--r--Documentation/driver-api/pm/types.rst5
-rw-r--r--Documentation/driver-api/regulator.rst170
-rw-r--r--Documentation/driver-api/uio-howto.rst705
29 files changed, 3003 insertions, 23 deletions
diff --git a/Documentation/driver-api/80211/cfg80211.rst b/Documentation/driver-api/80211/cfg80211.rst
index b1e149ea6fee..eca534ab6172 100644
--- a/Documentation/driver-api/80211/cfg80211.rst
+++ b/Documentation/driver-api/80211/cfg80211.rst
@@ -45,6 +45,9 @@ Device registration
:functions: wiphy_new
.. kernel-doc:: include/net/cfg80211.h
+ :functions: wiphy_read_of_freq_limits
+
+.. kernel-doc:: include/net/cfg80211.h
:functions: wiphy_register
.. kernel-doc:: include/net/cfg80211.h
diff --git a/Documentation/driver-api/device-io.rst b/Documentation/driver-api/device-io.rst
new file mode 100644
index 000000000000..b00b23903078
--- /dev/null
+++ b/Documentation/driver-api/device-io.rst
@@ -0,0 +1,201 @@
+.. Copyright 2001 Matthew Wilcox
+..
+.. This documentation is free software; you can redistribute
+.. it and/or modify it under the terms of the GNU General Public
+.. License as published by the Free Software Foundation; either
+.. version 2 of the License, or (at your option) any later
+.. version.
+
+===============================
+Bus-Independent Device Accesses
+===============================
+
+:Author: Matthew Wilcox
+:Author: Alan Cox
+
+Introduction
+============
+
+Linux provides an API which abstracts performing IO across all busses
+and devices, allowing device drivers to be written independently of bus
+type.
+
+Memory Mapped IO
+================
+
+Getting Access to the Device
+----------------------------
+
+The most widely supported form of IO is memory mapped IO. That is, a
+part of the CPU's address space is interpreted not as accesses to
+memory, but as accesses to a device. Some architectures define devices
+to be at a fixed address, but most have some method of discovering
+devices. The PCI bus walk is a good example of such a scheme. This
+document does not cover how to receive such an address, but assumes you
+are starting with one. Physical addresses are of type unsigned long.
+
+This address should not be used directly. Instead, to get an address
+suitable for passing to the accessor functions described below, you
+should call :c:func:`ioremap()`. An address suitable for accessing
+the device will be returned to you.
+
+After you've finished using the device (say, in your module's exit
+routine), call :c:func:`iounmap()` in order to return the address
+space to the kernel. Most architectures allocate new address space each
+time you call :c:func:`ioremap()`, and they can run out unless you
+call :c:func:`iounmap()`.
+
+Accessing the device
+--------------------
+
+The part of the interface most used by drivers is reading and writing
+memory-mapped registers on the device. Linux provides interfaces to read
+and write 8-bit, 16-bit, 32-bit and 64-bit quantities. Due to a
+historical accident, these are named byte, word, long and quad accesses.
+Both read and write accesses are supported; there is no prefetch support
+at this time.
+
+The functions are named readb(), readw(), readl(), readq(),
+readb_relaxed(), readw_relaxed(), readl_relaxed(), readq_relaxed(),
+writeb(), writew(), writel() and writeq().
+
+Some devices (such as framebuffers) would like to use larger transfers than
+8 bytes at a time. For these devices, the :c:func:`memcpy_toio()`,
+:c:func:`memcpy_fromio()` and :c:func:`memset_io()` functions are
+provided. Do not use memset or memcpy on IO addresses; they are not
+guaranteed to copy data in order.
+
+The read and write functions are defined to be ordered. That is the
+compiler is not permitted to reorder the I/O sequence. When the ordering
+can be compiler optimised, you can use __readb() and friends to
+indicate the relaxed ordering. Use this with care.
+
+While the basic functions are defined to be synchronous with respect to
+each other and ordered with respect to each other the busses the devices
+sit on may themselves have asynchronicity. In particular many authors
+are burned by the fact that PCI bus writes are posted asynchronously. A
+driver author must issue a read from the same device to ensure that
+writes have occurred in the specific cases the author cares. This kind
+of property cannot be hidden from driver writers in the API. In some
+cases, the read used to flush the device may be expected to fail (if the
+card is resetting, for example). In that case, the read should be done
+from config space, which is guaranteed to soft-fail if the card doesn't
+respond.
+
+The following is an example of flushing a write to a device when the
+driver would like to ensure the write's effects are visible prior to
+continuing execution::
+
+ static inline void
+ qla1280_disable_intrs(struct scsi_qla_host *ha)
+ {
+ struct device_reg *reg;
+
+ reg = ha->iobase;
+ /* disable risc and host interrupts */
+ WRT_REG_WORD(&reg->ictrl, 0);
+ /*
+ * The following read will ensure that the above write
+ * has been received by the device before we return from this
+ * function.
+ */
+ RD_REG_WORD(&reg->ictrl);
+ ha->flags.ints_enabled = 0;
+ }
+
+In addition to write posting, on some large multiprocessing systems
+(e.g. SGI Challenge, Origin and Altix machines) posted writes won't be
+strongly ordered coming from different CPUs. Thus it's important to
+properly protect parts of your driver that do memory-mapped writes with
+locks and use the :c:func:`mmiowb()` to make sure they arrive in the
+order intended. Issuing a regular readX() will also ensure write ordering,
+but should only be used when the
+driver has to be sure that the write has actually arrived at the device
+(not that it's simply ordered with respect to other writes), since a
+full readX() is a relatively expensive operation.
+
+Generally, one should use :c:func:`mmiowb()` prior to releasing a spinlock
+that protects regions using :c:func:`writeb()` or similar functions that
+aren't surrounded by readb() calls, which will ensure ordering
+and flushing. The following pseudocode illustrates what might occur if
+write ordering isn't guaranteed via :c:func:`mmiowb()` or one of the
+readX() functions::
+
+ CPU A: spin_lock_irqsave(&dev_lock, flags)
+ CPU A: ...
+ CPU A: writel(newval, ring_ptr);
+ CPU A: spin_unlock_irqrestore(&dev_lock, flags)
+ ...
+ CPU B: spin_lock_irqsave(&dev_lock, flags)
+ CPU B: writel(newval2, ring_ptr);
+ CPU B: ...
+ CPU B: spin_unlock_irqrestore(&dev_lock, flags)
+
+In the case above, newval2 could be written to ring_ptr before newval.
+Fixing it is easy though::
+
+ CPU A: spin_lock_irqsave(&dev_lock, flags)
+ CPU A: ...
+ CPU A: writel(newval, ring_ptr);
+ CPU A: mmiowb(); /* ensure no other writes beat us to the device */
+ CPU A: spin_unlock_irqrestore(&dev_lock, flags)
+ ...
+ CPU B: spin_lock_irqsave(&dev_lock, flags)
+ CPU B: writel(newval2, ring_ptr);
+ CPU B: ...
+ CPU B: mmiowb();
+ CPU B: spin_unlock_irqrestore(&dev_lock, flags)
+
+See tg3.c for a real world example of how to use :c:func:`mmiowb()`
+
+PCI ordering rules also guarantee that PIO read responses arrive after any
+outstanding DMA writes from that bus, since for some devices the result of
+a readb() call may signal to the driver that a DMA transaction is
+complete. In many cases, however, the driver may want to indicate that the
+next readb() call has no relation to any previous DMA writes
+performed by the device. The driver can use readb_relaxed() for
+these cases, although only some platforms will honor the relaxed
+semantics. Using the relaxed read functions will provide significant
+performance benefits on platforms that support it. The qla2xxx driver
+provides examples of how to use readX_relaxed(). In many cases, a majority
+of the driver's readX() calls can safely be converted to readX_relaxed()
+calls, since only a few will indicate or depend on DMA completion.
+
+Port Space Accesses
+===================
+
+Port Space Explained
+--------------------
+
+Another form of IO commonly supported is Port Space. This is a range of
+addresses separate to the normal memory address space. Access to these
+addresses is generally not as fast as accesses to the memory mapped
+addresses, and it also has a potentially smaller address space.
+
+Unlike memory mapped IO, no preparation is required to access port
+space.
+
+Accessing Port Space
+--------------------
+
+Accesses to this space are provided through a set of functions which
+allow 8-bit, 16-bit and 32-bit accesses; also known as byte, word and
+long. These functions are :c:func:`inb()`, :c:func:`inw()`,
+:c:func:`inl()`, :c:func:`outb()`, :c:func:`outw()` and
+:c:func:`outl()`.
+
+Some variants are provided for these functions. Some devices require
+that accesses to their ports are slowed down. This functionality is
+provided by appending a ``_p`` to the end of the function.
+There are also equivalents to memcpy. The :c:func:`ins()` and
+:c:func:`outs()` functions copy bytes, words or longs to the given
+port.
+
+Public Functions Provided
+=========================
+
+.. kernel-doc:: arch/x86/include/asm/io.h
+ :internal:
+
+.. kernel-doc:: lib/pci_iomap.c
+ :export:
diff --git a/Documentation/driver-api/device_link.rst b/Documentation/driver-api/device_link.rst
index 5f5713448703..70e328e16aad 100644
--- a/Documentation/driver-api/device_link.rst
+++ b/Documentation/driver-api/device_link.rst
@@ -1,3 +1,6 @@
+.. |struct dev_pm_domain| replace:: :c:type:`struct dev_pm_domain <dev_pm_domain>`
+.. |struct generic_pm_domain| replace:: :c:type:`struct generic_pm_domain <generic_pm_domain>`
+
============
Device links
============
@@ -120,12 +123,11 @@ Examples
is the same as if the MMU was the parent of the master device.
The fact that both devices share the same power domain would normally
- suggest usage of a :c:type:`struct dev_pm_domain` or :c:type:`struct
- generic_pm_domain`, however these are not independent devices that
- happen to share a power switch, but rather the MMU device serves the
- busmaster device and is useless without it. A device link creates a
- synthetic hierarchical relationship between the devices and is thus
- more apt.
+ suggest usage of a |struct dev_pm_domain| or |struct generic_pm_domain|,
+ however these are not independent devices that happen to share a power
+ switch, but rather the MMU device serves the busmaster device and is
+ useless without it. A device link creates a synthetic hierarchical
+ relationship between the devices and is thus more apt.
* A Thunderbolt host controller comprises a number of PCIe hotplug ports
and an NHI device to manage the PCIe switch. On resume from system sleep,
@@ -157,7 +159,7 @@ Examples
Alternatives
============
-* A :c:type:`struct dev_pm_domain` can be used to override the bus,
+* A |struct dev_pm_domain| can be used to override the bus,
class or device type callbacks. It is intended for devices sharing
a single on/off switch, however it does not guarantee a specific
suspend/resume ordering, this needs to be implemented separately.
@@ -166,7 +168,7 @@ Alternatives
suspended. Furthermore it cannot be used to enforce a specific shutdown
ordering or a driver presence dependency.
-* A :c:type:`struct generic_pm_domain` is a lot more heavyweight than a
+* A |struct generic_pm_domain| is a lot more heavyweight than a
device link and does not allow for shutdown ordering or driver presence
dependencies. It also cannot be used on ACPI systems.
diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
index a9b457a4b949..31671b469627 100644
--- a/Documentation/driver-api/dma-buf.rst
+++ b/Documentation/driver-api/dma-buf.rst
@@ -17,6 +17,98 @@ shared or exclusive fence(s) associated with the buffer.
Shared DMA Buffers
------------------
+This document serves as a guide to device-driver writers on what is the dma-buf
+buffer sharing API, how to use it for exporting and using shared buffers.
+
+Any device driver which wishes to be a part of DMA buffer sharing, can do so as
+either the 'exporter' of buffers, or the 'user' or 'importer' of buffers.
+
+Say a driver A wants to use buffers created by driver B, then we call B as the
+exporter, and A as buffer-user/importer.
+
+The exporter
+
+ - implements and manages operations in :c:type:`struct dma_buf_ops
+ <dma_buf_ops>` for the buffer,
+ - allows other users to share the buffer by using dma_buf sharing APIs,
+ - manages the details of buffer allocation, wrapped int a :c:type:`struct
+ dma_buf <dma_buf>`,
+ - decides about the actual backing storage where this allocation happens,
+ - and takes care of any migration of scatterlist - for all (shared) users of
+ this buffer.
+
+The buffer-user
+
+ - is one of (many) sharing users of the buffer.
+ - doesn't need to worry about how the buffer is allocated, or where.
+ - and needs a mechanism to get access to the scatterlist that makes up this
+ buffer in memory, mapped into its own address space, so it can access the
+ same area of memory. This interface is provided by :c:type:`struct
+ dma_buf_attachment <dma_buf_attachment>`.
+
+Any exporters or users of the dma-buf buffer sharing framework must have a
+'select DMA_SHARED_BUFFER' in their respective Kconfigs.
+
+Userspace Interface Notes
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Mostly a DMA buffer file descriptor is simply an opaque object for userspace,
+and hence the generic interface exposed is very minimal. There's a few things to
+consider though:
+
+- Since kernel 3.12 the dma-buf FD supports the llseek system call, but only
+ with offset=0 and whence=SEEK_END|SEEK_SET. SEEK_SET is supported to allow
+ the usual size discover pattern size = SEEK_END(0); SEEK_SET(0). Every other
+ llseek operation will report -EINVAL.
+
+ If llseek on dma-buf FDs isn't support the kernel will report -ESPIPE for all
+ cases. Userspace can use this to detect support for discovering the dma-buf
+ size using llseek.
+
+- In order to avoid fd leaks on exec, the FD_CLOEXEC flag must be set
+ on the file descriptor. This is not just a resource leak, but a
+ potential security hole. It could give the newly exec'd application
+ access to buffers, via the leaked fd, to which it should otherwise
+ not be permitted access.
+
+ The problem with doing this via a separate fcntl() call, versus doing it
+ atomically when the fd is created, is that this is inherently racy in a
+ multi-threaded app[3]. The issue is made worse when it is library code
+ opening/creating the file descriptor, as the application may not even be
+ aware of the fd's.
+
+ To avoid this problem, userspace must have a way to request O_CLOEXEC
+ flag be set when the dma-buf fd is created. So any API provided by
+ the exporting driver to create a dmabuf fd must provide a way to let
+ userspace control setting of O_CLOEXEC flag passed in to dma_buf_fd().
+
+- Memory mapping the contents of the DMA buffer is also supported. See the
+ discussion below on `CPU Access to DMA Buffer Objects`_ for the full details.
+
+- The DMA buffer FD is also pollable, see `Fence Poll Support`_ below for
+ details.
+
+Basic Operation and Device DMA Access
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. kernel-doc:: drivers/dma-buf/dma-buf.c
+ :doc: dma buf device access
+
+CPU Access to DMA Buffer Objects
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. kernel-doc:: drivers/dma-buf/dma-buf.c
+ :doc: cpu access
+
+Fence Poll Support
+~~~~~~~~~~~~~~~~~~
+
+.. kernel-doc:: drivers/dma-buf/dma-buf.c
+ :doc: fence polling
+
+Kernel Functions and Structures Reference
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
.. kernel-doc:: drivers/dma-buf/dma-buf.c
:export:
diff --git a/Documentation/driver-api/firmware/built-in-fw.rst b/Documentation/driver-api/firmware/built-in-fw.rst
new file mode 100644
index 000000000000..7300e66857f8
--- /dev/null
+++ b/Documentation/driver-api/firmware/built-in-fw.rst
@@ -0,0 +1,38 @@
+=================
+Built-in firmware
+=================
+
+Firmware can be built-in to the kernel, this means building the firmware
+into vmlinux directly, to enable avoiding having to look for firmware from
+the filesystem. Instead, firmware can be looked for inside the kernel
+directly. You can enable built-in firmware using the kernel configuration
+options:
+
+ * CONFIG_EXTRA_FIRMWARE
+ * CONFIG_EXTRA_FIRMWARE_DIR
+
+This should not be confused with CONFIG_FIRMWARE_IN_KERNEL, this is for drivers
+which enables firmware to be built as part of the kernel build process. This
+option, CONFIG_FIRMWARE_IN_KERNEL, will build all firmware for all drivers
+enabled which ship its firmware inside the Linux kernel source tree.
+
+There are a few reasons why you might want to consider building your firmware
+into the kernel with CONFIG_EXTRA_FIRMWARE though:
+
+* Speed
+* Firmware is needed for accessing the boot device, and the user doesn't
+ want to stuff the firmware into the boot initramfs.
+
+Even if you have these needs there are a few reasons why you may not be
+able to make use of built-in firmware:
+
+* Legalese - firmware is non-GPL compatible
+* Some firmware may be optional
+* Firmware upgrades are possible, therefore a new firmware would implicate
+ a complete kernel rebuild.
+* Some firmware files may be really large in size. The remote-proc subsystem
+ is an example subsystem which deals with these sorts of firmware
+* The firmware may need to be scraped out from some device specific location
+ dynamically, an example is calibration data for for some WiFi chipsets. This
+ calibration data can be unique per sold device.
+
diff --git a/Documentation/driver-api/firmware/core.rst b/Documentation/driver-api/firmware/core.rst
new file mode 100644
index 000000000000..1d1688cbc078
--- /dev/null
+++ b/Documentation/driver-api/firmware/core.rst
@@ -0,0 +1,16 @@
+==========================
+Firmware API core features
+==========================
+
+The firmware API has a rich set of core features available. This section
+documents these features.
+
+.. toctree::
+
+ fw_search_path
+ built-in-fw
+ firmware_cache
+ direct-fs-lookup
+ fallback-mechanisms
+ lookup-order
+
diff --git a/Documentation/driver-api/firmware/direct-fs-lookup.rst b/Documentation/driver-api/firmware/direct-fs-lookup.rst
new file mode 100644
index 000000000000..82b4d585a213
--- /dev/null
+++ b/Documentation/driver-api/firmware/direct-fs-lookup.rst
@@ -0,0 +1,30 @@
+========================
+Direct filesystem lookup
+========================
+
+Direct filesystem lookup is the most common form of firmware lookup performed
+by the kernel. The kernel looks for the firmware directly on the root
+filesystem in the paths documented in the section 'Firmware search paths'.
+The filesystem lookup is implemented in fw_get_filesystem_firmware(), it
+uses common core kernel file loader facility kernel_read_file_from_path().
+The max path allowed is PATH_MAX -- currently this is 4096 characters.
+
+It is recommended you keep /lib/firmware paths on your root filesystem,
+avoid having a separate partition for them in order to avoid possible
+races with lookups and avoid uses of the custom fallback mechanisms
+documented below.
+
+Firmware and initramfs
+----------------------
+
+Drivers which are built-in to the kernel should have the firmware integrated
+also as part of the initramfs used to boot the kernel given that otherwise
+a race is possible with loading the driver and the real rootfs not yet being
+available. Stuffing the firmware into initramfs resolves this race issue,
+however note that using initrd does not suffice to address the same race.
+
+There are circumstances that justify not wanting to include firmware into
+initramfs, such as dealing with large firmware firmware files for the
+remote-proc subsystem. For such cases using a userspace fallback mechanism
+is currently the only viable solution as only userspace can know for sure
+when the real rootfs is ready and mounted.
diff --git a/Documentation/driver-api/firmware/fallback-mechanisms.rst b/Documentation/driver-api/firmware/fallback-mechanisms.rst
new file mode 100644
index 000000000000..d19354794e67
--- /dev/null
+++ b/Documentation/driver-api/firmware/fallback-mechanisms.rst
@@ -0,0 +1,195 @@
+===================
+Fallback mechanisms
+===================
+
+A fallback mechanism is supported to allow to overcome failures to do a direct
+filesystem lookup on the root filesystem or when the firmware simply cannot be
+installed for practical reasons on the root filesystem. The kernel
+configuration options related to supporting the firmware fallback mechanism are:
+
+ * CONFIG_FW_LOADER_USER_HELPER: enables building the firmware fallback
+ mechanism. Most distributions enable this option today. If enabled but
+ CONFIG_FW_LOADER_USER_HELPER_FALLBACK is disabled, only the custom fallback
+ mechanism is available and for the request_firmware_nowait() call.
+ * CONFIG_FW_LOADER_USER_HELPER_FALLBACK: force enables each request to
+ enable the kobject uevent fallback mechanism on all firmware API calls
+ except request_firmware_direct(). Most distributions disable this option
+ today. The call request_firmware_nowait() allows for one alternative
+ fallback mechanism: if this kconfig option is enabled and your second
+ argument to request_firmware_nowait(), uevent, is set to false you are
+ informing the kernel that you have a custom fallback mechanism and it will
+ manually load the firmware. Read below for more details.
+
+Note that this means when having this configuration:
+
+CONFIG_FW_LOADER_USER_HELPER=y
+CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
+
+the kobject uevent fallback mechanism will never take effect even
+for request_firmware_nowait() when uevent is set to true.
+
+Justifying the firmware fallback mechanism
+==========================================
+
+Direct filesystem lookups may fail for a variety of reasons. Known reasons for
+this are worth itemizing and documenting as it justifies the need for the
+fallback mechanism:
+
+* Race against access with the root filesystem upon bootup.
+
+* Races upon resume from suspend. This is resolved by the firmware cache, but
+ the firmware cache is only supported if you use uevents, and its not
+ supported for request_firmware_into_buf().
+
+* Firmware is not accessible through typical means:
+ * It cannot be installed into the root filesystem
+ * The firmware provides very unique device specific data tailored for
+ the unit gathered with local information. An example is calibration
+ data for WiFi chipsets for mobile devices. This calibration data is
+ not common to all units, but tailored per unit. Such information may
+ be installed on a separate flash partition other than where the root
+ filesystem is provided.
+
+Types of fallback mechanisms
+============================
+
+There are really two fallback mechanisms available using one shared sysfs
+interface as a loading facility:
+
+* Kobject uevent fallback mechanism
+* Custom fallback mechanism
+
+First lets document the shared sysfs loading facility.
+
+Firmware sysfs loading facility
+===============================
+
+In order to help device drivers upload firmware using a fallback mechanism
+the firmware infrastructure creates a sysfs interface to enable userspace
+to load and indicate when firmware is ready. The sysfs directory is created
+via fw_create_instance(). This call creates a new struct device named after
+the firmware requested, and establishes it in the device hierarchy by
+associating the device used to make the request as the device's parent.
+The sysfs directory's file attributes are defined and controlled through
+the new device's class (firmare_class) and group (fw_dev_attr_groups).
+This is actually where the original firmware_class.c file name comes from,
+as originally the only firmware loading mechanism available was the
+mechanism we now use as a fallback mechanism.
+
+To load firmware using the sysfs interface we expose a loading indicator,
+and a file upload firmware into:
+
+ * /sys/$DEVPATH/loading
+ * /sys/$DEVPATH/data
+
+To upload firmware you will echo 1 onto the loading file to indicate
+you are loading firmware. You then cat the firmware into the data file,
+and you notify the kernel the firmware is ready by echo'ing 0 onto
+the loading file.
+
+The firmware device used to help load firmware using sysfs is only created if
+direct firmware loading fails and if the fallback mechanism is enabled for your
+firmware request, this is set up with fw_load_from_user_helper(). It is
+important to re-iterate that no device is created if a direct filesystem lookup
+succeeded.
+
+Using::
+
+ echo 1 > /sys/$DEVPATH/loading
+
+Will clean any previous partial load at once and make the firmware API
+return an error. When loading firmware the firmware_class grows a buffer
+for the firmware in PAGE_SIZE increments to hold the image as it comes in.
+
+firmware_data_read() and firmware_loading_show() are just provided for the
+test_firmware driver for testing, they are not called in normal use or
+expected to be used regularly by userspace.
+
+Firmware kobject uevent fallback mechanism
+==========================================
+
+Since a device is created for the sysfs interface to help load firmware as a
+fallback mechanism userspace can be informed of the addition of the device by
+relying on kobject uevents. The addition of the device into the device
+hierarchy means the fallback mechanism for firmware loading has been initiated.
+For details of implementation refer to _request_firmware_load(), in particular
+on the use of dev_set_uevent_suppress() and kobject_uevent().
+
+The kernel's kobject uevent mechanism is implemented in lib/kobject_uevent.c,
+it issues uevents to userspace. As a supplement to kobject uevents Linux
+distributions could also enable CONFIG_UEVENT_HELPER_PATH, which makes use of
+core kernel's usermode helper (UMH) functionality to call out to a userspace
+helper for kobject uevents. In practice though no standard distribution has
+ever used the CONFIG_UEVENT_HELPER_PATH. If CONFIG_UEVENT_HELPER_PATH is
+enabled this binary would be called each time kobject_uevent_env() gets called
+in the kernel for each kobject uevent triggered.
+
+Different implementations have been supported in userspace to take advantage of
+this fallback mechanism. When firmware loading was only possible using the
+sysfs mechanism the userspace component "hotplug" provided the functionality of
+monitoring for kobject events. Historically this was superseded be systemd's
+udev, however firmware loading support was removed from udev as of systemd
+commit be2ea723b1d0 ("udev: remove userspace firmware loading support")
+as of v217 on August, 2014. This means most Linux distributions today are
+not using or taking advantage of the firmware fallback mechanism provided
+by kobject uevents. This is specially exacerbated due to the fact that most
+distributions today disable CONFIG_FW_LOADER_USER_HELPER_FALLBACK.
+
+Refer to do_firmware_uevent() for details of the kobject event variables
+setup. Variables passwdd with a kobject add event:
+
+* FIRMWARE=firmware name
+* TIMEOUT=timeout value
+* ASYNC=whether or not the API request was asynchronous
+
+By default DEVPATH is set by the internal kernel kobject infrastructure.
+Below is an example simple kobject uevent script::
+
+ # Both $DEVPATH and $FIRMWARE are already provided in the environment.
+ MY_FW_DIR=/lib/firmware/
+ echo 1 > /sys/$DEVPATH/loading
+ cat $MY_FW_DIR/$FIRMWARE > /sys/$DEVPATH/data
+ echo 0 > /sys/$DEVPATH/loading
+
+Firmware custom fallback mechanism
+==================================
+
+Users of the request_firmware_nowait() call have yet another option available
+at their disposal: rely on the sysfs fallback mechanism but request that no
+kobject uevents be issued to userspace. The original logic behind this
+was that utilities other than udev might be required to lookup firmware
+in non-traditional paths -- paths outside of the listing documented in the
+section 'Direct filesystem lookup'. This option is not available to any of
+the other API calls as uevents are always forced for them.
+
+Since uevents are only meaningful if the fallback mechanism is enabled
+in your kernel it would seem odd to enable uevents with kernels that do not
+have the fallback mechanism enabled in their kernels. Unfortunately we also
+rely on the uevent flag which can be disabled by request_firmware_nowait() to
+also setup the firmware cache for firmware requests. As documented above,
+the firmware cache is only set up if uevent is enabled for an API call.
+Although this can disable the firmware cache for request_firmware_nowait()
+calls, users of this API should not use it for the purposes of disabling
+the cache as that was not the original purpose of the flag. Not setting
+the uevent flag means you want to opt-in for the firmware fallback mechanism
+but you want to suppress kobject uevents, as you have a custom solution which
+will monitor for your device addition into the device hierarchy somehow and
+load firmware for you through a custom path.
+
+Firmware fallback timeout
+=========================
+
+The firmware fallback mechanism has a timeout. If firmware is not loaded
+onto the sysfs interface by the timeout value an error is sent to the
+driver. By default the timeout is set to 60 seconds if uevents are
+desirable, otherwise MAX_JIFFY_OFFSET is used (max timeout possible).
+The logic behind using MAX_JIFFY_OFFSET for non-uevents is that a custom
+solution will have as much time as it needs to load firmware.
+
+You can customize the firmware timeout by echo'ing your desired timeout into
+the following file:
+
+* /sys/class/firmware/timeout
+
+If you echo 0 into it means MAX_JIFFY_OFFSET will be used. The data type
+for the timeout is an int.
diff --git a/Documentation/driver-api/firmware/firmware_cache.rst b/Documentation/driver-api/firmware/firmware_cache.rst
new file mode 100644
index 000000000000..2210e5bfb332
--- /dev/null
+++ b/Documentation/driver-api/firmware/firmware_cache.rst
@@ -0,0 +1,51 @@
+==============
+Firmware cache
+==============
+
+When Linux resumes from suspend some device drivers require firmware lookups to
+re-initialize devices. During resume there may be a period of time during which
+firmware lookups are not possible, during this short period of time firmware
+requests will fail. Time is of essence though, and delaying drivers to wait for
+the root filesystem for firmware delays user experience with device
+functionality. In order to support these requirements the firmware
+infrastructure implements a firmware cache for device drivers for most API
+calls, automatically behind the scenes.
+
+The firmware cache makes using certain firmware API calls safe during a device
+driver's suspend and resume callback. Users of these API calls needn't cache
+the firmware by themselves for dealing with firmware loss during system resume.
+
+The firmware cache works by requesting for firmware prior to suspend and
+caching it in memory. Upon resume device drivers using the firmware API will
+have access to the firmware immediately, without having to wait for the root
+filesystem to mount or dealing with possible race issues with lookups as the
+root filesystem mounts.
+
+Some implementation details about the firmware cache setup:
+
+* The firmware cache is setup by adding a devres entry for each device that
+ uses all synchronous call except :c:func:`request_firmware_into_buf`.
+
+* If an asynchronous call is used the firmware cache is only set up for a
+ device if if the second argument (uevent) to request_firmware_nowait() is
+ true. When uevent is true it requests that a kobject uevent be sent to
+ userspace for the firmware request. For details refer to the Fackback
+ mechanism documented below.
+
+* If the firmware cache is determined to be needed as per the above two
+ criteria the firmware cache is setup by adding a devres entry for the
+ device making the firmware request.
+
+* The firmware devres entry is maintained throughout the lifetime of the
+ device. This means that even if you release_firmware() the firmware cache
+ will still be used on resume from suspend.
+
+* The timeout for the fallback mechanism is temporarily reduced to 10 seconds
+ as the firmware cache is set up during suspend, the timeout is set back to
+ the old value you had configured after the cache is set up.
+
+* Upon suspend any pending non-uevent firmware requests are killed to avoid
+ stalling the kernel, this is done with kill_requests_without_uevent(). Kernel
+ calls requiring the non-uevent therefore need to implement their own firmware
+ cache mechanism but must not use the firmware API on suspend.
+
diff --git a/Documentation/driver-api/firmware/fw_search_path.rst b/Documentation/driver-api/firmware/fw_search_path.rst
new file mode 100644
index 000000000000..a360f1009fa3
--- /dev/null
+++ b/Documentation/driver-api/firmware/fw_search_path.rst
@@ -0,0 +1,26 @@
+=====================
+Firmware search paths
+=====================
+
+The following search paths are used to look for firmware on your
+root filesystem.
+
+* fw_path_para - module parameter - default is empty so this is ignored
+* /lib/firmware/updates/UTS_RELEASE/
+* /lib/firmware/updates/
+* /lib/firmware/UTS_RELEASE/
+* /lib/firmware/
+
+The module parameter ''path'' can be passed to the firmware_class module
+to activate the first optional custom fw_path_para. The custom path can
+only be up to 256 characters long. The kernel parameter passed would be:
+
+* 'firmware_class.path=$CUSTOMIZED_PATH'
+
+There is an alternative to customize the path at run time after bootup, you
+can use the file:
+
+* /sys/module/firmware_class/parameters/path
+
+You would echo into it your custom path and firmware requested will be
+searched for there first.
diff --git a/Documentation/driver-api/firmware/index.rst b/Documentation/driver-api/firmware/index.rst
new file mode 100644
index 000000000000..1abe01793031
--- /dev/null
+++ b/Documentation/driver-api/firmware/index.rst
@@ -0,0 +1,16 @@
+==================
+Linux Firmware API
+==================
+
+.. toctree::
+
+ introduction
+ core
+ request_firmware
+
+.. only:: subproject and html
+
+ Indices
+ =======
+
+ * :ref:`genindex`
diff --git a/Documentation/driver-api/firmware/introduction.rst b/Documentation/driver-api/firmware/introduction.rst
new file mode 100644
index 000000000000..211cb44eb972
--- /dev/null
+++ b/Documentation/driver-api/firmware/introduction.rst
@@ -0,0 +1,27 @@
+============
+Introduction
+============
+
+The firmware API enables kernel code to request files required
+for functionality from userspace, the uses vary:
+
+* Microcode for CPU errata
+* Device driver firmware, required to be loaded onto device
+ microcontrollers
+* Device driver information data (calibration data, EEPROM overrides),
+ some of which can be completely optional.
+
+Types of firmware requests
+==========================
+
+There are two types of calls:
+
+* Synchronous
+* Asynchronous
+
+Which one you use vary depending on your requirements, the rule of thumb
+however is you should strive to use the asynchronous APIs unless you also
+are already using asynchronous initialization mechanisms which will not
+stall or delay boot. Even if loading firmware does not take a lot of time
+processing firmware might, and this can still delay boot or initialization,
+as such mechanisms such as asynchronous probe can help supplement drivers.
diff --git a/Documentation/driver-api/firmware/lookup-order.rst b/Documentation/driver-api/firmware/lookup-order.rst
new file mode 100644
index 000000000000..88c81739683c
--- /dev/null
+++ b/Documentation/driver-api/firmware/lookup-order.rst
@@ -0,0 +1,18 @@
+=====================
+Firmware lookup order
+=====================
+
+Different functionality is available to enable firmware to be found.
+Below is chronological order of how firmware will be looked for once
+a driver issues a firmware API call.
+
+* The ''Built-in firmware'' is checked first, if the firmware is present we
+ return it immediately
+* The ''Firmware cache'' is looked at next. If the firmware is found we
+ return it immediately
+* The ''Direct filesystem lookup'' is performed next, if found we
+ return it immediately
+* If no firmware has been found and the fallback mechanism was enabled
+ the sysfs interface is created. After this either a kobject uevent
+ is issued or the custom firmware loading is relied upon for firmware
+ loading up to the timeout value.
diff --git a/Documentation/driver-api/firmware/request_firmware.rst b/Documentation/driver-api/firmware/request_firmware.rst
new file mode 100644
index 000000000000..cc0aea880824
--- /dev/null
+++ b/Documentation/driver-api/firmware/request_firmware.rst
@@ -0,0 +1,56 @@
+====================
+request_firmware API
+====================
+
+You would typically load firmware and then load it into your device somehow.
+The typical firmware work flow is reflected below::
+
+ if(request_firmware(&fw_entry, $FIRMWARE, device) == 0)
+ copy_fw_to_device(fw_entry->data, fw_entry->size);
+ release_firmware(fw_entry);
+
+Synchronous firmware requests
+=============================
+
+Synchronous firmware requests will wait until the firmware is found or until
+an error is returned.
+
+request_firmware
+----------------
+.. kernel-doc:: drivers/base/firmware_class.c
+ :functions: request_firmware
+
+request_firmware_direct
+-----------------------
+.. kernel-doc:: drivers/base/firmware_class.c
+ :functions: request_firmware_direct
+
+request_firmware_into_buf
+-------------------------
+.. kernel-doc:: drivers/base/firmware_class.c
+ :functions: request_firmware_into_buf
+
+Asynchronous firmware requests
+==============================
+
+Asynchronous firmware requests allow driver code to not have to wait
+until the firmware or an error is returned. Function callbacks are
+provided so that when the firmware or an error is found the driver is
+informed through the callback. request_firmware_nowait() cannot be called
+in atomic contexts.
+
+request_firmware_nowait
+-----------------------
+.. kernel-doc:: drivers/base/firmware_class.c
+ :functions: request_firmware_nowait
+
+request firmware API expected driver use
+========================================
+
+Once an API call returns you process the firmware and then release the
+firmware. For example if you used request_firmware() and it returns,
+the driver has the firmware image accessible in fw_entry->{data,size}.
+If something went wrong request_firmware() returns non-zero and fw_entry
+is set to NULL. Once your driver is done with processing the firmware it
+can call call release_firmware(fw_entry) to release the firmware image
+and any related resource.
diff --git a/Documentation/driver-api/iio/buffers.rst b/Documentation/driver-api/iio/buffers.rst
new file mode 100644
index 000000000000..02c99a6bee18
--- /dev/null
+++ b/Documentation/driver-api/iio/buffers.rst
@@ -0,0 +1,125 @@
+=======
+Buffers
+=======
+
+* struct :c:type:`iio_buffer` — general buffer structure
+* :c:func:`iio_validate_scan_mask_onehot` — Validates that exactly one channel
+ is selected
+* :c:func:`iio_buffer_get` — Grab a reference to the buffer
+* :c:func:`iio_buffer_put` — Release the reference to the buffer
+
+The Industrial I/O core offers a way for continuous data capture based on a
+trigger source. Multiple data channels can be read at once from
+:file:`/dev/iio:device{X}` character device node, thus reducing the CPU load.
+
+IIO buffer sysfs interface
+==========================
+An IIO buffer has an associated attributes directory under
+:file:`/sys/bus/iio/iio:device{X}/buffer/*`. Here are some of the existing
+attributes:
+
+* :file:`length`, the total number of data samples (capacity) that can be
+ stored by the buffer.
+* :file:`enable`, activate buffer capture.
+
+IIO buffer setup
+================
+
+The meta information associated with a channel reading placed in a buffer is
+called a scan element . The important bits configuring scan elements are
+exposed to userspace applications via the
+:file:`/sys/bus/iio/iio:device{X}/scan_elements/*` directory. This file contains
+attributes of the following form:
+
+* :file:`enable`, used for enabling a channel. If and only if its attribute
+ is non *zero*, then a triggered capture will contain data samples for this
+ channel.
+* :file:`type`, description of the scan element data storage within the buffer
+ and hence the form in which it is read from user space.
+ Format is [be|le]:[s|u]bits/storagebitsXrepeat[>>shift] .
+ * *be* or *le*, specifies big or little endian.
+ * *s* or *u*, specifies if signed (2's complement) or unsigned.
+ * *bits*, is the number of valid data bits.
+ * *storagebits*, is the number of bits (after padding) that it occupies in the
+ buffer.
+ * *shift*, if specified, is the shift that needs to be applied prior to
+ masking out unused bits.
+ * *repeat*, specifies the number of bits/storagebits repetitions. When the
+ repeat element is 0 or 1, then the repeat value is omitted.
+
+For example, a driver for a 3-axis accelerometer with 12 bit resolution where
+data is stored in two 8-bits registers as follows::
+
+ 7 6 5 4 3 2 1 0
+ +---+---+---+---+---+---+---+---+
+ |D3 |D2 |D1 |D0 | X | X | X | X | (LOW byte, address 0x06)
+ +---+---+---+---+---+---+---+---+
+
+ 7 6 5 4 3 2 1 0
+ +---+---+---+---+---+---+---+---+
+ |D11|D10|D9 |D8 |D7 |D6 |D5 |D4 | (HIGH byte, address 0x07)
+ +---+---+---+---+---+---+---+---+
+
+will have the following scan element type for each axis::
+
+ $ cat /sys/bus/iio/devices/iio:device0/scan_elements/in_accel_y_type
+ le:s12/16>>4
+
+A user space application will interpret data samples read from the buffer as
+two byte little endian signed data, that needs a 4 bits right shift before
+masking out the 12 valid bits of data.
+
+For implementing buffer support a driver should initialize the following
+fields in iio_chan_spec definition::
+
+ struct iio_chan_spec {
+ /* other members */
+ int scan_index
+ struct {
+ char sign;
+ u8 realbits;
+ u8 storagebits;
+ u8 shift;
+ u8 repeat;
+ enum iio_endian endianness;
+ } scan_type;
+ };
+
+The driver implementing the accelerometer described above will have the
+following channel definition::
+
+ struct struct iio_chan_spec accel_channels[] = {
+ {
+ .type = IIO_ACCEL,
+ .modified = 1,
+ .channel2 = IIO_MOD_X,
+ /* other stuff here */
+ .scan_index = 0,
+ .scan_type = {
+ .sign = 's',
+ .realbits = 12,
+ .storagebits = 16,
+ .shift = 4,
+ .endianness = IIO_LE,
+ },
+ }
+ /* similar for Y (with channel2 = IIO_MOD_Y, scan_index = 1)
+ * and Z (with channel2 = IIO_MOD_Z, scan_index = 2) axis
+ */
+ }
+
+Here **scan_index** defines the order in which the enabled channels are placed
+inside the buffer. Channels with a lower **scan_index** will be placed before
+channels with a higher index. Each channel needs to have a unique
+**scan_index**.
+
+Setting **scan_index** to -1 can be used to indicate that the specific channel
+does not support buffered capture. In this case no entries will be created for
+the channel in the scan_elements directory.
+
+More details
+============
+.. kernel-doc:: include/linux/iio/buffer.h
+.. kernel-doc:: drivers/iio/industrialio-buffer.c
+ :export:
+
diff --git a/Documentation/driver-api/iio/core.rst b/Documentation/driver-api/iio/core.rst
new file mode 100644
index 000000000000..9a34ae03b679
--- /dev/null
+++ b/Documentation/driver-api/iio/core.rst
@@ -0,0 +1,182 @@
+=============
+Core elements
+=============
+
+The Industrial I/O core offers a unified framework for writing drivers for
+many different types of embedded sensors. a standard interface to user space
+applications manipulating sensors. The implementation can be found under
+:file:`drivers/iio/industrialio-*`
+
+Industrial I/O Devices
+----------------------
+
+* struct :c:type:`iio_dev` - industrial I/O device
+* :c:func:`iio_device_alloc()` - alocate an :c:type:`iio_dev` from a driver
+* :c:func:`iio_device_free()` - free an :c:type:`iio_dev` from a driver
+* :c:func:`iio_device_register()` - register a device with the IIO subsystem
+* :c:func:`iio_device_unregister()` - unregister a device from the IIO
+ subsystem
+
+An IIO device usually corresponds to a single hardware sensor and it
+provides all the information needed by a driver handling a device.
+Let's first have a look at the functionality embedded in an IIO device
+then we will show how a device driver makes use of an IIO device.
+
+There are two ways for a user space application to interact with an IIO driver.
+
+1. :file:`/sys/bus/iio/iio:device{X}/`, this represents a hardware sensor
+ and groups together the data channels of the same chip.
+2. :file:`/dev/iio:device{X}`, character device node interface used for
+ buffered data transfer and for events information retrieval.
+
+A typical IIO driver will register itself as an :doc:`I2C <../i2c>` or
+:doc:`SPI <../spi>` driver and will create two routines, probe and remove.
+
+At probe:
+
+1. Call :c:func:`iio_device_alloc()`, which allocates memory for an IIO device.
+2. Initialize IIO device fields with driver specific information (e.g.
+ device name, device channels).
+3. Call :c:func:`iio_device_register()`, this registers the device with the
+ IIO core. After this call the device is ready to accept requests from user
+ space applications.
+
+At remove, we free the resources allocated in probe in reverse order:
+
+1. :c:func:`iio_device_unregister()`, unregister the device from the IIO core.
+2. :c:func:`iio_device_free()`, free the memory allocated for the IIO device.
+
+IIO device sysfs interface
+==========================
+
+Attributes are sysfs files used to expose chip info and also allowing
+applications to set various configuration parameters. For device with
+index X, attributes can be found under /sys/bus/iio/iio:deviceX/ directory.
+Common attributes are:
+
+* :file:`name`, description of the physical chip.
+* :file:`dev`, shows the major:minor pair associated with
+ :file:`/dev/iio:deviceX` node.
+* :file:`sampling_frequency_available`, available discrete set of sampling
+ frequency values for device.
+* Available standard attributes for IIO devices are described in the
+ :file:`Documentation/ABI/testing/sysfs-bus-iio` file in the Linux kernel
+ sources.
+
+IIO device channels
+===================
+
+struct :c:type:`iio_chan_spec` - specification of a single channel
+
+An IIO device channel is a representation of a data channel. An IIO device can
+have one or multiple channels. For example:
+
+* a thermometer sensor has one channel representing the temperature measurement.
+* a light sensor with two channels indicating the measurements in the visible
+ and infrared spectrum.
+* an accelerometer can have up to 3 channels representing acceleration on X, Y
+ and Z axes.
+
+An IIO channel is described by the struct :c:type:`iio_chan_spec`.
+A thermometer driver for the temperature sensor in the example above would
+have to describe its channel as follows::
+
+ static const struct iio_chan_spec temp_channel[] = {
+ {
+ .type = IIO_TEMP,
+ .info_mask_separate = BIT(IIO_CHAN_INFO_PROCESSED),
+ },
+ };
+
+Channel sysfs attributes exposed to userspace are specified in the form of
+bitmasks. Depending on their shared info, attributes can be set in one of the
+following masks:
+
+* **info_mask_separate**, attributes will be specific to
+ this channel
+* **info_mask_shared_by_type**, attributes are shared by all channels of the
+ same type
+* **info_mask_shared_by_dir**, attributes are shared by all channels of the same
+ direction
+* **info_mask_shared_by_all**, attributes are shared by all channels
+
+When there are multiple data channels per channel type we have two ways to
+distinguish between them:
+
+* set **.modified** field of :c:type:`iio_chan_spec` to 1. Modifiers are
+ specified using **.channel2** field of the same :c:type:`iio_chan_spec`
+ structure and are used to indicate a physically unique characteristic of the
+ channel such as its direction or spectral response. For example, a light
+ sensor can have two channels, one for infrared light and one for both
+ infrared and visible light.
+* set **.indexed** field of :c:type:`iio_chan_spec` to 1. In this case the
+ channel is simply another instance with an index specified by the **.channel**
+ field.
+
+Here is how we can make use of the channel's modifiers::
+
+ static const struct iio_chan_spec light_channels[] = {
+ {
+ .type = IIO_INTENSITY,
+ .modified = 1,
+ .channel2 = IIO_MOD_LIGHT_IR,
+ .info_mask_separate = BIT(IIO_CHAN_INFO_RAW),
+ .info_mask_shared = BIT(IIO_CHAN_INFO_SAMP_FREQ),
+ },
+ {
+ .type = IIO_INTENSITY,
+ .modified = 1,
+ .channel2 = IIO_MOD_LIGHT_BOTH,
+ .info_mask_separate = BIT(IIO_CHAN_INFO_RAW),
+ .info_mask_shared = BIT(IIO_CHAN_INFO_SAMP_FREQ),
+ },
+ {
+ .type = IIO_LIGHT,
+ .info_mask_separate = BIT(IIO_CHAN_INFO_PROCESSED),
+ .info_mask_shared = BIT(IIO_CHAN_INFO_SAMP_FREQ),
+ },
+ }
+
+This channel's definition will generate two separate sysfs files for raw data
+retrieval:
+
+* :file:`/sys/bus/iio/iio:device{X}/in_intensity_ir_raw`
+* :file:`/sys/bus/iio/iio:device{X}/in_intensity_both_raw`
+
+one file for processed data:
+
+* :file:`/sys/bus/iio/iio:device{X}/in_illuminance_input`
+
+and one shared sysfs file for sampling frequency:
+
+* :file:`/sys/bus/iio/iio:device{X}/sampling_frequency`.
+
+Here is how we can make use of the channel's indexing::
+
+ static const struct iio_chan_spec light_channels[] = {
+ {
+ .type = IIO_VOLTAGE,
+ .indexed = 1,
+ .channel = 0,
+ .info_mask_separate = BIT(IIO_CHAN_INFO_RAW),
+ },
+ {
+ .type = IIO_VOLTAGE,
+ .indexed = 1,
+ .channel = 1,
+ .info_mask_separate = BIT(IIO_CHAN_INFO_RAW),
+ },
+ }
+
+This will generate two separate attributes files for raw data retrieval:
+
+* :file:`/sys/bus/iio/devices/iio:device{X}/in_voltage0_raw`, representing
+ voltage measurement for channel 0.
+* :file:`/sys/bus/iio/devices/iio:device{X}/in_voltage1_raw`, representing
+ voltage measurement for channel 1.
+
+More details
+============
+.. kernel-doc:: include/linux/iio/iio.h
+.. kernel-doc:: drivers/iio/industrialio-core.c
+ :export:
diff --git a/Documentation/driver-api/iio/index.rst b/Documentation/driver-api/iio/index.rst
new file mode 100644
index 000000000000..e5c3922d1b6f
--- /dev/null
+++ b/Documentation/driver-api/iio/index.rst
@@ -0,0 +1,17 @@
+.. include:: <isonum.txt>
+
+Industrial I/O
+==============
+
+**Copyright** |copy| 2015 Intel Corporation
+
+Contents:
+
+.. toctree::
+ :maxdepth: 2
+
+ intro
+ core
+ buffers
+ triggers
+ triggered-buffers
diff --git a/Documentation/driver-api/iio/intro.rst b/Documentation/driver-api/iio/intro.rst
new file mode 100644
index 000000000000..3653fbd57069
--- /dev/null
+++ b/Documentation/driver-api/iio/intro.rst
@@ -0,0 +1,33 @@
+.. include:: <isonum.txt>
+
+============
+Introduction
+============
+
+The main purpose of the Industrial I/O subsystem (IIO) is to provide support
+for devices that in some sense perform either
+analog-to-digital conversion (ADC) or digital-to-analog conversion (DAC)
+or both. The aim is to fill the gap between the somewhat similar hwmon and
+:doc:`input <../input>` subsystems. Hwmon is directed at low sample rate
+sensors used to monitor and control the system itself, like fan speed control
+or temperature measurement. :doc:`Input <../input>` is, as its name suggests,
+focused on human interaction input devices (keyboard, mouse, touchscreen).
+In some cases there is considerable overlap between these and IIO.
+
+Devices that fall into this category include:
+
+* analog to digital converters (ADCs)
+* accelerometers
+* capacitance to digital converters (CDCs)
+* digital to analog converters (DACs)
+* gyroscopes
+* inertial measurement units (IMUs)
+* color and light sensors
+* magnetometers
+* pressure sensors
+* proximity sensors
+* temperature sensors
+
+Usually these sensors are connected via :doc:`SPI <../spi>` or
+:doc:`I2C <../i2c>`. A common use case of the sensors devices is to have
+combined functionality (e.g. light plus proximity sensor).
diff --git a/Documentation/driver-api/iio/triggered-buffers.rst b/Documentation/driver-api/iio/triggered-buffers.rst
new file mode 100644
index 000000000000..0db12660cc90
--- /dev/null
+++ b/Documentation/driver-api/iio/triggered-buffers.rst
@@ -0,0 +1,69 @@
+=================
+Triggered Buffers
+=================
+
+Now that we know what buffers and triggers are let's see how they work together.
+
+IIO triggered buffer setup
+==========================
+
+* :c:func:`iio_triggered_buffer_setup` — Setup triggered buffer and pollfunc
+* :c:func:`iio_triggered_buffer_cleanup` — Free resources allocated by
+ :c:func:`iio_triggered_buffer_setup`
+* struct :c:type:`iio_buffer_setup_ops` — buffer setup related callbacks
+
+A typical triggered buffer setup looks like this::
+
+ const struct iio_buffer_setup_ops sensor_buffer_setup_ops = {
+ .preenable = sensor_buffer_preenable,
+ .postenable = sensor_buffer_postenable,
+ .postdisable = sensor_buffer_postdisable,
+ .predisable = sensor_buffer_predisable,
+ };
+
+ irqreturn_t sensor_iio_pollfunc(int irq, void *p)
+ {
+ pf->timestamp = iio_get_time_ns((struct indio_dev *)p);
+ return IRQ_WAKE_THREAD;
+ }
+
+ irqreturn_t sensor_trigger_handler(int irq, void *p)
+ {
+ u16 buf[8];
+ int i = 0;
+
+ /* read data for each active channel */
+ for_each_set_bit(bit, active_scan_mask, masklength)
+ buf[i++] = sensor_get_data(bit)
+
+ iio_push_to_buffers_with_timestamp(indio_dev, buf, timestamp);
+
+ iio_trigger_notify_done(trigger);
+ return IRQ_HANDLED;
+ }
+
+ /* setup triggered buffer, usually in probe function */
+ iio_triggered_buffer_setup(indio_dev, sensor_iio_polfunc,
+ sensor_trigger_handler,
+ sensor_buffer_setup_ops);
+
+The important things to notice here are:
+
+* :c:type:`iio_buffer_setup_ops`, the buffer setup functions to be called at
+ predefined points in the buffer configuration sequence (e.g. before enable,
+ after disable). If not specified, the IIO core uses the default
+ iio_triggered_buffer_setup_ops.
+* **sensor_iio_pollfunc**, the function that will be used as top half of poll
+ function. It should do as little processing as possible, because it runs in
+ interrupt context. The most common operation is recording of the current
+ timestamp and for this reason one can use the IIO core defined
+ :c:func:`iio_pollfunc_store_time` function.
+* **sensor_trigger_handler**, the function that will be used as bottom half of
+ the poll function. This runs in the context of a kernel thread and all the
+ processing takes place here. It usually reads data from the device and
+ stores it in the internal buffer together with the timestamp recorded in the
+ top half.
+
+More details
+============
+.. kernel-doc:: drivers/iio/buffer/industrialio-triggered-buffer.c
diff --git a/Documentation/driver-api/iio/triggers.rst b/Documentation/driver-api/iio/triggers.rst
new file mode 100644
index 000000000000..f89d37e7dd82
--- /dev/null
+++ b/Documentation/driver-api/iio/triggers.rst
@@ -0,0 +1,80 @@
+========
+Triggers
+========
+
+* struct :c:type:`iio_trigger` — industrial I/O trigger device
+* :c:func:`devm_iio_trigger_alloc` — Resource-managed iio_trigger_alloc
+* :c:func:`devm_iio_trigger_free` — Resource-managed iio_trigger_free
+* :c:func:`devm_iio_trigger_register` — Resource-managed iio_trigger_register
+* :c:func:`devm_iio_trigger_unregister` — Resource-managed
+ iio_trigger_unregister
+* :c:func:`iio_trigger_validate_own_device` — Check if a trigger and IIO
+ device belong to the same device
+
+In many situations it is useful for a driver to be able to capture data based
+on some external event (trigger) as opposed to periodically polling for data.
+An IIO trigger can be provided by a device driver that also has an IIO device
+based on hardware generated events (e.g. data ready or threshold exceeded) or
+provided by a separate driver from an independent interrupt source (e.g. GPIO
+line connected to some external system, timer interrupt or user space writing
+a specific file in sysfs). A trigger may initiate data capture for a number of
+sensors and also it may be completely unrelated to the sensor itself.
+
+IIO trigger sysfs interface
+===========================
+
+There are two locations in sysfs related to triggers:
+
+* :file:`/sys/bus/iio/devices/trigger{Y}/*`, this file is created once an
+ IIO trigger is registered with the IIO core and corresponds to trigger
+ with index Y.
+ Because triggers can be very different depending on type there are few
+ standard attributes that we can describe here:
+
+ * :file:`name`, trigger name that can be later used for association with a
+ device.
+ * :file:`sampling_frequency`, some timer based triggers use this attribute to
+ specify the frequency for trigger calls.
+
+* :file:`/sys/bus/iio/devices/iio:device{X}/trigger/*`, this directory is
+ created once the device supports a triggered buffer. We can associate a
+ trigger with our device by writing the trigger's name in the
+ :file:`current_trigger` file.
+
+IIO trigger setup
+=================
+
+Let's see a simple example of how to setup a trigger to be used by a driver::
+
+ struct iio_trigger_ops trigger_ops = {
+ .set_trigger_state = sample_trigger_state,
+ .validate_device = sample_validate_device,
+ }
+
+ struct iio_trigger *trig;
+
+ /* first, allocate memory for our trigger */
+ trig = iio_trigger_alloc(dev, "trig-%s-%d", name, idx);
+
+ /* setup trigger operations field */
+ trig->ops = &trigger_ops;
+
+ /* now register the trigger with the IIO core */
+ iio_trigger_register(trig);
+
+IIO trigger ops
+===============
+
+* struct :c:type:`iio_trigger_ops` — operations structure for an iio_trigger.
+
+Notice that a trigger has a set of operations attached:
+
+* :file:`set_trigger_state`, switch the trigger on/off on demand.
+* :file:`validate_device`, function to validate the device when the current
+ trigger gets changed.
+
+More details
+============
+.. kernel-doc:: include/linux/iio/trigger.h
+.. kernel-doc:: drivers/iio/industrialio-trigger.c
+ :export:
diff --git a/Documentation/driver-api/index.rst b/Documentation/driver-api/index.rst
index 5475a2807e7a..60db00d1532b 100644
--- a/Documentation/driver-api/index.rst
+++ b/Documentation/driver-api/index.rst
@@ -16,11 +16,15 @@ available subsections can be seen below.
basics
infrastructure
+ pm/index
+ device-io
dma-buf
device_link
message-based
sound
frame-buffer
+ regulator
+ iio/index
input
usb
spi
@@ -30,6 +34,8 @@ available subsections can be seen below.
miscellaneous
vme
80211/index
+ uio-howto
+ firmware/index
.. only:: subproject and html
diff --git a/Documentation/driver-api/infrastructure.rst b/Documentation/driver-api/infrastructure.rst
index 0bb0b5fc9512..6d9ff316b608 100644
--- a/Documentation/driver-api/infrastructure.rst
+++ b/Documentation/driver-api/infrastructure.rst
@@ -55,21 +55,6 @@ Device Drivers DMA Management
.. kernel-doc:: drivers/base/dma-mapping.c
:export:
-Device Drivers Power Management
--------------------------------
-
-.. kernel-doc:: drivers/base/power/main.c
- :export:
-
-Device Drivers ACPI Support
----------------------------
-
-.. kernel-doc:: drivers/acpi/scan.c
- :export:
-
-.. kernel-doc:: drivers/acpi/scan.c
- :internal:
-
Device drivers PnP support
--------------------------
diff --git a/Documentation/driver-api/pm/conf.py b/Documentation/driver-api/pm/conf.py
new file mode 100644
index 000000000000..a89fac11272f
--- /dev/null
+++ b/Documentation/driver-api/pm/conf.py
@@ -0,0 +1,10 @@
+# -*- coding: utf-8; mode: python -*-
+
+project = "Device Power Management"
+
+tags.add("subproject")
+
+latex_documents = [
+ ('index', 'pm.tex', project,
+ 'The kernel development community', 'manual'),
+]
diff --git a/Documentation/driver-api/pm/devices.rst b/Documentation/driver-api/pm/devices.rst
new file mode 100644
index 000000000000..bedd32388dac
--- /dev/null
+++ b/Documentation/driver-api/pm/devices.rst
@@ -0,0 +1,736 @@
+.. |struct dev_pm_ops| replace:: :c:type:`struct dev_pm_ops <dev_pm_ops>`
+.. |struct dev_pm_domain| replace:: :c:type:`struct dev_pm_domain <dev_pm_domain>`
+.. |struct bus_type| replace:: :c:type:`struct bus_type <bus_type>`
+.. |struct device_type| replace:: :c:type:`struct device_type <device_type>`
+.. |struct class| replace:: :c:type:`struct class <class>`
+.. |struct wakeup_source| replace:: :c:type:`struct wakeup_source <wakeup_source>`
+.. |struct device| replace:: :c:type:`struct device <device>`
+
+==============================
+Device Power Management Basics
+==============================
+
+::
+
+ Copyright (c) 2010-2011 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
+ Copyright (c) 2010 Alan Stern <stern@rowland.harvard.edu>
+ Copyright (c) 2016 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com>
+
+Most of the code in Linux is device drivers, so most of the Linux power
+management (PM) code is also driver-specific. Most drivers will do very
+little; others, especially for platforms with small batteries (like cell
+phones), will do a lot.
+
+This writeup gives an overview of how drivers interact with system-wide
+power management goals, emphasizing the models and interfaces that are
+shared by everything that hooks up to the driver model core. Read it as
+background for the domain-specific work you'd do with any specific driver.
+
+
+Two Models for Device Power Management
+======================================
+
+Drivers will use one or both of these models to put devices into low-power
+states:
+
+ System Sleep model:
+
+ Drivers can enter low-power states as part of entering system-wide
+ low-power states like "suspend" (also known as "suspend-to-RAM"), or
+ (mostly for systems with disks) "hibernation" (also known as
+ "suspend-to-disk").
+
+ This is something that device, bus, and class drivers collaborate on
+ by implementing various role-specific suspend and resume methods to
+ cleanly power down hardware and software subsystems, then reactivate
+ them without loss of data.
+
+ Some drivers can manage hardware wakeup events, which make the system
+ leave the low-power state. This feature may be enabled or disabled
+ using the relevant :file:`/sys/devices/.../power/wakeup` file (for
+ Ethernet drivers the ioctl interface used by ethtool may also be used
+ for this purpose); enabling it may cost some power usage, but let the
+ whole system enter low-power states more often.
+
+ Runtime Power Management model:
+
+ Devices may also be put into low-power states while the system is
+ running, independently of other power management activity in principle.
+ However, devices are not generally independent of each other (for
+ example, a parent device cannot be suspended unless all of its child
+ devices have been suspended). Moreover, depending on the bus type the
+ device is on, it may be necessary to carry out some bus-specific
+ operations on the device for this purpose. Devices put into low power
+ states at run time may require special handling during system-wide power
+ transitions (suspend or hibernation).
+
+ For these reasons not only the device driver itself, but also the
+ appropriate subsystem (bus type, device type or device class) driver and
+ the PM core are involved in runtime power management. As in the system
+ sleep power management case, they need to collaborate by implementing
+ various role-specific suspend and resume methods, so that the hardware
+ is cleanly powered down and reactivated without data or service loss.
+
+There's not a lot to be said about those low-power states except that they are
+very system-specific, and often device-specific. Also, that if enough devices
+have been put into low-power states (at runtime), the effect may be very similar
+to entering some system-wide low-power state (system sleep) ... and that
+synergies exist, so that several drivers using runtime PM might put the system
+into a state where even deeper power saving options are available.
+
+Most suspended devices will have quiesced all I/O: no more DMA or IRQs (except
+for wakeup events), no more data read or written, and requests from upstream
+drivers are no longer accepted. A given bus or platform may have different
+requirements though.
+
+Examples of hardware wakeup events include an alarm from a real time clock,
+network wake-on-LAN packets, keyboard or mouse activity, and media insertion
+or removal (for PCMCIA, MMC/SD, USB, and so on).
+
+Interfaces for Entering System Sleep States
+===========================================
+
+There are programming interfaces provided for subsystems (bus type, device type,
+device class) and device drivers to allow them to participate in the power
+management of devices they are concerned with. These interfaces cover both
+system sleep and runtime power management.
+
+
+Device Power Management Operations
+----------------------------------
+
+Device power management operations, at the subsystem level as well as at the
+device driver level, are implemented by defining and populating objects of type
+|struct dev_pm_ops| defined in :file:`include/linux/pm.h`. The roles of the
+methods included in it will be explained in what follows. For now, it should be
+sufficient to remember that the last three methods are specific to runtime power
+management while the remaining ones are used during system-wide power
+transitions.
+
+There also is a deprecated "old" or "legacy" interface for power management
+operations available at least for some subsystems. This approach does not use
+|struct dev_pm_ops| objects and it is suitable only for implementing system
+sleep power management methods in a limited way. Therefore it is not described
+in this document, so please refer directly to the source code for more
+information about it.
+
+
+Subsystem-Level Methods
+-----------------------
+
+The core methods to suspend and resume devices reside in
+|struct dev_pm_ops| pointed to by the :c:member:`ops` member of
+|struct dev_pm_domain|, or by the :c:member:`pm` member of |struct bus_type|,
+|struct device_type| and |struct class|. They are mostly of interest to the
+people writing infrastructure for platforms and buses, like PCI or USB, or
+device type and device class drivers. They also are relevant to the writers of
+device drivers whose subsystems (PM domains, device types, device classes and
+bus types) don't provide all power management methods.
+
+Bus drivers implement these methods as appropriate for the hardware and the
+drivers using it; PCI works differently from USB, and so on. Not many people
+write subsystem-level drivers; most driver code is a "device driver" that builds
+on top of bus-specific framework code.
+
+For more information on these driver calls, see the description later;
+they are called in phases for every device, respecting the parent-child
+sequencing in the driver model tree.
+
+
+:file:`/sys/devices/.../power/wakeup` files
+-------------------------------------------
+
+All device objects in the driver model contain fields that control the handling
+of system wakeup events (hardware signals that can force the system out of a
+sleep state). These fields are initialized by bus or device driver code using
+:c:func:`device_set_wakeup_capable()` and :c:func:`device_set_wakeup_enable()`,
+defined in :file:`include/linux/pm_wakeup.h`.
+
+The :c:member:`power.can_wakeup` flag just records whether the device (and its
+driver) can physically support wakeup events. The
+:c:func:`device_set_wakeup_capable()` routine affects this flag. The
+:c:member:`power.wakeup` field is a pointer to an object of type
+|struct wakeup_source| used for controlling whether or not the device should use
+its system wakeup mechanism and for notifying the PM core of system wakeup
+events signaled by the device. This object is only present for wakeup-capable
+devices (i.e. devices whose :c:member:`can_wakeup` flags are set) and is created
+(or removed) by :c:func:`device_set_wakeup_capable()`.
+
+Whether or not a device is capable of issuing wakeup events is a hardware
+matter, and the kernel is responsible for keeping track of it. By contrast,
+whether or not a wakeup-capable device should issue wakeup events is a policy
+decision, and it is managed by user space through a sysfs attribute: the
+:file:`power/wakeup` file. User space can write the "enabled" or "disabled"
+strings to it to indicate whether or not, respectively, the device is supposed
+to signal system wakeup. This file is only present if the
+:c:member:`power.wakeup` object exists for the given device and is created (or
+removed) along with that object, by :c:func:`device_set_wakeup_capable()`.
+Reads from the file will return the corresponding string.
+
+The initial value in the :file:`power/wakeup` file is "disabled" for the
+majority of devices; the major exceptions are power buttons, keyboards, and
+Ethernet adapters whose WoL (wake-on-LAN) feature has been set up with ethtool.
+It should also default to "enabled" for devices that don't generate wakeup
+requests on their own but merely forward wakeup requests from one bus to another
+(like PCI Express ports).
+
+The :c:func:`device_may_wakeup()` routine returns true only if the
+:c:member:`power.wakeup` object exists and the corresponding :file:`power/wakeup`
+file contains the "enabled" string. This information is used by subsystems,
+like the PCI bus type code, to see whether or not to enable the devices' wakeup
+mechanisms. If device wakeup mechanisms are enabled or disabled directly by
+drivers, they also should use :c:func:`device_may_wakeup()` to decide what to do
+during a system sleep transition. Device drivers, however, are not expected to
+call :c:func:`device_set_wakeup_enable()` directly in any case.
+
+It ought to be noted that system wakeup is conceptually different from "remote
+wakeup" used by runtime power management, although it may be supported by the
+same physical mechanism. Remote wakeup is a feature allowing devices in
+low-power states to trigger specific interrupts to signal conditions in which
+they should be put into the full-power state. Those interrupts may or may not
+be used to signal system wakeup events, depending on the hardware design. On
+some systems it is impossible to trigger them from system sleep states. In any
+case, remote wakeup should always be enabled for runtime power management for
+all devices and drivers that support it.
+
+
+:file:`/sys/devices/.../power/control` files
+--------------------------------------------
+
+Each device in the driver model has a flag to control whether it is subject to
+runtime power management. This flag, :c:member:`runtime_auto`, is initialized
+by the bus type (or generally subsystem) code using :c:func:`pm_runtime_allow()`
+or :c:func:`pm_runtime_forbid()`; the default is to allow runtime power
+management.
+
+The setting can be adjusted by user space by writing either "on" or "auto" to
+the device's :file:`power/control` sysfs file. Writing "auto" calls
+:c:func:`pm_runtime_allow()`, setting the flag and allowing the device to be
+runtime power-managed by its driver. Writing "on" calls
+:c:func:`pm_runtime_forbid()`, clearing the flag, returning the device to full
+power if it was in a low-power state, and preventing the
+device from being runtime power-managed. User space can check the current value
+of the :c:member:`runtime_auto` flag by reading that file.
+
+The device's :c:member:`runtime_auto` flag has no effect on the handling of
+system-wide power transitions. In particular, the device can (and in the
+majority of cases should and will) be put into a low-power state during a
+system-wide transition to a sleep state even though its :c:member:`runtime_auto`
+flag is clear.
+
+For more information about the runtime power management framework, refer to
+:file:`Documentation/power/runtime_pm.txt`.
+
+
+Calling Drivers to Enter and Leave System Sleep States
+======================================================
+
+When the system goes into a sleep state, each device's driver is asked to
+suspend the device by putting it into a state compatible with the target
+system state. That's usually some version of "off", but the details are
+system-specific. Also, wakeup-enabled devices will usually stay partly
+functional in order to wake the system.
+
+When the system leaves that low-power state, the device's driver is asked to
+resume it by returning it to full power. The suspend and resume operations
+always go together, and both are multi-phase operations.
+
+For simple drivers, suspend might quiesce the device using class code
+and then turn its hardware as "off" as possible during suspend_noirq. The
+matching resume calls would then completely reinitialize the hardware
+before reactivating its class I/O queues.
+
+More power-aware drivers might prepare the devices for triggering system wakeup
+events.
+
+
+Call Sequence Guarantees
+------------------------
+
+To ensure that bridges and similar links needing to talk to a device are
+available when the device is suspended or resumed, the device hierarchy is
+walked in a bottom-up order to suspend devices. A top-down order is
+used to resume those devices.
+
+The ordering of the device hierarchy is defined by the order in which devices
+get registered: a child can never be registered, probed or resumed before
+its parent; and can't be removed or suspended after that parent.
+
+The policy is that the device hierarchy should match hardware bus topology.
+[Or at least the control bus, for devices which use multiple busses.]
+In particular, this means that a device registration may fail if the parent of
+the device is suspending (i.e. has been chosen by the PM core as the next
+device to suspend) or has already suspended, as well as after all of the other
+devices have been suspended. Device drivers must be prepared to cope with such
+situations.
+
+
+System Power Management Phases
+------------------------------
+
+Suspending or resuming the system is done in several phases. Different phases
+are used for suspend-to-idle, shallow (standby), and deep ("suspend-to-RAM")
+sleep states and the hibernation state ("suspend-to-disk"). Each phase involves
+executing callbacks for every device before the next phase begins. Not all
+buses or classes support all these callbacks and not all drivers use all the
+callbacks. The various phases always run after tasks have been frozen and
+before they are unfrozen. Furthermore, the ``*_noirq phases`` run at a time
+when IRQ handlers have been disabled (except for those marked with the
+IRQF_NO_SUSPEND flag).
+
+All phases use PM domain, bus, type, class or driver callbacks (that is, methods
+defined in ``dev->pm_domain->ops``, ``dev->bus->pm``, ``dev->type->pm``,
+``dev->class->pm`` or ``dev->driver->pm``). These callbacks are regarded by the
+PM core as mutually exclusive. Moreover, PM domain callbacks always take
+precedence over all of the other callbacks and, for example, type callbacks take
+precedence over bus, class and driver callbacks. To be precise, the following
+rules are used to determine which callback to execute in the given phase:
+
+ 1. If ``dev->pm_domain`` is present, the PM core will choose the callback
+ provided by ``dev->pm_domain->ops`` for execution.
+
+ 2. Otherwise, if both ``dev->type`` and ``dev->type->pm`` are present, the
+ callback provided by ``dev->type->pm`` will be chosen for execution.
+
+ 3. Otherwise, if both ``dev->class`` and ``dev->class->pm`` are present,
+ the callback provided by ``dev->class->pm`` will be chosen for
+ execution.
+
+ 4. Otherwise, if both ``dev->bus`` and ``dev->bus->pm`` are present, the
+ callback provided by ``dev->bus->pm`` will be chosen for execution.
+
+This allows PM domains and device types to override callbacks provided by bus
+types or device classes if necessary.
+
+The PM domain, type, class and bus callbacks may in turn invoke device- or
+driver-specific methods stored in ``dev->driver->pm``, but they don't have to do
+that.
+
+If the subsystem callback chosen for execution is not present, the PM core will
+execute the corresponding method from the ``dev->driver->pm`` set instead if
+there is one.
+
+
+Entering System Suspend
+-----------------------
+
+When the system goes into the freeze, standby or memory sleep state,
+the phases are: ``prepare``, ``suspend``, ``suspend_late``, ``suspend_noirq``.
+
+ 1. The ``prepare`` phase is meant to prevent races by preventing new
+ devices from being registered; the PM core would never know that all the
+ children of a device had been suspended if new children could be
+ registered at will. [By contrast, from the PM core's perspective,
+ devices may be unregistered at any time.] Unlike the other
+ suspend-related phases, during the ``prepare`` phase the device
+ hierarchy is traversed top-down.
+
+ After the ``->prepare`` callback method returns, no new children may be
+ registered below the device. The method may also prepare the device or
+ driver in some way for the upcoming system power transition, but it
+ should not put the device into a low-power state.
+
+ For devices supporting runtime power management, the return value of the
+ prepare callback can be used to indicate to the PM core that it may
+ safely leave the device in runtime suspend (if runtime-suspended
+ already), provided that all of the device's descendants are also left in
+ runtime suspend. Namely, if the prepare callback returns a positive
+ number and that happens for all of the descendants of the device too,
+ and all of them (including the device itself) are runtime-suspended, the
+ PM core will skip the ``suspend``, ``suspend_late`` and
+ ``suspend_noirq`` phases as well as all of the corresponding phases of
+ the subsequent device resume for all of these devices. In that case,
+ the ``->complete`` callback will be invoked directly after the
+ ``->prepare`` callback and is entirely responsible for putting the
+ device into a consistent state as appropriate.
+
+ Note that this direct-complete procedure applies even if the device is
+ disabled for runtime PM; only the runtime-PM status matters. It follows
+ that if a device has system-sleep callbacks but does not support runtime
+ PM, then its prepare callback must never return a positive value. This
+ is because all such devices are initially set to runtime-suspended with
+ runtime PM disabled.
+
+ 2. The ``->suspend`` methods should quiesce the device to stop it from
+ performing I/O. They also may save the device registers and put it into
+ the appropriate low-power state, depending on the bus type the device is
+ on, and they may enable wakeup events.
+
+ 3. For a number of devices it is convenient to split suspend into the
+ "quiesce device" and "save device state" phases, in which cases
+ ``suspend_late`` is meant to do the latter. It is always executed after
+ runtime power management has been disabled for the device in question.
+
+ 4. The ``suspend_noirq`` phase occurs after IRQ handlers have been disabled,
+ which means that the driver's interrupt handler will not be called while
+ the callback method is running. The ``->suspend_noirq`` methods should
+ save the values of the device's registers that weren't saved previously
+ and finally put the device into the appropriate low-power state.
+
+ The majority of subsystems and device drivers need not implement this
+ callback. However, bus types allowing devices to share interrupt
+ vectors, like PCI, generally need it; otherwise a driver might encounter
+ an error during the suspend phase by fielding a shared interrupt
+ generated by some other device after its own device had been set to low
+ power.
+
+At the end of these phases, drivers should have stopped all I/O transactions
+(DMA, IRQs), saved enough state that they can re-initialize or restore previous
+state (as needed by the hardware), and placed the device into a low-power state.
+On many platforms they will gate off one or more clock sources; sometimes they
+will also switch off power supplies or reduce voltages. [Drivers supporting
+runtime PM may already have performed some or all of these steps.]
+
+If :c:func:`device_may_wakeup(dev)` returns ``true``, the device should be
+prepared for generating hardware wakeup signals to trigger a system wakeup event
+when the system is in the sleep state. For example, :c:func:`enable_irq_wake()`
+might identify GPIO signals hooked up to a switch or other external hardware,
+and :c:func:`pci_enable_wake()` does something similar for the PCI PME signal.
+
+If any of these callbacks returns an error, the system won't enter the desired
+low-power state. Instead, the PM core will unwind its actions by resuming all
+the devices that were suspended.
+
+
+Leaving System Suspend
+----------------------
+
+When resuming from freeze, standby or memory sleep, the phases are:
+``resume_noirq``, ``resume_early``, ``resume``, ``complete``.
+
+ 1. The ``->resume_noirq`` callback methods should perform any actions
+ needed before the driver's interrupt handlers are invoked. This
+ generally means undoing the actions of the ``suspend_noirq`` phase. If
+ the bus type permits devices to share interrupt vectors, like PCI, the
+ method should bring the device and its driver into a state in which the
+ driver can recognize if the device is the source of incoming interrupts,
+ if any, and handle them correctly.
+
+ For example, the PCI bus type's ``->pm.resume_noirq()`` puts the device
+ into the full-power state (D0 in the PCI terminology) and restores the
+ standard configuration registers of the device. Then it calls the
+ device driver's ``->pm.resume_noirq()`` method to perform device-specific
+ actions.
+
+ 2. The ``->resume_early`` methods should prepare devices for the execution
+ of the resume methods. This generally involves undoing the actions of
+ the preceding ``suspend_late`` phase.
+
+ 3. The ``->resume`` methods should bring the device back to its operating
+ state, so that it can perform normal I/O. This generally involves
+ undoing the actions of the ``suspend`` phase.
+
+ 4. The ``complete`` phase should undo the actions of the ``prepare`` phase.
+ For this reason, unlike the other resume-related phases, during the
+ ``complete`` phase the device hierarchy is traversed bottom-up.
+
+ Note, however, that new children may be registered below the device as
+ soon as the ``->resume`` callbacks occur; it's not necessary to wait
+ until the ``complete`` phase with that.
+
+ Moreover, if the preceding ``->prepare`` callback returned a positive
+ number, the device may have been left in runtime suspend throughout the
+ whole system suspend and resume (the ``suspend``, ``suspend_late``,
+ ``suspend_noirq`` phases of system suspend and the ``resume_noirq``,
+ ``resume_early``, ``resume`` phases of system resume may have been
+ skipped for it). In that case, the ``->complete`` callback is entirely
+ responsible for putting the device into a consistent state after system
+ suspend if necessary. [For example, it may need to queue up a runtime
+ resume request for the device for this purpose.] To check if that is
+ the case, the ``->complete`` callback can consult the device's
+ ``power.direct_complete`` flag. Namely, if that flag is set when the
+ ``->complete`` callback is being run, it has been called directly after
+ the preceding ``->prepare`` and special actions may be required
+ to make the device work correctly afterward.
+
+At the end of these phases, drivers should be as functional as they were before
+suspending: I/O can be performed using DMA and IRQs, and the relevant clocks are
+gated on.
+
+However, the details here may again be platform-specific. For example,
+some systems support multiple "run" states, and the mode in effect at
+the end of resume might not be the one which preceded suspension.
+That means availability of certain clocks or power supplies changed,
+which could easily affect how a driver works.
+
+Drivers need to be able to handle hardware which has been reset since all of the
+suspend methods were called, for example by complete reinitialization.
+This may be the hardest part, and the one most protected by NDA'd documents
+and chip errata. It's simplest if the hardware state hasn't changed since
+the suspend was carried out, but that can only be guaranteed if the target
+system sleep entered was suspend-to-idle. For the other system sleep states
+that may not be the case (and usually isn't for ACPI-defined system sleep
+states, like S3).
+
+Drivers must also be prepared to notice that the device has been removed
+while the system was powered down, whenever that's physically possible.
+PCMCIA, MMC, USB, Firewire, SCSI, and even IDE are common examples of busses
+where common Linux platforms will see such removal. Details of how drivers
+will notice and handle such removals are currently bus-specific, and often
+involve a separate thread.
+
+These callbacks may return an error value, but the PM core will ignore such
+errors since there's nothing it can do about them other than printing them in
+the system log.
+
+
+Entering Hibernation
+--------------------
+
+Hibernating the system is more complicated than putting it into sleep states,
+because it involves creating and saving a system image. Therefore there are
+more phases for hibernation, with a different set of callbacks. These phases
+always run after tasks have been frozen and enough memory has been freed.
+
+The general procedure for hibernation is to quiesce all devices ("freeze"),
+create an image of the system memory while everything is stable, reactivate all
+devices ("thaw"), write the image to permanent storage, and finally shut down
+the system ("power off"). The phases used to accomplish this are: ``prepare``,
+``freeze``, ``freeze_late``, ``freeze_noirq``, ``thaw_noirq``, ``thaw_early``,
+``thaw``, ``complete``, ``prepare``, ``poweroff``, ``poweroff_late``,
+``poweroff_noirq``.
+
+ 1. The ``prepare`` phase is discussed in the "Entering System Suspend"
+ section above.
+
+ 2. The ``->freeze`` methods should quiesce the device so that it doesn't
+ generate IRQs or DMA, and they may need to save the values of device
+ registers. However the device does not have to be put in a low-power
+ state, and to save time it's best not to do so. Also, the device should
+ not be prepared to generate wakeup events.
+
+ 3. The ``freeze_late`` phase is analogous to the ``suspend_late`` phase
+ described earlier, except that the device should not be put into a
+ low-power state and should not be allowed to generate wakeup events.
+
+ 4. The ``freeze_noirq`` phase is analogous to the ``suspend_noirq`` phase
+ discussed earlier, except again that the device should not be put into
+ a low-power state and should not be allowed to generate wakeup events.
+
+At this point the system image is created. All devices should be inactive and
+the contents of memory should remain undisturbed while this happens, so that the
+image forms an atomic snapshot of the system state.
+
+ 5. The ``thaw_noirq`` phase is analogous to the ``resume_noirq`` phase
+ discussed earlier. The main difference is that its methods can assume
+ the device is in the same state as at the end of the ``freeze_noirq``
+ phase.
+
+ 6. The ``thaw_early`` phase is analogous to the ``resume_early`` phase
+ described above. Its methods should undo the actions of the preceding
+ ``freeze_late``, if necessary.
+
+ 7. The ``thaw`` phase is analogous to the ``resume`` phase discussed
+ earlier. Its methods should bring the device back to an operating
+ state, so that it can be used for saving the image if necessary.
+
+ 8. The ``complete`` phase is discussed in the "Leaving System Suspend"
+ section above.
+
+At this point the system image is saved, and the devices then need to be
+prepared for the upcoming system shutdown. This is much like suspending them
+before putting the system into the suspend-to-idle, shallow or deep sleep state,
+and the phases are similar.
+
+ 9. The ``prepare`` phase is discussed above.
+
+ 10. The ``poweroff`` phase is analogous to the ``suspend`` phase.
+
+ 11. The ``poweroff_late`` phase is analogous to the ``suspend_late`` phase.
+
+ 12. The ``poweroff_noirq`` phase is analogous to the ``suspend_noirq`` phase.
+
+The ``->poweroff``, ``->poweroff_late`` and ``->poweroff_noirq`` callbacks
+should do essentially the same things as the ``->suspend``, ``->suspend_late``
+and ``->suspend_noirq`` callbacks, respectively. The only notable difference is
+that they need not store the device register values, because the registers
+should already have been stored during the ``freeze``, ``freeze_late`` or
+``freeze_noirq`` phases.
+
+
+Leaving Hibernation
+-------------------
+
+Resuming from hibernation is, again, more complicated than resuming from a sleep
+state in which the contents of main memory are preserved, because it requires
+a system image to be loaded into memory and the pre-hibernation memory contents
+to be restored before control can be passed back to the image kernel.
+
+Although in principle the image might be loaded into memory and the
+pre-hibernation memory contents restored by the boot loader, in practice this
+can't be done because boot loaders aren't smart enough and there is no
+established protocol for passing the necessary information. So instead, the
+boot loader loads a fresh instance of the kernel, called "the restore kernel",
+into memory and passes control to it in the usual way. Then the restore kernel
+reads the system image, restores the pre-hibernation memory contents, and passes
+control to the image kernel. Thus two different kernel instances are involved
+in resuming from hibernation. In fact, the restore kernel may be completely
+different from the image kernel: a different configuration and even a different
+version. This has important consequences for device drivers and their
+subsystems.
+
+To be able to load the system image into memory, the restore kernel needs to
+include at least a subset of device drivers allowing it to access the storage
+medium containing the image, although it doesn't need to include all of the
+drivers present in the image kernel. After the image has been loaded, the
+devices managed by the boot kernel need to be prepared for passing control back
+to the image kernel. This is very similar to the initial steps involved in
+creating a system image, and it is accomplished in the same way, using
+``prepare``, ``freeze``, and ``freeze_noirq`` phases. However, the devices
+affected by these phases are only those having drivers in the restore kernel;
+other devices will still be in whatever state the boot loader left them.
+
+Should the restoration of the pre-hibernation memory contents fail, the restore
+kernel would go through the "thawing" procedure described above, using the
+``thaw_noirq``, ``thaw_early``, ``thaw``, and ``complete`` phases, and then
+continue running normally. This happens only rarely. Most often the
+pre-hibernation memory contents are restored successfully and control is passed
+to the image kernel, which then becomes responsible for bringing the system back
+to the working state.
+
+To achieve this, the image kernel must restore the devices' pre-hibernation
+functionality. The operation is much like waking up from a sleep state (with
+the memory contents preserved), although it involves different phases:
+``restore_noirq``, ``restore_early``, ``restore``, ``complete``.
+
+ 1. The ``restore_noirq`` phase is analogous to the ``resume_noirq`` phase.
+
+ 2. The ``restore_early`` phase is analogous to the ``resume_early`` phase.
+
+ 3. The ``restore`` phase is analogous to the ``resume`` phase.
+
+ 4. The ``complete`` phase is discussed above.
+
+The main difference from ``resume[_early|_noirq]`` is that
+``restore[_early|_noirq]`` must assume the device has been accessed and
+reconfigured by the boot loader or the restore kernel. Consequently, the state
+of the device may be different from the state remembered from the ``freeze``,
+``freeze_late`` and ``freeze_noirq`` phases. The device may even need to be
+reset and completely re-initialized. In many cases this difference doesn't
+matter, so the ``->resume[_early|_noirq]`` and ``->restore[_early|_norq]``
+method pointers can be set to the same routines. Nevertheless, different
+callback pointers are used in case there is a situation where it actually does
+matter.
+
+
+Power Management Notifiers
+==========================
+
+There are some operations that cannot be carried out by the power management
+callbacks discussed above, because the callbacks occur too late or too early.
+To handle these cases, subsystems and device drivers may register power
+management notifiers that are called before tasks are frozen and after they have
+been thawed. Generally speaking, the PM notifiers are suitable for performing
+actions that either require user space to be available, or at least won't
+interfere with user space.
+
+For details refer to :doc:`notifiers`.
+
+
+Device Low-Power (suspend) States
+=================================
+
+Device low-power states aren't standard. One device might only handle
+"on" and "off", while another might support a dozen different versions of
+"on" (how many engines are active?), plus a state that gets back to "on"
+faster than from a full "off".
+
+Some buses define rules about what different suspend states mean. PCI
+gives one example: after the suspend sequence completes, a non-legacy
+PCI device may not perform DMA or issue IRQs, and any wakeup events it
+issues would be issued through the PME# bus signal. Plus, there are
+several PCI-standard device states, some of which are optional.
+
+In contrast, integrated system-on-chip processors often use IRQs as the
+wakeup event sources (so drivers would call :c:func:`enable_irq_wake`) and
+might be able to treat DMA completion as a wakeup event (sometimes DMA can stay
+active too, it'd only be the CPU and some peripherals that sleep).
+
+Some details here may be platform-specific. Systems may have devices that
+can be fully active in certain sleep states, such as an LCD display that's
+refreshed using DMA while most of the system is sleeping lightly ... and
+its frame buffer might even be updated by a DSP or other non-Linux CPU while
+the Linux control processor stays idle.
+
+Moreover, the specific actions taken may depend on the target system state.
+One target system state might allow a given device to be very operational;
+another might require a hard shut down with re-initialization on resume.
+And two different target systems might use the same device in different
+ways; the aforementioned LCD might be active in one product's "standby",
+but a different product using the same SOC might work differently.
+
+
+Device Power Management Domains
+===============================
+
+Sometimes devices share reference clocks or other power resources. In those
+cases it generally is not possible to put devices into low-power states
+individually. Instead, a set of devices sharing a power resource can be put
+into a low-power state together at the same time by turning off the shared
+power resource. Of course, they also need to be put into the full-power state
+together, by turning the shared power resource on. A set of devices with this
+property is often referred to as a power domain. A power domain may also be
+nested inside another power domain. The nested domain is referred to as the
+sub-domain of the parent domain.
+
+Support for power domains is provided through the :c:member:`pm_domain` field of
+|struct device|. This field is a pointer to an object of type
+|struct dev_pm_domain|, defined in :file:`include/linux/pm.h``, providing a set
+of power management callbacks analogous to the subsystem-level and device driver
+callbacks that are executed for the given device during all power transitions,
+instead of the respective subsystem-level callbacks. Specifically, if a
+device's :c:member:`pm_domain` pointer is not NULL, the ``->suspend()`` callback
+from the object pointed to by it will be executed instead of its subsystem's
+(e.g. bus type's) ``->suspend()`` callback and analogously for all of the
+remaining callbacks. In other words, power management domain callbacks, if
+defined for the given device, always take precedence over the callbacks provided
+by the device's subsystem (e.g. bus type).
+
+The support for device power management domains is only relevant to platforms
+needing to use the same device driver power management callbacks in many
+different power domain configurations and wanting to avoid incorporating the
+support for power domains into subsystem-level callbacks, for example by
+modifying the platform bus type. Other platforms need not implement it or take
+it into account in any way.
+
+Devices may be defined as IRQ-safe which indicates to the PM core that their
+runtime PM callbacks may be invoked with disabled interrupts (see
+:file:`Documentation/power/runtime_pm.txt` for more information). If an
+IRQ-safe device belongs to a PM domain, the runtime PM of the domain will be
+disallowed, unless the domain itself is defined as IRQ-safe. However, it
+makes sense to define a PM domain as IRQ-safe only if all the devices in it
+are IRQ-safe. Moreover, if an IRQ-safe domain has a parent domain, the runtime
+PM of the parent is only allowed if the parent itself is IRQ-safe too with the
+additional restriction that all child domains of an IRQ-safe parent must also
+be IRQ-safe.
+
+
+Runtime Power Management
+========================
+
+Many devices are able to dynamically power down while the system is still
+running. This feature is useful for devices that are not being used, and
+can offer significant power savings on a running system. These devices
+often support a range of runtime power states, which might use names such
+as "off", "sleep", "idle", "active", and so on. Those states will in some
+cases (like PCI) be partially constrained by the bus the device uses, and will
+usually include hardware states that are also used in system sleep states.
+
+A system-wide power transition can be started while some devices are in low
+power states due to runtime power management. The system sleep PM callbacks
+should recognize such situations and react to them appropriately, but the
+necessary actions are subsystem-specific.
+
+In some cases the decision may be made at the subsystem level while in other
+cases the device driver may be left to decide. In some cases it may be
+desirable to leave a suspended device in that state during a system-wide power
+transition, but in other cases the device must be put back into the full-power
+state temporarily, for example so that its system wakeup capability can be
+disabled. This all depends on the hardware and the design of the subsystem and
+device driver in question.
+
+During system-wide resume from a sleep state it's easiest to put devices into
+the full-power state, as explained in :file:`Documentation/power/runtime_pm.txt`.
+Refer to that document for more information regarding this particular issue as
+well as for information on the device runtime power management framework in
+general.
diff --git a/Documentation/driver-api/pm/index.rst b/Documentation/driver-api/pm/index.rst
new file mode 100644
index 000000000000..2f6d0e9cf6b7
--- /dev/null
+++ b/Documentation/driver-api/pm/index.rst
@@ -0,0 +1,16 @@
+=======================
+Device Power Management
+=======================
+
+.. toctree::
+
+ devices
+ notifiers
+ types
+
+.. only:: subproject and html
+
+ Indices
+ =======
+
+ * :ref:`genindex`
diff --git a/Documentation/driver-api/pm/notifiers.rst b/Documentation/driver-api/pm/notifiers.rst
new file mode 100644
index 000000000000..62f860026992
--- /dev/null
+++ b/Documentation/driver-api/pm/notifiers.rst
@@ -0,0 +1,70 @@
+=============================
+Suspend/Hibernation Notifiers
+=============================
+
+::
+
+ Copyright (c) 2016 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com>
+
+There are some operations that subsystems or drivers may want to carry out
+before hibernation/suspend or after restore/resume, but they require the system
+to be fully functional, so the drivers' and subsystems' ``->suspend()`` and
+``->resume()`` or even ``->prepare()`` and ``->complete()`` callbacks are not
+suitable for this purpose.
+
+For example, device drivers may want to upload firmware to their devices after
+resume/restore, but they cannot do it by calling :c:func:`request_firmware()`
+from their ``->resume()`` or ``->complete()`` callback routines (user land
+processes are frozen at these points). The solution may be to load the firmware
+into memory before processes are frozen and upload it from there in the
+``->resume()`` routine. A suspend/hibernation notifier may be used for that.
+
+Subsystems or drivers having such needs can register suspend notifiers that
+will be called upon the following events by the PM core:
+
+``PM_HIBERNATION_PREPARE``
+ The system is going to hibernate, tasks will be frozen immediately. This
+ is different from ``PM_SUSPEND_PREPARE`` below, because in this case
+ additional work is done between the notifiers and the invocation of PM
+ callbacks for the "freeze" transition.
+
+``PM_POST_HIBERNATION``
+ The system memory state has been restored from a hibernation image or an
+ error occurred during hibernation. Device restore callbacks have been
+ executed and tasks have been thawed.
+
+``PM_RESTORE_PREPARE``
+ The system is going to restore a hibernation image. If all goes well,
+ the restored image kernel will issue a ``PM_POST_HIBERNATION``
+ notification.
+
+``PM_POST_RESTORE``
+ An error occurred during restore from hibernation. Device restore
+ callbacks have been executed and tasks have been thawed.
+
+``PM_SUSPEND_PREPARE``
+ The system is preparing for suspend.
+
+``PM_POST_SUSPEND``
+ The system has just resumed or an error occurred during suspend. Device
+ resume callbacks have been executed and tasks have been thawed.
+
+It is generally assumed that whatever the notifiers do for
+``PM_HIBERNATION_PREPARE``, should be undone for ``PM_POST_HIBERNATION``.
+Analogously, operations carried out for ``PM_SUSPEND_PREPARE`` should be
+reversed for ``PM_POST_SUSPEND``.
+
+Moreover, if one of the notifiers fails for the ``PM_HIBERNATION_PREPARE`` or
+``PM_SUSPEND_PREPARE`` event, the notifiers that have already succeeded for that
+event will be called for ``PM_POST_HIBERNATION`` or ``PM_POST_SUSPEND``,
+respectively.
+
+The hibernation and suspend notifiers are called with :c:data:`pm_mutex` held.
+They are defined in the usual way, but their last argument is meaningless (it is
+always NULL).
+
+To register and/or unregister a suspend notifier use
+:c:func:`register_pm_notifier()` and :c:func:`unregister_pm_notifier()`,
+respectively (both defined in :file:`include/linux/suspend.h`). If you don't
+need to unregister the notifier, you can also use the :c:func:`pm_notifier()`
+macro defined in :file:`include/linux/suspend.h`.
diff --git a/Documentation/driver-api/pm/types.rst b/Documentation/driver-api/pm/types.rst
new file mode 100644
index 000000000000..3ebdecc54104
--- /dev/null
+++ b/Documentation/driver-api/pm/types.rst
@@ -0,0 +1,5 @@
+==================================
+Device Power Management Data Types
+==================================
+
+.. kernel-doc:: include/linux/pm.h
diff --git a/Documentation/driver-api/regulator.rst b/Documentation/driver-api/regulator.rst
new file mode 100644
index 000000000000..520da0a5251d
--- /dev/null
+++ b/Documentation/driver-api/regulator.rst
@@ -0,0 +1,170 @@
+.. Copyright 2007-2008 Wolfson Microelectronics
+
+.. This documentation is free software; you can redistribute
+.. it and/or modify it under the terms of the GNU General Public
+.. License version 2 as published by the Free Software Foundation.
+
+=================================
+Voltage and current regulator API
+=================================
+
+:Author: Liam Girdwood
+:Author: Mark Brown
+
+Introduction
+============
+
+This framework is designed to provide a standard kernel interface to
+control voltage and current regulators.
+
+The intention is to allow systems to dynamically control regulator power
+output in order to save power and prolong battery life. This applies to
+both voltage regulators (where voltage output is controllable) and
+current sinks (where current limit is controllable).
+
+Note that additional (and currently more complete) documentation is
+available in the Linux kernel source under
+``Documentation/power/regulator``.
+
+Glossary
+--------
+
+The regulator API uses a number of terms which may not be familiar:
+
+Regulator
+
+ Electronic device that supplies power to other devices. Most regulators
+ can enable and disable their output and some can also control their
+ output voltage or current.
+
+Consumer
+
+ Electronic device which consumes power provided by a regulator. These
+ may either be static, requiring only a fixed supply, or dynamic,
+ requiring active management of the regulator at runtime.
+
+Power Domain
+
+ The electronic circuit supplied by a given regulator, including the
+ regulator and all consumer devices. The configuration of the regulator
+ is shared between all the components in the circuit.
+
+Power Management Integrated Circuit (PMIC)
+
+ An IC which contains numerous regulators and often also other
+ subsystems. In an embedded system the primary PMIC is often equivalent
+ to a combination of the PSU and southbridge in a desktop system.
+
+Consumer driver interface
+=========================
+
+This offers a similar API to the kernel clock framework. Consumer
+drivers use `get <#API-regulator-get>`__ and
+`put <#API-regulator-put>`__ operations to acquire and release
+regulators. Functions are provided to `enable <#API-regulator-enable>`__
+and `disable <#API-regulator-disable>`__ the regulator and to get and
+set the runtime parameters of the regulator.
+
+When requesting regulators consumers use symbolic names for their
+supplies, such as "Vcc", which are mapped into actual regulator devices
+by the machine interface.
+
+A stub version of this API is provided when the regulator framework is
+not in use in order to minimise the need to use ifdefs.
+
+Enabling and disabling
+----------------------
+
+The regulator API provides reference counted enabling and disabling of
+regulators. Consumer devices use the :c:func:`regulator_enable()` and
+:c:func:`regulator_disable()` functions to enable and disable
+regulators. Calls to the two functions must be balanced.
+
+Note that since multiple consumers may be using a regulator and machine
+constraints may not allow the regulator to be disabled there is no
+guarantee that calling :c:func:`regulator_disable()` will actually
+cause the supply provided by the regulator to be disabled. Consumer
+drivers should assume that the regulator may be enabled at all times.
+
+Configuration
+-------------
+
+Some consumer devices may need to be able to dynamically configure their
+supplies. For example, MMC drivers may need to select the correct
+operating voltage for their cards. This may be done while the regulator
+is enabled or disabled.
+
+The :c:func:`regulator_set_voltage()` and
+:c:func:`regulator_set_current_limit()` functions provide the primary
+interface for this. Both take ranges of voltages and currents, supporting
+drivers that do not require a specific value (eg, CPU frequency scaling
+normally permits the CPU to use a wider range of supply voltages at lower
+frequencies but does not require that the supply voltage be lowered). Where
+an exact value is required both minimum and maximum values should be
+identical.
+
+Callbacks
+---------
+
+Callbacks may also be registered for events such as regulation failures.
+
+Regulator driver interface
+==========================
+
+Drivers for regulator chips register the regulators with the regulator
+core, providing operations structures to the core. A notifier interface
+allows error conditions to be reported to the core.
+
+Registration should be triggered by explicit setup done by the platform,
+supplying a struct :c:type:`regulator_init_data` for the regulator
+containing constraint and supply information.
+
+Machine interface
+=================
+
+This interface provides a way to define how regulators are connected to
+consumers on a given system and what the valid operating parameters are
+for the system.
+
+Supplies
+--------
+
+Regulator supplies are specified using struct
+:c:type:`regulator_consumer_supply`. This is done at driver registration
+time as part of the machine constraints.
+
+Constraints
+-----------
+
+As well as defining the connections the machine interface also provides
+constraints defining the operations that clients are allowed to perform
+and the parameters that may be set. This is required since generally
+regulator devices will offer more flexibility than it is safe to use on
+a given system, for example supporting higher supply voltages than the
+consumers are rated for.
+
+This is done at driver registration time` by providing a
+struct :c:type:`regulation_constraints`.
+
+The constraints may also specify an initial configuration for the
+regulator in the constraints, which is particularly useful for use with
+static consumers.
+
+API reference
+=============
+
+Due to limitations of the kernel documentation framework and the
+existing layout of the source code the entire regulator API is
+documented here.
+
+.. kernel-doc:: include/linux/regulator/consumer.h
+ :internal:
+
+.. kernel-doc:: include/linux/regulator/machine.h
+ :internal:
+
+.. kernel-doc:: include/linux/regulator/driver.h
+ :internal:
+
+.. kernel-doc:: drivers/regulator/core.c
+ :export:
diff --git a/Documentation/driver-api/uio-howto.rst b/Documentation/driver-api/uio-howto.rst
new file mode 100644
index 000000000000..f73d660b2956
--- /dev/null
+++ b/Documentation/driver-api/uio-howto.rst
@@ -0,0 +1,705 @@
+=======================
+The Userspace I/O HOWTO
+=======================
+
+:Author: Hans-Jürgen Koch Linux developer, Linutronix
+:Date: 2006-12-11
+
+About this document
+===================
+
+Translations
+------------
+
+If you know of any translations for this document, or you are interested
+in translating it, please email me hjk@hansjkoch.de.
+
+Preface
+-------
+
+For many types of devices, creating a Linux kernel driver is overkill.
+All that is really needed is some way to handle an interrupt and provide
+access to the memory space of the device. The logic of controlling the
+device does not necessarily have to be within the kernel, as the device
+does not need to take advantage of any of other resources that the
+kernel provides. One such common class of devices that are like this are
+for industrial I/O cards.
+
+To address this situation, the userspace I/O system (UIO) was designed.
+For typical industrial I/O cards, only a very small kernel module is
+needed. The main part of the driver will run in user space. This
+simplifies development and reduces the risk of serious bugs within a
+kernel module.
+
+Please note that UIO is not an universal driver interface. Devices that
+are already handled well by other kernel subsystems (like networking or
+serial or USB) are no candidates for an UIO driver. Hardware that is
+ideally suited for an UIO driver fulfills all of the following:
+
+- The device has memory that can be mapped. The device can be
+ controlled completely by writing to this memory.
+
+- The device usually generates interrupts.
+
+- The device does not fit into one of the standard kernel subsystems.
+
+Acknowledgments
+---------------
+
+I'd like to thank Thomas Gleixner and Benedikt Spranger of Linutronix,
+who have not only written most of the UIO code, but also helped greatly
+writing this HOWTO by giving me all kinds of background information.
+
+Feedback
+--------
+
+Find something wrong with this document? (Or perhaps something right?) I
+would love to hear from you. Please email me at hjk@hansjkoch.de.
+
+About UIO
+=========
+
+If you use UIO for your card's driver, here's what you get:
+
+- only one small kernel module to write and maintain.
+
+- develop the main part of your driver in user space, with all the
+ tools and libraries you're used to.
+
+- bugs in your driver won't crash the kernel.
+
+- updates of your driver can take place without recompiling the kernel.
+
+How UIO works
+-------------
+
+Each UIO device is accessed through a device file and several sysfs
+attribute files. The device file will be called ``/dev/uio0`` for the
+first device, and ``/dev/uio1``, ``/dev/uio2`` and so on for subsequent
+devices.
+
+``/dev/uioX`` is used to access the address space of the card. Just use
+:c:func:`mmap()` to access registers or RAM locations of your card.
+
+Interrupts are handled by reading from ``/dev/uioX``. A blocking
+:c:func:`read()` from ``/dev/uioX`` will return as soon as an
+interrupt occurs. You can also use :c:func:`select()` on
+``/dev/uioX`` to wait for an interrupt. The integer value read from
+``/dev/uioX`` represents the total interrupt count. You can use this
+number to figure out if you missed some interrupts.
+
+For some hardware that has more than one interrupt source internally,
+but not separate IRQ mask and status registers, there might be
+situations where userspace cannot determine what the interrupt source
+was if the kernel handler disables them by writing to the chip's IRQ
+register. In such a case, the kernel has to disable the IRQ completely
+to leave the chip's register untouched. Now the userspace part can
+determine the cause of the interrupt, but it cannot re-enable
+interrupts. Another cornercase is chips where re-enabling interrupts is
+a read-modify-write operation to a combined IRQ status/acknowledge
+register. This would be racy if a new interrupt occurred simultaneously.
+
+To address these problems, UIO also implements a write() function. It is
+normally not used and can be ignored for hardware that has only a single
+interrupt source or has separate IRQ mask and status registers. If you
+need it, however, a write to ``/dev/uioX`` will call the
+:c:func:`irqcontrol()` function implemented by the driver. You have
+to write a 32-bit value that is usually either 0 or 1 to disable or
+enable interrupts. If a driver does not implement
+:c:func:`irqcontrol()`, :c:func:`write()` will return with
+``-ENOSYS``.
+
+To handle interrupts properly, your custom kernel module can provide its
+own interrupt handler. It will automatically be called by the built-in
+handler.
+
+For cards that don't generate interrupts but need to be polled, there is
+the possibility to set up a timer that triggers the interrupt handler at
+configurable time intervals. This interrupt simulation is done by
+calling :c:func:`uio_event_notify()` from the timer's event
+handler.
+
+Each driver provides attributes that are used to read or write
+variables. These attributes are accessible through sysfs files. A custom
+kernel driver module can add its own attributes to the device owned by
+the uio driver, but not added to the UIO device itself at this time.
+This might change in the future if it would be found to be useful.
+
+The following standard attributes are provided by the UIO framework:
+
+- ``name``: The name of your device. It is recommended to use the name
+ of your kernel module for this.
+
+- ``version``: A version string defined by your driver. This allows the
+ user space part of your driver to deal with different versions of the
+ kernel module.
+
+- ``event``: The total number of interrupts handled by the driver since
+ the last time the device node was read.
+
+These attributes appear under the ``/sys/class/uio/uioX`` directory.
+Please note that this directory might be a symlink, and not a real
+directory. Any userspace code that accesses it must be able to handle
+this.
+
+Each UIO device can make one or more memory regions available for memory
+mapping. This is necessary because some industrial I/O cards require
+access to more than one PCI memory region in a driver.
+
+Each mapping has its own directory in sysfs, the first mapping appears
+as ``/sys/class/uio/uioX/maps/map0/``. Subsequent mappings create
+directories ``map1/``, ``map2/``, and so on. These directories will only
+appear if the size of the mapping is not 0.
+
+Each ``mapX/`` directory contains four read-only files that show
+attributes of the memory:
+
+- ``name``: A string identifier for this mapping. This is optional, the
+ string can be empty. Drivers can set this to make it easier for
+ userspace to find the correct mapping.
+
+- ``addr``: The address of memory that can be mapped.
+
+- ``size``: The size, in bytes, of the memory pointed to by addr.
+
+- ``offset``: The offset, in bytes, that has to be added to the pointer
+ returned by :c:func:`mmap()` to get to the actual device memory.
+ This is important if the device's memory is not page aligned.
+ Remember that pointers returned by :c:func:`mmap()` are always
+ page aligned, so it is good style to always add this offset.
+
+From userspace, the different mappings are distinguished by adjusting
+the ``offset`` parameter of the :c:func:`mmap()` call. To map the
+memory of mapping N, you have to use N times the page size as your
+offset::
+
+ offset = N * getpagesize();
+
+Sometimes there is hardware with memory-like regions that can not be
+mapped with the technique described here, but there are still ways to
+access them from userspace. The most common example are x86 ioports. On
+x86 systems, userspace can access these ioports using
+:c:func:`ioperm()`, :c:func:`iopl()`, :c:func:`inb()`,
+:c:func:`outb()`, and similar functions.
+
+Since these ioport regions can not be mapped, they will not appear under
+``/sys/class/uio/uioX/maps/`` like the normal memory described above.
+Without information about the port regions a hardware has to offer, it
+becomes difficult for the userspace part of the driver to find out which
+ports belong to which UIO device.
+
+To address this situation, the new directory
+``/sys/class/uio/uioX/portio/`` was added. It only exists if the driver
+wants to pass information about one or more port regions to userspace.
+If that is the case, subdirectories named ``port0``, ``port1``, and so
+on, will appear underneath ``/sys/class/uio/uioX/portio/``.
+
+Each ``portX/`` directory contains four read-only files that show name,
+start, size, and type of the port region:
+
+- ``name``: A string identifier for this port region. The string is
+ optional and can be empty. Drivers can set it to make it easier for
+ userspace to find a certain port region.
+
+- ``start``: The first port of this region.
+
+- ``size``: The number of ports in this region.
+
+- ``porttype``: A string describing the type of port.
+
+Writing your own kernel module
+==============================
+
+Please have a look at ``uio_cif.c`` as an example. The following
+paragraphs explain the different sections of this file.
+
+struct uio_info
+---------------
+
+This structure tells the framework the details of your driver, Some of
+the members are required, others are optional.
+
+- ``const char *name``: Required. The name of your driver as it will
+ appear in sysfs. I recommend using the name of your module for this.
+
+- ``const char *version``: Required. This string appears in
+ ``/sys/class/uio/uioX/version``.
+
+- ``struct uio_mem mem[ MAX_UIO_MAPS ]``: Required if you have memory
+ that can be mapped with :c:func:`mmap()`. For each mapping you
+ need to fill one of the ``uio_mem`` structures. See the description
+ below for details.
+
+- ``struct uio_port port[ MAX_UIO_PORTS_REGIONS ]``: Required if you
+ want to pass information about ioports to userspace. For each port
+ region you need to fill one of the ``uio_port`` structures. See the
+ description below for details.
+
+- ``long irq``: Required. If your hardware generates an interrupt, it's
+ your modules task to determine the irq number during initialization.
+ If you don't have a hardware generated interrupt but want to trigger
+ the interrupt handler in some other way, set ``irq`` to
+ ``UIO_IRQ_CUSTOM``. If you had no interrupt at all, you could set
+ ``irq`` to ``UIO_IRQ_NONE``, though this rarely makes sense.
+
+- ``unsigned long irq_flags``: Required if you've set ``irq`` to a
+ hardware interrupt number. The flags given here will be used in the
+ call to :c:func:`request_irq()`.
+
+- ``int (*mmap)(struct uio_info *info, struct vm_area_struct *vma)``:
+ Optional. If you need a special :c:func:`mmap()`
+ function, you can set it here. If this pointer is not NULL, your
+ :c:func:`mmap()` will be called instead of the built-in one.
+
+- ``int (*open)(struct uio_info *info, struct inode *inode)``:
+ Optional. You might want to have your own :c:func:`open()`,
+ e.g. to enable interrupts only when your device is actually used.
+
+- ``int (*release)(struct uio_info *info, struct inode *inode)``:
+ Optional. If you define your own :c:func:`open()`, you will
+ probably also want a custom :c:func:`release()` function.
+
+- ``int (*irqcontrol)(struct uio_info *info, s32 irq_on)``:
+ Optional. If you need to be able to enable or disable interrupts
+ from userspace by writing to ``/dev/uioX``, you can implement this
+ function. The parameter ``irq_on`` will be 0 to disable interrupts
+ and 1 to enable them.
+
+Usually, your device will have one or more memory regions that can be
+mapped to user space. For each region, you have to set up a
+``struct uio_mem`` in the ``mem[]`` array. Here's a description of the
+fields of ``struct uio_mem``:
+
+- ``const char *name``: Optional. Set this to help identify the memory
+ region, it will show up in the corresponding sysfs node.
+
+- ``int memtype``: Required if the mapping is used. Set this to
+ ``UIO_MEM_PHYS`` if you you have physical memory on your card to be
+ mapped. Use ``UIO_MEM_LOGICAL`` for logical memory (e.g. allocated
+ with :c:func:`kmalloc()`). There's also ``UIO_MEM_VIRTUAL`` for
+ virtual memory.
+
+- ``phys_addr_t addr``: Required if the mapping is used. Fill in the
+ address of your memory block. This address is the one that appears in
+ sysfs.
+
+- ``resource_size_t size``: Fill in the size of the memory block that
+ ``addr`` points to. If ``size`` is zero, the mapping is considered
+ unused. Note that you *must* initialize ``size`` with zero for all
+ unused mappings.
+
+- ``void *internal_addr``: If you have to access this memory region
+ from within your kernel module, you will want to map it internally by
+ using something like :c:func:`ioremap()`. Addresses returned by
+ this function cannot be mapped to user space, so you must not store
+ it in ``addr``. Use ``internal_addr`` instead to remember such an
+ address.
+
+Please do not touch the ``map`` element of ``struct uio_mem``! It is
+used by the UIO framework to set up sysfs files for this mapping. Simply
+leave it alone.
+
+Sometimes, your device can have one or more port regions which can not
+be mapped to userspace. But if there are other possibilities for
+userspace to access these ports, it makes sense to make information
+about the ports available in sysfs. For each region, you have to set up
+a ``struct uio_port`` in the ``port[]`` array. Here's a description of
+the fields of ``struct uio_port``:
+
+- ``char *porttype``: Required. Set this to one of the predefined
+ constants. Use ``UIO_PORT_X86`` for the ioports found in x86
+ architectures.
+
+- ``unsigned long start``: Required if the port region is used. Fill in
+ the number of the first port of this region.
+
+- ``unsigned long size``: Fill in the number of ports in this region.
+ If ``size`` is zero, the region is considered unused. Note that you
+ *must* initialize ``size`` with zero for all unused regions.
+
+Please do not touch the ``portio`` element of ``struct uio_port``! It is
+used internally by the UIO framework to set up sysfs files for this
+region. Simply leave it alone.
+
+Adding an interrupt handler
+---------------------------
+
+What you need to do in your interrupt handler depends on your hardware
+and on how you want to handle it. You should try to keep the amount of
+code in your kernel interrupt handler low. If your hardware requires no
+action that you *have* to perform after each interrupt, then your
+handler can be empty.
+
+If, on the other hand, your hardware *needs* some action to be performed
+after each interrupt, then you *must* do it in your kernel module. Note
+that you cannot rely on the userspace part of your driver. Your
+userspace program can terminate at any time, possibly leaving your
+hardware in a state where proper interrupt handling is still required.
+
+There might also be applications where you want to read data from your
+hardware at each interrupt and buffer it in a piece of kernel memory
+you've allocated for that purpose. With this technique you could avoid
+loss of data if your userspace program misses an interrupt.
+
+A note on shared interrupts: Your driver should support interrupt
+sharing whenever this is possible. It is possible if and only if your
+driver can detect whether your hardware has triggered the interrupt or
+not. This is usually done by looking at an interrupt status register. If
+your driver sees that the IRQ bit is actually set, it will perform its
+actions, and the handler returns IRQ_HANDLED. If the driver detects
+that it was not your hardware that caused the interrupt, it will do
+nothing and return IRQ_NONE, allowing the kernel to call the next
+possible interrupt handler.
+
+If you decide not to support shared interrupts, your card won't work in
+computers with no free interrupts. As this frequently happens on the PC
+platform, you can save yourself a lot of trouble by supporting interrupt
+sharing.
+
+Using uio_pdrv for platform devices
+-----------------------------------
+
+In many cases, UIO drivers for platform devices can be handled in a
+generic way. In the same place where you define your
+``struct platform_device``, you simply also implement your interrupt
+handler and fill your ``struct uio_info``. A pointer to this
+``struct uio_info`` is then used as ``platform_data`` for your platform
+device.
+
+You also need to set up an array of ``struct resource`` containing
+addresses and sizes of your memory mappings. This information is passed
+to the driver using the ``.resource`` and ``.num_resources`` elements of
+``struct platform_device``.
+
+You now have to set the ``.name`` element of ``struct platform_device``
+to ``"uio_pdrv"`` to use the generic UIO platform device driver. This
+driver will fill the ``mem[]`` array according to the resources given,
+and register the device.
+
+The advantage of this approach is that you only have to edit a file you
+need to edit anyway. You do not have to create an extra driver.
+
+Using uio_pdrv_genirq for platform devices
+------------------------------------------
+
+Especially in embedded devices, you frequently find chips where the irq
+pin is tied to its own dedicated interrupt line. In such cases, where
+you can be really sure the interrupt is not shared, we can take the
+concept of ``uio_pdrv`` one step further and use a generic interrupt
+handler. That's what ``uio_pdrv_genirq`` does.
+
+The setup for this driver is the same as described above for
+``uio_pdrv``, except that you do not implement an interrupt handler. The
+``.handler`` element of ``struct uio_info`` must remain ``NULL``. The
+``.irq_flags`` element must not contain ``IRQF_SHARED``.
+
+You will set the ``.name`` element of ``struct platform_device`` to
+``"uio_pdrv_genirq"`` to use this driver.
+
+The generic interrupt handler of ``uio_pdrv_genirq`` will simply disable
+the interrupt line using :c:func:`disable_irq_nosync()`. After
+doing its work, userspace can reenable the interrupt by writing
+0x00000001 to the UIO device file. The driver already implements an
+:c:func:`irq_control()` to make this possible, you must not
+implement your own.
+
+Using ``uio_pdrv_genirq`` not only saves a few lines of interrupt
+handler code. You also do not need to know anything about the chip's
+internal registers to create the kernel part of the driver. All you need
+to know is the irq number of the pin the chip is connected to.
+
+Using uio_dmem_genirq for platform devices
+------------------------------------------
+
+In addition to statically allocated memory ranges, they may also be a
+desire to use dynamically allocated regions in a user space driver. In
+particular, being able to access memory made available through the
+dma-mapping API, may be particularly useful. The ``uio_dmem_genirq``
+driver provides a way to accomplish this.
+
+This driver is used in a similar manner to the ``"uio_pdrv_genirq"``
+driver with respect to interrupt configuration and handling.
+
+Set the ``.name`` element of ``struct platform_device`` to
+``"uio_dmem_genirq"`` to use this driver.
+
+When using this driver, fill in the ``.platform_data`` element of
+``struct platform_device``, which is of type
+``struct uio_dmem_genirq_pdata`` and which contains the following
+elements:
+
+- ``struct uio_info uioinfo``: The same structure used as the
+ ``uio_pdrv_genirq`` platform data
+
+- ``unsigned int *dynamic_region_sizes``: Pointer to list of sizes of
+ dynamic memory regions to be mapped into user space.
+
+- ``unsigned int num_dynamic_regions``: Number of elements in
+ ``dynamic_region_sizes`` array.
+
+The dynamic regions defined in the platform data will be appended to the
+`` mem[] `` array after the platform device resources, which implies
+that the total number of static and dynamic memory regions cannot exceed
+``MAX_UIO_MAPS``.
+
+The dynamic memory regions will be allocated when the UIO device file,
+``/dev/uioX`` is opened. Similar to static memory resources, the memory
+region information for dynamic regions is then visible via sysfs at
+``/sys/class/uio/uioX/maps/mapY/*``. The dynamic memory regions will be
+freed when the UIO device file is closed. When no processes are holding
+the device file open, the address returned to userspace is ~0.
+
+Writing a driver in userspace
+=============================
+
+Once you have a working kernel module for your hardware, you can write
+the userspace part of your driver. You don't need any special libraries,
+your driver can be written in any reasonable language, you can use
+floating point numbers and so on. In short, you can use all the tools
+and libraries you'd normally use for writing a userspace application.
+
+Getting information about your UIO device
+-----------------------------------------
+
+Information about all UIO devices is available in sysfs. The first thing
+you should do in your driver is check ``name`` and ``version`` to make
+sure your talking to the right device and that its kernel driver has the
+version you expect.
+
+You should also make sure that the memory mapping you need exists and
+has the size you expect.
+
+There is a tool called ``lsuio`` that lists UIO devices and their
+attributes. It is available here:
+
+http://www.osadl.org/projects/downloads/UIO/user/
+
+With ``lsuio`` you can quickly check if your kernel module is loaded and
+which attributes it exports. Have a look at the manpage for details.
+
+The source code of ``lsuio`` can serve as an example for getting
+information about an UIO device. The file ``uio_helper.c`` contains a
+lot of functions you could use in your userspace driver code.
+
+mmap() device memory
+--------------------
+
+After you made sure you've got the right device with the memory mappings
+you need, all you have to do is to call :c:func:`mmap()` to map the
+device's memory to userspace.
+
+The parameter ``offset`` of the :c:func:`mmap()` call has a special
+meaning for UIO devices: It is used to select which mapping of your
+device you want to map. To map the memory of mapping N, you have to use
+N times the page size as your offset::
+
+ offset = N * getpagesize();
+
+N starts from zero, so if you've got only one memory range to map, set
+``offset = 0``. A drawback of this technique is that memory is always
+mapped beginning with its start address.
+
+Waiting for interrupts
+----------------------
+
+After you successfully mapped your devices memory, you can access it
+like an ordinary array. Usually, you will perform some initialization.
+After that, your hardware starts working and will generate an interrupt
+as soon as it's finished, has some data available, or needs your
+attention because an error occurred.
+
+``/dev/uioX`` is a read-only file. A :c:func:`read()` will always
+block until an interrupt occurs. There is only one legal value for the
+``count`` parameter of :c:func:`read()`, and that is the size of a
+signed 32 bit integer (4). Any other value for ``count`` causes
+:c:func:`read()` to fail. The signed 32 bit integer read is the
+interrupt count of your device. If the value is one more than the value
+you read the last time, everything is OK. If the difference is greater
+than one, you missed interrupts.
+
+You can also use :c:func:`select()` on ``/dev/uioX``.
+
+Generic PCI UIO driver
+======================
+
+The generic driver is a kernel module named uio_pci_generic. It can
+work with any device compliant to PCI 2.3 (circa 2002) and any compliant
+PCI Express device. Using this, you only need to write the userspace
+driver, removing the need to write a hardware-specific kernel module.
+
+Making the driver recognize the device
+--------------------------------------
+
+Since the driver does not declare any device ids, it will not get loaded
+automatically and will not automatically bind to any devices, you must
+load it and allocate id to the driver yourself. For example::
+
+ modprobe uio_pci_generic
+ echo "8086 10f5" > /sys/bus/pci/drivers/uio_pci_generic/new_id
+
+If there already is a hardware specific kernel driver for your device,
+the generic driver still won't bind to it, in this case if you want to
+use the generic driver (why would you?) you'll have to manually unbind
+the hardware specific driver and bind the generic driver, like this::
+
+ echo -n 0000:00:19.0 > /sys/bus/pci/drivers/e1000e/unbind
+ echo -n 0000:00:19.0 > /sys/bus/pci/drivers/uio_pci_generic/bind
+
+You can verify that the device has been bound to the driver by looking
+for it in sysfs, for example like the following::
+
+ ls -l /sys/bus/pci/devices/0000:00:19.0/driver
+
+Which if successful should print::
+
+ .../0000:00:19.0/driver -> ../../../bus/pci/drivers/uio_pci_generic
+
+Note that the generic driver will not bind to old PCI 2.2 devices. If
+binding the device failed, run the following command::
+
+ dmesg
+
+and look in the output for failure reasons.
+
+Things to know about uio_pci_generic
+------------------------------------
+
+Interrupts are handled using the Interrupt Disable bit in the PCI
+command register and Interrupt Status bit in the PCI status register.
+All devices compliant to PCI 2.3 (circa 2002) and all compliant PCI
+Express devices should support these bits. uio_pci_generic detects
+this support, and won't bind to devices which do not support the
+Interrupt Disable Bit in the command register.
+
+On each interrupt, uio_pci_generic sets the Interrupt Disable bit.
+This prevents the device from generating further interrupts until the
+bit is cleared. The userspace driver should clear this bit before
+blocking and waiting for more interrupts.
+
+Writing userspace driver using uio_pci_generic
+------------------------------------------------
+
+Userspace driver can use pci sysfs interface, or the libpci library that
+wraps it, to talk to the device and to re-enable interrupts by writing
+to the command register.
+
+Example code using uio_pci_generic
+----------------------------------
+
+Here is some sample userspace driver code using uio_pci_generic::
+
+ #include <stdlib.h>
+ #include <stdio.h>
+ #include <unistd.h>
+ #include <sys/types.h>
+ #include <sys/stat.h>
+ #include <fcntl.h>
+ #include <errno.h>
+
+ int main()
+ {
+ int uiofd;
+ int configfd;
+ int err;
+ int i;
+ unsigned icount;
+ unsigned char command_high;
+
+ uiofd = open("/dev/uio0", O_RDONLY);
+ if (uiofd < 0) {
+ perror("uio open:");
+ return errno;
+ }
+ configfd = open("/sys/class/uio/uio0/device/config", O_RDWR);
+ if (configfd < 0) {
+ perror("config open:");
+ return errno;
+ }
+
+ /* Read and cache command value */
+ err = pread(configfd, &command_high, 1, 5);
+ if (err != 1) {
+ perror("command config read:");
+ return errno;
+ }
+ command_high &= ~0x4;
+
+ for(i = 0;; ++i) {
+ /* Print out a message, for debugging. */
+ if (i == 0)
+ fprintf(stderr, "Started uio test driver.\n");
+ else
+ fprintf(stderr, "Interrupts: %d\n", icount);
+
+ /****************************************/
+ /* Here we got an interrupt from the
+ device. Do something to it. */
+ /****************************************/
+
+ /* Re-enable interrupts. */
+ err = pwrite(configfd, &command_high, 1, 5);
+ if (err != 1) {
+ perror("config write:");
+ break;
+ }
+
+ /* Wait for next interrupt. */
+ err = read(uiofd, &icount, 4);
+ if (err != 4) {
+ perror("uio read:");
+ break;
+ }
+
+ }
+ return errno;
+ }
+
+Generic Hyper-V UIO driver
+==========================
+
+The generic driver is a kernel module named uio_hv_generic. It
+supports devices on the Hyper-V VMBus similar to uio_pci_generic on
+PCI bus.
+
+Making the driver recognize the device
+--------------------------------------
+
+Since the driver does not declare any device GUID's, it will not get
+loaded automatically and will not automatically bind to any devices, you
+must load it and allocate id to the driver yourself. For example, to use
+the network device GUID::
+
+ modprobe uio_hv_generic
+ echo "f8615163-df3e-46c5-913f-f2d2f965ed0e" > /sys/bus/vmbus/drivers/uio_hv_generic/new_id
+
+If there already is a hardware specific kernel driver for the device,
+the generic driver still won't bind to it, in this case if you want to
+use the generic driver (why would you?) you'll have to manually unbind
+the hardware specific driver and bind the generic driver, like this::
+
+ echo -n vmbus-ed963694-e847-4b2a-85af-bc9cfc11d6f3 > /sys/bus/vmbus/drivers/hv_netvsc/unbind
+ echo -n vmbus-ed963694-e847-4b2a-85af-bc9cfc11d6f3 > /sys/bus/vmbus/drivers/uio_hv_generic/bind
+
+You can verify that the device has been bound to the driver by looking
+for it in sysfs, for example like the following::
+
+ ls -l /sys/bus/vmbus/devices/vmbus-ed963694-e847-4b2a-85af-bc9cfc11d6f3/driver
+
+Which if successful should print::
+
+ .../vmbus-ed963694-e847-4b2a-85af-bc9cfc11d6f3/driver -> ../../../bus/vmbus/drivers/uio_hv_generic
+
+Things to know about uio_hv_generic
+-----------------------------------
+
+On each interrupt, uio_hv_generic sets the Interrupt Disable bit. This
+prevents the device from generating further interrupts until the bit is
+cleared. The userspace driver should clear this bit before blocking and
+waiting for more interrupts.
+
+Further information
+===================
+
+- `OSADL homepage. <http://www.osadl.org>`_
+
+- `Linutronix homepage. <http://www.linutronix.de>`_