summaryrefslogtreecommitdiffstats
path: root/Documentation/admin-guide/media/ipu3.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/admin-guide/media/ipu3.rst')
-rw-r--r--Documentation/admin-guide/media/ipu3.rst591
1 files changed, 591 insertions, 0 deletions
diff --git a/Documentation/admin-guide/media/ipu3.rst b/Documentation/admin-guide/media/ipu3.rst
new file mode 100644
index 000000000000..9361c34f123e
--- /dev/null
+++ b/Documentation/admin-guide/media/ipu3.rst
@@ -0,0 +1,591 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. include:: <isonum.txt>
+
+===============================================================
+Intel Image Processing Unit 3 (IPU3) Imaging Unit (ImgU) driver
+===============================================================
+
+Copyright |copy| 2018 Intel Corporation
+
+Introduction
+============
+
+This file documents the Intel IPU3 (3rd generation Image Processing Unit)
+Imaging Unit drivers located under drivers/media/pci/intel/ipu3 (CIO2) as well
+as under drivers/staging/media/ipu3 (ImgU).
+
+The Intel IPU3 found in certain Kaby Lake (as well as certain Sky Lake)
+platforms (U/Y processor lines) is made up of two parts namely the Imaging Unit
+(ImgU) and the CIO2 device (MIPI CSI2 receiver).
+
+The CIO2 device receives the raw Bayer data from the sensors and outputs the
+frames in a format that is specific to the IPU3 (for consumption by the IPU3
+ImgU). The CIO2 driver is available as drivers/media/pci/intel/ipu3/ipu3-cio2*
+and is enabled through the CONFIG_VIDEO_IPU3_CIO2 config option.
+
+The Imaging Unit (ImgU) is responsible for processing images captured
+by the IPU3 CIO2 device. The ImgU driver sources can be found under
+drivers/staging/media/ipu3 directory. The driver is enabled through the
+CONFIG_VIDEO_IPU3_IMGU config option.
+
+The two driver modules are named ipu3_csi2 and ipu3_imgu, respectively.
+
+The drivers has been tested on Kaby Lake platforms (U/Y processor lines).
+
+Both of the drivers implement V4L2, Media Controller and V4L2 sub-device
+interfaces. The IPU3 CIO2 driver supports camera sensors connected to the CIO2
+MIPI CSI-2 interfaces through V4L2 sub-device sensor drivers.
+
+CIO2
+====
+
+The CIO2 is represented as a single V4L2 subdev, which provides a V4L2 subdev
+interface to the user space. There is a video node for each CSI-2 receiver,
+with a single media controller interface for the entire device.
+
+The CIO2 contains four independent capture channel, each with its own MIPI CSI-2
+receiver and DMA engine. Each channel is modelled as a V4L2 sub-device exposed
+to userspace as a V4L2 sub-device node and has two pads:
+
+.. tabularcolumns:: |p{0.8cm}|p{4.0cm}|p{4.0cm}|
+
+.. flat-table::
+
+ * - pad
+ - direction
+ - purpose
+
+ * - 0
+ - sink
+ - MIPI CSI-2 input, connected to the sensor subdev
+
+ * - 1
+ - source
+ - Raw video capture, connected to the V4L2 video interface
+
+The V4L2 video interfaces model the DMA engines. They are exposed to userspace
+as V4L2 video device nodes.
+
+Capturing frames in raw Bayer format
+------------------------------------
+
+CIO2 MIPI CSI2 receiver is used to capture frames (in packed raw Bayer format)
+from the raw sensors connected to the CSI2 ports. The captured frames are used
+as input to the ImgU driver.
+
+Image processing using IPU3 ImgU requires tools such as raw2pnm [#f1]_, and
+yavta [#f2]_ due to the following unique requirements and / or features specific
+to IPU3.
+
+-- The IPU3 CSI2 receiver outputs the captured frames from the sensor in packed
+raw Bayer format that is specific to IPU3.
+
+-- Multiple video nodes have to be operated simultaneously.
+
+Let us take the example of ov5670 sensor connected to CSI2 port 0, for a
+2592x1944 image capture.
+
+Using the media contorller APIs, the ov5670 sensor is configured to send
+frames in packed raw Bayer format to IPU3 CSI2 receiver.
+
+# This example assumes /dev/media0 as the CIO2 media device
+
+export MDEV=/dev/media0
+
+# and that ov5670 sensor is connected to i2c bus 10 with address 0x36
+
+export SDEV=$(media-ctl -d $MDEV -e "ov5670 10-0036")
+
+# Establish the link for the media devices using media-ctl [#f3]_
+media-ctl -d $MDEV -l "ov5670:0 -> ipu3-csi2 0:0[1]"
+
+# Set the format for the media devices
+media-ctl -d $MDEV -V "ov5670:0 [fmt:SGRBG10/2592x1944]"
+
+media-ctl -d $MDEV -V "ipu3-csi2 0:0 [fmt:SGRBG10/2592x1944]"
+
+media-ctl -d $MDEV -V "ipu3-csi2 0:1 [fmt:SGRBG10/2592x1944]"
+
+Once the media pipeline is configured, desired sensor specific settings
+(such as exposure and gain settings) can be set, using the yavta tool.
+
+e.g
+
+yavta -w 0x009e0903 444 $SDEV
+
+yavta -w 0x009e0913 1024 $SDEV
+
+yavta -w 0x009e0911 2046 $SDEV
+
+Once the desired sensor settings are set, frame captures can be done as below.
+
+e.g
+
+yavta --data-prefix -u -c10 -n5 -I -s2592x1944 --file=/tmp/frame-#.bin \
+ -f IPU3_SGRBG10 $(media-ctl -d $MDEV -e "ipu3-cio2 0")
+
+With the above command, 10 frames are captured at 2592x1944 resolution, with
+sGRBG10 format and output as IPU3_SGRBG10 format.
+
+The captured frames are available as /tmp/frame-#.bin files.
+
+ImgU
+====
+
+The ImgU is represented as two V4L2 subdevs, each of which provides a V4L2
+subdev interface to the user space.
+
+Each V4L2 subdev represents a pipe, which can support a maximum of 2 streams.
+This helps to support advanced camera features like Continuous View Finder (CVF)
+and Snapshot During Video(SDV).
+
+The ImgU contains two independent pipes, each modelled as a V4L2 sub-device
+exposed to userspace as a V4L2 sub-device node.
+
+Each pipe has two sink pads and three source pads for the following purpose:
+
+.. tabularcolumns:: |p{0.8cm}|p{4.0cm}|p{4.0cm}|
+
+.. flat-table::
+
+ * - pad
+ - direction
+ - purpose
+
+ * - 0
+ - sink
+ - Input raw video stream
+
+ * - 1
+ - sink
+ - Processing parameters
+
+ * - 2
+ - source
+ - Output processed video stream
+
+ * - 3
+ - source
+ - Output viewfinder video stream
+
+ * - 4
+ - source
+ - 3A statistics
+
+Each pad is connected to a corresponding V4L2 video interface, exposed to
+userspace as a V4L2 video device node.
+
+Device operation
+----------------
+
+With ImgU, once the input video node ("ipu3-imgu 0/1":0, in
+<entity>:<pad-number> format) is queued with buffer (in packed raw Bayer
+format), ImgU starts processing the buffer and produces the video output in YUV
+format and statistics output on respective output nodes. The driver is expected
+to have buffers ready for all of parameter, output and statistics nodes, when
+input video node is queued with buffer.
+
+At a minimum, all of input, main output, 3A statistics and viewfinder
+video nodes should be enabled for IPU3 to start image processing.
+
+Each ImgU V4L2 subdev has the following set of video nodes.
+
+input, output and viewfinder video nodes
+----------------------------------------
+
+The frames (in packed raw Bayer format specific to the IPU3) received by the
+input video node is processed by the IPU3 Imaging Unit and are output to 2 video
+nodes, with each targeting a different purpose (main output and viewfinder
+output).
+
+Details onand the Bayer format specific to the IPU3 can be found in
+:ref:`v4l2-pix-fmt-ipu3-sbggr10`.
+
+The driver supports V4L2 Video Capture Interface as defined at :ref:`devices`.
+
+Only the multi-planar API is supported. More details can be found at
+:ref:`planar-apis`.
+
+Parameters video node
+---------------------
+
+The parameters video node receives the ImgU algorithm parameters that are used
+to configure how the ImgU algorithms process the image.
+
+Details on processing parameters specific to the IPU3 can be found in
+:ref:`v4l2-meta-fmt-params`.
+
+3A statistics video node
+------------------------
+
+3A statistics video node is used by the ImgU driver to output the 3A (auto
+focus, auto exposure and auto white balance) statistics for the frames that are
+being processed by the ImgU to user space applications. User space applications
+can use this statistics data to compute the desired algorithm parameters for
+the ImgU.
+
+Configuring the Intel IPU3
+==========================
+
+The IPU3 ImgU pipelines can be configured using the Media Controller, defined at
+:ref:`media_controller`.
+
+Running mode and firmware binary selection
+------------------------------------------
+
+ImgU works based on firmware, currently the ImgU firmware support run 2 pipes in
+time-sharing with single input frame data. Each pipe can run at certain mode -
+"VIDEO" or "STILL", "VIDEO" mode is commonly used for video frames capture, and
+"STILL" is used for still frame capture. However, you can also select "VIDEO" to
+capture still frames if you want to capture images with less system load and
+power. For "STILL" mode, ImgU will try to use smaller BDS factor and output
+larger bayer frame for further YUV processing than "VIDEO" mode to get high
+quality images. Besides, "STILL" mode need XNR3 to do noise reduction, hence
+"STILL" mode will need more power and memory bandwidth than "VIDEO" mode. TNR
+will be enabled in "VIDEO" mode and bypassed by "STILL" mode. ImgU is running at
+“VIDEO” mode by default, the user can use v4l2 control V4L2_CID_INTEL_IPU3_MODE
+(currently defined in drivers/staging/media/ipu3/include/intel-ipu3.h) to query
+and set the running mode. For user, there is no difference for buffer queueing
+between the "VIDEO" and "STILL" mode, mandatory input and main output node
+should be enabled and buffers need be queued, the statistics and the view-finder
+queues are optional.
+
+The firmware binary will be selected according to current running mode, such log
+"using binary if_to_osys_striped " or "using binary if_to_osys_primary_striped"
+could be observed if you enable the ImgU dynamic debug, the binary
+if_to_osys_striped is selected for "VIDEO" and the binary
+"if_to_osys_primary_striped" is selected for "STILL".
+
+
+Processing the image in raw Bayer format
+----------------------------------------
+
+Configuring ImgU V4L2 subdev for image processing
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The ImgU V4L2 subdevs have to be configured with media controller APIs to have
+all the video nodes setup correctly.
+
+Let us take "ipu3-imgu 0" subdev as an example.
+
+media-ctl -d $MDEV -r
+
+media-ctl -d $MDEV -l "ipu3-imgu 0 input":0 -> "ipu3-imgu 0":0[1]
+
+media-ctl -d $MDEV -l "ipu3-imgu 0":2 -> "ipu3-imgu 0 output":0[1]
+
+media-ctl -d $MDEV -l "ipu3-imgu 0":3 -> "ipu3-imgu 0 viewfinder":0[1]
+
+media-ctl -d $MDEV -l "ipu3-imgu 0":4 -> "ipu3-imgu 0 3a stat":0[1]
+
+Also the pipe mode of the corresponding V4L2 subdev should be set as desired
+(e.g 0 for video mode or 1 for still mode) through the control id 0x009819a1 as
+below.
+
+yavta -w "0x009819A1 1" /dev/v4l-subdev7
+
+Certain hardware blocks in ImgU pipeline can change the frame resolution by
+cropping or scaling, these hardware blocks include Input Feeder(IF), Bayer Down
+Scaler (BDS) and Geometric Distortion Correction (GDC).
+There is also a block which can change the frame resolution - YUV Scaler, it is
+only applicable to the secondary output.
+
+RAW Bayer frames go through these ImgU pipeline hardware blocks and the final
+processed image output to the DDR memory.
+
+.. kernel-figure:: ipu3_rcb.svg
+ :alt: ipu3 resolution blocks image
+
+ IPU3 resolution change hardware blocks
+
+**Input Feeder**
+
+Input Feeder gets the Bayer frame data from the sensor, it can enable cropping
+of lines and columns from the frame and then store pixels into device's internal
+pixel buffer which are ready to readout by following blocks.
+
+**Bayer Down Scaler**
+
+Bayer Down Scaler is capable of performing image scaling in Bayer domain, the
+downscale factor can be configured from 1X to 1/4X in each axis with
+configuration steps of 0.03125 (1/32).
+
+**Geometric Distortion Correction**
+
+Geometric Distortion Correction is used to performe correction of distortions
+and image filtering. It needs some extra filter and envelop padding pixels to
+work, so the input resolution of GDC should be larger than the output
+resolution.
+
+**YUV Scaler**
+
+YUV Scaler which similar with BDS, but it is mainly do image down scaling in
+YUV domain, it can support up to 1/12X down scaling, but it can not be applied
+to the main output.
+
+The ImgU V4L2 subdev has to be configured with the supported resolutions in all
+the above hardware blocks, for a given input resolution.
+For a given supported resolution for an input frame, the Input Feeder, Bayer
+Down Scaler and GDC blocks should be configured with the supported resolutions
+as each hardware block has its own alignment requirement.
+
+You must configure the output resolution of the hardware blocks smartly to meet
+the hardware requirement along with keeping the maximum field of view. The
+intermediate resolutions can be generated by specific tool -
+
+https://github.com/intel/intel-ipu3-pipecfg
+
+This tool can be used to generate intermediate resolutions. More information can
+be obtained by looking at the following IPU3 ImgU configuration table.
+
+https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/master
+
+Under baseboard-poppy/media-libs/cros-camera-hal-configs-poppy/files/gcss
+directory, graph_settings_ov5670.xml can be used as an example.
+
+The following steps prepare the ImgU pipeline for the image processing.
+
+1. The ImgU V4L2 subdev data format should be set by using the
+VIDIOC_SUBDEV_S_FMT on pad 0, using the GDC width and height obtained above.
+
+2. The ImgU V4L2 subdev cropping should be set by using the
+VIDIOC_SUBDEV_S_SELECTION on pad 0, with V4L2_SEL_TGT_CROP as the target,
+using the input feeder height and width.
+
+3. The ImgU V4L2 subdev composing should be set by using the
+VIDIOC_SUBDEV_S_SELECTION on pad 0, with V4L2_SEL_TGT_COMPOSE as the target,
+using the BDS height and width.
+
+For the ov5670 example, for an input frame with a resolution of 2592x1944
+(which is input to the ImgU subdev pad 0), the corresponding resolutions
+for input feeder, BDS and GDC are 2592x1944, 2592x1944 and 2560x1920
+respectively.
+
+Once this is done, the received raw Bayer frames can be input to the ImgU
+V4L2 subdev as below, using the open source application v4l2n [#f1]_.
+
+For an image captured with 2592x1944 [#f4]_ resolution, with desired output
+resolution as 2560x1920 and viewfinder resolution as 2560x1920, the following
+v4l2n command can be used. This helps process the raw Bayer frames and produces
+the desired results for the main output image and the viewfinder output, in NV12
+format.
+
+v4l2n --pipe=4 --load=/tmp/frame-#.bin --open=/dev/video4
+--fmt=type:VIDEO_OUTPUT_MPLANE,width=2592,height=1944,pixelformat=0X47337069
+--reqbufs=type:VIDEO_OUTPUT_MPLANE,count:1 --pipe=1 --output=/tmp/frames.out
+--open=/dev/video5
+--fmt=type:VIDEO_CAPTURE_MPLANE,width=2560,height=1920,pixelformat=NV12
+--reqbufs=type:VIDEO_CAPTURE_MPLANE,count:1 --pipe=2 --output=/tmp/frames.vf
+--open=/dev/video6
+--fmt=type:VIDEO_CAPTURE_MPLANE,width=2560,height=1920,pixelformat=NV12
+--reqbufs=type:VIDEO_CAPTURE_MPLANE,count:1 --pipe=3 --open=/dev/video7
+--output=/tmp/frames.3A --fmt=type:META_CAPTURE,?
+--reqbufs=count:1,type:META_CAPTURE --pipe=1,2,3,4 --stream=5
+
+You can also use yavta [#f2]_ command to do same thing as above:
+
+.. code-block:: none
+
+ yavta --data-prefix -Bcapture-mplane -c10 -n5 -I -s2592x1944 \
+ --file=frame-#.out-f NV12 /dev/video5 & \
+ yavta --data-prefix -Bcapture-mplane -c10 -n5 -I -s2592x1944 \
+ --file=frame-#.vf -f NV12 /dev/video6 & \
+ yavta --data-prefix -Bmeta-capture -c10 -n5 -I \
+ --file=frame-#.3a /dev/video7 & \
+ yavta --data-prefix -Boutput-mplane -c10 -n5 -I -s2592x1944 \
+ --file=/tmp/frame-in.cio2 -f IPU3_SGRBG10 /dev/video4
+
+where /dev/video4, /dev/video5, /dev/video6 and /dev/video7 devices point to
+input, output, viewfinder and 3A statistics video nodes respectively.
+
+Converting the raw Bayer image into YUV domain
+----------------------------------------------
+
+The processed images after the above step, can be converted to YUV domain
+as below.
+
+Main output frames
+~~~~~~~~~~~~~~~~~~
+
+raw2pnm -x2560 -y1920 -fNV12 /tmp/frames.out /tmp/frames.out.ppm
+
+where 2560x1920 is output resolution, NV12 is the video format, followed
+by input frame and output PNM file.
+
+Viewfinder output frames
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+raw2pnm -x2560 -y1920 -fNV12 /tmp/frames.vf /tmp/frames.vf.ppm
+
+where 2560x1920 is output resolution, NV12 is the video format, followed
+by input frame and output PNM file.
+
+Example user space code for IPU3
+================================
+
+User space code that configures and uses IPU3 is available here.
+
+https://chromium.googlesource.com/chromiumos/platform/arc-camera/+/master/
+
+The source can be located under hal/intel directory.
+
+Overview of IPU3 pipeline
+=========================
+
+IPU3 pipeline has a number of image processing stages, each of which takes a
+set of parameters as input. The major stages of pipelines are shown here:
+
+.. kernel-render:: DOT
+ :alt: IPU3 ImgU Pipeline
+ :caption: IPU3 ImgU Pipeline Diagram
+
+ digraph "IPU3 ImgU" {
+ node [shape=box]
+ splines="ortho"
+ rankdir="LR"
+
+ a [label="Raw pixels"]
+ b [label="Bayer Downscaling"]
+ c [label="Optical Black Correction"]
+ d [label="Linearization"]
+ e [label="Lens Shading Correction"]
+ f [label="White Balance / Exposure / Focus Apply"]
+ g [label="Bayer Noise Reduction"]
+ h [label="ANR"]
+ i [label="Demosaicing"]
+ j [label="Color Correction Matrix"]
+ k [label="Gamma correction"]
+ l [label="Color Space Conversion"]
+ m [label="Chroma Down Scaling"]
+ n [label="Chromatic Noise Reduction"]
+ o [label="Total Color Correction"]
+ p [label="XNR3"]
+ q [label="TNR"]
+ r [label="DDR", style=filled, fillcolor=yellow, shape=cylinder]
+ s [label="YUV Downscaling"]
+ t [label="DDR", style=filled, fillcolor=yellow, shape=cylinder]
+
+ { rank=same; a -> b -> c -> d -> e -> f -> g -> h -> i }
+ { rank=same; j -> k -> l -> m -> n -> o -> p -> q -> s -> t}
+
+ a -> j [style=invis, weight=10]
+ i -> j
+ q -> r
+ }
+
+The table below presents a description of the above algorithms.
+
+======================== =======================================================
+Name Description
+======================== =======================================================
+Optical Black Correction Optical Black Correction block subtracts a pre-defined
+ value from the respective pixel values to obtain better
+ image quality.
+ Defined in :c:type:`ipu3_uapi_obgrid_param`.
+Linearization This algo block uses linearization parameters to
+ address non-linearity sensor effects. The Lookup table
+ table is defined in
+ :c:type:`ipu3_uapi_isp_lin_vmem_params`.
+SHD Lens shading correction is used to correct spatial
+ non-uniformity of the pixel response due to optical
+ lens shading. This is done by applying a different gain
+ for each pixel. The gain, black level etc are
+ configured in :c:type:`ipu3_uapi_shd_config_static`.
+BNR Bayer noise reduction block removes image noise by
+ applying a bilateral filter.
+ See :c:type:`ipu3_uapi_bnr_static_config` for details.
+ANR Advanced Noise Reduction is a block based algorithm
+ that performs noise reduction in the Bayer domain. The
+ convolution matrix etc can be found in
+ :c:type:`ipu3_uapi_anr_config`.
+DM Demosaicing converts raw sensor data in Bayer format
+ into RGB (Red, Green, Blue) presentation. Then add
+ outputs of estimation of Y channel for following stream
+ processing by Firmware. The struct is defined as
+ :c:type:`ipu3_uapi_dm_config`.
+Color Correction Color Correction algo transforms sensor specific color
+ space to the standard "sRGB" color space. This is done
+ by applying 3x3 matrix defined in
+ :c:type:`ipu3_uapi_ccm_mat_config`.
+Gamma correction Gamma correction :c:type:`ipu3_uapi_gamma_config` is a
+ basic non-linear tone mapping correction that is
+ applied per pixel for each pixel component.
+CSC Color space conversion transforms each pixel from the
+ RGB primary presentation to YUV (Y: brightness,
+ UV: Luminance) presentation. This is done by applying
+ a 3x3 matrix defined in
+ :c:type:`ipu3_uapi_csc_mat_config`
+CDS Chroma down sampling
+ After the CSC is performed, the Chroma Down Sampling
+ is applied for a UV plane down sampling by a factor
+ of 2 in each direction for YUV 4:2:0 using a 4x2
+ configurable filter :c:type:`ipu3_uapi_cds_params`.
+CHNR Chroma noise reduction
+ This block processes only the chrominance pixels and
+ performs noise reduction by cleaning the high
+ frequency noise.
+ See struct :c:type:`ipu3_uapi_yuvp1_chnr_config`.
+TCC Total color correction as defined in struct
+ :c:type:`ipu3_uapi_yuvp2_tcc_static_config`.
+XNR3 eXtreme Noise Reduction V3 is the third revision of
+ noise reduction algorithm used to improve image
+ quality. This removes the low frequency noise in the
+ captured image. Two related structs are being defined,
+ :c:type:`ipu3_uapi_isp_xnr3_params` for ISP data memory
+ and :c:type:`ipu3_uapi_isp_xnr3_vmem_params` for vector
+ memory.
+TNR Temporal Noise Reduction block compares successive
+ frames in time to remove anomalies / noise in pixel
+ values. :c:type:`ipu3_uapi_isp_tnr3_vmem_params` and
+ :c:type:`ipu3_uapi_isp_tnr3_params` are defined for ISP
+ vector and data memory respectively.
+======================== =======================================================
+
+Other often encountered acronyms not listed in above table:
+
+ ACC
+ Accelerator cluster
+ AWB_FR
+ Auto white balance filter response statistics
+ BDS
+ Bayer downscaler parameters
+ CCM
+ Color correction matrix coefficients
+ IEFd
+ Image enhancement filter directed
+ Obgrid
+ Optical black level compensation
+ OSYS
+ Output system configuration
+ ROI
+ Region of interest
+ YDS
+ Y down sampling
+ YTM
+ Y-tone mapping
+
+A few stages of the pipeline will be executed by firmware running on the ISP
+processor, while many others will use a set of fixed hardware blocks also
+called accelerator cluster (ACC) to crunch pixel data and produce statistics.
+
+ACC parameters of individual algorithms, as defined by
+:c:type:`ipu3_uapi_acc_param`, can be chosen to be applied by the user
+space through struct :c:type:`ipu3_uapi_flags` embedded in
+:c:type:`ipu3_uapi_params` structure. For parameters that are configured as
+not enabled by the user space, the corresponding structs are ignored by the
+driver, in which case the existing configuration of the algorithm will be
+preserved.
+
+References
+==========
+
+.. [#f5] drivers/staging/media/ipu3/include/intel-ipu3.h
+
+.. [#f1] https://github.com/intel/nvt
+
+.. [#f2] http://git.ideasonboard.org/yavta.git
+
+.. [#f3] http://git.ideasonboard.org/?p=media-ctl.git;a=summary
+
+.. [#f4] ImgU limitation requires an additional 16x16 for all input resolutions