linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2015-04-24	crypto: x86/sha512_ssse3 - fixup for asm function prototype change	Ard Biesheuvel	1	-1/+1
	Patch e68410ebf626 ("crypto: x86/sha512_ssse3 - move SHA-384/512 SSSE3 implementation to base layer") changed the prototypes of the core asm SHA-512 implementations so that they are compatible with the prototype used by the base layer. However, in one instance, the register that was used for passing the input buffer was reused as a scratch register later on in the code, and since the input buffer param changed places with the digest param -which needs to be written back before the function returns- this resulted in the scratch register to be dereferenced in a memory write operation, causing a GPF. Fix this by changing the scratch register to use the same register as the input buffer param again. Fixes: e68410ebf626 ("crypto: x86/sha512_ssse3 - move SHA-384/512 SSSE3 implementation to base layer") Reported-By: Bobby Powers <bobbypowers@gmail.com> Tested-By: Bobby Powers <bobbypowers@gmail.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-04-15	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6	Linus Torvalds	22	-567/+335
	Pull crypto update from Herbert Xu: "Here is the crypto update for 4.1: New interfaces: - user-space interface for AEAD - user-space interface for RNG (i.e., pseudo RNG) New hashes: - ARMv8 SHA1/256 - ARMv8 AES - ARMv8 GHASH - ARM assembler and NEON SHA256 - MIPS OCTEON SHA1/256/512 - MIPS img-hash SHA1/256 and MD5 - Power 8 VMX AES/CBC/CTR/GHASH - PPC assembler AES, SHA1/256 and MD5 - Broadcom IPROC RNG driver Cleanups/fixes: - prevent internal helper algos from being exposed to user-space - merge common code from assembly/C SHA implementations - misc fixes" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (169 commits) crypto: arm - workaround for building with old binutils crypto: arm/sha256 - avoid sha256 code on ARMv7-M crypto: x86/sha512_ssse3 - move SHA-384/512 SSSE3 implementation to base layer crypto: x86/sha256_ssse3 - move SHA-224/256 SSSE3 implementation to base layer crypto: x86/sha1_ssse3 - move SHA-1 SSSE3 implementation to base layer crypto: arm64/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer crypto: arm64/sha1-ce - move SHA-1 ARMv8 implementation to base layer crypto: arm/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer crypto: arm/sha256 - move SHA-224/256 ASM/NEON implementation to base layer crypto: arm/sha1-ce - move SHA-1 ARMv8 implementation to base layer crypto: arm/sha1_neon - move SHA-1 NEON implementation to base layer crypto: arm/sha1 - move SHA-1 ARM asm implementation to base layer crypto: sha512-generic - move to generic glue implementation crypto: sha256-generic - move to generic glue implementation crypto: sha1-generic - move to generic glue implementation crypto: sha512 - implement base layer for SHA-512 crypto: sha256 - implement base layer for SHA-256 crypto: sha1 - implement base layer for SHA-1 crypto: api - remove instance when test failed crypto: api - Move alg ref count init to crypto_check_alg ...
2015-04-10	crypto: x86/sha512_ssse3 - move SHA-384/512 SSSE3 implementation to base layer	Ard Biesheuvel	4	-176/+44
	This removes all the boilerplate from the existing implementation, and replaces it with calls into the base layer. It also changes the prototypes of the core asm functions to be compatible with the base prototype void (sha512_block_fn)(struct sha256_state sst, u8 const src, int blocks) so that they can be passed to the base layer directly. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-04-10	crypto: x86/sha256_ssse3 - move SHA-224/256 SSSE3 implementation to base layer	Ard Biesheuvel	4	-173/+50
	This removes all the boilerplate from the existing implementation, and replaces it with calls into the base layer. It also changes the prototypes of the core asm functions to be compatible with the base prototype void (sha256_block_fn)(struct sha256_state sst, u8 const src, int blocks) so that they can be passed to the base layer directly. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-04-10	crypto: x86/sha1_ssse3 - move SHA-1 SSSE3 implementation to base layer	Ard Biesheuvel	1	-111/+28
	This removes all the boilerplate from the existing implementation, and replaces it with calls into the base layer. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-04-01	x86/asm: Replace "MOVQ $imm, %reg" with MOVL	Denys Vlasenko	2	-3/+3
	There is no reason to use MOVQ to load a non-negative immediate constant value into a 64-bit register. MOVL does the same, since the upper 32 bits are zero-extended by the CPU. This makes the code a bit smaller, while leaving functionality unchanged. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427821211-25099-8-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-31	crypto: sha-mb - mark Multi buffer SHA1 helper cipher	Stephan Mueller	1	-2/+5
	Flag all Multi buffer SHA1 helper ciphers as internal ciphers to prevent them from being called by normal users. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-03-31	crypto: twofish_avx - mark Twofish AVX helper ciphers	Stephan Mueller	1	-5/+10
	Flag all Twofish AVX helper ciphers as internal ciphers to prevent them from being called by normal users. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-03-31	crypto: serpent_sse2 - mark Serpent SSE2 helper ciphers	Stephan Mueller	1	-5/+10
	Flag all Serpent SSE2 helper ciphers as internal ciphers to prevent them from being called by normal users. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-03-31	crypto: serpent_avx - mark Serpent AVX helper ciphers	Stephan Mueller	1	-5/+10
	Flag all Serpent AVX helper ciphers as internal ciphers to prevent them from being called by normal users. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-03-31	crypto: serpent_avx2 - mark Serpent AVX2 helper ciphers	Stephan Mueller	1	-5/+10
	Flag all Serpent AVX2 helper ciphers as internal ciphers to prevent them from being called by normal users. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-03-31	crypto: cast6_avx - mark CAST6 helper ciphers	Stephan Mueller	1	-5/+10
	Flag all CAST6 helper ciphers as internal ciphers to prevent them from being called by normal users. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-03-31	crypto: camellia_aesni_avx - mark AVX Camellia helper ciphers	Stephan Mueller	1	-5/+10
	Flag all AVX Camellia helper ciphers as internal ciphers to prevent them from being called by normal users. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-03-31	crypto: cast5_avx - mark CAST5 helper ciphers	Stephan Mueller	1	-3/+6
	Flag all CAST5 helper ciphers as internal ciphers to prevent them from being called by normal users. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-03-31	crypto: camellia_aesni_avx2 - mark AES-NI Camellia helper ciphers	Stephan Mueller	1	-5/+10
	Flag all AES-NI Camellia helper ciphers as internal ciphers to prevent them from being called by normal users. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-03-31	crypto: clmulni - mark ghash clmulni helper ciphers	Stephan Mueller	1	-2/+5
	Flag all ash clmulni helper ciphers as internal ciphers to prevent them from being called by normal users. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-03-31	crypto: aesni - mark AES-NI helper ciphers	Stephan Mueller	1	-8/+15
	Flag all AES-NI helper ciphers as internal ciphers to prevent them from being called by normal users. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-03-16	crypto: sha1-mb - Syntax error	Ameen Ali	1	-1/+1
	fixing a syntax-error . Signed-off-by: Ameen Ali <AmeenAli023@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-03-13	crypto: don't export static symbol	Julia Lawall	1	-1/+0
	The semantic patch that fixes this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r@ type T; identifier f; @@ static T f (...) { ... } @@ identifier r.f; declarer name EXPORT_SYMBOL_GPL; @@ -EXPORT_SYMBOL_GPL(f); // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-03-13	crypto: aesni - fix memory usage in GCM decryption	Stephan Mueller	1	-2/+2
	The kernel crypto API logic requires the caller to provide the length of (ciphertext \|\| authentication tag) as cryptlen for the AEAD decryption operation. Thus, the cipher implementation must calculate the size of the plaintext output itself and cannot simply use cryptlen. The RFC4106 GCM decryption operation tries to overwrite cryptlen memory in req->dst. As the destination buffer for decryption only needs to hold the plaintext memory but cryptlen references the input buffer holding (ciphertext \|\| authentication tag), the assumption of the destination buffer length in RFC4106 GCM operation leads to a too large size. This patch simply uses the already calculated plaintext size. In addition, this patch fixes the offset calculation of the AAD buffer pointer: as mentioned before, cryptlen already includes the size of the tag. Thus, the tag does not need to be added. With the addition, the AAD will be written beyond the already allocated buffer. Note, this fixes a kernel crash that can be triggered from user space via AF_ALG(aead) -- simply use the libkcapi test application from [1] and update it to use rfc4106-gcm-aes. Using [1], the changes were tested using CAVS vectors to demonstrate that the crypto operation still delivers the right results. [1] http://www.chronox.de/libkcapi.html CC: Tadeusz Struk <tadeusz.struk@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-02-28	crypto: aesni - make driver-gcm-aes-aesni helper a proper aead alg	Tadeusz Struk	1	-54/+110
	Changed the __driver-gcm-aes-aesni to be a proper aead algorithm. This required a valid setkey and setauthsize functions to be added and also some changes to make sure that math context is not corrupted when the alg is used directly. Note that the __driver-gcm-aes-aesni should not be used directly by modules that can use it in interrupt context as we don't have a good fallback mechanism in this case. Signed-off-by: Adrian Hoban <adrian.hoban@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-02-27	crypto: sha-mb - Fix big integer constant sparse warning	Lad, Prabhakar	1	-1/+1
	this patch fixes following sparse warning: sha1_mb_mgr_init_avx2.c:59:31: warning: constant 0xF76543210 is so big it is long Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-02-14	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6	Linus Torvalds	3	-174/+205
	Pull crypto update from Herbert Xu: "Here is the crypto update for 3.20: - Added 192/256-bit key support to aesni GCM. - Added MIPS OCTEON MD5 support. - Fixed hwrng starvation and race conditions. - Added note that memzero_explicit is not a subsitute for memset. - Added user-space interface for crypto_rng. - Misc fixes" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (71 commits) crypto: tcrypt - do not allocate iv on stack for aead speed tests crypto: testmgr - limit IV copy length in aead tests crypto: tcrypt - fix buflen reminder calculation crypto: testmgr - mark rfc4106(gcm(aes)) as fips_allowed crypto: caam - fix resource clean-up on error path for caam_jr_init crypto: caam - pair irq map and dispose in the same function crypto: ccp - terminate ccp_support array with empty element crypto: caam - remove unused local variable crypto: caam - remove dead code crypto: caam - don't emit ICV check failures to dmesg hwrng: virtio - drop extra empty line crypto: replace scatterwalk_sg_next with sg_next crypto: atmel - Free memory in error path crypto: doc - remove colons in comments crypto: seqiv - Ensure that IV size is at least 8 bytes crypto: cts - Weed out non-CBC algorithms MAINTAINERS: add linux-crypto to hw random crypto: cts - Remove bogus use of seqiv crypto: qat - don't need qat_auth_state struct crypto: algif_rng - fix sparse non static symbol warning ...
2015-01-14	crypto: aesni - Add support for 192 & 256 bit keys to AESNI RFC4106	Timothy McCaffrey	2	-172/+205
	These patches fix the RFC4106 implementation in the aesni-intel module so it supports 192 & 256 bit keys. Since the AVX support that was added to this module also only supports 128 bit keys, and this patch only affects the SSE implementation, changes were also made to use the SSE version if key sizes other than 128 are specified. RFC4106 specifies that 192 & 256 bit keys must be supported (section 8.4). Also, this should fix Strongswan issue 341 where the aesni module needs to be unloaded if 256 bit keys are used: http://wiki.strongswan.org/issues/341 This patch has been tested with Sandy Bridge and Haswell processors. With 128 bit keys and input buffers > 512 bytes a slight performance degradation was noticed (~1%). For input buffers of less than 512 bytes there was no performance impact. Compared to 128 bit keys, 256 bit key size performance is approx. .5 cycles per byte slower on Sandy Bridge, and .37 cycles per byte slower on Haswell (vs. SSE code). This patch has also been tested with StrongSwan IPSec connections where it worked correctly. I created this diff from a git clone of crypto-2.6.git. Any questions, please feel free to contact me. Signed-off-by: Timothy McCaffrey <timothy.mccaffrey@unisys.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-01-13	crypto: x86/des3_ede - drop bogus module aliases	Mathias Krause	1	-2/+0
	This module implements variations of "des3_ede" only. Drop the bogus module aliases for "des". Cc: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-01-13	crypto: add missing crypto module aliases	Mathias Krause	1	-1/+1
	Commit 5d26a105b5a7 ("crypto: prefix module autoloading with "crypto-"") changed the automatic module loading when requesting crypto algorithms to prefix all module requests with "crypto-". This requires all crypto modules to have a crypto specific module alias even if their file name would otherwise match the requested crypto algorithm. Even though commit 5d26a105b5a7 added those aliases for a vast amount of modules, it was missing a few. Add the required MODULE_ALIAS_CRYPTO annotations to those files to make them get loaded automatically, again. This fixes, e.g., requesting 'ecb(blowfish-generic)', which used to work with kernels v3.18 and below. Also change MODULE_ALIAS() lines to MODULE_ALIAS_CRYPTO(). The former won't work for crypto modules any more. Fixes: 5d26a105b5a7 ("crypto: prefix module autoloading with "crypto-"") Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-01-05	crypto: sha-mb - Add avx2_supported check.	Vinson Lee	1	-1/+1
	This patch fixes this allyesconfig target build error with older binutils. LD arch/x86/crypto/built-in.o ld: arch/x86/crypto/sha-mb/built-in.o: No such file: No such file or directory Cc: stable@vger.kernel.org # 3.18+ Signed-off-by: Vinson Lee <vlee@twitter.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-01-05	crypto: aesni - fix "by8" variant for 128 bit keys	Mathias Krause	1	-11/+35
	The "by8" counter mode optimization is broken for 128 bit keys with input data longer than 128 bytes. It uses the wrong key material for en- and decryption. The key registers xkey0, xkey4, xkey8 and xkey12 need to be preserved in case we're handling more than 128 bytes of input data -- they won't get reloaded after the initial load. They must therefore be (a) loaded on the first iteration and (b) be preserved for the latter ones. The implementation for 128 bit keys does not comply with (a) nor (b). Fix this by bringing the implementation back to its original source and correctly load the key registers and preserve their values by not re-using the registers for other purposes. Kudos to James for reporting the issue and providing a test case showing the discrepancies. Reported-by: James Yonan <james@openvpn.net> Cc: Chandramouli Narayanan <mouli@linux.intel.com> Cc: <stable@vger.kernel.org> # v3.18 Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-12-02	crypto: sha - replace memset by memzero_explicit	Julia Lawall	2	-2/+2
	Memset on a local variable may be removed when it is called just before the variable goes out of scope. Using memzero_explicit defeats this optimization. A simplified version of the semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ identifier x; type T; @@ { ... when any T x[...]; ... when any when exists - memset + memzero_explicit (x, -0, ...) ... when != x when strict } // </smpl> This change was suggested by Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-11-26	crypto: include crypto- module prefix in template	Kees Cook	1	-0/+3
	This adds the module loading prefix "crypto-" to the template lookup as well. For example, attempting to load 'vfat(blowfish)' via AF_ALG now correctly includes the "crypto-" prefix at every level, correctly rejecting "vfat": net-pf-38 algif-hash crypto-vfat(blowfish) crypto-vfat(blowfish)-all crypto-vfat Reported-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-11-25	crypto: sha-mb - remove a bogus NULL check	Dan Carpenter	1	-2/+1
	This can't be NULL and we dereferenced it earlier. Smatch used to ignore these things where the pointer was obviously non-NULL but I've found that sometimes the intention was to check something else so we were maybe missing bugs. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-11-24	crypto: prefix module autoloading with "crypto-"	Kees Cook	23	-40/+40
	This prefixes all crypto module loading with "crypto-" so we never run the risk of exposing module auto-loading to userspace via a crypto API, as demonstrated by Mathias Krause: https://lkml.org/lkml/2013/3/4/70 Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-11-06	crypto: aesni - remove unnecessary #define	Valentin Rothberg	1	-6/+2
	The CPP identifier 'HAS_PCBC' is defined when the Kconfig option CRYPTO_PCBC is set as 'y' or 'm', and is further used in two ifdef blocks to conditionally compile source code. This indirection hides the actual Kconfig dependency and complicates readability. Moreover, it's inconsistent with the rest of the ifdef blocks in the file, which directly reference Kconfig options. This patch removes 'HAS_PCBC' and replaces its occurrences with the actual dependency on 'CRYPTO_PCBC' being set as 'y' or 'm'. Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-10-02	Revert "crypto: aesni - disable "by8" AVX CTR optimization"	Mathias Krause	1	-2/+2
	This reverts commit 7da4b29d496b1389d3a29b55d3668efecaa08ebd. Now, that the issue is fixed, we can re-enable the code. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Chandramouli Narayanan <mouli@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-10-02	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6	Herbert Xu	1	-2/+2
	Merging the crypto tree for 3.17 to pull in the "by8" AVX CTR revert.
2014-10-02	crypto: aesni - remove unused defines in "by8" variant	Mathias Krause	1	-3/+0
	The defines for xkey3, xkey6 and xkey9 are not used in the code. They're probably left overs from merging the three source files for 128, 192 and 256 bit AES. They can safely be removed. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Chandramouli Narayanan <mouli@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-10-02	crypto: aesni - fix counter overflow handling in "by8" variant	Mathias Krause	1	-2/+15
	The "by8" CTR AVX implementation fails to propperly handle counter overflows. That was the reason it got disabled in commit 7da4b29d496b ("crypto: aesni - disable "by8" AVX CTR optimization"). Fix the overflow handling by incrementing the counter block as a double quad word, i.e. a 128 bit, and testing for overflows afterwards. We need to use VPTEST to do so as VPADD* does not set the flags itself and silently drops the carry bit. As this change adds branches to the hot path, minor performance regressions might be a side effect. But, OTOH, we now have a conforming implementation -- the preferable goal. A tcrypt test on a SandyBridge system (i7-2620M) showed almost identical numbers for the old and this version with differences within the noise range. A dm-crypt test with the fixed version gave even slightly better results for this version. So the performance impact might not be as big as expected. Tested-by: Romain Francoise <romain@orebokech.com> Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Chandramouli Narayanan <mouli@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-09-24	crypto: aesni - disable "by8" AVX CTR optimization	Mathias Krause	1	-2/+2
	The "by8" implementation introduced in commit 22cddcc7df8f ("crypto: aes - AES CTR x86_64 "by8" AVX optimization") is failing crypto tests as it handles counter block overflows differently. It only accounts the right most 32 bit as a counter -- not the whole block as all other implementations do. This makes it fail the cryptomgr test #4 that specifically tests this corner case. As we're quite late in the release cycle, just disable the "by8" variant for now. Reported-by: Romain Francoise <romain@orebokech.com> Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Chandramouli Narayanan <mouli@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-08-26	crypto: sha-mb - sha1_mb_alg_state can be static	Fengguang Wu	1	-11/+11
	CC: Tim Chen <tim.c.chen@linux.intel.com> CC: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-08-25	crypto: sha-mb - SHA1 multibuffer job manager and glue code	Tim Chen	3	-0/+947
	This patch introduces the multi-buffer job manager which is responsible for submitting scatter-gather buffers from several SHA1 jobs to the multi-buffer algorithm. It also contains the flush routine to that's called by the crypto daemon to complete the job when no new jobs arrive before the deadline of maximum latency of a SHA1 crypto job. The SHA1 multi-buffer crypto algorithm is defined and initialized in this patch. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-08-25	crypto: sha-mb - SHA1 multibuffer crypto computation (x8 AVX2)	Tim Chen	1	-0/+472
	This patch introduces the assembly routines to do SHA1 computation on buffers belonging to serveral jobs at once. The assembly routines are optimized with AVX2 instructions that have 8 data lanes and using AVX2 registers. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-08-25	crypto: sha-mb - SHA1 multibuffer submit and flush routines for AVX2	Tim Chen	3	-0/+619
	This patch introduces the routines used to submit and flush buffers belonging to SHA1 crypto jobs to the SHA1 multibuffer algorithm. It is implemented mostly in assembly optimized with AVX2 instructions. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-08-25	crypto: sha-mb - SHA1 multibuffer algorithm data structures	Tim Chen	3	-0/+533
	This patch introduces the data structures and prototypes of functions needed for computing SHA1 hash using multi-buffer. Included are the structures of the multi-buffer SHA1 job, job scheduler in C and x86 assembly. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-08-04	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6	Linus Torvalds	6	-145/+2040
	Pull crypto update from Herbert Xu: - CTR(AES) optimisation on x86_64 using "by8" AVX. - arm64 support to ccp - Intel QAT crypto driver - Qualcomm crypto engine driver - x86-64 assembly optimisation for 3DES - CTR(3DES) speed test - move FIPS panic from module.c so that it only triggers on crypto modules - SP800-90A Deterministic Random Bit Generator (drbg). - more test vectors for ghash. - tweak self tests to catch partial block bugs. - misc fixes. * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (94 commits) crypto: drbg - fix failure of generating multiple of 2**16 bytes crypto: ccp - Do not sign extend input data to CCP crypto: testmgr - add missing spaces to drbg error strings crypto: atmel-tdes - Switch to managed version of kzalloc crypto: atmel-sha - Switch to managed version of kzalloc crypto: testmgr - use chunks smaller than algo block size in chunk tests crypto: qat - Fixed SKU1 dev issue crypto: qat - Use hweight for bit counting crypto: qat - Updated print outputs crypto: qat - change ae_num to ae_id crypto: qat - change slice->regions to slice->region crypto: qat - use min_t macro crypto: qat - remove unnecessary parentheses crypto: qat - remove unneeded header crypto: qat - checkpatch blank lines crypto: qat - remove unnecessary return codes crypto: Resolve shadow warnings crypto: ccp - Remove "select OF" from Kconfig crypto: caam - fix DECO RSR polling crypto: qce - Let 'DEV_QCE' depend on both HAS_DMA and HAS_IOMEM ...
2014-06-25	crypto: sha512_ssse3 - fix byte count to bit count conversion	Jussi Kivilinna	1	-1/+1
	Byte-to-bit-count computation is only partly converted to big-endian and is mixing in CPU-endian values. Problem was noticed by sparce with warning: CHECK arch/x86/crypto/sha512_ssse3_glue.c arch/x86/crypto/sha512_ssse3_glue.c:144:19: warning: restricted __be64 degrades to integer arch/x86/crypto/sha512_ssse3_glue.c:144:17: warning: incorrect type in assignment (different base types) arch/x86/crypto/sha512_ssse3_glue.c:144:17: expected restricted __be64 <noident> arch/x86/crypto/sha512_ssse3_glue.c:144:17: got unsigned long long Cc: <stable@vger.kernel.org> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-25	crypto: des3_ede-x86_64 - fix parse warning	Jussi Kivilinna	1	-2/+2
	Patch fixes following sparse warning: CHECK arch/x86/crypto/des3_ede_glue.c arch/x86/crypto/des3_ede_glue.c:308:52: warning: restricted __be64 degrades to integer arch/x86/crypto/des3_ede_glue.c:309:52: warning: restricted __be64 degrades to integer arch/x86/crypto/des3_ede_glue.c:310:52: warning: restricted __be64 degrades to integer arch/x86/crypto/des3_ede_glue.c:326:44: warning: restricted __be64 degrades to integer Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20	crypto: aes - AES CTR x86_64 "by8" AVX optimization	chandramouli narayanan	3	-3/+585
	This patch introduces "by8" AES CTR mode AVX optimization inspired by Intel Optimized IPSEC Cryptograhpic library. For additional information, please see: http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=22972 The functions aes_ctr_enc_128_avx_by8(), aes_ctr_enc_192_avx_by8() and aes_ctr_enc_256_avx_by8() are adapted from Intel Optimized IPSEC Cryptographic library. When both AES and AVX features are enabled in a platform, the glue code in AESNI module overrieds the existing "by4" CTR mode en/decryption with the "by8" AES CTR mode en/decryption. On a Haswell desktop, with turbo disabled and all cpus running at maximum frequency, the "by8" CTR mode optimization shows better performance results across data & key sizes as measured by tcrypt. The average performance improvement of the "by8" version over the "by4" version is as follows: For 128 bit key and data sizes >= 256 bytes, there is a 10-16% improvement. For 192 bit key and data sizes >= 256 bytes, there is a 20-22% improvement. For 256 bit key and data sizes >= 256 bytes, there is a 20-25% improvement. A typical run of tcrypt with AES CTR mode encryption of the "by4" and "by8" optimization shows the following results: tcrypt with "by4" AES CTR mode encryption optimization on a Haswell Desktop: --------------------------------------------------------------------------- testing speed of __ctr-aes-aesni encryption test 0 (128 bit key, 16 byte blocks): 1 operation in 343 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 336 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 491 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1130 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 7309 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 346 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 361 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 543 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 1321 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 9649 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 369 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 366 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 595 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 1531 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 10522 cycles (8192 bytes) testing speed of __ctr-aes-aesni decryption test 0 (128 bit key, 16 byte blocks): 1 operation in 336 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 350 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 487 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1129 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 7287 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 350 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 359 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 635 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 1324 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 9595 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 364 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 377 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 604 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 1527 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 10549 cycles (8192 bytes) tcrypt with "by8" AES CTR mode encryption optimization on a Haswell Desktop: --------------------------------------------------------------------------- testing speed of __ctr-aes-aesni encryption test 0 (128 bit key, 16 byte blocks): 1 operation in 340 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 330 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 450 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1043 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 6597 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 339 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 352 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 539 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 1153 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 8458 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 353 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 360 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 512 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 1277 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 8745 cycles (8192 bytes) testing speed of __ctr-aes-aesni decryption test 0 (128 bit key, 16 byte blocks): 1 operation in 348 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 335 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 451 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1030 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 6611 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 354 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 346 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 488 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 1154 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 8390 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 357 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 362 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 515 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 1284 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 8681 cycles (8192 bytes) crypto: Incorporate feed back to AES CTR mode optimization patch Specifically, the following: a) alignment around main loop in aes_ctrby8_avx_x86_64.S b) .rodata around data constants used in the assembely code. c) the use of CONFIG_AVX in the glue code. d) fix up white space. e) informational message for "by8" AES CTR mode optimization f) "by8" AES CTR mode optimization can be simply enabled if the platform supports both AES and AVX features. The optimization works superbly on Sandybridge as well. Testing on Haswell shows no performance change since the last. Testing on Sandybridge shows that the "by8" AES CTR mode optimization greatly improves performance. tcrypt log with "by4" AES CTR mode optimization on Sandybridge -------------------------------------------------------------- testing speed of __ctr-aes-aesni encryption test 0 (128 bit key, 16 byte blocks): 1 operation in 383 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 408 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 707 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1864 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 12813 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 395 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 432 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 780 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 2132 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 15765 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 416 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 438 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 842 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 2383 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 16945 cycles (8192 bytes) testing speed of __ctr-aes-aesni decryption test 0 (128 bit key, 16 byte blocks): 1 operation in 389 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 409 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 704 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1865 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 12783 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 409 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 434 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 792 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 2151 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 15804 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 421 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 444 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 840 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 2394 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 16928 cycles (8192 bytes) tcrypt log with "by8" AES CTR mode optimization on Sandybridge -------------------------------------------------------------- testing speed of __ctr-aes-aesni encryption test 0 (128 bit key, 16 byte blocks): 1 operation in 383 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 401 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 522 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1136 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 7046 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 394 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 418 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 559 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 1263 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 9072 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 408 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 428 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 595 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 1385 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 9224 cycles (8192 bytes) testing speed of __ctr-aes-aesni decryption test 0 (128 bit key, 16 byte blocks): 1 operation in 390 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 402 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 530 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1135 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 7079 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 414 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 417 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 572 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 1312 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 9073 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 415 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 454 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 598 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 1407 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 9288 cycles (8192 bytes) crypto: Fix redundant checks a) Fix the redundant check for cpu_has_aes b) Fix the key length check when invoking the CTR mode "by8" encryptor/decryptor. crypto: fix typo in AES ctr mode transform Signed-off-by: Chandramouli Narayanan <mouli@linux.intel.com> Reviewed-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20	crypto: des_3des - add x86-64 assembly implementation	Jussi Kivilinna	3	-0/+1316
	Patch adds x86_64 assembly implementation of Triple DES EDE cipher algorithm. Two assembly implementations are provided. First is regular 'one-block at time' encrypt/decrypt function. Second is 'three-blocks at time' function that gains performance increase on out-of-order CPUs. tcrypt test results: Intel Core i5-4570: des3_ede-asm vs des3_ede-generic: size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec 16B 1.21x 1.22x 1.27x 1.36x 1.25x 1.25x 64B 1.98x 1.96x 1.23x 2.04x 2.01x 2.00x 256B 2.34x 2.37x 1.21x 2.40x 2.38x 2.39x 1024B 2.50x 2.47x 1.22x 2.51x 2.52x 2.51x 8192B 2.51x 2.53x 1.21x 2.56x 2.54x 2.55x Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20	crypto: crc32c-pclmul - Shrink K_table to 32-bit words	George Spelvin	1	-142/+139
	There's no need for the K_table to be made of 64-bit words. For some reason, the original authors didn't fully reduce the values modulo the CRC32C polynomial, and so had some 33-bit values in there. They can all be reduced to 32 bits. Doing that cuts the table size in half. Since the code depends on both pclmulq and crc32, SSE 4.1 is obviously present, so we can use pmovzxdq to fetch it in the correct format. This adds (measured on Ivy Bridge) 1 cycle per main loop iteration (CRC of up to 3K bytes), less than 0.2%. The hope is that the reduced D-cache footprint will make up the loss in other code. Two other related fixes: * K_table is read-only, so belongs in .rodata, and * There's no need for more than 8-byte alignment Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: George Spelvin <linux@horizon.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-04-04	crypto: ghash-clmulni-intel - Use u128 instead of be128 for internal key	Herbert Xu	2	-8/+8
	The internal key isn't actually in big-endian format so let's switch to u128 which also happens to allow us to remove a sparse warning. Based on suggestion by Ard Biesheuvel. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>