From wk at gnupg.org Tue Oct 1 13:31:16 2013 From: wk at gnupg.org (Werner Koch) Date: Tue, 01 Oct 2013 13:31:16 +0200 Subject: possible mpi-pow improvement In-Reply-To: <1378456897.3188.14.camel@cfw2.gniibe.org> (NIIBE Yutaka's message of "Fri, 06 Sep 2013 17:41:37 +0900") References: <1378456897.3188.14.camel@cfw2.gniibe.org> Message-ID: <87a9itb52z.fsf@vigenere.g10code.de> On Fri, 6 Sep 2013 10:41, gniibe at fsij.org said: > ====================== original ===================== > $ ./tests/benchmark rsa > Algorithm generate 100*sign 100*verify > ------------------------------------------------ > RSA 1024 bit 340ms 860ms 30ms > RSA 2048 bit 870ms 5510ms 110ms > RSA 3072 bit 6440ms 16930ms 210ms > RSA 4096 bit 17470ms 37270ms 360ms > My possible change: > ====================== k-ary, MUL instead of SQR ===== > Algorithm generate 100*sign 100*verify > ------------------------------------------------ > RSA 1024 bit 280ms 710ms 30ms > RSA 2048 bit 960ms 4410ms 110ms > RSA 3072 bit 17680ms 12990ms 220ms > RSA 4096 bit 12280ms 29550ms 360ms > > Any comments are appreciated. Thus you change is even an improvement for the general case. Can you please change your patch to conditionally include the k-ary multiply but enable it right away. Shalom-Salam, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From wk at gnupg.org Tue Oct 1 13:35:07 2013 From: wk at gnupg.org (Werner Koch) Date: Tue, 01 Oct 2013 13:35:07 +0200 Subject: GOST ECC pubkey In-Reply-To: <1379655225.3179.3.camel@cfw2.gniibe.org> (NIIBE Yutaka's message of "Fri, 20 Sep 2013 14:33:45 +0900") References: <1379653630.3179.2.camel@cfw2.gniibe.org> <1379655225.3179.3.camel@cfw2.gniibe.org> Message-ID: <8761thb4wk.fsf@vigenere.g10code.de> On Fri, 20 Sep 2013 07:33, gniibe at fsij.org said: > I think that it is possible to represent GOST3410 without extending > the structure ecc_domain_parms_t. Just redefine "n" as order of > cyclic subgroup of elliptic curve points group for GOST3410. Anyone else with comments? I haven't read the specs, if it is different to ECDSA we might want to add another signature scheme similar to the (flag eddsa). Salam-Shalom, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From cvs at cvs.gnupg.org Tue Oct 1 14:34:33 2013 From: cvs at cvs.gnupg.org (by Peter Wu) Date: Tue, 01 Oct 2013 14:34:33 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-283-g738177e Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 738177ec0eae05069ec61bc4f724a69d4e052e42 (commit) from 1d85452412b65e7976bc94969fc513ff6b880ed8 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 738177ec0eae05069ec61bc4f724a69d4e052e42 Author: Peter Wu Date: Thu Sep 26 23:20:32 2013 +0200 cipher: Add support for 128-bit keys in RC2 * cipher/rfc2268.c (oids_rfc2268_128): New (_gcry_cipher_spec_rfc2268_128): New. * cipher/cipher.c (cipher_table_entry): Add GCRY_CIPHER_RFC2268_128. -- This patch adds support for decrypting (and encrypting) using 128-bit keys using the RC2 algorithm. Signed-off-by: Peter Wu Actually this is merely enabling that extra ID for 128 bit RFC2268. We should have used one id for that algorithm only, because a second identifier merely for having the OID in the code is a bad idea. My initial fault and thus I better apply this patch to make the id not entirely useless. -wk diff --git a/cipher/cipher.c b/cipher/cipher.c index a17ca9b..23cb99c 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -87,6 +87,8 @@ static struct cipher_table_entry #if USE_RFC2268 { &_gcry_cipher_spec_rfc2268_40, &dummy_extra_spec, GCRY_CIPHER_RFC2268_40 }, + { &_gcry_cipher_spec_rfc2268_128, + &dummy_extra_spec, GCRY_CIPHER_RFC2268_128 }, #endif #if USE_SEED { &_gcry_cipher_spec_seed, diff --git a/cipher/rfc2268.c b/cipher/rfc2268.c index 130be9b..da0b9f4 100644 --- a/cipher/rfc2268.c +++ b/cipher/rfc2268.c @@ -351,8 +351,21 @@ static gcry_cipher_oid_spec_t oids_rfc2268_40[] = { NULL } }; +static gcry_cipher_oid_spec_t oids_rfc2268_128[] = + { + /* pbeWithSHAAnd128BitRC2_CBC */ + { "1.2.840.113549.1.12.1.5", GCRY_CIPHER_MODE_CBC }, + { NULL } + }; + gcry_cipher_spec_t _gcry_cipher_spec_rfc2268_40 = { "RFC2268_40", NULL, oids_rfc2268_40, RFC2268_BLOCKSIZE, 40, sizeof(RFC2268_context), do_setkey, encrypt_block, decrypt_block }; + +gcry_cipher_spec_t _gcry_cipher_spec_rfc2268_128 = { + "RFC2268_128", NULL, oids_rfc2268_128, + RFC2268_BLOCKSIZE, 128, sizeof(RFC2268_context), + do_setkey, encrypt_block, decrypt_block +}; diff --git a/src/cipher.h b/src/cipher.h index ea7a141..70b46fe 100644 --- a/src/cipher.h +++ b/src/cipher.h @@ -194,6 +194,7 @@ extern gcry_cipher_spec_t _gcry_cipher_spec_serpent128; extern gcry_cipher_spec_t _gcry_cipher_spec_serpent192; extern gcry_cipher_spec_t _gcry_cipher_spec_serpent256; extern gcry_cipher_spec_t _gcry_cipher_spec_rfc2268_40; +extern gcry_cipher_spec_t _gcry_cipher_spec_rfc2268_128; extern gcry_cipher_spec_t _gcry_cipher_spec_seed; extern gcry_cipher_spec_t _gcry_cipher_spec_camellia128; extern gcry_cipher_spec_t _gcry_cipher_spec_camellia192; ----------------------------------------------------------------------- Summary of changes: cipher/cipher.c | 2 ++ cipher/rfc2268.c | 13 +++++++++++++ src/cipher.h | 1 + 3 files changed, 16 insertions(+), 0 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From wk at gnupg.org Tue Oct 1 13:59:08 2013 From: wk at gnupg.org (Werner Koch) Date: Tue, 01 Oct 2013 13:59:08 +0200 Subject: [PATCH] Add support for 128-bit keys in RC2 In-Reply-To: <1380230432-13431-1-git-send-email-lekensteyn@gmail.com> (Peter Wu's message of "Thu, 26 Sep 2013 23:20:32 +0200") References: <1380230432-13431-1-git-send-email-lekensteyn@gmail.com> Message-ID: <87y56d9p83.fsf@vigenere.g10code.de> On Thu, 26 Sep 2013 23:20, lekensteyn at gmail.com said: > This patch adds support for decrypting (and encrypting) using 128-bit > keys using the RC2 algorithm. Actually our RC2 implementation supports any key size >= 40 bit. I can't remember why I came up with the two identifiers GCRY_CIPHER_RFC2268_40 = 307, /* Ron's Cipher 2 (40 bit). */ GCRY_CIPHER_RFC2268_128 = 308, /* Ron's Cipher 2 (128 bit). */ and didn't implement the second one. Actually GCRY_CIPHER_RFC2268 would have been sufficient because the caller may use any keylength anyway. I added a Changelog entry and pushed it to master. Shalom-Salam, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From wk at gnupg.org Tue Oct 1 14:01:11 2013 From: wk at gnupg.org (Werner Koch) Date: Tue, 01 Oct 2013 14:01:11 +0200 Subject: Comments on the change: Mitigate a flush+reload cache attack on RSA secret exponents In-Reply-To: <1376008230.3177.2.camel@cfw2.gniibe.org> (NIIBE Yutaka's message of "Fri, 09 Aug 2013 09:30:30 +0900") References: <1375939200.3172.6.camel@cfw2.gniibe.org> <871u64iqg7.fsf@vigenere.g10code.de> <1375949109.3172.12.camel@cfw2.gniibe.org> <1376008230.3177.2.camel@cfw2.gniibe.org> Message-ID: <87r4c59p4o.fsf@vigenere.g10code.de> On Fri, 9 Aug 2013 02:30, gniibe at fsij.org said: > Given the situation, my opinion is that, it's not good idea, for now, > to share some useful information with git notes (for libgcrypt, gnupg, > etc.). Okay, let's forget about this. Thanks for looking into this. Salam-Shalom, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From nmav at gnutls.org Tue Oct 1 15:24:22 2013 From: nmav at gnutls.org (Nikos Mavrogiannopoulos) Date: Tue, 01 Oct 2013 15:24:22 +0200 Subject: [PATCH] Add support for 128-bit keys in RC2 In-Reply-To: <87y56d9p83.fsf@vigenere.g10code.de> References: <1380230432-13431-1-git-send-email-lekensteyn@gmail.com> <87y56d9p83.fsf@vigenere.g10code.de> Message-ID: <524ACD06.5020000@gnutls.org> On 10/01/2013 01:59 PM, Werner Koch wrote: > On Thu, 26 Sep 2013 23:20, lekensteyn at gmail.com said: >> This patch adds support for decrypting (and encrypting) using 128-bit >> keys using the RC2 algorithm. > > Actually our RC2 implementation supports any key size >= 40 bit. I > can't remember why I came up with the two identifiers It must have been because of the effective key size reduction code used in the 40 bit version. Since RC2-40 is the only version used in PKCS #12, there may have been no incentive to have the "real" RC2. regards, Nikos From lekensteyn at gmail.com Tue Oct 1 15:21:09 2013 From: lekensteyn at gmail.com (Peter Wu) Date: Tue, 01 Oct 2013 15:21:09 +0200 Subject: ../../src/visibility.c:498:3: warning: implicit declaration of function `_gcry_mpi_ec_new' [-Wimplicit-function-declaration] Message-ID: <22656519.CJENjD6aAd@al> Hi Werner, While building the latest git master (libgcrypt-1.5.0-283-g738177e), I got a warning about an implicit declaration of a function. As far as I can see, it was caused by: commit 64a7d347847d606eb5f4c156e24ba060271b8f6b Author: Werner Koch Date: Sat Sep 7 10:06:46 2013 +0200 ecc: Refactor low-level access functions. The declaration was previously done in src/mpi.h which is included by src/visibility.c, but now ec-context.h is not included by any visibility- related file. Regards, Peter From lekensteyn at gmail.com Tue Oct 1 14:59:21 2013 From: lekensteyn at gmail.com (Peter Wu) Date: Tue, 01 Oct 2013 14:59:21 +0200 Subject: [PATCH] Add support for 128-bit keys in RC2 In-Reply-To: <87y56d9p83.fsf@vigenere.g10code.de> References: <1380230432-13431-1-git-send-email-lekensteyn@gmail.com> <87y56d9p83.fsf@vigenere.g10code.de> Message-ID: <1452407.XMrRVjiz6F@al> Hi Werner, Thanks for taking this patch. On Tuesday 01 October 2013 13:59:08 Werner Koch wrote: > On Thu, 26 Sep 2013 23:20, lekensteyn at gmail.com said: > > This patch adds support for decrypting (and encrypting) using 128-bit > > keys using the RC2 algorithm. > > Actually our RC2 implementation supports any key size >= 40 bit. I > can't remember why I came up with the two identifiers > > GCRY_CIPHER_RFC2268_40 = 307, /* Ron's Cipher 2 (40 bit). */ > GCRY_CIPHER_RFC2268_128 = 308, /* Ron's Cipher 2 (128 bit). */ > > and didn't implement the second one. Actually GCRY_CIPHER_RFC2268 would > have been sufficient because the caller may use any keylength anyway. > > I added a Changelog entry and pushed it to master. Can you also update the documentation? In doc/gcrypt.texi from line 1563 it says: Ron's Cipher 2 in the 40 and 128 bit variants. Note, that we currently only support the 40 bit variant. The identifier for 128 is reserved for future use. If keys of any length (ranging from 0 to 1024?) are allowed, it could be specified in there. Regards, Peter From cvs at cvs.gnupg.org Tue Oct 1 22:35:20 2013 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Tue, 01 Oct 2013 22:35:20 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-285-g3ca180b Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 3ca180b25e8df252fc16f802cfdc27496e307830 (commit) via 4153fa859816e799e506055321a22e6450aacdcc (commit) from 738177ec0eae05069ec61bc4f724a69d4e052e42 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 3ca180b25e8df252fc16f802cfdc27496e307830 Author: Werner Koch Date: Tue Oct 1 22:00:50 2013 +0200 cipher: Simplify the cipher dispatcher cipher.c. * src/gcrypt-module.h (gcry_cipher_spec_t): Move to ... * src/cipher-proto.h (gcry_cipher_spec_t): here. Merge with cipher_extra_spec_t. Add fields ALGO and FLAGS. Set these fields in all cipher modules. * cipher/cipher.c: Change most code to replace the former module system by a simpler system to gain information about the algorithms. (disable_pubkey_algo): Simplified. Not anymore thread-safe, though. * cipher/md.c (_gcry_md_selftest): Use correct structure. Not a real problem because both define the same function as their first field. * cipher/pubkey.c (_gcry_pk_selftest): Take care of the disabled flag. Signed-off-by: Werner Koch diff --git a/cipher/arcfour.c b/cipher/arcfour.c index 6ef07fb..dc32b07 100644 --- a/cipher/arcfour.c +++ b/cipher/arcfour.c @@ -150,6 +150,7 @@ selftest(void) gcry_cipher_spec_t _gcry_cipher_spec_arcfour = { + GCRY_CIPHER_ARCFOUR, {0, 0}, "ARCFOUR", NULL, NULL, 1, 128, sizeof (ARCFOUR_context), arcfour_setkey, NULL, NULL, encrypt_stream, encrypt_stream, }; diff --git a/cipher/blowfish.c b/cipher/blowfish.c index 61042ed..2f739c8 100644 --- a/cipher/blowfish.c +++ b/cipher/blowfish.c @@ -960,6 +960,7 @@ bf_setkey (void *context, const byte *key, unsigned keylen) gcry_cipher_spec_t _gcry_cipher_spec_blowfish = { + GCRY_CIPHER_BLOWFISH, {0, 0}, "BLOWFISH", NULL, NULL, BLOWFISH_BLOCKSIZE, 128, sizeof (BLOWFISH_context), bf_setkey, encrypt_block, decrypt_block diff --git a/cipher/camellia-glue.c b/cipher/camellia-glue.c index 2842c3b..29cb7a5 100644 --- a/cipher/camellia-glue.c +++ b/cipher/camellia-glue.c @@ -691,18 +691,21 @@ static gcry_cipher_oid_spec_t camellia256_oids[] = gcry_cipher_spec_t _gcry_cipher_spec_camellia128 = { + GCRY_CIPHER_CAMELLIA128, {0, 0}, "CAMELLIA128",NULL,camellia128_oids,CAMELLIA_BLOCK_SIZE,128, sizeof(CAMELLIA_context),camellia_setkey,camellia_encrypt,camellia_decrypt }; gcry_cipher_spec_t _gcry_cipher_spec_camellia192 = { + GCRY_CIPHER_CAMELLIA192, {0, 0}, "CAMELLIA192",NULL,camellia192_oids,CAMELLIA_BLOCK_SIZE,192, sizeof(CAMELLIA_context),camellia_setkey,camellia_encrypt,camellia_decrypt }; gcry_cipher_spec_t _gcry_cipher_spec_camellia256 = { + GCRY_CIPHER_CAMELLIA256, {0, 0}, "CAMELLIA256",NULL,camellia256_oids,CAMELLIA_BLOCK_SIZE,256, sizeof(CAMELLIA_context),camellia_setkey,camellia_encrypt,camellia_decrypt }; diff --git a/cipher/cast5.c b/cipher/cast5.c index ae6b509..92d9af8 100644 --- a/cipher/cast5.c +++ b/cipher/cast5.c @@ -983,6 +983,7 @@ cast_setkey (void *context, const byte *key, unsigned keylen ) gcry_cipher_spec_t _gcry_cipher_spec_cast5 = { + GCRY_CIPHER_CAST5, {0, 0}, "CAST5", NULL, NULL, CAST5_BLOCKSIZE, 128, sizeof (CAST5_context), cast_setkey, encrypt_block, decrypt_block }; diff --git a/cipher/cipher-aeswrap.c b/cipher/cipher-aeswrap.c index 931dec1..03b0ea7 100644 --- a/cipher/cipher-aeswrap.c +++ b/cipher/cipher-aeswrap.c @@ -48,7 +48,7 @@ _gcry_cipher_aeswrap_encrypt (gcry_cipher_hd_t c, #error Invalid block size #endif /* We require a cipher with a 128 bit block length. */ - if (c->cipher->blocksize != 16) + if (c->spec->blocksize != 16) return GPG_ERR_INV_LENGTH; /* The output buffer must be able to hold the input data plus one @@ -90,7 +90,7 @@ _gcry_cipher_aeswrap_encrypt (gcry_cipher_hd_t c, /* B := AES_k( A | R[i] ) */ memcpy (b, a, 8); memcpy (b+8, r+i*8, 8); - nburn = c->cipher->encrypt (&c->context.c, b, b); + nburn = c->spec->encrypt (&c->context.c, b, b); burn = nburn > burn ? nburn : burn; /* t := t + 1 */ for (x = 7; x >= 0; x--) @@ -130,7 +130,7 @@ _gcry_cipher_aeswrap_decrypt (gcry_cipher_hd_t c, #error Invalid block size #endif /* We require a cipher with a 128 bit block length. */ - if (c->cipher->blocksize != 16) + if (c->spec->blocksize != 16) return GPG_ERR_INV_LENGTH; /* The output buffer must be able to hold the input data minus one @@ -173,7 +173,7 @@ _gcry_cipher_aeswrap_decrypt (gcry_cipher_hd_t c, /* B := AES_k^1( (A ^ t)| R[i] ) */ buf_xor(b, a, t, 8); memcpy (b+8, r+(i-1)*8, 8); - nburn = c->cipher->decrypt (&c->context.c, b, b); + nburn = c->spec->decrypt (&c->context.c, b, b); burn = nburn > burn ? nburn : burn; /* t := t - 1 */ for (x = 7; x >= 0; x--) diff --git a/cipher/cipher-cbc.c b/cipher/cipher-cbc.c index 55a1c74..523f5a6 100644 --- a/cipher/cipher-cbc.c +++ b/cipher/cipher-cbc.c @@ -40,15 +40,15 @@ _gcry_cipher_cbc_encrypt (gcry_cipher_hd_t c, unsigned int n; unsigned char *ivp; int i; - size_t blocksize = c->cipher->blocksize; + size_t blocksize = c->spec->blocksize; unsigned nblocks = inbuflen / blocksize; unsigned int burn, nburn; if (outbuflen < ((c->flags & GCRY_CIPHER_CBC_MAC)? blocksize : inbuflen)) return GPG_ERR_BUFFER_TOO_SHORT; - if ((inbuflen % c->cipher->blocksize) - && !(inbuflen > c->cipher->blocksize + if ((inbuflen % c->spec->blocksize) + && !(inbuflen > c->spec->blocksize && (c->flags & GCRY_CIPHER_CBC_CTS))) return GPG_ERR_INV_LENGTH; @@ -73,7 +73,7 @@ _gcry_cipher_cbc_encrypt (gcry_cipher_hd_t c, for (n=0; n < nblocks; n++ ) { buf_xor(outbuf, inbuf, c->u_iv.iv, blocksize); - nburn = c->cipher->encrypt ( &c->context.c, outbuf, outbuf ); + nburn = c->spec->encrypt ( &c->context.c, outbuf, outbuf ); burn = nburn > burn ? nburn : burn; memcpy (c->u_iv.iv, outbuf, blocksize ); inbuf += blocksize; @@ -104,7 +104,7 @@ _gcry_cipher_cbc_encrypt (gcry_cipher_hd_t c, for (; i < blocksize; i++) outbuf[i] = 0 ^ *ivp++; - nburn = c->cipher->encrypt (&c->context.c, outbuf, outbuf); + nburn = c->spec->encrypt (&c->context.c, outbuf, outbuf); burn = nburn > burn ? nburn : burn; memcpy (c->u_iv.iv, outbuf, blocksize); } @@ -123,15 +123,15 @@ _gcry_cipher_cbc_decrypt (gcry_cipher_hd_t c, { unsigned int n; int i; - size_t blocksize = c->cipher->blocksize; + size_t blocksize = c->spec->blocksize; unsigned int nblocks = inbuflen / blocksize; unsigned int burn, nburn; if (outbuflen < inbuflen) return GPG_ERR_BUFFER_TOO_SHORT; - if ((inbuflen % c->cipher->blocksize) - && !(inbuflen > c->cipher->blocksize + if ((inbuflen % c->spec->blocksize) + && !(inbuflen > c->spec->blocksize && (c->flags & GCRY_CIPHER_CBC_CTS))) return GPG_ERR_INV_LENGTH; @@ -159,12 +159,12 @@ _gcry_cipher_cbc_decrypt (gcry_cipher_hd_t c, * save the original ciphertext block. We use LASTIV for * this here because it is not used otherwise. */ memcpy (c->lastiv, inbuf, blocksize); - nburn = c->cipher->decrypt ( &c->context.c, outbuf, inbuf ); + nburn = c->spec->decrypt ( &c->context.c, outbuf, inbuf ); burn = nburn > burn ? nburn : burn; buf_xor(outbuf, outbuf, c->u_iv.iv, blocksize); memcpy(c->u_iv.iv, c->lastiv, blocksize ); - inbuf += c->cipher->blocksize; - outbuf += c->cipher->blocksize; + inbuf += c->spec->blocksize; + outbuf += c->spec->blocksize; } } @@ -180,14 +180,14 @@ _gcry_cipher_cbc_decrypt (gcry_cipher_hd_t c, memcpy (c->lastiv, c->u_iv.iv, blocksize ); /* Save Cn-2. */ memcpy (c->u_iv.iv, inbuf + blocksize, restbytes ); /* Save Cn. */ - nburn = c->cipher->decrypt ( &c->context.c, outbuf, inbuf ); + nburn = c->spec->decrypt ( &c->context.c, outbuf, inbuf ); burn = nburn > burn ? nburn : burn; buf_xor(outbuf, outbuf, c->u_iv.iv, restbytes); memcpy(outbuf + blocksize, outbuf, restbytes); for(i=restbytes; i < blocksize; i++) c->u_iv.iv[i] = outbuf[i]; - nburn = c->cipher->decrypt (&c->context.c, outbuf, c->u_iv.iv); + nburn = c->spec->decrypt (&c->context.c, outbuf, c->u_iv.iv); burn = nburn > burn ? nburn : burn; buf_xor(outbuf, outbuf, c->lastiv, blocksize); /* c->lastiv is now really lastlastiv, does this matter? */ diff --git a/cipher/cipher-cfb.c b/cipher/cipher-cfb.c index f772280..244f5fd 100644 --- a/cipher/cipher-cfb.c +++ b/cipher/cipher-cfb.c @@ -37,7 +37,7 @@ _gcry_cipher_cfb_encrypt (gcry_cipher_hd_t c, const unsigned char *inbuf, unsigned int inbuflen) { unsigned char *ivp; - size_t blocksize = c->cipher->blocksize; + size_t blocksize = c->spec->blocksize; size_t blocksize_x_2 = blocksize + blocksize; unsigned int burn, nburn; @@ -48,7 +48,7 @@ _gcry_cipher_cfb_encrypt (gcry_cipher_hd_t c, { /* Short enough to be encoded by the remaining XOR mask. */ /* XOR the input with the IV and store input into IV. */ - ivp = c->u_iv.iv + c->cipher->blocksize - c->unused; + ivp = c->u_iv.iv + c->spec->blocksize - c->unused; buf_xor_2dst(outbuf, ivp, inbuf, inbuflen); c->unused -= inbuflen; return 0; @@ -83,7 +83,7 @@ _gcry_cipher_cfb_encrypt (gcry_cipher_hd_t c, while ( inbuflen >= blocksize_x_2 ) { /* Encrypt the IV. */ - nburn = c->cipher->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; /* XOR the input with the IV and store input into IV. */ buf_xor_2dst(outbuf, c->u_iv.iv, inbuf, blocksize); @@ -97,7 +97,7 @@ _gcry_cipher_cfb_encrypt (gcry_cipher_hd_t c, { /* Save the current IV and then encrypt the IV. */ memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->cipher->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; /* XOR the input with the IV and store input into IV */ buf_xor_2dst(outbuf, c->u_iv.iv, inbuf, blocksize); @@ -109,7 +109,7 @@ _gcry_cipher_cfb_encrypt (gcry_cipher_hd_t c, { /* Save the current IV and then encrypt the IV. */ memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->cipher->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; c->unused = blocksize; /* Apply the XOR. */ @@ -133,7 +133,7 @@ _gcry_cipher_cfb_decrypt (gcry_cipher_hd_t c, const unsigned char *inbuf, unsigned int inbuflen) { unsigned char *ivp; - size_t blocksize = c->cipher->blocksize; + size_t blocksize = c->spec->blocksize; size_t blocksize_x_2 = blocksize + blocksize; unsigned int burn, nburn; @@ -179,7 +179,7 @@ _gcry_cipher_cfb_decrypt (gcry_cipher_hd_t c, while (inbuflen >= blocksize_x_2 ) { /* Encrypt the IV. */ - nburn = c->cipher->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; /* XOR the input with the IV and store input into IV. */ buf_xor_n_copy(outbuf, c->u_iv.iv, inbuf, blocksize); @@ -193,7 +193,7 @@ _gcry_cipher_cfb_decrypt (gcry_cipher_hd_t c, { /* Save the current IV and then encrypt the IV. */ memcpy ( c->lastiv, c->u_iv.iv, blocksize); - nburn = c->cipher->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; /* XOR the input with the IV and store input into IV */ buf_xor_n_copy(outbuf, c->u_iv.iv, inbuf, blocksize); @@ -206,7 +206,7 @@ _gcry_cipher_cfb_decrypt (gcry_cipher_hd_t c, { /* Save the current IV and then encrypt the IV. */ memcpy ( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->cipher->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; c->unused = blocksize; /* Apply the XOR. */ diff --git a/cipher/cipher-ctr.c b/cipher/cipher-ctr.c index ff1742c..fbc898f 100644 --- a/cipher/cipher-ctr.c +++ b/cipher/cipher-ctr.c @@ -38,7 +38,7 @@ _gcry_cipher_ctr_encrypt (gcry_cipher_hd_t c, { unsigned int n; int i; - unsigned int blocksize = c->cipher->blocksize; + unsigned int blocksize = c->spec->blocksize; unsigned int nblocks; unsigned int burn, nburn; @@ -77,7 +77,7 @@ _gcry_cipher_ctr_encrypt (gcry_cipher_hd_t c, unsigned char tmp[MAX_BLOCKSIZE]; do { - nburn = c->cipher->encrypt (&c->context.c, tmp, c->u_ctr.ctr); + nburn = c->spec->encrypt (&c->context.c, tmp, c->u_ctr.ctr); burn = nburn > burn ? nburn : burn; for (i = blocksize; i > 0; i--) diff --git a/cipher/cipher-internal.h b/cipher/cipher-internal.h index 025bf2e..cabcd1f 100644 --- a/cipher/cipher-internal.h +++ b/cipher/cipher-internal.h @@ -60,8 +60,7 @@ struct gcry_cipher_handle int magic; size_t actual_handle_size; /* Allocated size of this handle. */ size_t handle_offset; /* Offset to the malloced block. */ - gcry_cipher_spec_t *cipher; - cipher_extra_spec_t *extraspec; + gcry_cipher_spec_t *spec; gcry_module_t module; /* The algorithm id. This is a hack required because the module diff --git a/cipher/cipher-ofb.c b/cipher/cipher-ofb.c index 3fb9b0d..3d9d54c 100644 --- a/cipher/cipher-ofb.c +++ b/cipher/cipher-ofb.c @@ -37,7 +37,7 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, const unsigned char *inbuf, unsigned int inbuflen) { unsigned char *ivp; - size_t blocksize = c->cipher->blocksize; + size_t blocksize = c->spec->blocksize; unsigned int burn, nburn; if (outbuflen < inbuflen) @@ -47,7 +47,7 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, { /* Short enough to be encoded by the remaining XOR mask. */ /* XOR the input with the IV */ - ivp = c->u_iv.iv + c->cipher->blocksize - c->unused; + ivp = c->u_iv.iv + c->spec->blocksize - c->unused; buf_xor(outbuf, ivp, inbuf, inbuflen); c->unused -= inbuflen; return 0; @@ -70,7 +70,7 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, { /* Encrypt the IV (and save the current one). */ memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->cipher->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; buf_xor(outbuf, c->u_iv.iv, inbuf, blocksize); outbuf += blocksize; @@ -80,7 +80,7 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, if ( inbuflen ) { /* process the remaining bytes */ memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->cipher->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; c->unused = blocksize; c->unused -= inbuflen; @@ -103,7 +103,7 @@ _gcry_cipher_ofb_decrypt (gcry_cipher_hd_t c, const unsigned char *inbuf, unsigned int inbuflen) { unsigned char *ivp; - size_t blocksize = c->cipher->blocksize; + size_t blocksize = c->spec->blocksize; unsigned int burn, nburn; if (outbuflen < inbuflen) @@ -135,7 +135,7 @@ _gcry_cipher_ofb_decrypt (gcry_cipher_hd_t c, { /* Encrypt the IV (and save the current one). */ memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->cipher->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; buf_xor(outbuf, c->u_iv.iv, inbuf, blocksize); outbuf += blocksize; @@ -146,7 +146,7 @@ _gcry_cipher_ofb_decrypt (gcry_cipher_hd_t c, { /* Process the remaining bytes. */ /* Encrypt the IV (and save the current one). */ memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->cipher->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; c->unused = blocksize; c->unused -= inbuflen; diff --git a/cipher/cipher.c b/cipher/cipher.c index 23cb99c..ca61375 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -1,6 +1,7 @@ /* cipher.c - cipher dispatcher * Copyright (C) 1998, 1999, 2000, 2001, 2002, 2003 * 2005, 2007, 2008, 2009, 2011 Free Software Foundation, Inc. + * Copyright (C) 2013 g10 Code GmbH * * This file is part of Libgcrypt. * @@ -29,355 +30,168 @@ #include "ath.h" #include "./cipher-internal.h" -/* A dummy extraspec so that we do not need to tests the extraspec - field from the module specification against NULL and instead - directly test the respective fields of extraspecs. */ -static cipher_extra_spec_t dummy_extra_spec; /* This is the list of the default ciphers, which are included in libgcrypt. */ -static struct cipher_table_entry -{ - gcry_cipher_spec_t *cipher; - cipher_extra_spec_t *extraspec; - unsigned int algorithm; - int fips_allowed; -} cipher_table[] = +static gcry_cipher_spec_t *cipher_list[] = { #if USE_BLOWFISH - { &_gcry_cipher_spec_blowfish, - &dummy_extra_spec, GCRY_CIPHER_BLOWFISH }, + &_gcry_cipher_spec_blowfish, #endif #if USE_DES - { &_gcry_cipher_spec_des, - &dummy_extra_spec, GCRY_CIPHER_DES }, - { &_gcry_cipher_spec_tripledes, - &_gcry_cipher_extraspec_tripledes, GCRY_CIPHER_3DES, 1 }, + &_gcry_cipher_spec_des, + &_gcry_cipher_spec_tripledes, #endif #if USE_ARCFOUR - { &_gcry_cipher_spec_arcfour, - &dummy_extra_spec, GCRY_CIPHER_ARCFOUR }, + &_gcry_cipher_spec_arcfour, #endif #if USE_CAST5 - { &_gcry_cipher_spec_cast5, - &dummy_extra_spec, GCRY_CIPHER_CAST5 }, + &_gcry_cipher_spec_cast5, #endif #if USE_AES - { &_gcry_cipher_spec_aes, - &_gcry_cipher_extraspec_aes, GCRY_CIPHER_AES, 1 }, - { &_gcry_cipher_spec_aes192, - &_gcry_cipher_extraspec_aes192, GCRY_CIPHER_AES192, 1 }, - { &_gcry_cipher_spec_aes256, - &_gcry_cipher_extraspec_aes256, GCRY_CIPHER_AES256, 1 }, + &_gcry_cipher_spec_aes, + &_gcry_cipher_spec_aes192, + &_gcry_cipher_spec_aes256, #endif #if USE_TWOFISH - { &_gcry_cipher_spec_twofish, - &dummy_extra_spec, GCRY_CIPHER_TWOFISH }, - { &_gcry_cipher_spec_twofish128, - &dummy_extra_spec, GCRY_CIPHER_TWOFISH128 }, + &_gcry_cipher_spec_twofish, + &_gcry_cipher_spec_twofish128, #endif #if USE_SERPENT - { &_gcry_cipher_spec_serpent128, - &dummy_extra_spec, GCRY_CIPHER_SERPENT128 }, - { &_gcry_cipher_spec_serpent192, - &dummy_extra_spec, GCRY_CIPHER_SERPENT192 }, - { &_gcry_cipher_spec_serpent256, - &dummy_extra_spec, GCRY_CIPHER_SERPENT256 }, + &_gcry_cipher_spec_serpent128, + &_gcry_cipher_spec_serpent192, + &_gcry_cipher_spec_serpent256, #endif #if USE_RFC2268 - { &_gcry_cipher_spec_rfc2268_40, - &dummy_extra_spec, GCRY_CIPHER_RFC2268_40 }, - { &_gcry_cipher_spec_rfc2268_128, - &dummy_extra_spec, GCRY_CIPHER_RFC2268_128 }, + &_gcry_cipher_spec_rfc2268_40, + &_gcry_cipher_spec_rfc2268_128, #endif #if USE_SEED - { &_gcry_cipher_spec_seed, - &dummy_extra_spec, GCRY_CIPHER_SEED }, + &_gcry_cipher_spec_seed, #endif #if USE_CAMELLIA - { &_gcry_cipher_spec_camellia128, - &dummy_extra_spec, GCRY_CIPHER_CAMELLIA128 }, - { &_gcry_cipher_spec_camellia192, - &dummy_extra_spec, GCRY_CIPHER_CAMELLIA192 }, - { &_gcry_cipher_spec_camellia256, - &dummy_extra_spec, GCRY_CIPHER_CAMELLIA256 }, + &_gcry_cipher_spec_camellia128, + &_gcry_cipher_spec_camellia192, + &_gcry_cipher_spec_camellia256, #endif #ifdef USE_IDEA - { &_gcry_cipher_spec_idea, - &dummy_extra_spec, GCRY_CIPHER_IDEA }, + &_gcry_cipher_spec_idea, #endif #if USE_SALSA20 - { &_gcry_cipher_spec_salsa20, - &_gcry_cipher_extraspec_salsa20, GCRY_CIPHER_SALSA20 }, - { &_gcry_cipher_spec_salsa20r12, - &_gcry_cipher_extraspec_salsa20, GCRY_CIPHER_SALSA20R12 }, + &_gcry_cipher_spec_salsa20, + &_gcry_cipher_spec_salsa20r12, #endif #if USE_GOST28147 - { &_gcry_cipher_spec_gost28147, - &dummy_extra_spec, GCRY_CIPHER_GOST28147 }, + &_gcry_cipher_spec_gost28147, #endif - { NULL } + NULL }; -/* List of registered ciphers. */ -static gcry_module_t ciphers_registered; - -/* This is the lock protecting CIPHERS_REGISTERED. It is initialized - by _gcry_cipher_init. */ -static ath_mutex_t ciphers_registered_lock; - -/* Flag to check whether the default ciphers have already been - registered. */ -static int default_ciphers_registered; - -/* Convenient macro for registering the default ciphers. */ -#define REGISTER_DEFAULT_CIPHERS \ - do \ - { \ - ath_mutex_lock (&ciphers_registered_lock); \ - if (! default_ciphers_registered) \ - { \ - cipher_register_default (); \ - default_ciphers_registered = 1; \ - } \ - ath_mutex_unlock (&ciphers_registered_lock); \ - } \ - while (0) -/* These dummy functions are used in case a cipher implementation - refuses to provide it's own functions. */ - -static gcry_err_code_t -dummy_setkey (void *c, const unsigned char *key, unsigned int keylen) +static int +map_algo (int algo) { - (void)c; - (void)key; - (void)keylen; - return GPG_ERR_NO_ERROR; + return algo; } -static unsigned int -dummy_encrypt_block (void *c, - unsigned char *outbuf, const unsigned char *inbuf) -{ - (void)c; - (void)outbuf; - (void)inbuf; - BUG(); - return 0; -} -static unsigned int -dummy_decrypt_block (void *c, - unsigned char *outbuf, const unsigned char *inbuf) +/* Return the spec structure for the cipher algorithm ALGO. For + an unknown algorithm NULL is returned. */ +static gcry_cipher_spec_t * +spec_from_algo (int algo) { - (void)c; - (void)outbuf; - (void)inbuf; - BUG(); - return 0; -} + int idx; + gcry_cipher_spec_t *spec; -static void -dummy_encrypt_stream (void *c, - unsigned char *outbuf, const unsigned char *inbuf, - unsigned int n) -{ - (void)c; - (void)outbuf; - (void)inbuf; - (void)n; - BUG(); -} + algo = map_algo (algo); -static void -dummy_decrypt_stream (void *c, - unsigned char *outbuf, const unsigned char *inbuf, - unsigned int n) -{ - (void)c; - (void)outbuf; - (void)inbuf; - (void)n; - BUG(); + for (idx = 0; (spec = cipher_list[idx]); idx++) + if (algo == spec->algo) + return spec; + return NULL; } - -/* Internal function. Register all the ciphers included in - CIPHER_TABLE. Note, that this function gets only used by the macro - REGISTER_DEFAULT_CIPHERS which protects it using a mutex. */ -static void -cipher_register_default (void) + +/* Lookup a cipher's spec by its name. */ +static gcry_cipher_spec_t * +spec_from_name (const char *name) { - gcry_err_code_t err = GPG_ERR_NO_ERROR; - int i; + gcry_cipher_spec_t *spec; + int idx; + const char **aliases; - for (i = 0; !err && cipher_table[i].cipher; i++) + for (idx=0; (spec = cipher_list[idx]); idx++) { - if (! cipher_table[i].cipher->setkey) - cipher_table[i].cipher->setkey = dummy_setkey; - if (! cipher_table[i].cipher->encrypt) - cipher_table[i].cipher->encrypt = dummy_encrypt_block; - if (! cipher_table[i].cipher->decrypt) - cipher_table[i].cipher->decrypt = dummy_decrypt_block; - if (! cipher_table[i].cipher->stencrypt) - cipher_table[i].cipher->stencrypt = dummy_encrypt_stream; - if (! cipher_table[i].cipher->stdecrypt) - cipher_table[i].cipher->stdecrypt = dummy_decrypt_stream; - - if ( fips_mode () && !cipher_table[i].fips_allowed ) - continue; - - err = _gcry_module_add (&ciphers_registered, - cipher_table[i].algorithm, - (void *) cipher_table[i].cipher, - (void *) cipher_table[i].extraspec, - NULL); + if (!stricmp (name, spec->name)) + return spec; + if (spec->aliases) + { + for (aliases = spec->aliases; *aliases; aliases++) + if (!stricmp (name, *aliases)) + return spec; + } } - if (err) - BUG (); + return NULL; } -/* Internal callback function. Used via _gcry_module_lookup. */ -static int -gcry_cipher_lookup_func_name (void *spec, void *data) -{ - gcry_cipher_spec_t *cipher = (gcry_cipher_spec_t *) spec; - char *name = (char *) data; - const char **aliases = cipher->aliases; - int i, ret = ! stricmp (name, cipher->name); - - if (aliases) - for (i = 0; aliases[i] && (! ret); i++) - ret = ! stricmp (name, aliases[i]); - - return ret; -} -/* Internal callback function. Used via _gcry_module_lookup. */ -static int -gcry_cipher_lookup_func_oid (void *spec, void *data) -{ - gcry_cipher_spec_t *cipher = (gcry_cipher_spec_t *) spec; - char *oid = (char *) data; - gcry_cipher_oid_spec_t *oid_specs = cipher->oids; - int ret = 0, i; - - if (oid_specs) - for (i = 0; oid_specs[i].oid && (! ret); i++) - if (! stricmp (oid, oid_specs[i].oid)) - ret = 1; - - return ret; -} - -/* Internal function. Lookup a cipher entry by it's name. */ -static gcry_module_t -gcry_cipher_lookup_name (const char *name) -{ - gcry_module_t cipher; - - cipher = _gcry_module_lookup (ciphers_registered, (void *) name, - gcry_cipher_lookup_func_name); - - return cipher; -} - -/* Internal function. Lookup a cipher entry by it's oid. */ -static gcry_module_t -gcry_cipher_lookup_oid (const char *oid) -{ - gcry_module_t cipher; - - cipher = _gcry_module_lookup (ciphers_registered, (void *) oid, - gcry_cipher_lookup_func_oid); - - return cipher; -} - -/* Register a new cipher module whose specification can be found in - CIPHER. On success, a new algorithm ID is stored in ALGORITHM_ID - and a pointer representhing this module is stored in MODULE. */ -gcry_error_t -_gcry_cipher_register (gcry_cipher_spec_t *cipher, - cipher_extra_spec_t *extraspec, - int *algorithm_id, - gcry_module_t *module) +/* Lookup a cipher's spec by its OID. */ +static gcry_cipher_spec_t * +spec_from_oid (const char *oid) { - gcry_err_code_t err = 0; - gcry_module_t mod; - - /* We do not support module loading in fips mode. */ - if (fips_mode ()) - return gpg_error (GPG_ERR_NOT_SUPPORTED); - - ath_mutex_lock (&ciphers_registered_lock); - err = _gcry_module_add (&ciphers_registered, 0, - (void *)cipher, - (void *)(extraspec? extraspec : &dummy_extra_spec), - &mod); - ath_mutex_unlock (&ciphers_registered_lock); + gcry_cipher_spec_t *spec; + gcry_cipher_oid_spec_t *oid_specs; + int idx, j; - if (! err) + for (idx=0; (spec = cipher_list[idx]); idx++) { - *module = mod; - *algorithm_id = mod->mod_id; + oid_specs = spec->oids; + if (oid_specs) + { + for (j = 0; oid_specs[j].oid; j++) + if (!stricmp (oid, oid_specs[j].oid)) + return spec; + } } - return gcry_error (err); + return NULL; } -/* Unregister the cipher identified by MODULE, which must have been - registered with gcry_cipher_register. */ -void -_gcry_cipher_unregister (gcry_module_t module) -{ - ath_mutex_lock (&ciphers_registered_lock); - _gcry_module_release (module); - ath_mutex_unlock (&ciphers_registered_lock); -} -/* Locate the OID in the oid table and return the index or -1 when not - found. An opitonal "oid." or "OID." prefix in OID is ignored, the - OID is expected to be in standard IETF dotted notation. The - internal algorithm number is returned in ALGORITHM unless it - ispassed as NULL. A pointer to the specification of the module - implementing this algorithm is return in OID_SPEC unless passed as - NULL.*/ -static int -search_oid (const char *oid, int *algorithm, gcry_cipher_oid_spec_t *oid_spec) +/* Locate the OID in the oid table and return the spec or NULL if not + found. An optional "oid." or "OID." prefix in OID is ignored, the + OID is expected to be in standard IETF dotted notation. A pointer + to the OID specification of the module implementing this algorithm + is return in OID_SPEC unless passed as NULL.*/ +static gcry_cipher_spec_t * +search_oid (const char *oid, gcry_cipher_oid_spec_t *oid_spec) { - gcry_module_t module; - int ret = 0; + gcry_cipher_spec_t *spec; + int i; if (oid && ((! strncmp (oid, "oid.", 4)) || (! strncmp (oid, "OID.", 4)))) oid += 4; - module = gcry_cipher_lookup_oid (oid); - if (module) + spec = spec_from_oid (oid); + if (spec && spec->oids) { - gcry_cipher_spec_t *cipher = module->spec; - int i; - - for (i = 0; cipher->oids[i].oid && !ret; i++) - if (! stricmp (oid, cipher->oids[i].oid)) + for (i = 0; spec->oids[i].oid; i++) + if (!stricmp (oid, spec->oids[i].oid)) { - if (algorithm) - *algorithm = module->mod_id; if (oid_spec) - *oid_spec = cipher->oids[i]; - ret = 1; + *oid_spec = spec->oids[i]; + return spec; } - _gcry_module_release (module); } - return ret; + return NULL; } + /* Map STRING to the cipher algorithm identifier. Returns the algorithm ID of the cipher for the given name or 0 if the name is not known. It is valid to pass NULL for STRING which results in a @@ -385,34 +199,24 @@ search_oid (const char *oid, int *algorithm, gcry_cipher_oid_spec_t *oid_spec) int gcry_cipher_map_name (const char *string) { - gcry_module_t cipher; - int ret, algorithm = 0; + gcry_cipher_spec_t *spec; - if (! string) + if (!string) return 0; - REGISTER_DEFAULT_CIPHERS; - /* If the string starts with a digit (optionally prefixed with either "OID." or "oid."), we first look into our table of ASN.1 object identifiers to figure out the algorithm */ - ath_mutex_lock (&ciphers_registered_lock); - - ret = search_oid (string, &algorithm, NULL); - if (! ret) - { - cipher = gcry_cipher_lookup_name (string); - if (cipher) - { - algorithm = cipher->mod_id; - _gcry_module_release (cipher); - } - } + spec = search_oid (string, NULL); + if (spec) + return spec->algo; - ath_mutex_unlock (&ciphers_registered_lock); + spec = spec_from_name (string); + if (spec) + return spec->algo; - return algorithm; + return 0; } @@ -423,78 +227,46 @@ gcry_cipher_map_name (const char *string) int gcry_cipher_mode_from_oid (const char *string) { + gcry_cipher_spec_t *spec; gcry_cipher_oid_spec_t oid_spec; - int ret = 0, mode = 0; if (!string) return 0; - ath_mutex_lock (&ciphers_registered_lock); - ret = search_oid (string, NULL, &oid_spec); - if (ret) - mode = oid_spec.mode; - ath_mutex_unlock (&ciphers_registered_lock); + spec = search_oid (string, &oid_spec); + if (spec) + return oid_spec.mode; - return mode; + return 0; } -/* Map the cipher algorithm whose ID is contained in ALGORITHM to a - string representation of the algorithm name. For unknown algorithm - IDs this function returns "?". */ -static const char * -cipher_algo_to_string (int algorithm) -{ - gcry_module_t cipher; - const char *name; - - REGISTER_DEFAULT_CIPHERS; - - ath_mutex_lock (&ciphers_registered_lock); - cipher = _gcry_module_lookup_id (ciphers_registered, algorithm); - if (cipher) - { - name = ((gcry_cipher_spec_t *) cipher->spec)->name; - _gcry_module_release (cipher); - } - else - name = "?"; - ath_mutex_unlock (&ciphers_registered_lock); - - return name; -} - /* Map the cipher algorithm identifier ALGORITHM to a string representing this algorithm. This string is the default name as - used by Libgcrypt. An pointer to an empty string is returned for - an unknown algorithm. NULL is never returned. */ + used by Libgcrypt. A "?" is returned for an unknown algorithm. + NULL is never returned. */ const char * gcry_cipher_algo_name (int algorithm) { - return cipher_algo_to_string (algorithm); + gcry_cipher_spec_t *spec; + + spec = spec_from_algo (algorithm); + return spec? spec->name : "?"; } /* Flag the cipher algorithm with the identifier ALGORITHM as disabled. There is no error return, the function does nothing for - unknown algorithms. Disabled algorithms are vitually not available - in Libgcrypt. */ + unknown algorithms. Disabled algorithms are virtually not + available in Libgcrypt. This is not thread safe and should thus be + called early. */ static void -disable_cipher_algo (int algorithm) +disable_cipher_algo (int algo) { - gcry_module_t cipher; - - REGISTER_DEFAULT_CIPHERS; + gcry_cipher_spec_t *spec = spec_from_algo (algo); - ath_mutex_lock (&ciphers_registered_lock); - cipher = _gcry_module_lookup_id (ciphers_registered, algorithm); - if (cipher) - { - if (! (cipher->flags & FLAG_MODULE_DISABLED)) - cipher->flags |= FLAG_MODULE_DISABLED; - _gcry_module_release (cipher); - } - ath_mutex_unlock (&ciphers_registered_lock); + if (spec) + spec->flags.disabled = 1; } @@ -504,24 +276,13 @@ disable_cipher_algo (int algorithm) static gcry_err_code_t check_cipher_algo (int algorithm) { - gcry_err_code_t err = GPG_ERR_NO_ERROR; - gcry_module_t cipher; - - REGISTER_DEFAULT_CIPHERS; + gcry_cipher_spec_t *spec; - ath_mutex_lock (&ciphers_registered_lock); - cipher = _gcry_module_lookup_id (ciphers_registered, algorithm); - if (cipher) - { - if (cipher->flags & FLAG_MODULE_DISABLED) - err = GPG_ERR_CIPHER_ALGO; - _gcry_module_release (cipher); - } - else - err = GPG_ERR_CIPHER_ALGO; - ath_mutex_unlock (&ciphers_registered_lock); + spec = spec_from_algo (algorithm); + if (spec && !spec->flags.disabled) + return 0; - return err; + return GPG_ERR_CIPHER_ALGO; } @@ -530,45 +291,36 @@ check_cipher_algo (int algorithm) static unsigned int cipher_get_keylen (int algorithm) { - gcry_module_t cipher; + gcry_cipher_spec_t *spec; unsigned len = 0; - REGISTER_DEFAULT_CIPHERS; - - ath_mutex_lock (&ciphers_registered_lock); - cipher = _gcry_module_lookup_id (ciphers_registered, algorithm); - if (cipher) + spec = spec_from_algo (algorithm); + if (spec) { - len = ((gcry_cipher_spec_t *) cipher->spec)->keylen; + len = spec->keylen; if (!len) log_bug ("cipher %d w/o key length\n", algorithm); - _gcry_module_release (cipher); } - ath_mutex_unlock (&ciphers_registered_lock); return len; } + /* Return the block length of the cipher algorithm with the identifier ALGORITHM. This function return 0 for an invalid algorithm. */ static unsigned int cipher_get_blocksize (int algorithm) { - gcry_module_t cipher; + gcry_cipher_spec_t *spec; unsigned len = 0; - REGISTER_DEFAULT_CIPHERS; - - ath_mutex_lock (&ciphers_registered_lock); - cipher = _gcry_module_lookup_id (ciphers_registered, algorithm); - if (cipher) + spec = spec_from_algo (algorithm); + if (spec) { - len = ((gcry_cipher_spec_t *) cipher->spec)->blocksize; - if (! len) - log_bug ("cipher %d w/o blocksize\n", algorithm); - _gcry_module_release (cipher); + len = spec->blocksize; + if (!len) + log_bug ("cipher %d w/o blocksize\n", algorithm); } - ath_mutex_unlock (&ciphers_registered_lock); return len; } @@ -593,40 +345,21 @@ gcry_cipher_open (gcry_cipher_hd_t *handle, int algo, int mode, unsigned int flags) { int secure = (flags & GCRY_CIPHER_SECURE); - gcry_cipher_spec_t *cipher = NULL; - cipher_extra_spec_t *extraspec = NULL; - gcry_module_t module = NULL; + gcry_cipher_spec_t *spec; gcry_cipher_hd_t h = NULL; - gcry_err_code_t err = 0; + gcry_err_code_t err; /* If the application missed to call the random poll function, we do it here to ensure that it is used once in a while. */ _gcry_fast_random_poll (); - REGISTER_DEFAULT_CIPHERS; - - /* Fetch the according module and check whether the cipher is marked - available for use. */ - ath_mutex_lock (&ciphers_registered_lock); - module = _gcry_module_lookup_id (ciphers_registered, algo); - if (module) - { - /* Found module. */ - - if (module->flags & FLAG_MODULE_DISABLED) - { - /* Not available for use. */ - err = GPG_ERR_CIPHER_ALGO; - } - else - { - cipher = (gcry_cipher_spec_t *) module->spec; - extraspec = module->extraspec; - } - } - else + spec = spec_from_algo (algo); + if (!spec) + err = GPG_ERR_CIPHER_ALGO; + else if (spec->flags.disabled) err = GPG_ERR_CIPHER_ALGO; - ath_mutex_unlock (&ciphers_registered_lock); + else + err = 0; /* check flags */ if ((! err) @@ -648,14 +381,12 @@ gcry_cipher_open (gcry_cipher_hd_t *handle, case GCRY_CIPHER_MODE_OFB: case GCRY_CIPHER_MODE_CTR: case GCRY_CIPHER_MODE_AESWRAP: - if ((cipher->encrypt == dummy_encrypt_block) - || (cipher->decrypt == dummy_decrypt_block)) + if (!spec->encrypt || !spec->decrypt) err = GPG_ERR_INV_CIPHER_MODE; break; case GCRY_CIPHER_MODE_STREAM: - if ((cipher->stencrypt == dummy_encrypt_stream) - || (cipher->stdecrypt == dummy_decrypt_stream)) + if (!spec->stencrypt || !spec->stdecrypt) err = GPG_ERR_INV_CIPHER_MODE; break; @@ -674,13 +405,12 @@ gcry_cipher_open (gcry_cipher_hd_t *handle, /* Perform selftest here and mark this with a flag in cipher_table? No, we should not do this as it takes too long. Further it does not make sense to exclude algorithms with failing selftests at - runtime: If a selftest fails there is something seriously wrong - with the system and thus we better die immediately. */ + runtime: If a selftest fails there is something seriously wrong with the system and thus we better die immediately. */ if (! err) { size_t size = (sizeof (*h) - + 2 * cipher->contextsize + + 2 * spec->contextsize - sizeof (cipher_context_alignment_t) #ifdef NEED_16BYTE_ALIGNED_CONTEXT + 15 /* Space for leading alignment gap. */ @@ -711,9 +441,7 @@ gcry_cipher_open (gcry_cipher_hd_t *handle, h->magic = secure ? CTX_MAGIC_SECURE : CTX_MAGIC_NORMAL; h->actual_handle_size = size - off; h->handle_offset = off; - h->cipher = cipher; - h->extraspec = extraspec; - h->module = module; + h->spec = spec; h->algo = algo; h->mode = mode; h->flags = flags; @@ -781,17 +509,6 @@ gcry_cipher_open (gcry_cipher_hd_t *handle, /* Done. */ - if (err) - { - if (module) - { - /* Release module. */ - ath_mutex_lock (&ciphers_registered_lock); - _gcry_module_release (module); - ath_mutex_unlock (&ciphers_registered_lock); - } - } - *handle = err ? NULL : h; return gcry_error (err); @@ -815,11 +532,6 @@ gcry_cipher_close (gcry_cipher_hd_t h) else h->magic = 0; - /* Release module. */ - ath_mutex_lock (&ciphers_registered_lock); - _gcry_module_release (h->module); - ath_mutex_unlock (&ciphers_registered_lock); - /* We always want to wipe out the memory even when the context has been allocated in secure memory. The user might have disabled secure memory or is using his own implementation which does not @@ -840,13 +552,13 @@ cipher_setkey (gcry_cipher_hd_t c, byte *key, unsigned int keylen) { gcry_err_code_t ret; - ret = (*c->cipher->setkey) (&c->context.c, key, keylen); + ret = c->spec->setkey (&c->context.c, key, keylen); if (!ret) { /* Duplicate initial context. */ - memcpy ((void *) ((char *) &c->context.c + c->cipher->contextsize), + memcpy ((void *) ((char *) &c->context.c + c->spec->contextsize), (void *) &c->context.c, - c->cipher->contextsize); + c->spec->contextsize); c->marks.key = 1; } else @@ -863,23 +575,23 @@ cipher_setiv (gcry_cipher_hd_t c, const byte *iv, unsigned ivlen) { /* If the cipher has its own IV handler, we use only this one. This is currently used for stream ciphers requiring a nonce. */ - if (c->extraspec && c->extraspec->setiv) + if (c->spec->setiv) { - c->extraspec->setiv (&c->context.c, iv, ivlen); + c->spec->setiv (&c->context.c, iv, ivlen); return; } - memset (c->u_iv.iv, 0, c->cipher->blocksize); + memset (c->u_iv.iv, 0, c->spec->blocksize); if (iv) { - if (ivlen != c->cipher->blocksize) + if (ivlen != c->spec->blocksize) { log_info ("WARNING: cipher_setiv: ivlen=%u blklen=%u\n", - ivlen, (unsigned int)c->cipher->blocksize); + ivlen, (unsigned int)c->spec->blocksize); fips_signal_error ("IV length does not match blocklength"); } - if (ivlen > c->cipher->blocksize) - ivlen = c->cipher->blocksize; + if (ivlen > c->spec->blocksize) + ivlen = c->spec->blocksize; memcpy (c->u_iv.iv, iv, ivlen); c->marks.iv = 1; } @@ -895,12 +607,12 @@ static void cipher_reset (gcry_cipher_hd_t c) { memcpy (&c->context.c, - (char *) &c->context.c + c->cipher->contextsize, - c->cipher->contextsize); + (char *) &c->context.c + c->spec->contextsize, + c->spec->contextsize); memset (&c->marks, 0, sizeof c->marks); - memset (c->u_iv.iv, 0, c->cipher->blocksize); - memset (c->lastiv, 0, c->cipher->blocksize); - memset (c->u_ctr.ctr, 0, c->cipher->blocksize); + memset (c->u_iv.iv, 0, c->spec->blocksize); + memset (c->lastiv, 0, c->spec->blocksize); + memset (c->u_ctr.ctr, 0, c->spec->blocksize); } @@ -910,7 +622,7 @@ do_ecb_encrypt (gcry_cipher_hd_t c, unsigned char *outbuf, unsigned int outbuflen, const unsigned char *inbuf, unsigned int inbuflen) { - unsigned int blocksize = c->cipher->blocksize; + unsigned int blocksize = c->spec->blocksize; unsigned int n, nblocks; unsigned int burn, nburn; @@ -919,12 +631,12 @@ do_ecb_encrypt (gcry_cipher_hd_t c, if ((inbuflen % blocksize)) return GPG_ERR_INV_LENGTH; - nblocks = inbuflen / c->cipher->blocksize; + nblocks = inbuflen / c->spec->blocksize; burn = 0; for (n=0; n < nblocks; n++ ) { - nburn = c->cipher->encrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf); + nburn = c->spec->encrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf); burn = nburn > burn ? nburn : burn; inbuf += blocksize; outbuf += blocksize; @@ -941,7 +653,7 @@ do_ecb_decrypt (gcry_cipher_hd_t c, unsigned char *outbuf, unsigned int outbuflen, const unsigned char *inbuf, unsigned int inbuflen) { - unsigned int blocksize = c->cipher->blocksize; + unsigned int blocksize = c->spec->blocksize; unsigned int n, nblocks; unsigned int burn, nburn; @@ -950,12 +662,12 @@ do_ecb_decrypt (gcry_cipher_hd_t c, if ((inbuflen % blocksize)) return GPG_ERR_INV_LENGTH; - nblocks = inbuflen / c->cipher->blocksize; + nblocks = inbuflen / c->spec->blocksize; burn = 0; for (n=0; n < nblocks; n++ ) { - nburn = c->cipher->decrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf); + nburn = c->spec->decrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf); burn = nburn > burn ? nburn : burn; inbuf += blocksize; outbuf += blocksize; @@ -1007,8 +719,8 @@ cipher_encrypt (gcry_cipher_hd_t c, byte *outbuf, unsigned int outbuflen, break; case GCRY_CIPHER_MODE_STREAM: - c->cipher->stencrypt (&c->context.c, - outbuf, (byte*)/*arggg*/inbuf, inbuflen); + c->spec->stencrypt (&c->context.c, + outbuf, (byte*)/*arggg*/inbuf, inbuflen); rc = 0; break; @@ -1100,8 +812,8 @@ cipher_decrypt (gcry_cipher_hd_t c, byte *outbuf, unsigned int outbuflen, break; case GCRY_CIPHER_MODE_STREAM: - c->cipher->stdecrypt (&c->context.c, - outbuf, (byte*)/*arggg*/inbuf, inbuflen); + c->spec->stdecrypt (&c->context.c, + outbuf, (byte*)/*arggg*/inbuf, inbuflen); rc = 0; break; @@ -1155,9 +867,9 @@ cipher_sync (gcry_cipher_hd_t c) if ((c->flags & GCRY_CIPHER_ENABLE_SYNC) && c->unused) { memmove (c->u_iv.iv + c->unused, - c->u_iv.iv, c->cipher->blocksize - c->unused); + c->u_iv.iv, c->spec->blocksize - c->unused); memcpy (c->u_iv.iv, - c->lastiv + c->cipher->blocksize - c->unused, c->unused); + c->lastiv + c->spec->blocksize - c->unused, c->unused); c->unused = 0; } } @@ -1183,14 +895,14 @@ _gcry_cipher_setiv (gcry_cipher_hd_t hd, const void *iv, size_t ivlen) gpg_error_t _gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen) { - if (ctr && ctrlen == hd->cipher->blocksize) + if (ctr && ctrlen == hd->spec->blocksize) { - memcpy (hd->u_ctr.ctr, ctr, hd->cipher->blocksize); + memcpy (hd->u_ctr.ctr, ctr, hd->spec->blocksize); hd->unused = 0; } else if (!ctr || !ctrlen) { - memset (hd->u_ctr.ctr, 0, hd->cipher->blocksize); + memset (hd->u_ctr.ctr, 0, hd->spec->blocksize); hd->unused = 0; } else @@ -1255,8 +967,8 @@ gcry_cipher_ctl( gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) break; case 61: /* Disable weak key detection (private). */ - if (h->extraspec->set_extra_info) - rc = h->extraspec->set_extra_info + if (h->spec->set_extra_info) + rc = h->spec->set_extra_info (&h->context.c, CIPHER_INFO_NO_WEAK_KEY, NULL, 0); else rc = GPG_ERR_NOT_SUPPORTED; @@ -1268,7 +980,7 @@ gcry_cipher_ctl( gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) 1 byte Actual length of the block in bytes. n byte The block. If the provided buffer is too short, an error is returned. */ - if (buflen < (1 + h->cipher->blocksize)) + if (buflen < (1 + h->spec->blocksize)) rc = GPG_ERR_TOO_SHORT; else { @@ -1277,10 +989,10 @@ gcry_cipher_ctl( gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) int n = h->unused; if (!n) - n = h->cipher->blocksize; - gcry_assert (n <= h->cipher->blocksize); + n = h->spec->blocksize; + gcry_assert (n <= h->spec->blocksize); *dst++ = n; - ivp = h->u_iv.iv + h->cipher->blocksize - n; + ivp = h->u_iv.iv + h->spec->blocksize - n; while (n--) *dst++ = *ivp++; } @@ -1434,15 +1146,7 @@ gcry_cipher_get_algo_blklen (int algo) gcry_err_code_t _gcry_cipher_init (void) { - gcry_err_code_t err; - - err = ath_mutex_init (&ciphers_registered_lock); - if (err) - return gpg_err_code_from_errno (err); - - REGISTER_DEFAULT_CIPHERS; - - return err; + return 0; } @@ -1451,34 +1155,21 @@ _gcry_cipher_init (void) gpg_error_t _gcry_cipher_selftest (int algo, int extended, selftest_report_func_t report) { - gcry_module_t module = NULL; - cipher_extra_spec_t *extraspec = NULL; gcry_err_code_t ec = 0; + gcry_cipher_spec_t *spec; - REGISTER_DEFAULT_CIPHERS; - - ath_mutex_lock (&ciphers_registered_lock); - module = _gcry_module_lookup_id (ciphers_registered, algo); - if (module && !(module->flags & FLAG_MODULE_DISABLED)) - extraspec = module->extraspec; - ath_mutex_unlock (&ciphers_registered_lock); - if (extraspec && extraspec->selftest) - ec = extraspec->selftest (algo, extended, report); + spec = spec_from_algo (algo); + if (spec && !spec->flags.disabled && spec->selftest) + ec = spec->selftest (algo, extended, report); else { ec = GPG_ERR_CIPHER_ALGO; if (report) report ("cipher", algo, "module", - module && !(module->flags & FLAG_MODULE_DISABLED)? + (spec && !spec->flags.disabled)? "no selftest available" : - module? "algorithm disabled" : "algorithm not found"); + spec? "algorithm disabled" : "algorithm not found"); } - if (module) - { - ath_mutex_lock (&ciphers_registered_lock); - _gcry_module_release (module); - ath_mutex_unlock (&ciphers_registered_lock); - } return gpg_error (ec); } diff --git a/cipher/des.c b/cipher/des.c index f1550d1..3464d53 100644 --- a/cipher/des.c +++ b/cipher/des.c @@ -1168,6 +1168,7 @@ run_selftests (int algo, int extended, selftest_report_func_t report) gcry_cipher_spec_t _gcry_cipher_spec_des = { + GCRY_CIPHER_DES, {0, 0}, "DES", NULL, NULL, 8, 64, sizeof (struct _des_ctx), do_des_setkey, do_des_encrypt, do_des_decrypt }; @@ -1184,12 +1185,10 @@ static gcry_cipher_oid_spec_t oids_tripledes[] = gcry_cipher_spec_t _gcry_cipher_spec_tripledes = { + GCRY_CIPHER_3DES, {0, 1}, "3DES", NULL, oids_tripledes, 8, 192, sizeof (struct _tripledes_ctx), - do_tripledes_setkey, do_tripledes_encrypt, do_tripledes_decrypt - }; - -cipher_extra_spec_t _gcry_cipher_extraspec_tripledes = - { + do_tripledes_setkey, do_tripledes_encrypt, do_tripledes_decrypt, + NULL, NULL, run_selftests, do_tripledes_set_extra_info }; diff --git a/cipher/gost28147.c b/cipher/gost28147.c index c669148..2bda868 100644 --- a/cipher/gost28147.c +++ b/cipher/gost28147.c @@ -227,6 +227,7 @@ gost_decrypt_block (void *c, byte *outbuf, const byte *inbuf) gcry_cipher_spec_t _gcry_cipher_spec_gost28147 = { + GCRY_CIPHER_GOST28147, {0, 0}, "GOST28147", NULL, NULL, 8, 256, sizeof (GOST28147_context), gost_setkey, diff --git a/cipher/idea.c b/cipher/idea.c index 6e81e84..7d91a9a 100644 --- a/cipher/idea.c +++ b/cipher/idea.c @@ -371,8 +371,9 @@ static struct { gcry_cipher_spec_t _gcry_cipher_spec_idea = -{ + { + GCRY_CIPHER_IDEA, {0, 0}, "IDEA", NULL, NULL, IDEA_BLOCKSIZE, 128, sizeof (IDEA_context), idea_setkey, idea_encrypt, idea_decrypt -}; + }; diff --git a/cipher/md.c b/cipher/md.c index 280c5d5..c65eb70 100644 --- a/cipher/md.c +++ b/cipher/md.c @@ -1414,7 +1414,7 @@ gpg_error_t _gcry_md_selftest (int algo, int extended, selftest_report_func_t report) { gcry_module_t module = NULL; - cipher_extra_spec_t *extraspec = NULL; + md_extra_spec_t *extraspec = NULL; gcry_err_code_t ec = 0; REGISTER_DEFAULT_DIGESTS; diff --git a/cipher/pubkey.c b/cipher/pubkey.c index 4738c29..1628467 100644 --- a/cipher/pubkey.c +++ b/cipher/pubkey.c @@ -2356,7 +2356,7 @@ _gcry_pk_selftest (int algo, int extended, selftest_report_func_t report) algo = map_algo (algo); spec = spec_from_algo (algo); - if (spec && spec->selftest) + if (spec && !spec->flags.disabled && spec->selftest) ec = spec->selftest (algo, extended, report); else { diff --git a/cipher/rfc2268.c b/cipher/rfc2268.c index da0b9f4..aed8cad 100644 --- a/cipher/rfc2268.c +++ b/cipher/rfc2268.c @@ -358,14 +358,18 @@ static gcry_cipher_oid_spec_t oids_rfc2268_128[] = { NULL } }; -gcry_cipher_spec_t _gcry_cipher_spec_rfc2268_40 = { - "RFC2268_40", NULL, oids_rfc2268_40, - RFC2268_BLOCKSIZE, 40, sizeof(RFC2268_context), - do_setkey, encrypt_block, decrypt_block -}; +gcry_cipher_spec_t _gcry_cipher_spec_rfc2268_40 = + { + GCRY_CIPHER_RFC2268_40, {0, 0}, + "RFC2268_40", NULL, oids_rfc2268_40, + RFC2268_BLOCKSIZE, 40, sizeof(RFC2268_context), + do_setkey, encrypt_block, decrypt_block + }; -gcry_cipher_spec_t _gcry_cipher_spec_rfc2268_128 = { - "RFC2268_128", NULL, oids_rfc2268_128, - RFC2268_BLOCKSIZE, 128, sizeof(RFC2268_context), - do_setkey, encrypt_block, decrypt_block -}; +gcry_cipher_spec_t _gcry_cipher_spec_rfc2268_128 = + { + GCRY_CIPHER_RFC2268_128, {0, 0}, + "RFC2268_128", NULL, oids_rfc2268_128, + RFC2268_BLOCKSIZE, 128, sizeof(RFC2268_context), + do_setkey, encrypt_block, decrypt_block + }; diff --git a/cipher/rijndael.c b/cipher/rijndael.c index 190d0f9..85c1a41 100644 --- a/cipher/rijndael.c +++ b/cipher/rijndael.c @@ -2557,14 +2557,15 @@ static gcry_cipher_oid_spec_t rijndael_oids[] = gcry_cipher_spec_t _gcry_cipher_spec_aes = { - "AES", rijndael_names, rijndael_oids, 16, 128, sizeof (RIJNDAEL_context), - rijndael_setkey, rijndael_encrypt, rijndael_decrypt - }; -cipher_extra_spec_t _gcry_cipher_extraspec_aes = - { + GCRY_CIPHER_AES, {0, 1}, + "AES", rijndael_names, rijndael_oids, 16, 128, + sizeof (RIJNDAEL_context), + rijndael_setkey, rijndael_encrypt, rijndael_decrypt, + NULL, NULL, run_selftests }; + static const char *rijndael192_names[] = { "RIJNDAEL192", @@ -2583,14 +2584,15 @@ static gcry_cipher_oid_spec_t rijndael192_oids[] = gcry_cipher_spec_t _gcry_cipher_spec_aes192 = { - "AES192", rijndael192_names, rijndael192_oids, 16, 192, sizeof (RIJNDAEL_context), - rijndael_setkey, rijndael_encrypt, rijndael_decrypt - }; -cipher_extra_spec_t _gcry_cipher_extraspec_aes192 = - { + GCRY_CIPHER_AES192, {0, 1}, + "AES192", rijndael192_names, rijndael192_oids, 16, 192, + sizeof (RIJNDAEL_context), + rijndael_setkey, rijndael_encrypt, rijndael_decrypt, + NULL, NULL, run_selftests }; + static const char *rijndael256_names[] = { "RIJNDAEL256", @@ -2609,12 +2611,10 @@ static gcry_cipher_oid_spec_t rijndael256_oids[] = gcry_cipher_spec_t _gcry_cipher_spec_aes256 = { + GCRY_CIPHER_AES256, {0, 1}, "AES256", rijndael256_names, rijndael256_oids, 16, 256, sizeof (RIJNDAEL_context), - rijndael_setkey, rijndael_encrypt, rijndael_decrypt - }; - -cipher_extra_spec_t _gcry_cipher_extraspec_aes256 = - { + rijndael_setkey, rijndael_encrypt, rijndael_decrypt, + NULL, NULL, run_selftests }; diff --git a/cipher/salsa20.c b/cipher/salsa20.c index 88f5372..6189bca 100644 --- a/cipher/salsa20.c +++ b/cipher/salsa20.c @@ -373,6 +373,8 @@ selftest (void) gcry_cipher_spec_t _gcry_cipher_spec_salsa20 = { + GCRY_CIPHER_SALSA20, + {0, 0}, /* flags */ "SALSA20", /* name */ NULL, /* aliases */ NULL, /* oids */ @@ -383,11 +385,16 @@ gcry_cipher_spec_t _gcry_cipher_spec_salsa20 = NULL, NULL, salsa20_encrypt_stream, - salsa20_encrypt_stream + salsa20_encrypt_stream, + NULL, + NULL, + salsa20_setiv }; gcry_cipher_spec_t _gcry_cipher_spec_salsa20r12 = { + GCRY_CIPHER_SALSA20R12, + {0, 0}, /* flags */ "SALSA20R12", /* name */ NULL, /* aliases */ NULL, /* oids */ @@ -398,11 +405,7 @@ gcry_cipher_spec_t _gcry_cipher_spec_salsa20r12 = NULL, NULL, salsa20r12_encrypt_stream, - salsa20r12_encrypt_stream - }; - -cipher_extra_spec_t _gcry_cipher_extraspec_salsa20 = - { + salsa20r12_encrypt_stream, NULL, NULL, salsa20_setiv diff --git a/cipher/seed.c b/cipher/seed.c index 474ccba..9f87c05 100644 --- a/cipher/seed.c +++ b/cipher/seed.c @@ -470,6 +470,7 @@ static gcry_cipher_oid_spec_t seed_oids[] = gcry_cipher_spec_t _gcry_cipher_spec_seed = { + GCRY_CIPHER_SEED, {0, 0}, "SEED", NULL, seed_oids, 16, 128, sizeof (SEED_context), seed_setkey, seed_encrypt, seed_decrypt, }; diff --git a/cipher/serpent.c b/cipher/serpent.c index 4720b9c..c0898dc 100644 --- a/cipher/serpent.c +++ b/cipher/serpent.c @@ -1192,6 +1192,7 @@ static const char *cipher_spec_serpent128_aliases[] = gcry_cipher_spec_t _gcry_cipher_spec_serpent128 = { + GCRY_CIPHER_SERPENT128, {0, 0}, "SERPENT128", cipher_spec_serpent128_aliases, NULL, 16, 128, sizeof (serpent_context_t), serpent_setkey, serpent_encrypt, serpent_decrypt @@ -1199,6 +1200,7 @@ gcry_cipher_spec_t _gcry_cipher_spec_serpent128 = gcry_cipher_spec_t _gcry_cipher_spec_serpent192 = { + GCRY_CIPHER_SERPENT192, {0, 0}, "SERPENT192", NULL, NULL, 16, 192, sizeof (serpent_context_t), serpent_setkey, serpent_encrypt, serpent_decrypt @@ -1206,6 +1208,7 @@ gcry_cipher_spec_t _gcry_cipher_spec_serpent192 = gcry_cipher_spec_t _gcry_cipher_spec_serpent256 = { + GCRY_CIPHER_SERPENT256, {0, 0}, "SERPENT256", NULL, NULL, 16, 256, sizeof (serpent_context_t), serpent_setkey, serpent_encrypt, serpent_decrypt diff --git a/cipher/twofish.c b/cipher/twofish.c index 17b3aa3..993ad0f 100644 --- a/cipher/twofish.c +++ b/cipher/twofish.c @@ -1306,12 +1306,14 @@ main() gcry_cipher_spec_t _gcry_cipher_spec_twofish = { + GCRY_CIPHER_TWOFISH, {0, 0}, "TWOFISH", NULL, NULL, 16, 256, sizeof (TWOFISH_context), twofish_setkey, twofish_encrypt, twofish_decrypt }; gcry_cipher_spec_t _gcry_cipher_spec_twofish128 = { + GCRY_CIPHER_TWOFISH128, {0, 0}, "TWOFISH128", NULL, NULL, 16, 128, sizeof (TWOFISH_context), twofish_setkey, twofish_encrypt, twofish_decrypt }; diff --git a/src/cipher-proto.h b/src/cipher-proto.h index 5b152b5..62bc8b9 100644 --- a/src/cipher-proto.h +++ b/src/cipher-proto.h @@ -149,6 +149,39 @@ typedef struct gcry_pk_spec +/* + * + * Symmetric cipher related definitions. + * + */ + +/* Type for the cipher_setkey function. */ +typedef gcry_err_code_t (*gcry_cipher_setkey_t) (void *c, + const unsigned char *key, + unsigned keylen); + +/* Type for the cipher_encrypt function. */ +typedef unsigned int (*gcry_cipher_encrypt_t) (void *c, + unsigned char *outbuf, + const unsigned char *inbuf); + +/* Type for the cipher_decrypt function. */ +typedef unsigned int (*gcry_cipher_decrypt_t) (void *c, + unsigned char *outbuf, + const unsigned char *inbuf); + +/* Type for the cipher_stencrypt function. */ +typedef void (*gcry_cipher_stencrypt_t) (void *c, + unsigned char *outbuf, + const unsigned char *inbuf, + unsigned int n); + +/* Type for the cipher_stdecrypt function. */ +typedef void (*gcry_cipher_stdecrypt_t) (void *c, + unsigned char *outbuf, + const unsigned char *inbuf, + unsigned int n); + /* The type used to convey additional information to a cipher. */ typedef gpg_err_code_t (*cipher_set_extra_info_t) (void *c, int what, const void *buffer, size_t buflen); @@ -157,15 +190,45 @@ typedef gpg_err_code_t (*cipher_set_extra_info_t) typedef void (*cipher_setiv_func_t)(void *c, const byte *iv, unsigned int ivlen); -/* Extra module specification structures. These are used for internal - modules which provide more functions than available through the - public algorithm register APIs. */ -typedef struct cipher_extra_spec +/* A structure to map OIDs to encryption modes. */ +typedef struct gcry_cipher_oid_spec { + const char *oid; + int mode; +} gcry_cipher_oid_spec_t; + + +/* Module specification structure for ciphers. */ +typedef struct gcry_cipher_spec +{ + int algo; + struct { + unsigned int disabled:1; + unsigned int fips:1; + } flags; + const char *name; + const char **aliases; + gcry_cipher_oid_spec_t *oids; + size_t blocksize; + size_t keylen; + size_t contextsize; + gcry_cipher_setkey_t setkey; + gcry_cipher_encrypt_t encrypt; + gcry_cipher_decrypt_t decrypt; + gcry_cipher_stencrypt_t stencrypt; + gcry_cipher_stdecrypt_t stdecrypt; selftest_func_t selftest; cipher_set_extra_info_t set_extra_info; cipher_setiv_func_t setiv; -} cipher_extra_spec_t; +} gcry_cipher_spec_t; + + + +/* + * + * Message digest related definitions. + * + */ typedef struct md_extra_spec { @@ -174,11 +237,8 @@ typedef struct md_extra_spec + /* The private register functions. */ -gcry_error_t _gcry_cipher_register (gcry_cipher_spec_t *cipher, - cipher_extra_spec_t *extraspec, - int *algorithm_id, - gcry_module_t *module); gcry_error_t _gcry_md_register (gcry_md_spec_t *cipher, md_extra_spec_t *extraspec, unsigned int *algorithm_id, diff --git a/src/cipher.h b/src/cipher.h index 70b46fe..d080e72 100644 --- a/src/cipher.h +++ b/src/cipher.h @@ -204,12 +204,6 @@ extern gcry_cipher_spec_t _gcry_cipher_spec_salsa20; extern gcry_cipher_spec_t _gcry_cipher_spec_salsa20r12; extern gcry_cipher_spec_t _gcry_cipher_spec_gost28147; -extern cipher_extra_spec_t _gcry_cipher_extraspec_tripledes; -extern cipher_extra_spec_t _gcry_cipher_extraspec_aes; -extern cipher_extra_spec_t _gcry_cipher_extraspec_aes192; -extern cipher_extra_spec_t _gcry_cipher_extraspec_aes256; -extern cipher_extra_spec_t _gcry_cipher_extraspec_salsa20; - /* Declarations for the digest specifications. */ extern gcry_md_spec_t _gcry_digest_spec_crc32; extern gcry_md_spec_t _gcry_digest_spec_crc32_rfc1510; diff --git a/src/gcrypt-module.h b/src/gcrypt-module.h index 9fcb8ab..621a3a4 100644 --- a/src/gcrypt-module.h +++ b/src/gcrypt-module.h @@ -44,59 +44,6 @@ extern "C" { /* This type represents a `module'. */ typedef struct gcry_module *gcry_module_t; -/* Check that the library fulfills the version requirement. */ - -/* Type for the cipher_setkey function. */ -typedef gcry_err_code_t (*gcry_cipher_setkey_t) (void *c, - const unsigned char *key, - unsigned keylen); - -/* Type for the cipher_encrypt function. */ -typedef unsigned int (*gcry_cipher_encrypt_t) (void *c, - unsigned char *outbuf, - const unsigned char *inbuf); - -/* Type for the cipher_decrypt function. */ -typedef unsigned int (*gcry_cipher_decrypt_t) (void *c, - unsigned char *outbuf, - const unsigned char *inbuf); - -/* Type for the cipher_stencrypt function. */ -typedef void (*gcry_cipher_stencrypt_t) (void *c, - unsigned char *outbuf, - const unsigned char *inbuf, - unsigned int n); - -/* Type for the cipher_stdecrypt function. */ -typedef void (*gcry_cipher_stdecrypt_t) (void *c, - unsigned char *outbuf, - const unsigned char *inbuf, - unsigned int n); - -typedef struct gcry_cipher_oid_spec -{ - const char *oid; - int mode; -} gcry_cipher_oid_spec_t; - -/* Module specification structure for ciphers. */ -typedef struct gcry_cipher_spec -{ - const char *name; - const char **aliases; - gcry_cipher_oid_spec_t *oids; - size_t blocksize; - size_t keylen; - size_t contextsize; - gcry_cipher_setkey_t setkey; - gcry_cipher_encrypt_t encrypt; - gcry_cipher_decrypt_t decrypt; - gcry_cipher_stencrypt_t stencrypt; - gcry_cipher_stdecrypt_t stdecrypt; -} gcry_cipher_spec_t; - - -/* ********************** */ /* ********************** */ commit 4153fa859816e799e506055321a22e6450aacdcc Author: Werner Koch Date: Tue Oct 1 17:47:27 2013 +0200 mpi: Fix gcry_mpi_neg. * mpi/mpiutil.c (_gcry_mpi_neg): Copy U to W. Signed-off-by: Werner Koch diff --git a/mpi/mpiutil.c b/mpi/mpiutil.c index c9e6b31..a82a8e7 100644 --- a/mpi/mpiutil.c +++ b/mpi/mpiutil.c @@ -373,7 +373,9 @@ _gcry_mpi_is_neg (gcry_mpi_t a) void _gcry_mpi_neg (gcry_mpi_t w, gcry_mpi_t u) { - if (mpi_is_immutable (w)) + if (w != u) + mpi_set (w, u); + else if (mpi_is_immutable (w)) { mpi_immutable_failed (); return; ----------------------------------------------------------------------- Summary of changes: cipher/arcfour.c | 1 + cipher/blowfish.c | 1 + cipher/camellia-glue.c | 3 + cipher/cast5.c | 1 + cipher/cipher-aeswrap.c | 8 +- cipher/cipher-cbc.c | 26 +- cipher/cipher-cfb.c | 18 +- cipher/cipher-ctr.c | 4 +- cipher/cipher-internal.h | 3 +- cipher/cipher-ofb.c | 14 +- cipher/cipher.c | 685 +++++++++++++--------------------------------- cipher/des.c | 9 +- cipher/gost28147.c | 1 + cipher/idea.c | 5 +- cipher/md.c | 2 +- cipher/pubkey.c | 2 +- cipher/rfc2268.c | 24 +- cipher/rijndael.c | 30 +- cipher/salsa20.c | 15 +- cipher/seed.c | 1 + cipher/serpent.c | 3 + cipher/twofish.c | 2 + mpi/mpiutil.c | 4 +- src/cipher-proto.h | 78 +++++- src/cipher.h | 6 - src/gcrypt-module.h | 53 ---- 26 files changed, 356 insertions(+), 643 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From gniibe at fsij.org Wed Oct 2 02:41:52 2013 From: gniibe at fsij.org (NIIBE Yutaka) Date: Wed, 02 Oct 2013 09:41:52 +0900 Subject: possible mpi-pow improvement In-Reply-To: <87a9itb52z.fsf@vigenere.g10code.de> References: <1378456897.3188.14.camel@cfw2.gniibe.org> <87a9itb52z.fsf@vigenere.g10code.de> Message-ID: <1380674512.3342.2.camel@cfw2.gniibe.org> On 2013-10-01 at 13:31 +0200, Werner Koch wrote: > Thus you change is even an improvement for the general case. Yes, at least on machine. > Can you please change your patch to conditionally include the k-ary > multiply but enable it right away. Sure. Here is current version, which is ready to push. Since then, I modified the selection of the value W (window size) for 64-bit machine, and added some comments. With compilation option -DUSE_ALGORITHM_SIMPLE_EXPONENTIATION, we can use old implementation. Please test it out. * mpi/mpi-pow.c (gcry_mpi_powm): New implementation of left-to-right k-ary exponentiation. -- Signed-off-by: NIIBE Yutaka For the Yarom/Falkner flush+reload cache side-channel attack, we changed the code so that it always calls the multiplication routine (even if we can skip it to get result). This results some performance regression. This change is for recovering performance with efficient algorithm. --- mpi/mpi-pow.c | 454 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 454 insertions(+) diff --git a/mpi/mpi-pow.c b/mpi/mpi-pow.c index 85d6fd8..469c382 100644 --- a/mpi/mpi-pow.c +++ b/mpi/mpi-pow.c @@ -34,6 +34,14 @@ #include "longlong.h" +/* + * When you need old implementation, please add compilation option + * -DUSE_ALGORITHM_SIMPLE_EXPONENTIATION + * or expose this line: +#define USE_ALGORITHM_SIMPLE_EXPONENTIATION 1 + */ + +#if defined(USE_ALGORITHM_SIMPLE_EXPONENTIATION) /**************** * RES = BASE ^ EXPO mod MOD */ @@ -336,3 +344,449 @@ gcry_mpi_powm (gcry_mpi_t res, if (tspace) _gcry_mpi_free_limb_space( tspace, 0 ); } +#else +/** + * Internal function to compute + * + * X = R * S mod M + * + * and set the size of X at the pointer XSIZE_P. + * Use karatsuba structure at KARACTX_P. + * + * Condition: + * RSIZE >= SSIZE + * Enough space for X is allocated beforehand. + * + * For generic cases, we can/should use gcry_mpi_mulm. + * This function is use for specific internal case. + */ +static void +mul_mod (mpi_ptr_t xp, mpi_size_t *xsize_p, + mpi_ptr_t rp, mpi_size_t rsize, + mpi_ptr_t sp, mpi_size_t ssize, + mpi_ptr_t mp, mpi_size_t msize, + struct karatsuba_ctx *karactx_p) +{ + if( ssize < KARATSUBA_THRESHOLD ) + _gcry_mpih_mul ( xp, rp, rsize, sp, ssize ); + else + _gcry_mpih_mul_karatsuba_case (xp, rp, rsize, sp, ssize, karactx_p); + + if (rsize + ssize > msize) + { + _gcry_mpih_divrem (xp + msize, 0, xp, rsize + ssize, mp, msize); + *xsize_p = msize; + } + else + *xsize_p = rsize + ssize; +} + +#define SIZE_B_2I3 ((1 << (5 - 1)) - 1) + +/**************** + * RES = BASE ^ EXPO mod MOD + * + * To mitigate the Yarom/Falkner flush+reload cache side-channel + * attack on the RSA secret exponent, we don't use the square + * routine but multiplication. + * + * Reference: + * Handbook of Applied Cryptography + * Algorithm 14.83: Modified left-to-right k-ary exponentiation + */ +void +gcry_mpi_powm (gcry_mpi_t res, + gcry_mpi_t base, gcry_mpi_t expo, gcry_mpi_t mod) +{ + /* Pointer to the limbs of the arguments, their size and signs. */ + mpi_ptr_t rp, ep, mp, bp; + mpi_size_t esize, msize, bsize, rsize; + int msign, bsign, rsign; + /* Flags telling the secure allocation status of the arguments. */ + int esec, msec, bsec; + /* Size of the result including space for temporary values. */ + mpi_size_t size; + /* Helper. */ + int mod_shift_cnt; + int negative_result; + mpi_ptr_t mp_marker = NULL; + mpi_ptr_t bp_marker = NULL; + mpi_ptr_t ep_marker = NULL; + mpi_ptr_t xp_marker = NULL; + unsigned int mp_nlimbs = 0; + unsigned int bp_nlimbs = 0; + unsigned int ep_nlimbs = 0; + unsigned int xp_nlimbs = 0; + mpi_ptr_t b_2i3[SIZE_B_2I3]; /* Pre-computed array: BASE^3, ^5, ^7, ... */ + mpi_size_t b_2i3size[SIZE_B_2I3]; + mpi_size_t W; + mpi_ptr_t base_u; + mpi_size_t base_u_size; + + esize = expo->nlimbs; + msize = mod->nlimbs; + size = 2 * msize; + msign = mod->sign; + + if (esize * BITS_PER_MPI_LIMB > 512) + W = 5; + else if (esize * BITS_PER_MPI_LIMB > 256) + W = 4; + else if (esize * BITS_PER_MPI_LIMB > 128) + W = 3; + else if (esize * BITS_PER_MPI_LIMB > 64) + W = 2; + else + W = 1; + + esec = mpi_is_secure(expo); + msec = mpi_is_secure(mod); + bsec = mpi_is_secure(base); + + rp = res->d; + ep = expo->d; + + if (!msize) + _gcry_divide_by_zero(); + + if (!esize) + { + /* Exponent is zero, result is 1 mod MOD, i.e., 1 or 0 depending + on if MOD equals 1. */ + res->nlimbs = (msize == 1 && mod->d[0] == 1) ? 0 : 1; + if (res->nlimbs) + { + RESIZE_IF_NEEDED (res, 1); + rp = res->d; + rp[0] = 1; + } + res->sign = 0; + goto leave; + } + + /* Normalize MOD (i.e. make its most significant bit set) as + required by mpn_divrem. This will make the intermediate values + in the calculation slightly larger, but the correct result is + obtained after a final reduction using the original MOD value. */ + mp_nlimbs = msec? msize:0; + mp = mp_marker = mpi_alloc_limb_space(msize, msec); + count_leading_zeros (mod_shift_cnt, mod->d[msize-1]); + if (mod_shift_cnt) + _gcry_mpih_lshift (mp, mod->d, msize, mod_shift_cnt); + else + MPN_COPY( mp, mod->d, msize ); + + bsize = base->nlimbs; + bsign = base->sign; + if (bsize > msize) + { + /* The base is larger than the module. Reduce it. + + Allocate (BSIZE + 1) with space for remainder and quotient. + (The quotient is (bsize - msize + 1) limbs.) */ + bp_nlimbs = bsec ? (bsize + 1):0; + bp = bp_marker = mpi_alloc_limb_space( bsize + 1, bsec ); + MPN_COPY ( bp, base->d, bsize ); + /* We don't care about the quotient, store it above the + * remainder, at BP + MSIZE. */ + _gcry_mpih_divrem( bp + msize, 0, bp, bsize, mp, msize ); + bsize = msize; + /* Canonicalize the base, since we are going to multiply with it + quite a few times. */ + MPN_NORMALIZE( bp, bsize ); + } + else + bp = base->d; + + if (!bsize) + { + res->nlimbs = 0; + res->sign = 0; + goto leave; + } + + + /* Make BASE, EXPO and MOD not overlap with RES. */ + if ( rp == bp ) + { + /* RES and BASE are identical. Allocate temp. space for BASE. */ + gcry_assert (!bp_marker); + bp_nlimbs = bsec? bsize:0; + bp = bp_marker = mpi_alloc_limb_space( bsize, bsec ); + MPN_COPY(bp, rp, bsize); + } + if ( rp == ep ) + { + /* RES and EXPO are identical. Allocate temp. space for EXPO. */ + ep_nlimbs = esec? esize:0; + ep = ep_marker = mpi_alloc_limb_space( esize, esec ); + MPN_COPY(ep, rp, esize); + } + if ( rp == mp ) + { + /* RES and MOD are identical. Allocate temporary space for MOD.*/ + gcry_assert (!mp_marker); + mp_nlimbs = msec?msize:0; + mp = mp_marker = mpi_alloc_limb_space( msize, msec ); + MPN_COPY(mp, rp, msize); + } + + /* Copy base to the result. */ + if (res->alloced < size) + { + mpi_resize (res, size); + rp = res->d; + } + + /* Main processing. */ + { + mpi_size_t i, j; + mpi_ptr_t xp; + mpi_size_t xsize; + int c; + mpi_limb_t e; + mpi_limb_t carry_limb; + struct karatsuba_ctx karactx; + mpi_ptr_t tp; + + xp_nlimbs = msec? (2 * (msize + 1)):0; + xp = xp_marker = mpi_alloc_limb_space( 2 * (msize + 1), msec ); + + memset( &karactx, 0, sizeof karactx ); + negative_result = (ep[0] & 1) && bsign; + + /* Precompute B_2I3[], BASE^(2 * i + 3), BASE^3, ^5, ^7, ... */ + if (W > 1) /* X := BASE^2 */ + mul_mod (xp, &xsize, bp, bsize, bp, bsize, mp, msize, &karactx); + for (i = 0; i < (1 << (W - 1)) - 1; i++) + { /* B_2I3[i] = BASE^(2 * i + 3) */ + if (i == 0) + { + base_u = bp; + base_u_size = bsize; + } + else + { + base_u = b_2i3[i-1]; + base_u_size = b_2i3size[i-1]; + } + + if (xsize >= base_u_size) + mul_mod (rp, &rsize, xp, xsize, base_u, base_u_size, + mp, msize, &karactx); + else + mul_mod (rp, &rsize, base_u, base_u_size, xp, xsize, + mp, msize, &karactx); + b_2i3[i] = mpi_alloc_limb_space (rsize, esec); + b_2i3size[i] = rsize; + MPN_COPY (b_2i3[i], rp, rsize); + } + + i = esize - 1; + + /* Main loop. + + Make the result be pointed to alternately by XP and RP. This + helps us avoid block copying, which would otherwise be + necessary with the overlap restrictions of + _gcry_mpih_divmod. With 50% probability the result after this + loop will be in the area originally pointed by RP (==RES->d), + and with 50% probability in the area originally pointed to by XP. */ + rsign = 0; + if (W == 1) + { + rsize = bsize; + } + else + { + rsize = msize; + MPN_ZERO (rp, rsize); + } + MPN_COPY ( rp, bp, bsize ); + + e = ep[i]; + count_leading_zeros (c, e); + e = (e << c) << 1; + c = BITS_PER_MPI_LIMB - 1 - c; + + j = 0; + + for (;;) + if (e == 0) + { + j += c; + i--; + if ( i < 0 ) + { + c = 0; + break; + } + + e = ep[i]; + c = BITS_PER_MPI_LIMB; + } + else + { + int c0; + mpi_limb_t e0; + + count_leading_zeros (c0, e); + e = (e << c0); + c -= c0; + j += c0; + + if (c >= W) + { + e0 = (e >> (BITS_PER_MPI_LIMB - W)); + e = (e << W); + c -= W; + } + else + { + i--; + if ( i < 0 ) + { + e = (e >> (BITS_PER_MPI_LIMB - c)); + break; + } + + c0 = c; + e0 = (e >> (BITS_PER_MPI_LIMB - W)) + | (ep[i] >> (BITS_PER_MPI_LIMB - W + c0)); + e = (ep[i] << (W - c0)); + c = BITS_PER_MPI_LIMB - W + c0; + } + + count_trailing_zeros (c0, e0); + e0 = (e0 >> c0) >> 1; + + for (j += W - c0; j; j--) + { + mul_mod (xp, &xsize, rp, rsize, rp, rsize, mp, msize, &karactx); + tp = rp; rp = xp; xp = tp; + rsize = xsize; + } + + if (e0 == 0) + { + base_u = bp; + base_u_size = bsize; + } + else + { + base_u = b_2i3[e0 - 1]; + base_u_size = b_2i3size[e0 -1]; + } + + mul_mod (xp, &xsize, rp, rsize, base_u, base_u_size, + mp, msize, &karactx); + tp = rp; rp = xp; xp = tp; + rsize = xsize; + + j = c0; + } + + if (c != 0) + { + j += c; + count_trailing_zeros (c, e); + e = (e >> c); + j -= c; + } + + while (j--) + { + mul_mod (xp, &xsize, rp, rsize, rp, rsize, mp, msize, &karactx); + tp = rp; rp = xp; xp = tp; + rsize = xsize; + } + + if (e != 0) + { + if ((e>>1) == 0) + { + base_u = bp; + base_u_size = bsize; + } + else + { + base_u = b_2i3[(e>>1) - 1]; + base_u_size = b_2i3size[(e>>1) -1]; + } + + mul_mod (xp, &xsize, rp, rsize, base_u, base_u_size, + mp, msize, &karactx); + tp = rp; rp = xp; xp = tp; + rsize = xsize; + + for (; c; c--) + { + mul_mod (xp, &xsize, rp, rsize, rp, rsize, mp, msize, &karactx); + tp = rp; rp = xp; xp = tp; + rsize = xsize; + } + } + + /* We shifted MOD, the modulo reduction argument, left + MOD_SHIFT_CNT steps. Adjust the result by reducing it with the + original MOD. + + Also make sure the result is put in RES->d (where it already + might be, see above). */ + if ( mod_shift_cnt ) + { + carry_limb = _gcry_mpih_lshift( res->d, rp, rsize, mod_shift_cnt); + rp = res->d; + if ( carry_limb ) + { + rp[rsize] = carry_limb; + rsize++; + } + } + else if (res->d != rp) + { + MPN_COPY (res->d, rp, rsize); + rp = res->d; + } + + if ( rsize >= msize ) + { + _gcry_mpih_divrem(rp + msize, 0, rp, rsize, mp, msize); + rsize = msize; + } + + /* Remove any leading zero words from the result. */ + if ( mod_shift_cnt ) + _gcry_mpih_rshift( rp, rp, rsize, mod_shift_cnt); + MPN_NORMALIZE (rp, rsize); + + _gcry_mpih_release_karatsuba_ctx (&karactx ); + for (i = 0; i < (1 << (W - 1)) - 1; i++) + _gcry_mpi_free_limb_space( b_2i3[i], esec ? b_2i3size[i] : 0 ); + } + + /* Fixup for negative results. */ + if ( negative_result && rsize ) + { + if ( mod_shift_cnt ) + _gcry_mpih_rshift( mp, mp, msize, mod_shift_cnt); + _gcry_mpih_sub( rp, mp, msize, rp, rsize); + rsize = msize; + rsign = msign; + MPN_NORMALIZE(rp, rsize); + } + gcry_assert (res->d == rp); + res->nlimbs = rsize; + res->sign = rsign; + + leave: + if (mp_marker) + _gcry_mpi_free_limb_space( mp_marker, mp_nlimbs ); + if (bp_marker) + _gcry_mpi_free_limb_space( bp_marker, bp_nlimbs ); + if (ep_marker) + _gcry_mpi_free_limb_space( ep_marker, ep_nlimbs ); + if (xp_marker) + _gcry_mpi_free_limb_space( xp_marker, xp_nlimbs ); +} +#endif -- 1.7.10.4 From cvs at cvs.gnupg.org Wed Oct 2 14:42:52 2013 From: cvs at cvs.gnupg.org (by Dmitry Eremin-Solenikov) Date: Wed, 02 Oct 2013 14:42:52 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-289-g33757c1 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 33757c1e03f1d885920633edf543cd1c77999455 (commit) from 628ed5ba0ef4b1f04b5a77e29e4bc49a1fe13c07 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 33757c1e03f1d885920633edf543cd1c77999455 Author: Dmitry Eremin-Solenikov Date: Fri Sep 20 01:14:12 2013 +0400 Fix errors when building with Clang on PPC * mpi/longlong.h (add_ssaaaa, sub_ddmmss, count_leading_zeros, umul_ppmm): Do not cast asm output to USItype. Signed-off-by: Dmitry Eremin-Solenikov diff --git a/mpi/longlong.h b/mpi/longlong.h index 4e315df..c2ab9c5 100644 --- a/mpi/longlong.h +++ b/mpi/longlong.h @@ -862,22 +862,22 @@ typedef unsigned int UTItype __attribute__ ((mode (TI))); do { \ if (__builtin_constant_p (bh) && (bh) == 0) \ __asm__ ("{a%I4|add%I4c} %1,%3,%4\n\t{aze|addze} %0,%2" \ - : "=r" ((USItype)(sh)), \ - "=&r" ((USItype)(sl)) \ + : "=r" ((sh)), \ + "=&r" ((sl)) \ : "%r" ((USItype)(ah)), \ "%r" ((USItype)(al)), \ "rI" ((USItype)(bl))); \ else if (__builtin_constant_p (bh) && (bh) ==~(USItype) 0) \ __asm__ ("{a%I4|add%I4c} %1,%3,%4\n\t{ame|addme} %0,%2" \ - : "=r" ((USItype)(sh)), \ - "=&r" ((USItype)(sl)) \ + : "=r" ((sh)), \ + "=&r" ((sl)) \ : "%r" ((USItype)(ah)), \ "%r" ((USItype)(al)), \ "rI" ((USItype)(bl))); \ else \ __asm__ ("{a%I5|add%I5c} %1,%4,%5\n\t{ae|adde} %0,%2,%3" \ - : "=r" ((USItype)(sh)), \ - "=&r" ((USItype)(sl)) \ + : "=r" ((sh)), \ + "=&r" ((sl)) \ : "%r" ((USItype)(ah)), \ "r" ((USItype)(bh)), \ "%r" ((USItype)(al)), \ @@ -887,36 +887,36 @@ typedef unsigned int UTItype __attribute__ ((mode (TI))); do { \ if (__builtin_constant_p (ah) && (ah) == 0) \ __asm__ ("{sf%I3|subf%I3c} %1,%4,%3\n\t{sfze|subfze} %0,%2" \ - : "=r" ((USItype)(sh)), \ - "=&r" ((USItype)(sl)) \ + : "=r" ((sh)), \ + "=&r" ((sl)) \ : "r" ((USItype)(bh)), \ "rI" ((USItype)(al)), \ "r" ((USItype)(bl))); \ else if (__builtin_constant_p (ah) && (ah) ==~(USItype) 0) \ __asm__ ("{sf%I3|subf%I3c} %1,%4,%3\n\t{sfme|subfme} %0,%2" \ - : "=r" ((USItype)(sh)), \ - "=&r" ((USItype)(sl)) \ + : "=r" ((sh)), \ + "=&r" ((sl)) \ : "r" ((USItype)(bh)), \ "rI" ((USItype)(al)), \ "r" ((USItype)(bl))); \ else if (__builtin_constant_p (bh) && (bh) == 0) \ __asm__ ("{sf%I3|subf%I3c} %1,%4,%3\n\t{ame|addme} %0,%2" \ - : "=r" ((USItype)(sh)), \ - "=&r" ((USItype)(sl)) \ + : "=r" ((sh)), \ + "=&r" ((sl)) \ : "r" ((USItype)(ah)), \ "rI" ((USItype)(al)), \ "r" ((USItype)(bl))); \ else if (__builtin_constant_p (bh) && (bh) ==~(USItype) 0) \ __asm__ ("{sf%I3|subf%I3c} %1,%4,%3\n\t{aze|addze} %0,%2" \ - : "=r" ((USItype)(sh)), \ - "=&r" ((USItype)(sl)) \ + : "=r" ((sh)), \ + "=&r" ((sl)) \ : "r" ((USItype)(ah)), \ "rI" ((USItype)(al)), \ "r" ((USItype)(bl))); \ else \ __asm__ ("{sf%I4|subf%I4c} %1,%5,%4\n\t{sfe|subfe} %0,%3,%2" \ - : "=r" ((USItype)(sh)), \ - "=&r" ((USItype)(sl)) \ + : "=r" ((sh)), \ + "=&r" ((sl)) \ : "r" ((USItype)(ah)), \ "r" ((USItype)(bh)), \ "rI" ((USItype)(al)), \ @@ -924,7 +924,7 @@ typedef unsigned int UTItype __attribute__ ((mode (TI))); } while (0) #define count_leading_zeros(count, x) \ __asm__ ("{cntlz|cntlzw} %0,%1" \ - : "=r" ((USItype)(count)) \ + : "=r" ((count)) \ : "r" ((USItype)(x))) #define COUNT_LEADING_ZEROS_0 32 #if defined (_ARCH_PPC) @@ -932,7 +932,7 @@ typedef unsigned int UTItype __attribute__ ((mode (TI))); do { \ USItype __m0 = (m0), __m1 = (m1); \ __asm__ ("mulhwu %0,%1,%2" \ - : "=r" ((USItype) ph) \ + : "=r" (ph) \ : "%r" (__m0), \ "r" (__m1)); \ (pl) = __m0 * __m1; \ @@ -954,8 +954,8 @@ typedef unsigned int UTItype __attribute__ ((mode (TI))); do { \ USItype __m0 = (m0), __m1 = (m1); \ __asm__ ("mul %0,%2,%3" \ - : "=r" ((USItype)(xh)), \ - "=q" ((USItype)(xl)) \ + : "=r" ((xh)), \ + "=q" ((xl)) \ : "r" (__m0), \ "r" (__m1)); \ (xh) += ((((SItype) __m0 >> 31) & __m1) \ ----------------------------------------------------------------------- Summary of changes: mpi/longlong.h | 40 ++++++++++++++++++++-------------------- 1 files changed, 20 insertions(+), 20 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Wed Oct 2 14:52:26 2013 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Wed, 02 Oct 2013 14:52:26 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-290-gf04a1db Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via f04a1db22d982627ba87da4e5df52df9b994c779 (commit) from 33757c1e03f1d885920633edf543cd1c77999455 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit f04a1db22d982627ba87da4e5df52df9b994c779 Author: Werner Koch Date: Wed Oct 2 14:14:57 2013 +0200 Remove deprecated control codes. * src/gcrypt.h.in (GCRYCTL_SET_KEY): Remove. (GCRYCTL_SET_IV): Remove. (GCRYCTL_SET_CTR): Remove. * cipher/md.c (gcry_md_ctl): Remove deprecated GCRYCTL_SET_KEY. * cipher/cipher.c (gcry_cipher_ctl): Remove deprecated GCRYCTL_SET_KEY, GCRYCTL_SET_IV, GCRYCTL_SET_CTR. -- Real functions are available for a long time now thus there is no more point in supporting the control code hacks. We have an ABI break anyway thus this is a good time to get rid of them. Signed-off-by: Werner Koch diff --git a/NEWS b/NEWS index c232a99..ab326eb 100644 --- a/NEWS +++ b/NEWS @@ -10,6 +10,8 @@ Noteworthy changes in version 1.6.0 (unreleased) * The deprecated message digest debug macros have been removed. Use gcry_md_debug instead. + * Removed deprecated control codes. + * Added support for the IDEA cipher algorithm. * Added support for the Salsa20 and reduced Salsa20/12 stream ciphers. @@ -24,10 +26,12 @@ Noteworthy changes in version 1.6.0 (unreleased) * Added support for the SCRYPT algorithm. - * Mitigate the Yarom/Falkner flush+reload side-channel attack on RSA + * Mitigated the Yarom/Falkner flush+reload side-channel attack on RSA secret keys. See [CVE-2013-4242]. - * Support Deterministic DSA as per RFC-6969. + * Added support for Deterministic DSA as per RFC-6969. + + * Added support for curve Ed25519. * Added a scatter gather hash convenience function. @@ -41,20 +45,24 @@ Noteworthy changes in version 1.6.0 (unreleased) * Interface changes relative to the 1.5.0 release: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - gcry_ac_* REMOVED. - GCRY_AC_* REMOVED. - gcry_module_t REMOVED. - gcry_cipher_register REMOVED. - gcry_cipher_unregister REMOVED. - gcry_cipher_list REMOVED. - gcry_pk_register REMOVED. - gcry_pk_unregister REMOVED. - gcry_pk_list REMOVED. - gcry_md_register REMOVED. - gcry_md_unregister REMOVED. - gcry_md_list REMOVED. - gcry_md_start_debug REMOVED (macro). - gcry_md_stop_debug REMOVED (macro). + gcry_ac_* REMOVED. + GCRY_AC_* REMOVED. + gcry_module_t REMOVED. + gcry_cipher_register REMOVED. + gcry_cipher_unregister REMOVED. + gcry_cipher_list REMOVED. + gcry_pk_register REMOVED. + gcry_pk_unregister REMOVED. + gcry_pk_list REMOVED. + gcry_md_register REMOVED. + gcry_md_unregister REMOVED. + gcry_md_list REMOVED. + gcry_md_start_debug REMOVED (macro). + gcry_md_stop_debug REMOVED (macro). + GCRYCTL_SET_KEY REMOVED. + GCRYCTL_SET_IV REMOVED. + GCRYCTL_SET_CTR REMOVED. + GCRYCTL_DISABLE_ALGO CHANGED: Not anymore thread-safe. gcry_md_hash_buffers NEW. gcry_buffer_t NEW. GCRYCTL_SET_ENFORCED_FIPS_FLAG NEW. @@ -105,7 +113,6 @@ Noteworthy changes in version 1.6.0 (unreleased) GCRY_MD_GOSTR3411_94 NEW. GCRY_MD_STRIBOG256 NEW. GCRY_MD_STRIBOG512 NEW. - GCRYCTL_DISABLE_ALGO CHANGED: Not anymore thread-safe. GCRY_PK_ECC NEW. gcry_log_debug NEW. gcry_log_debughex NEW. diff --git a/cipher/cipher.c b/cipher/cipher.c index ca61375..75d42d1 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -918,14 +918,6 @@ gcry_cipher_ctl( gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) switch (cmd) { - case GCRYCTL_SET_KEY: /* Deprecated; use gcry_cipher_setkey. */ - rc = cipher_setkey( h, buffer, buflen ); - break; - - case GCRYCTL_SET_IV: /* Deprecated; use gcry_cipher_setiv. */ - cipher_setiv( h, buffer, buflen ); - break; - case GCRYCTL_RESET: cipher_reset (h); break; @@ -962,10 +954,6 @@ gcry_cipher_ctl( gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) disable_cipher_algo( *(int*)buffer ); break; - case GCRYCTL_SET_CTR: /* Deprecated; use gcry_cipher_setctr. */ - rc = gpg_err_code (_gcry_cipher_setctr (h, buffer, buflen)); - break; - case 61: /* Disable weak key detection (private). */ if (h->spec->set_extra_info) rc = h->spec->set_extra_info diff --git a/cipher/md.c b/cipher/md.c index e3cc6c6..5c66397 100644 --- a/cipher/md.c +++ b/cipher/md.c @@ -717,14 +717,13 @@ gcry_md_ctl (gcry_md_hd_t hd, int cmd, void *buffer, size_t buflen) { gcry_err_code_t rc = 0; + (void)buflen; /* Currently not used. */ + switch (cmd) { case GCRYCTL_FINALIZE: md_final (hd); break; - case GCRYCTL_SET_KEY: - rc = gcry_err_code (gcry_md_setkey (hd, buffer, buflen)); - break; case GCRYCTL_START_DUMP: md_start_debug (hd, buffer); break; diff --git a/src/gcrypt.h.in b/src/gcrypt.h.in index 8fca52e..719b951 100644 --- a/src/gcrypt.h.in +++ b/src/gcrypt.h.in @@ -257,8 +257,7 @@ const char *gcry_check_version (const char *req_version); /* Codes used with the gcry_control function. */ enum gcry_ctl_cmds { - GCRYCTL_SET_KEY = 1, - GCRYCTL_SET_IV = 2, + /* Note: 1 .. 2 are not anymore used. */ GCRYCTL_CFB_SYNC = 3, GCRYCTL_RESET = 4, /* e.g. for MDs */ GCRYCTL_FINALIZE = 5, @@ -298,7 +297,7 @@ enum gcry_ctl_cmds GCRYCTL_ANY_INITIALIZATION_P = 40, GCRYCTL_SET_CBC_CTS = 41, GCRYCTL_SET_CBC_MAC = 42, - GCRYCTL_SET_CTR = 43, + /* Note: 43 is not anymore used. */ GCRYCTL_ENABLE_QUICK_RANDOM = 44, GCRYCTL_SET_RANDOM_SEED_FILE = 45, GCRYCTL_UPDATE_RANDOM_SEED_FILE = 46, ----------------------------------------------------------------------- Summary of changes: NEWS | 41 ++++++++++++++++++++++++----------------- cipher/cipher.c | 12 ------------ cipher/md.c | 5 ++--- src/gcrypt.h.in | 5 ++--- 4 files changed, 28 insertions(+), 35 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Wed Oct 2 15:32:17 2013 From: cvs at cvs.gnupg.org (by Jussi Kivilinna) Date: Wed, 02 Oct 2013 15:32:17 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-291-g6410152 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 6410152338a2b2ac1216e70c153cd16f9199c94e (commit) from f04a1db22d982627ba87da4e5df52df9b994c779 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 6410152338a2b2ac1216e70c153cd16f9199c94e Author: Jussi Kivilinna Date: Wed Oct 2 15:59:59 2013 +0300 sha512: fix building on ARM * cipher/sha512.c (transform) [USE_ARM_NEON_ASM]: Fix 'hd' to 'ctx'. -- Signed-off-by: Jussi Kivilinna diff --git a/cipher/sha512.c b/cipher/sha512.c index af30775..97fb203 100644 --- a/cipher/sha512.c +++ b/cipher/sha512.c @@ -488,7 +488,7 @@ transform (void *context, const unsigned char *data) SHA512_CONTEXT *ctx = context; #ifdef USE_ARM_NEON_ASM - if (hd->use_neon) + if (ctx->use_neon) { _gcry_sha512_transform_armv7_neon(&ctx->state, data, k); ----------------------------------------------------------------------- Summary of changes: cipher/sha512.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From dbaryshkov at gmail.com Wed Oct 2 17:09:14 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Wed, 2 Oct 2013 19:09:14 +0400 Subject: GOST ECC pubkey In-Reply-To: <8761thb4wk.fsf@vigenere.g10code.de> References: <1379653630.3179.2.camel@cfw2.gniibe.org> <1379655225.3179.3.camel@cfw2.gniibe.org> <8761thb4wk.fsf@vigenere.g10code.de> Message-ID: On Tue, Oct 1, 2013 at 3:35 PM, Werner Koch wrote: > On Fri, 20 Sep 2013 07:33, gniibe at fsij.org said: > >> I think that it is possible to represent GOST3410 without extending >> the structure ecc_domain_parms_t. Just redefine "n" as order of >> cyclic subgroup of elliptic curve points group for GOST3410. > > Anyone else with comments? > > I haven't read the specs, if it is different to ECDSA we might want to > add another signature scheme similar to the (flag eddsa). I have sent a patch a few days ago. It just adds a (flag gost) to the data and signature s-exps. There is no difference with ECDSA in the keys S-expressions. -- With best wishes Dmitry From cvs at cvs.gnupg.org Wed Oct 2 18:10:40 2013 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Wed, 02 Oct 2013 18:10:40 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-292-g2f767f6 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 2f767f6a17f7e99da4075882f7fe3ca597b31bdb (commit) from 6410152338a2b2ac1216e70c153cd16f9199c94e (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 2f767f6a17f7e99da4075882f7fe3ca597b31bdb Author: Werner Koch Date: Wed Oct 2 16:56:46 2013 +0200 Provide Pth compatiblity for use with GnuPG 2.0. * src/ath.c (ath_install): Call ath_init and declare Pth as compatible. -- GnuPG 2.0 requires GNU Pth which is a plain userland thread implementation. Given that decent versions of GNU Pth seem to work together with pthread, we can declare Pth as compatible. Native pthreads in Libgcrypt are only used internal to Libgcrypt and any internal blocking should be invisible to Pth. Signed-off-by: Werner Koch diff --git a/src/ath.c b/src/ath.c index 1363d9e..7a7035d 100644 --- a/src/ath.c +++ b/src/ath.c @@ -134,8 +134,14 @@ ath_get_model (int *r_model) gpg_err_code_t ath_install (struct ath_ops *ath_ops) { + gpg_err_code_t rc; unsigned int thread_option; + /* Fist call ath_init so that we know our thread model. */ + rc = ath_init (); + if (rc) + return rc; + /* Check if the requested thread option is compatible to the thread option we are already committed to. */ thread_option = ath_ops? GET_OPTION (ath_ops->option) : 0; @@ -149,8 +155,15 @@ ath_install (struct ath_ops *ath_ops) { if (thread_option == ATH_THREAD_OPTION_PTHREAD) return 0; /* Okay - compatible. */ + if (thread_option == ATH_THREAD_OPTION_PTH) + return 0; /* Okay - compatible. */ } #endif /*USE_POSIX_THREADS_WEAK*/ + else if (thread_option == ATH_THREAD_OPTION_PTH) + { + if (thread_model == ath_model_none) + return 0; /* Okay - compatible. */ + } else if (thread_option == ATH_THREAD_OPTION_DEFAULT) return 0; /* No thread support requested. */ ----------------------------------------------------------------------- Summary of changes: src/ath.c | 13 +++++++++++++ 1 files changed, 13 insertions(+), 0 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From wk at gnupg.org Wed Oct 2 17:39:20 2013 From: wk at gnupg.org (Werner Koch) Date: Wed, 02 Oct 2013 17:39:20 +0200 Subject: [PATCH] Add support for 128-bit keys in RC2 In-Reply-To: <1452407.XMrRVjiz6F@al> (Peter Wu's message of "Tue, 01 Oct 2013 14:59:21 +0200") References: <1380230432-13431-1-git-send-email-lekensteyn@gmail.com> <87y56d9p83.fsf@vigenere.g10code.de> <1452407.XMrRVjiz6F@al> Message-ID: <87txgz7kd3.fsf@vigenere.g10code.de> On Tue, 1 Oct 2013 14:59, lekensteyn at gmail.com said: > Can you also update the documentation? In doc/gcrypt.texi from line 1563 it > says: Will do so. > If keys of any length (ranging from 0 to 1024?) are allowed, it could be > specified in there. Please view that an undocumented features. If we document it we would need to write test cases ;-). Shalom-Salam, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From wk at gnupg.org Wed Oct 2 17:37:26 2013 From: wk at gnupg.org (Werner Koch) Date: Wed, 02 Oct 2013 17:37:26 +0200 Subject: ../../src/visibility.c:498:3: warning: implicit declaration of function `_gcry_mpi_ec_new' [-Wimplicit-function-declaration] In-Reply-To: <22656519.CJENjD6aAd@al> (Peter Wu's message of "Tue, 01 Oct 2013 15:21:09 +0200") References: <22656519.CJENjD6aAd@al> Message-ID: <87y56b7kg9.fsf@vigenere.g10code.de> On Tue, 1 Oct 2013 15:21, lekensteyn at gmail.com said: > While building the latest git master (libgcrypt-1.5.0-283-g738177e), I got a > warning about an implicit declaration of a function. As far as I can see, it Meanwhile fixed. You do not need to report such warnings on standard platforms (i.e. using gcc on a Unix platform). We see them as well and they are usually fixed along with other minor fixes. Salam-Shalom, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From cvs at cvs.gnupg.org Wed Oct 2 18:17:45 2013 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Wed, 02 Oct 2013 18:17:45 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-293-g9981040 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 99810404bee86aa55822740ea5ae670848074856 (commit) from 2f767f6a17f7e99da4075882f7fe3ca597b31bdb (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 99810404bee86aa55822740ea5ae670848074856 Author: Werner Koch Date: Wed Oct 2 17:45:13 2013 +0200 doc: Remove note that RC2/128 is not yet supported. -- diff --git a/doc/gcrypt.texi b/doc/gcrypt.texi index 0590a26..c80a6cc 100644 --- a/doc/gcrypt.texi +++ b/doc/gcrypt.texi @@ -1560,9 +1560,7 @@ The Serpent cipher from the AES contest. @itemx GCRY_CIPHER_RFC2268_128 @cindex rfc-2268 @cindex RC2 -Ron's Cipher 2 in the 40 and 128 bit variants. Note, that we currently -only support the 40 bit variant. The identifier for 128 is reserved for -future use. +Ron's Cipher 2 in the 40 and 128 bit variants. @item GCRY_CIPHER_SEED @cindex Seed (cipher) ----------------------------------------------------------------------- Summary of changes: doc/gcrypt.texi | 4 +--- 1 files changed, 1 insertions(+), 3 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From nmav at gnutls.org Wed Oct 2 18:12:55 2013 From: nmav at gnutls.org (Nikos Mavrogiannopoulos) Date: Wed, 02 Oct 2013 18:12:55 +0200 Subject: [PATCH] Add support for 128-bit keys in RC2 In-Reply-To: <1452407.XMrRVjiz6F@al> References: <1380230432-13431-1-git-send-email-lekensteyn@gmail.com> <87y56d9p83.fsf@vigenere.g10code.de> <1452407.XMrRVjiz6F@al> Message-ID: <524C4607.5020308@gnutls.org> On 10/01/2013 02:59 PM, Peter Wu wrote: > Can you also update the documentation? In doc/gcrypt.texi from line 1563 it > says: > Ron's Cipher 2 in the 40 and 128 bit variants. Note, that we currently > only support the 40 bit variant. The identifier for 128 is reserved for > future use. > If keys of any length (ranging from 0 to 1024?) are allowed, it could be > specified in there. They are allowed, but it isn't the normal RC2 cipher. It is the RC2-40 which uses a different key expansion. There is no point to use this variant with longer keys. regards, Nikos From cvs at cvs.gnupg.org Wed Oct 2 20:21:27 2013 From: cvs at cvs.gnupg.org (by Jussi Kivilinna) Date: Wed, 02 Oct 2013 20:21:27 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-294-gd3fa0bc Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via d3fa0bcf62bdb77b70fc96034d86b9f76ba4d4c1 (commit) from 99810404bee86aa55822740ea5ae670848074856 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit d3fa0bcf62bdb77b70fc96034d86b9f76ba4d4c1 Author: Jussi Kivilinna Date: Wed Oct 2 20:47:56 2013 +0300 Prevent tail call optimization with _gcry_burn_stack * configure.ac: New check, HAVE_GCC_ASM_VOLATILE_MEMORY. * src/g10lib.h (_gcry_burn_stack): Rename to __gcry_burn_stack. (__gcry_burn_stack_dummy): New. (_gcry_burn_stack): New macro. * src/misc.c (_gcry_burn_stack): Rename to __gcry_burn_stack. (__gcry_burn_stack_dummy): New. -- Tail call optimization can turn _gcry_burn_stack call in to tail jump. When this happens, stack pointer is restored to initial state of current function. This causes problem for _gcry_burn_stack because its callers do not count in current function stack depth. One solution is to prevent gcry_burn_stack being tail optimized by inserting dummy function call behind it. Another would be to add memory barrier 'asm volatile("":::"memory")' behind every _gcry_burn_stack call. This however requires GCC asm support from compiler. Patch adds detection for memory barrier support and when available uses memory barrier to prevent when tail call optimization. If not available dummy function call is used instead. Signed-off-by: Jussi Kivilinna diff --git a/configure.ac b/configure.ac index 2c92028..1460dfd 100644 --- a/configure.ac +++ b/configure.ac @@ -921,7 +921,7 @@ fi # # Check whether the compiler supports 'asm' or '__asm__' keyword for -# assembler blocks +# assembler blocks. # AC_CACHE_CHECK([whether 'asm' assembler keyword is supported], [gcry_cv_have_asm], @@ -945,6 +945,32 @@ fi # +# Check whether the compiler supports inline assembly memory barrier. +# +if test "$gcry_cv_have_asm" = "no" ; then + if test "$gcry_cv_have___asm__" = "yes" ; then + AC_CACHE_CHECK([whether inline assembly memory barrier is supported], + [gcry_cv_have_asm_volatile_memory], + [gcry_cv_have_asm_volatile_memory=no + AC_COMPILE_IFELSE([AC_LANG_SOURCE( + [[void a(void) { __asm__ volatile("":::"memory"); }]])], + [gcry_cv_have_asm_volatile_memory=yes])]) + fi +else + AC_CACHE_CHECK([whether inline assembly memory barrier is supported], + [gcry_cv_have_asm_volatile_memory], + [gcry_cv_have_asm_volatile_memory=no + AC_COMPILE_IFELSE([AC_LANG_SOURCE( + [[void a(void) { asm volatile("":::"memory"); }]])], + [gcry_cv_have_asm_volatile_memory=yes])]) +fi +if test "$gcry_cv_have_asm_volatile_memory" = "yes" ; then + AC_DEFINE(HAVE_GCC_ASM_VOLATILE_MEMORY,1, + [Define if inline asm memory barrier is supported]) +fi + + +# # Check whether GCC inline assembler supports SSSE3 instructions # This is required for the AES-NI instructions. # diff --git a/src/g10lib.h b/src/g10lib.h index 43281ad..85bd93b 100644 --- a/src/g10lib.h +++ b/src/g10lib.h @@ -254,7 +254,16 @@ int strcasecmp (const char *a, const char *b) _GCRY_GCC_ATTR_PURE; /* Stack burning. */ -void _gcry_burn_stack (unsigned int bytes); +#ifdef HAVE_GCC_ASM_VOLATILE_MEMORY +#define __gcry_burn_stack_dummy() asm volatile ("":::"memory") +#else +void __gcry_burn_stack_dummy (void); +#endif + +void __gcry_burn_stack (unsigned int bytes); +#define _gcry_burn_stack(bytes) \ + do { __gcry_burn_stack (bytes); \ + __gcry_burn_stack_dummy (); } while(0) /* To avoid that a compiler optimizes certain memset calls away, these diff --git a/src/misc.c b/src/misc.c index 912039a..9b30ac3 100644 --- a/src/misc.c +++ b/src/misc.c @@ -438,7 +438,7 @@ _gcry_log_printsxp (const char *text, gcry_sexp_t sexp) void -_gcry_burn_stack (unsigned int bytes) +__gcry_burn_stack (unsigned int bytes) { #ifdef HAVE_VLA /* (bytes == 0 ? 1 : bytes) == (!bytes + bytes) */ @@ -456,6 +456,13 @@ _gcry_burn_stack (unsigned int bytes) #endif } +#ifndef HAVE_GCC_ASM_VOLATILE_MEMORY +void +__gcry_burn_stack_dummy (void) +{ +} +#endif + void _gcry_divide_by_zero (void) { ----------------------------------------------------------------------- Summary of changes: configure.ac | 28 +++++++++++++++++++++++++++- src/g10lib.h | 11 ++++++++++- src/misc.c | 9 ++++++++- 3 files changed, 45 insertions(+), 3 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Wed Oct 2 14:25:58 2013 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Wed, 02 Oct 2013 14:25:58 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-288-g628ed5b Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 628ed5ba0ef4b1f04b5a77e29e4bc49a1fe13c07 (commit) via 52783d483293d48cd468143ae6ae2cccbfe17200 (commit) via 0d39997932617ba20656f8bcc230ba744b76c87e (commit) from 3ca180b25e8df252fc16f802cfdc27496e307830 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 628ed5ba0ef4b1f04b5a77e29e4bc49a1fe13c07 Author: Werner Koch Date: Wed Oct 2 13:53:07 2013 +0200 Remove last remains of the former module system. * src/gcrypt-module.h, src/module.c: Remove. * src/visibility.h: Do not include gcrypt-module.h. * src/g10lib.h: Remove all prototypes from module.c (gcry_module): Remove. * cipher/cipher-internal.h (gcry_cipher_handle): Remove unused field. Signed-off-by: Werner Koch diff --git a/cipher/cipher-internal.h b/cipher/cipher-internal.h index cabcd1f..b60ef38 100644 --- a/cipher/cipher-internal.h +++ b/cipher/cipher-internal.h @@ -61,7 +61,6 @@ struct gcry_cipher_handle size_t actual_handle_size; /* Allocated size of this handle. */ size_t handle_offset; /* Offset to the malloced block. */ gcry_cipher_spec_t *spec; - gcry_module_t module; /* The algorithm id. This is a hack required because the module interface does not easily allow to retrieve this value. */ diff --git a/src/Makefile.am b/src/Makefile.am index 3bc27c8..c020239 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -56,10 +56,10 @@ endif libgcrypt_la_CFLAGS = $(GPG_ERROR_CFLAGS) libgcrypt_la_SOURCES = \ gcrypt-int.h g10lib.h visibility.c visibility.h types.h \ - cipher.h cipher-proto.h gcrypt-module.h \ + cipher.h cipher-proto.h \ misc.c global.c sexp.c hwfeatures.c hwf-common.h \ stdmem.c stdmem.h secmem.c secmem.h \ - mpi.h missing-string.c module.c fips.c \ + mpi.h missing-string.c fips.c \ hmac256.c hmac256.h context.c context.h \ ec-context.h \ ath.h ath.c diff --git a/src/g10lib.h b/src/g10lib.h index ff7f4b3..43281ad 100644 --- a/src/g10lib.h +++ b/src/g10lib.h @@ -351,52 +351,7 @@ void _gcry_burn_stack (unsigned int bytes); || (*(a) >= 'A' && *(a) <= 'F') \ || (*(a) >= 'a' && *(a) <= 'f')) -/* Management for ciphers/digests/pubkey-ciphers. */ - -/* Structure for each registered `module'. */ -struct gcry_module -{ - struct gcry_module *next; /* List pointers. */ - struct gcry_module **prevp; - void *spec; /* Pointer to the subsystem-specific - specification structure. */ - void *extraspec; /* Pointer to the subsystem-specific - extra specification structure. */ - int flags; /* Associated flags. */ - int counter; /* Use counter. */ - unsigned int mod_id; /* ID of this module. */ -}; - -/* Flags for the `flags' member of gcry_module_t. */ -#define FLAG_MODULE_DISABLED (1 << 0) - -gcry_err_code_t _gcry_module_add (gcry_module_t *entries, - unsigned int id, - void *spec, - void *extraspec, - gcry_module_t *module); - -typedef int (*gcry_module_lookup_t) (void *spec, void *data); - -/* Lookup a module specification by it's ID. After a successful - lookup, the module has it's resource counter incremented. */ -gcry_module_t _gcry_module_lookup_id (gcry_module_t entries, - unsigned int id); - -/* Internal function. Lookup a module specification. */ -gcry_module_t _gcry_module_lookup (gcry_module_t entries, void *data, - gcry_module_lookup_t func); - -/* Release a module. In case the use-counter reaches zero, destroy - the module. */ -void _gcry_module_release (gcry_module_t entry); - -/* Add a reference to a module. */ -void _gcry_module_use (gcry_module_t module); - -/* Return a list of module IDs. */ -gcry_err_code_t _gcry_module_list (gcry_module_t modules, - int *list, int *list_length); +/* Init functions. */ gcry_err_code_t _gcry_cipher_init (void); gcry_err_code_t _gcry_md_init (void); diff --git a/src/gcrypt-module.h b/src/gcrypt-module.h deleted file mode 100644 index 35a928c..0000000 --- a/src/gcrypt-module.h +++ /dev/null @@ -1,57 +0,0 @@ -/* gcrypt-module.h - GNU Cryptographic Library Interface - Copyright (C) 2003, 2007 Free Software Foundation, Inc. - - This file is part of Libgcrypt. - - Libgcrypt is free software; you can redistribute it and/or modify - it under the terms of the GNU Lesser General Public License as - published by the Free Software Foundation; either version 2.1 of - the License, or (at your option) any later version. - - Libgcrypt is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with this program; if not, see . - */ - -/* - This file contains the necessary declarations/definitions for - working with Libgcrypt modules. Since 1.6 this is an internal - interface and will eventually be merged into another header or - entirely removed. - */ - -#ifndef GCRYPT_MODULE_H -#define GCRYPT_MODULE_H - -#ifdef __cplusplus -extern "C" { -#if 0 /* keep Emacsens's auto-indent happy */ -} -#endif -#endif - -/* The interfaces using the module system reserve a certain range of - IDs for application use. These IDs are not valid within Libgcrypt - but Libgcrypt makes sure never to allocate such a module ID. */ -#define GCRY_MODULE_ID_USER 1024 -#define GCRY_MODULE_ID_USER_LAST 4095 - - -/* This type represents a `module'. */ -typedef struct gcry_module *gcry_module_t; - - -/* ********************** */ - - -#if 0 /* keep Emacsens's auto-indent happy */ -{ -#endif -#ifdef __cplusplus -} -#endif -#endif /*GCRYPT_MODULE_H*/ diff --git a/src/module.c b/src/module.c deleted file mode 100644 index 32f668d..0000000 --- a/src/module.c +++ /dev/null @@ -1,212 +0,0 @@ -/* module.c - Module management for libgcrypt. - * Copyright (C) 2003, 2008 Free Software Foundation, Inc. - * - * This file is part of Libgcrypt. - * - * Libgcrypt is free software; you can redistribute it and/or modify - * it under the terms of the GNU Lesser general Public License as - * published by the Free Software Foundation; either version 2.1 of - * the License, or (at your option) any later version. - * - * Libgcrypt is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public - * License along with this program; if not, see . - */ - -#include -#include -#include "g10lib.h" - -/* Please match these numbers with the allocated algorithm - numbers. */ -#define MODULE_ID_MIN 600 -#define MODULE_ID_LAST 65500 -#define MODULE_ID_USER GCRY_MODULE_ID_USER -#define MODULE_ID_USER_LAST GCRY_MODULE_ID_USER_LAST - -#if MODULE_ID_MIN >= MODULE_ID_USER -#error Need to implement a different search strategy -#endif - -/* Internal function. Generate a new, unique module ID for a module - that should be inserted into the module chain starting at - MODULES. */ -static gcry_err_code_t -_gcry_module_id_new (gcry_module_t modules, unsigned int *id_new) -{ - unsigned int mod_id; - gcry_err_code_t err = GPG_ERR_NO_ERROR; - gcry_module_t module; - - /* Search for unused ID. */ - for (mod_id = MODULE_ID_MIN; mod_id < MODULE_ID_LAST; mod_id++) - { - if (mod_id == MODULE_ID_USER) - { - mod_id = MODULE_ID_USER_LAST; - continue; - } - - /* Search for a module with the current ID. */ - for (module = modules; module; module = module->next) - if (mod_id == module->mod_id) - break; - - if (! module) - /* None found -> the ID is available for use. */ - break; - } - - if (mod_id < MODULE_ID_LAST) - /* Done. */ - *id_new = mod_id; - else - /* No free ID found. */ - err = GPG_ERR_INTERNAL; - - return err; -} - -/* Add a module specification to the list ENTRIES. The new module has - it's use-counter set to one. */ -gcry_err_code_t -_gcry_module_add (gcry_module_t *entries, unsigned int mod_id, - void *spec, void *extraspec, gcry_module_t *module) -{ - gcry_err_code_t err = 0; - gcry_module_t entry; - - if (! mod_id) - err = _gcry_module_id_new (*entries, &mod_id); - - if (! err) - { - entry = gcry_malloc (sizeof (struct gcry_module)); - if (! entry) - err = gpg_err_code_from_errno (errno); - } - - if (! err) - { - /* Fill new module entry. */ - entry->flags = 0; - entry->counter = 1; - entry->spec = spec; - entry->extraspec = extraspec; - entry->mod_id = mod_id; - - /* Link it into the list. */ - entry->next = *entries; - entry->prevp = entries; - if (*entries) - (*entries)->prevp = &entry->next; - *entries = entry; - - /* And give it to the caller. */ - if (module) - *module = entry; - } - return err; -} - -/* Internal function. Unlink CIPHER_ENTRY from the list of registered - ciphers and destroy it. */ -static void -_gcry_module_drop (gcry_module_t entry) -{ - *entry->prevp = entry->next; - if (entry->next) - entry->next->prevp = entry->prevp; - - gcry_free (entry); -} - -/* Lookup a module specification by it's ID. After a successful - lookup, the module has it's resource counter incremented. */ -gcry_module_t -_gcry_module_lookup_id (gcry_module_t entries, unsigned int mod_id) -{ - gcry_module_t entry; - - for (entry = entries; entry; entry = entry->next) - if (entry->mod_id == mod_id) - { - entry->counter++; - break; - } - - return entry; -} - -/* Lookup a module specification. After a successful lookup, the - module has it's resource counter incremented. FUNC is a function - provided by the caller, which is responsible for identifying the - wanted module. */ -gcry_module_t -_gcry_module_lookup (gcry_module_t entries, void *data, - gcry_module_lookup_t func) -{ - gcry_module_t entry; - - for (entry = entries; entry; entry = entry->next) - if ((*func) (entry->spec, data)) - { - entry->counter++; - break; - } - - return entry; -} - -/* Release a module. In case the use-counter reaches zero, destroy - the module. Passing MODULE as NULL is a dummy operation (similar - to free()). */ -void -_gcry_module_release (gcry_module_t module) -{ - if (module && ! --module->counter) - _gcry_module_drop (module); -} - -/* Add a reference to a module. */ -void -_gcry_module_use (gcry_module_t module) -{ - ++module->counter; -} - -/* If LIST is zero, write the number of modules identified by MODULES - to LIST_LENGTH and return. If LIST is non-zero, the first - *LIST_LENGTH algorithm IDs are stored in LIST, which must be of - according size. In case there are less cipher modules than - *LIST_LENGTH, *LIST_LENGTH is updated to the correct number. */ -gcry_err_code_t -_gcry_module_list (gcry_module_t modules, - int *list, int *list_length) -{ - gcry_err_code_t err = GPG_ERR_NO_ERROR; - gcry_module_t module; - int length, i; - - for (module = modules, length = 0; module; module = module->next, length++); - - if (list) - { - if (length > *list_length) - length = *list_length; - - for (module = modules, i = 0; i < length; module = module->next, i++) - list[i] = module->mod_id; - - if (length < *list_length) - *list_length = length; - } - else - *list_length = length; - - return err; -} diff --git a/src/visibility.h b/src/visibility.h index 7e7793b..cd2a60f 100644 --- a/src/visibility.h +++ b/src/visibility.h @@ -208,7 +208,6 @@ #else # include "gcrypt-int.h" #endif -#include "gcrypt-module.h" /* Prototypes of functions exported but not ready for use. */ gcry_err_code_t gcry_md_get (gcry_md_hd_t hd, int algo, commit 52783d483293d48cd468143ae6ae2cccbfe17200 Author: Werner Koch Date: Wed Oct 2 13:44:46 2013 +0200 Fix missing prototype warning in visibility.c * src/ec-context.h (_gcry_mpi_ec_new): Move prototype to mpi.h. diff --git a/src/ec-context.h b/src/ec-context.h index ba6bdfc..a118608 100644 --- a/src/ec-context.h +++ b/src/ec-context.h @@ -71,9 +71,6 @@ void _gcry_mpi_ec_get_reset (mpi_ec_t ec); /*-- cipher/ecc-curves.c --*/ -gpg_err_code_t _gcry_mpi_ec_new (gcry_ctx_t *r_ctx, - gcry_sexp_t keyparam, const char *curvename); - gcry_mpi_t _gcry_ecc_get_mpi (const char *name, mpi_ec_t ec, int copy); gcry_mpi_point_t _gcry_ecc_get_point (const char *name, mpi_ec_t ec); gpg_err_code_t _gcry_ecc_set_mpi (const char *name, diff --git a/src/mpi.h b/src/mpi.h index 780d5eb..15fb542 100644 --- a/src/mpi.h +++ b/src/mpi.h @@ -342,6 +342,10 @@ gpg_err_code_t _gcry_mpi_ec_set_point (const char *name, gcry_mpi_point_t newvalue, gcry_ctx_t ctx); +/*-- ecc-curves.c --*/ +gpg_err_code_t _gcry_mpi_ec_new (gcry_ctx_t *r_ctx, + gcry_sexp_t keyparam, const char *curvename); + #endif /*G10_MPI_H*/ commit 0d39997932617ba20656f8bcc230ba744b76c87e Author: Werner Koch Date: Wed Oct 2 13:39:47 2013 +0200 md: Simplify the message digest dispatcher md.c. * src/gcrypt-module.h (gcry_md_spec_t): Move to ... * src/cipher-proto.h: here. Merge with md_extra_spec_t. Add fields ALGO and FLAGS. Set these fields in all digest modules. * cipher/md.c: Change most code to replace the former module system by a simpler system to gain information about the algorithms. Signed-off-by: Werner Koch diff --git a/cipher/crc.c b/cipher/crc.c index 9e406f1..4f72ffb 100644 --- a/cipher/crc.c +++ b/cipher/crc.c @@ -272,8 +272,12 @@ crc24rfc2440_final (void *context) ctx->buf[2] = (ctx->CRC ) & 0xFF; } +/* We allow the CRC algorithms even in FIPS mode because they are + actually no cryptographic primitives. */ + gcry_md_spec_t _gcry_digest_spec_crc32 = { + GCRY_MD_CRC32, {0, 1}, "CRC32", NULL, 0, NULL, 4, crc32_init, crc32_write, crc32_final, crc32_read, sizeof (CRC_CONTEXT) @@ -281,6 +285,7 @@ gcry_md_spec_t _gcry_digest_spec_crc32 = gcry_md_spec_t _gcry_digest_spec_crc32_rfc1510 = { + GCRY_MD_CRC32_RFC1510, {0, 1}, "CRC32RFC1510", NULL, 0, NULL, 4, crc32rfc1510_init, crc32_write, crc32rfc1510_final, crc32_read, @@ -289,6 +294,7 @@ gcry_md_spec_t _gcry_digest_spec_crc32_rfc1510 = gcry_md_spec_t _gcry_digest_spec_crc24_rfc2440 = { + GCRY_MD_CRC24_RFC2440, {0, 1}, "CRC24RFC2440", NULL, 0, NULL, 3, crc24rfc2440_init, crc24rfc2440_write, crc24rfc2440_final, crc32_read, diff --git a/cipher/gostr3411-94.c b/cipher/gostr3411-94.c index 368fc01..1267216 100644 --- a/cipher/gostr3411-94.c +++ b/cipher/gostr3411-94.c @@ -277,6 +277,7 @@ gost3411_read (void *context) } gcry_md_spec_t _gcry_digest_spec_gost3411_94 = { + GCRY_MD_GOSTR3411_94, {0, 0}, "GOSTR3411_94", NULL, 0, NULL, 32, gost3411_init, _gcry_md_block_write, gost3411_final, gost3411_read, sizeof (GOSTR3411_CONTEXT) diff --git a/cipher/md.c b/cipher/md.c index c65eb70..e3cc6c6 100644 --- a/cipher/md.c +++ b/cipher/md.c @@ -31,105 +31,64 @@ #include "rmd.h" -/* A dummy extraspec so that we do not need to tests the extraspec - field from the module specification against NULL and instead - directly test the respective fields of extraspecs. */ -static md_extra_spec_t dummy_extra_spec; - /* This is the list of the digest implementations included in libgcrypt. */ -static struct digest_table_entry -{ - gcry_md_spec_t *digest; - md_extra_spec_t *extraspec; - unsigned int algorithm; - int fips_allowed; -} digest_table[] = +static gcry_md_spec_t *digest_list[] = { #if USE_CRC - /* We allow the CRC algorithms even in FIPS mode because they are - actually no cryptographic primitives. */ - { &_gcry_digest_spec_crc32, - &dummy_extra_spec, GCRY_MD_CRC32, 1 }, - { &_gcry_digest_spec_crc32_rfc1510, - &dummy_extra_spec, GCRY_MD_CRC32_RFC1510, 1 }, - { &_gcry_digest_spec_crc24_rfc2440, - &dummy_extra_spec, GCRY_MD_CRC24_RFC2440, 1 }, + &_gcry_digest_spec_crc32, + &_gcry_digest_spec_crc32_rfc1510, + &_gcry_digest_spec_crc24_rfc2440, #endif -#ifdef USE_GOST_R_3411_94 - { &_gcry_digest_spec_gost3411_94, - &dummy_extra_spec, GCRY_MD_GOSTR3411_94 }, -#endif -#ifdef USE_GOST_R_3411_12 - { &_gcry_digest_spec_stribog_256, - &dummy_extra_spec, GCRY_MD_STRIBOG256 }, - { &_gcry_digest_spec_stribog_512, - &dummy_extra_spec, GCRY_MD_STRIBOG512 }, +#if USE_SHA1 + &_gcry_digest_spec_sha1, #endif -#if USE_MD4 - { &_gcry_digest_spec_md4, - &dummy_extra_spec, GCRY_MD_MD4 }, +#if USE_SHA256 + &_gcry_digest_spec_sha256, + &_gcry_digest_spec_sha224, #endif -#if USE_MD5 - { &_gcry_digest_spec_md5, - &dummy_extra_spec, GCRY_MD_MD5, 1 }, +#if USE_SHA512 + &_gcry_digest_spec_sha512, + &_gcry_digest_spec_sha384, #endif -#if USE_RMD160 - { &_gcry_digest_spec_rmd160, - &dummy_extra_spec, GCRY_MD_RMD160 }, +#ifdef USE_GOST_R_3411_94 + &_gcry_digest_spec_gost3411_94, #endif -#if USE_SHA1 - { &_gcry_digest_spec_sha1, - &_gcry_digest_extraspec_sha1, GCRY_MD_SHA1, 1 }, +#ifdef USE_GOST_R_3411_12 + &_gcry_digest_spec_stribog_256, + &_gcry_digest_spec_stribog_512, #endif -#if USE_SHA256 - { &_gcry_digest_spec_sha256, - &_gcry_digest_extraspec_sha256, GCRY_MD_SHA256, 1 }, - { &_gcry_digest_spec_sha224, - &_gcry_digest_extraspec_sha224, GCRY_MD_SHA224, 1 }, +#if USE_WHIRLPOOL + &_gcry_digest_spec_whirlpool, #endif -#if USE_SHA512 - { &_gcry_digest_spec_sha512, - &_gcry_digest_extraspec_sha512, GCRY_MD_SHA512, 1 }, - { &_gcry_digest_spec_sha384, - &_gcry_digest_extraspec_sha384, GCRY_MD_SHA384, 1 }, +#if USE_RMD160 + &_gcry_digest_spec_rmd160, #endif #if USE_TIGER - { &_gcry_digest_spec_tiger, - &dummy_extra_spec, GCRY_MD_TIGER }, - { &_gcry_digest_spec_tiger1, - &dummy_extra_spec, GCRY_MD_TIGER1 }, - { &_gcry_digest_spec_tiger2, - &dummy_extra_spec, GCRY_MD_TIGER2 }, + &_gcry_digest_spec_tiger, + &_gcry_digest_spec_tiger1, + &_gcry_digest_spec_tiger2, #endif -#if USE_WHIRLPOOL - { &_gcry_digest_spec_whirlpool, - &dummy_extra_spec, GCRY_MD_WHIRLPOOL }, +#if USE_MD5 + &_gcry_digest_spec_md5, +#endif +#if USE_MD4 + &_gcry_digest_spec_md4, #endif - { NULL }, + NULL }; -/* List of registered digests. */ -static gcry_module_t digests_registered; - -/* This is the lock protecting DIGESTS_REGISTERED. */ -static ath_mutex_t digests_registered_lock; - -/* Flag to check whether the default ciphers have already been - registered. */ -static int default_digests_registered; typedef struct gcry_md_list { - gcry_md_spec_t *digest; - gcry_module_t module; + gcry_md_spec_t *spec; struct gcry_md_list *next; size_t actual_struct_size; /* Allocated size of this structure. */ PROPERLY_ALIGNED_TYPE context; } GcryDigestEntry; -/* this structure is put right after the gcry_md_hd_t buffer, so that +/* This structure is put right after the gcry_md_hd_t buffer, so that * only one memory block is needed. */ struct gcry_md_context { @@ -147,248 +106,135 @@ struct gcry_md_context #define CTX_MAGIC_NORMAL 0x11071961 #define CTX_MAGIC_SECURE 0x16917011 -/* Convenient macro for registering the default digests. */ -#define REGISTER_DEFAULT_DIGESTS \ - do \ - { \ - ath_mutex_lock (&digests_registered_lock); \ - if (! default_digests_registered) \ - { \ - md_register_default (); \ - default_digests_registered = 1; \ - } \ - ath_mutex_unlock (&digests_registered_lock); \ - } \ - while (0) - - -static const char * digest_algo_to_string( int algo ); -static gcry_err_code_t check_digest_algo (int algo); -static gcry_err_code_t md_open (gcry_md_hd_t *h, int algo, - int secure, int hmac); static gcry_err_code_t md_enable (gcry_md_hd_t hd, int algo); -static gcry_err_code_t md_copy (gcry_md_hd_t a, gcry_md_hd_t *b); static void md_close (gcry_md_hd_t a); static void md_write (gcry_md_hd_t a, const void *inbuf, size_t inlen); -static void md_final(gcry_md_hd_t a); static byte *md_read( gcry_md_hd_t a, int algo ); static int md_get_algo( gcry_md_hd_t a ); static int md_digest_length( int algo ); -static const byte *md_asn_oid( int algo, size_t *asnlen, size_t *mdlen ); static void md_start_debug ( gcry_md_hd_t a, const char *suffix ); static void md_stop_debug ( gcry_md_hd_t a ); + +static int +map_algo (int algo) +{ + return algo; +} -/* Internal function. Register all the ciphers included in - CIPHER_TABLE. Returns zero on success or an error code. */ -static void -md_register_default (void) +/* Return the spec structure for the hash algorithm ALGO. For an + unknown algorithm NULL is returned. */ +static gcry_md_spec_t * +spec_from_algo (int algo) { - gcry_err_code_t err = 0; - int i; + int idx; + gcry_md_spec_t *spec; - for (i = 0; !err && digest_table[i].digest; i++) - { - if ( fips_mode ()) - { - if (!digest_table[i].fips_allowed) - continue; - if (digest_table[i].algorithm == GCRY_MD_MD5 - && _gcry_enforced_fips_mode () ) - continue; /* Do not register in enforced fips mode. */ - } + algo = map_algo (algo); - err = _gcry_module_add (&digests_registered, - digest_table[i].algorithm, - (void *) digest_table[i].digest, - (void *) digest_table[i].extraspec, - NULL); - } - - if (err) - BUG (); + for (idx = 0; (spec = digest_list[idx]); idx++) + if (algo == spec->algo) + return spec; + return NULL; } -/* Internal callback function. */ -static int -gcry_md_lookup_func_name (void *spec, void *data) -{ - gcry_md_spec_t *digest = (gcry_md_spec_t *) spec; - char *name = (char *) data; - return (! stricmp (digest->name, name)); -} - -/* Internal callback function. Used via _gcry_module_lookup. */ -static int -gcry_md_lookup_func_oid (void *spec, void *data) +/* Lookup a hash's spec by its name. */ +static gcry_md_spec_t * +spec_from_name (const char *name) { - gcry_md_spec_t *digest = (gcry_md_spec_t *) spec; - char *oid = (char *) data; - gcry_md_oid_spec_t *oid_specs = digest->oids; - int ret = 0, i; + gcry_md_spec_t *spec; + int idx; - if (oid_specs) + for (idx=0; (spec = digest_list[idx]); idx++) { - for (i = 0; oid_specs[i].oidstring && (! ret); i++) - if (! stricmp (oid, oid_specs[i].oidstring)) - ret = 1; + if (!stricmp (name, spec->name)) + return spec; } - return ret; -} - -/* Internal function. Lookup a digest entry by it's name. */ -static gcry_module_t -gcry_md_lookup_name (const char *name) -{ - gcry_module_t digest; - - digest = _gcry_module_lookup (digests_registered, (void *) name, - gcry_md_lookup_func_name); - - return digest; + return NULL; } -/* Internal function. Lookup a cipher entry by it's oid. */ -static gcry_module_t -gcry_md_lookup_oid (const char *oid) -{ - gcry_module_t digest; - - digest = _gcry_module_lookup (digests_registered, (void *) oid, - gcry_md_lookup_func_oid); - - return digest; -} -/* Register a new digest module whose specification can be found in - DIGEST. On success, a new algorithm ID is stored in ALGORITHM_ID - and a pointer representhing this module is stored in MODULE. */ -gcry_error_t -_gcry_md_register (gcry_md_spec_t *digest, - md_extra_spec_t *extraspec, - unsigned int *algorithm_id, - gcry_module_t *module) +/* Lookup a hash's spec by its OID. */ +static gcry_md_spec_t * +spec_from_oid (const char *oid) { - gcry_err_code_t err = 0; - gcry_module_t mod; - - /* We do not support module loading in fips mode. */ - if (fips_mode ()) - return gpg_error (GPG_ERR_NOT_SUPPORTED); + gcry_md_spec_t *spec; + gcry_md_oid_spec_t *oid_specs; + int idx, j; - ath_mutex_lock (&digests_registered_lock); - err = _gcry_module_add (&digests_registered, 0, - (void *) digest, - (void *)(extraspec? extraspec : &dummy_extra_spec), - &mod); - ath_mutex_unlock (&digests_registered_lock); - - if (! err) + for (idx=0; (spec = digest_list[idx]); idx++) { - *module = mod; - *algorithm_id = mod->mod_id; + oid_specs = spec->oids; + if (oid_specs) + { + for (j = 0; oid_specs[j].oidstring; j++) + if (!stricmp (oid, oid_specs[j].oidstring)) + return spec; + } } - return gcry_error (err); + return NULL; } -static int -search_oid (const char *oid, int *algorithm, gcry_md_oid_spec_t *oid_spec) +static gcry_md_spec_t * +search_oid (const char *oid, gcry_md_oid_spec_t *oid_spec) { - gcry_module_t module; - int ret = 0; + gcry_md_spec_t *spec; + int i; if (oid && ((! strncmp (oid, "oid.", 4)) || (! strncmp (oid, "OID.", 4)))) oid += 4; - module = gcry_md_lookup_oid (oid); - if (module) + spec = spec_from_oid (oid); + if (spec && spec->oids) { - gcry_md_spec_t *digest = module->spec; - int i; - - for (i = 0; digest->oids[i].oidstring && !ret; i++) - if (! stricmp (oid, digest->oids[i].oidstring)) + for (i = 0; spec->oids[i].oidstring; i++) + if (stricmp (oid, spec->oids[i].oidstring)) { - if (algorithm) - *algorithm = module->mod_id; if (oid_spec) - *oid_spec = digest->oids[i]; - ret = 1; + *oid_spec = spec->oids[i]; + return spec; } - _gcry_module_release (module); } - return ret; + return NULL; } + /**************** * Map a string to the digest algo */ int gcry_md_map_name (const char *string) { - gcry_module_t digest; - int ret, algorithm = 0; + gcry_md_spec_t *spec; - if (! string) + if (!string) return 0; - REGISTER_DEFAULT_DIGESTS; - /* If the string starts with a digit (optionally prefixed with either "OID." or "oid."), we first look into our table of ASN.1 object identifiers to figure out the algorithm */ + spec = search_oid (string, NULL); + if (spec) + return spec->algo; - ath_mutex_lock (&digests_registered_lock); + /* Not found, search a matching digest name. */ + spec = spec_from_name (string); + if (spec) + return spec->algo; - ret = search_oid (string, &algorithm, NULL); - if (! ret) - { - /* Not found, search a matching digest name. */ - digest = gcry_md_lookup_name (string); - if (digest) - { - algorithm = digest->mod_id; - _gcry_module_release (digest); - } - } - ath_mutex_unlock (&digests_registered_lock); - - return algorithm; + return 0; } /**************** - * Map a digest algo to a string - */ -static const char * -digest_algo_to_string (int algorithm) -{ - const char *name = NULL; - gcry_module_t digest; - - REGISTER_DEFAULT_DIGESTS; - - ath_mutex_lock (&digests_registered_lock); - digest = _gcry_module_lookup_id (digests_registered, algorithm); - if (digest) - { - name = ((gcry_md_spec_t *) digest->spec)->name; - _gcry_module_release (digest); - } - ath_mutex_unlock (&digests_registered_lock); - - return name; -} - -/**************** * This function simply returns the name of the algorithm or some constant * string when there is no algo. It will never return NULL. * Use the macro gcry_md_test_algo() to check whether the algorithm @@ -397,32 +243,27 @@ digest_algo_to_string (int algorithm) const char * gcry_md_algo_name (int algorithm) { - const char *s = digest_algo_to_string (algorithm); - return s ? s : "?"; + gcry_md_spec_t *spec; + + spec = spec_from_algo (algorithm); + return spec ? spec->name : "?"; } static gcry_err_code_t check_digest_algo (int algorithm) { - gcry_err_code_t rc = 0; - gcry_module_t digest; + gcry_md_spec_t *spec; - REGISTER_DEFAULT_DIGESTS; + spec = spec_from_algo (algorithm); + if (spec && !spec->flags.disabled) + return 0; - ath_mutex_lock (&digests_registered_lock); - digest = _gcry_module_lookup_id (digests_registered, algorithm); - if (digest) - _gcry_module_release (digest); - else - rc = GPG_ERR_DIGEST_ALGO; - ath_mutex_unlock (&digests_registered_lock); + return GPG_ERR_DIGEST_ALGO; - return rc; } - /**************** * Open a message digest handle for use with algorithm ALGO. * More algorithms may be added by md_enable(). The initial algorithm @@ -525,7 +366,7 @@ md_open (gcry_md_hd_t *h, int algo, int secure, int hmac) gcry_error_t gcry_md_open (gcry_md_hd_t *h, int algo, unsigned int flags) { - gcry_err_code_t err = GPG_ERR_NO_ERROR; + gcry_err_code_t err; gcry_md_hd_t hd; if ((flags & ~(GCRY_MD_FLAG_SECURE | GCRY_MD_FLAG_HMAC))) @@ -546,27 +387,20 @@ static gcry_err_code_t md_enable (gcry_md_hd_t hd, int algorithm) { struct gcry_md_context *h = hd->ctx; - gcry_md_spec_t *digest = NULL; + gcry_md_spec_t *spec; GcryDigestEntry *entry; - gcry_module_t module; gcry_err_code_t err = 0; for (entry = h->list; entry; entry = entry->next) - if (entry->module->mod_id == algorithm) - return err; /* already enabled */ + if (entry->spec->algo == algorithm) + return 0; /* Already enabled */ - REGISTER_DEFAULT_DIGESTS; - - ath_mutex_lock (&digests_registered_lock); - module = _gcry_module_lookup_id (digests_registered, algorithm); - ath_mutex_unlock (&digests_registered_lock); - if (! module) + spec = spec_from_algo (algorithm); + if (!spec) { log_debug ("md_enable: algorithm %d not available\n", algorithm); err = GPG_ERR_DIGEST_ALGO; } - else - digest = (gcry_md_spec_t *) module->spec; if (!err && algorithm == GCRY_MD_MD5 && fips_mode ()) @@ -583,7 +417,7 @@ md_enable (gcry_md_hd_t hd, int algorithm) if (!err) { size_t size = (sizeof (*entry) - + digest->contextsize + + spec->contextsize - sizeof (entry->context)); /* And allocate a new list entry. */ @@ -596,24 +430,13 @@ md_enable (gcry_md_hd_t hd, int algorithm) err = gpg_err_code_from_errno (errno); else { - entry->digest = digest; - entry->module = module; + entry->spec = spec; entry->next = h->list; entry->actual_struct_size = size; h->list = entry; /* And init this instance. */ - entry->digest->init (&entry->context.c); - } - } - - if (err) - { - if (module) - { - ath_mutex_lock (&digests_registered_lock); - _gcry_module_release (module); - ath_mutex_unlock (&digests_registered_lock); + entry->spec->init (&entry->context.c); } } @@ -627,10 +450,11 @@ gcry_md_enable (gcry_md_hd_t hd, int algorithm) return gcry_error (md_enable (hd, algorithm)); } + static gcry_err_code_t md_copy (gcry_md_hd_t ahd, gcry_md_hd_t *b_hd) { - gcry_err_code_t err = GPG_ERR_NO_ERROR; + gcry_err_code_t err = 0; struct gcry_md_context *a = ahd->ctx; struct gcry_md_context *b; GcryDigestEntry *ar, *br; @@ -681,11 +505,11 @@ md_copy (gcry_md_hd_t ahd, gcry_md_hd_t *b_hd) { if (a->secure) br = gcry_malloc_secure (sizeof *br - + ar->digest->contextsize + + ar->spec->contextsize - sizeof(ar->context)); else br = gcry_malloc (sizeof *br - + ar->digest->contextsize + + ar->spec->contextsize - sizeof (ar->context)); if (!br) { @@ -694,15 +518,10 @@ md_copy (gcry_md_hd_t ahd, gcry_md_hd_t *b_hd) break; } - memcpy (br, ar, (sizeof (*br) + ar->digest->contextsize + memcpy (br, ar, (sizeof (*br) + ar->spec->contextsize - sizeof (ar->context))); br->next = b->list; b->list = br; - - /* Add a reference to the module. */ - ath_mutex_lock (&digests_registered_lock); - _gcry_module_use (br->module); - ath_mutex_unlock (&digests_registered_lock); } } @@ -715,6 +534,7 @@ md_copy (gcry_md_hd_t ahd, gcry_md_hd_t *b_hd) return err; } + gcry_error_t gcry_md_copy (gcry_md_hd_t *handle, gcry_md_hd_t hd) { @@ -726,6 +546,7 @@ gcry_md_copy (gcry_md_hd_t *handle, gcry_md_hd_t hd) return gcry_error (err); } + /* * Reset all contexts and discard any buffered stuff. This may be used * instead of a md_close(); md_open(). @@ -741,13 +562,14 @@ gcry_md_reset (gcry_md_hd_t a) for (r = a->ctx->list; r; r = r->next) { - memset (r->context.c, 0, r->digest->contextsize); - (*r->digest->init) (&r->context.c); + memset (r->context.c, 0, r->spec->contextsize); + (*r->spec->init) (&r->context.c); } if (a->ctx->macpads) md_write (a, a->ctx->macpads, a->ctx->macpads_Bsize); /* inner pad */ } + static void md_close (gcry_md_hd_t a) { @@ -760,9 +582,6 @@ md_close (gcry_md_hd_t a) for (r = a->ctx->list; r; r = r2) { r2 = r->next; - ath_mutex_lock (&digests_registered_lock); - _gcry_module_release (r->module); - ath_mutex_unlock (&digests_registered_lock); wipememory (r, r->actual_struct_size); gcry_free (r); } @@ -777,6 +596,7 @@ md_close (gcry_md_hd_t a) gcry_free(a); } + void gcry_md_close (gcry_md_hd_t hd) { @@ -784,6 +604,7 @@ gcry_md_close (gcry_md_hd_t hd) md_close (hd); } + static void md_write (gcry_md_hd_t a, const void *inbuf, size_t inlen) { @@ -800,18 +621,20 @@ md_write (gcry_md_hd_t a, const void *inbuf, size_t inlen) for (r = a->ctx->list; r; r = r->next) { if (a->bufpos) - (*r->digest->write) (&r->context.c, a->buf, a->bufpos); - (*r->digest->write) (&r->context.c, inbuf, inlen); + (*r->spec->write) (&r->context.c, a->buf, a->bufpos); + (*r->spec->write) (&r->context.c, inbuf, inlen); } a->bufpos = 0; } + void gcry_md_write (gcry_md_hd_t hd, const void *inbuf, size_t inlen) { md_write (hd, inbuf, inlen); } + static void md_final (gcry_md_hd_t a) { @@ -824,7 +647,7 @@ md_final (gcry_md_hd_t a) md_write (a, NULL, 0); for (r = a->ctx->list; r; r = r->next) - (*r->digest->final) (&r->context.c); + (*r->spec->final) (&r->context.c); a->ctx->finalized = 1; @@ -850,6 +673,7 @@ md_final (gcry_md_hd_t a) } } + static gcry_err_code_t prepare_macpads (gcry_md_hd_t hd, const unsigned char *key, size_t keylen) { @@ -884,9 +708,10 @@ prepare_macpads (gcry_md_hd_t hd, const unsigned char *key, size_t keylen) } gcry_free (helpkey); - return GPG_ERR_NO_ERROR; + return 0; } + gcry_error_t gcry_md_ctl (gcry_md_hd_t hd, int cmd, void *buffer, size_t buflen) { @@ -912,10 +737,11 @@ gcry_md_ctl (gcry_md_hd_t hd, int cmd, void *buffer, size_t buflen) return gcry_error (rc); } + gcry_error_t gcry_md_setkey (gcry_md_hd_t hd, const void *key, size_t keylen) { - gcry_err_code_t rc = GPG_ERR_NO_ERROR; + gcry_err_code_t rc; if (!hd->ctx->macpads) rc = GPG_ERR_CONFLICT; @@ -929,6 +755,7 @@ gcry_md_setkey (gcry_md_hd_t hd, const void *key, size_t keylen) return gcry_error (rc); } + /* The new debug interface. If SUFFIX is a string it creates an debug file for the context HD. IF suffix is NULL, the file is closed and debugging is stopped. */ @@ -942,9 +769,9 @@ gcry_md_debug (gcry_md_hd_t hd, const char *suffix) } - /**************** - * if ALGO is null get the digest for the used algo (which should be only one) + * If ALGO is null get the digest for the used algo (which should be + * only one) */ static byte * md_read( gcry_md_hd_t a, int algo ) @@ -958,19 +785,20 @@ md_read( gcry_md_hd_t a, int algo ) { if (r->next) log_debug ("more than one algorithm in md_read(0)\n"); - return r->digest->read (&r->context.c); + return r->spec->read (&r->context.c); } } else { for (r = a->ctx->list; r; r = r->next) - if (r->module->mod_id == algo) - return r->digest->read (&r->context.c); + if (r->spec->algo == algo) + return r->spec->read (&r->context.c); } BUG(); return NULL; } + /* * Read out the complete digest, this function implictly finalizes * the hash. @@ -1133,9 +961,10 @@ md_get_algo (gcry_md_hd_t a) fips_signal_error ("possible usage error"); log_error ("WARNING: more than one algorithm in md_get_algo()\n"); } - return r ? r->module->mod_id : 0; + return r ? r->spec->algo : 0; } + int gcry_md_get_algo (gcry_md_hd_t hd) { @@ -1149,23 +978,13 @@ gcry_md_get_algo (gcry_md_hd_t hd) static int md_digest_length (int algorithm) { - gcry_module_t digest; - int mdlen = 0; - - REGISTER_DEFAULT_DIGESTS; - - ath_mutex_lock (&digests_registered_lock); - digest = _gcry_module_lookup_id (digests_registered, algorithm); - if (digest) - { - mdlen = ((gcry_md_spec_t *) digest->spec)->mdlen; - _gcry_module_release (digest); - } - ath_mutex_unlock (&digests_registered_lock); + gcry_md_spec_t *spec; - return mdlen; + spec = spec_from_algo (algorithm); + return spec? spec->mdlen : 0; } + /**************** * Return the length of the digest in bytes. * This function will return 0 in case of errors. @@ -1182,31 +1001,25 @@ gcry_md_get_algo_dlen (int algorithm) static const byte * md_asn_oid (int algorithm, size_t *asnlen, size_t *mdlen) { + gcry_md_spec_t *spec; const byte *asnoid = NULL; - gcry_module_t digest; - - REGISTER_DEFAULT_DIGESTS; - ath_mutex_lock (&digests_registered_lock); - digest = _gcry_module_lookup_id (digests_registered, algorithm); - if (digest) + spec = spec_from_algo (algorithm); + if (spec) { if (asnlen) - *asnlen = ((gcry_md_spec_t *) digest->spec)->asnlen; + *asnlen = spec->asnlen; if (mdlen) - *mdlen = ((gcry_md_spec_t *) digest->spec)->mdlen; - asnoid = ((gcry_md_spec_t *) digest->spec)->asnoid; - _gcry_module_release (digest); + *mdlen = spec->mdlen; + asnoid = spec->asnoid; } else log_bug ("no ASN.1 OID for md algo %d\n", algorithm); - ath_mutex_unlock (&digests_registered_lock); return asnoid; } - /**************** * Return information about the given cipher algorithm * WHAT select the kind of information returned: @@ -1226,7 +1039,7 @@ md_asn_oid (int algorithm, size_t *asnlen, size_t *mdlen) gcry_error_t gcry_md_algo_info (int algo, int what, void *buffer, size_t *nbytes) { - gcry_err_code_t err = GPG_ERR_NO_ERROR; + gcry_err_code_t err; switch (what) { @@ -1248,10 +1061,10 @@ gcry_md_algo_info (int algo, int what, void *buffer, size_t *nbytes) asn = md_asn_oid (algo, &asnlen, NULL); if (buffer && (*nbytes >= asnlen)) - { - memcpy (buffer, asn, asnlen); - *nbytes = asnlen; - } + { + memcpy (buffer, asn, asnlen); + *nbytes = asnlen; + } else if (!buffer && nbytes) *nbytes = asnlen; else @@ -1264,8 +1077,9 @@ gcry_md_algo_info (int algo, int what, void *buffer, size_t *nbytes) } break; - default: - err = GPG_ERR_INV_OP; + default: + err = GPG_ERR_INV_OP; + break; } return gcry_error (err); @@ -1293,6 +1107,7 @@ md_start_debug ( gcry_md_hd_t md, const char *suffix ) log_debug("md debug: can't open %s\n", buf ); } + static void md_stop_debug( gcry_md_hd_t md ) { @@ -1329,7 +1144,7 @@ md_stop_debug( gcry_md_hd_t md ) gcry_error_t gcry_md_info (gcry_md_hd_t h, int cmd, void *buffer, size_t *nbytes) { - gcry_err_code_t err = GPG_ERR_NO_ERROR; + gcry_err_code_t err = 0; switch (cmd) { @@ -1350,7 +1165,7 @@ gcry_md_info (gcry_md_hd_t h, int cmd, void *buffer, size_t *nbytes) *nbytes = 0; for(r=h->ctx->list; r; r = r->next ) { - if (r->module->mod_id == algo) + if (r->spec->algo == algo) { *nbytes = 1; break; @@ -1372,15 +1187,7 @@ gcry_md_info (gcry_md_hd_t h, int cmd, void *buffer, size_t *nbytes) gcry_err_code_t _gcry_md_init (void) { - gcry_err_code_t err; - - err = ath_mutex_init (&digests_registered_lock); - if (err) - return gpg_err_code_from_errno (err); - - REGISTER_DEFAULT_DIGESTS; - - return err; + return 0; } @@ -1413,34 +1220,21 @@ gcry_md_is_enabled (gcry_md_hd_t a, int algo) gpg_error_t _gcry_md_selftest (int algo, int extended, selftest_report_func_t report) { - gcry_module_t module = NULL; - md_extra_spec_t *extraspec = NULL; gcry_err_code_t ec = 0; + gcry_md_spec_t *spec; - REGISTER_DEFAULT_DIGESTS; - - ath_mutex_lock (&digests_registered_lock); - module = _gcry_module_lookup_id (digests_registered, algo); - if (module && !(module->flags & FLAG_MODULE_DISABLED)) - extraspec = module->extraspec; - ath_mutex_unlock (&digests_registered_lock); - if (extraspec && extraspec->selftest) - ec = extraspec->selftest (algo, extended, report); + spec = spec_from_algo (algo); + if (spec && !spec->flags.disabled && spec->selftest) + ec = spec->selftest (algo, extended, report); else { ec = GPG_ERR_DIGEST_ALGO; if (report) report ("digest", algo, "module", - module && !(module->flags & FLAG_MODULE_DISABLED)? + (spec && !spec->flags.disabled)? "no selftest available" : - module? "algorithm disabled" : "algorithm not found"); + spec? "algorithm disabled" : "algorithm not found"); } - if (module) - { - ath_mutex_lock (&digests_registered_lock); - _gcry_module_release (module); - ath_mutex_unlock (&digests_registered_lock); - } return gpg_error (ec); } diff --git a/cipher/md4.c b/cipher/md4.c index e2d096c..ab32b14 100644 --- a/cipher/md4.c +++ b/cipher/md4.c @@ -261,6 +261,7 @@ static gcry_md_oid_spec_t oid_spec_md4[] = gcry_md_spec_t _gcry_digest_spec_md4 = { + GCRY_MD_MD4, {0, 0}, "MD4", asn, DIM (asn), oid_spec_md4,16, md4_init, _gcry_md_block_write, md4_final, md4_read, sizeof (MD4_CONTEXT) diff --git a/cipher/md5.c b/cipher/md5.c index db0f315..1b6ad48 100644 --- a/cipher/md5.c +++ b/cipher/md5.c @@ -287,6 +287,7 @@ static gcry_md_oid_spec_t oid_spec_md5[] = gcry_md_spec_t _gcry_digest_spec_md5 = { + GCRY_MD_MD5, {0, 1}, "MD5", asn, DIM (asn), oid_spec_md5, 16, md5_init, _gcry_md_block_write, md5_final, md5_read, sizeof (MD5_CONTEXT) diff --git a/cipher/rmd160.c b/cipher/rmd160.c index d156e61..f7430ea 100644 --- a/cipher/rmd160.c +++ b/cipher/rmd160.c @@ -504,6 +504,7 @@ static gcry_md_oid_spec_t oid_spec_rmd160[] = gcry_md_spec_t _gcry_digest_spec_rmd160 = { + GCRY_MD_RMD160, {0, 0}, "RIPEMD160", asn, DIM (asn), oid_spec_rmd160, 20, _gcry_rmd160_init, _gcry_md_block_write, rmd160_final, rmd160_read, sizeof (RMD160_CONTEXT) diff --git a/cipher/sha1.c b/cipher/sha1.c index aef9f05..95591eb 100644 --- a/cipher/sha1.c +++ b/cipher/sha1.c @@ -411,11 +411,9 @@ static gcry_md_oid_spec_t oid_spec_sha1[] = gcry_md_spec_t _gcry_digest_spec_sha1 = { + GCRY_MD_SHA1, {0, 1}, "SHA1", asn, DIM (asn), oid_spec_sha1, 20, sha1_init, _gcry_md_block_write, sha1_final, sha1_read, - sizeof (SHA1_CONTEXT) - }; -md_extra_spec_t _gcry_digest_extraspec_sha1 = - { + sizeof (SHA1_CONTEXT), run_selftests }; diff --git a/cipher/sha256.c b/cipher/sha256.c index ad08cc7..d3917e4 100644 --- a/cipher/sha256.c +++ b/cipher/sha256.c @@ -475,22 +475,18 @@ static gcry_md_oid_spec_t oid_spec_sha256[] = gcry_md_spec_t _gcry_digest_spec_sha224 = { + GCRY_MD_SHA224, {0, 1}, "SHA224", asn224, DIM (asn224), oid_spec_sha224, 28, sha224_init, _gcry_md_block_write, sha256_final, sha256_read, - sizeof (SHA256_CONTEXT) - }; -md_extra_spec_t _gcry_digest_extraspec_sha224 = - { + sizeof (SHA256_CONTEXT), run_selftests }; gcry_md_spec_t _gcry_digest_spec_sha256 = { + GCRY_MD_SHA256, {0, 1}, "SHA256", asn256, DIM (asn256), oid_spec_sha256, 32, sha256_init, _gcry_md_block_write, sha256_final, sha256_read, - sizeof (SHA256_CONTEXT) - }; -md_extra_spec_t _gcry_digest_extraspec_sha256 = - { + sizeof (SHA256_CONTEXT), run_selftests }; diff --git a/cipher/sha512.c b/cipher/sha512.c index 505a1e4..af30775 100644 --- a/cipher/sha512.c +++ b/cipher/sha512.c @@ -731,12 +731,10 @@ static gcry_md_oid_spec_t oid_spec_sha512[] = gcry_md_spec_t _gcry_digest_spec_sha512 = { + GCRY_MD_SHA512, {0, 1}, "SHA512", sha512_asn, DIM (sha512_asn), oid_spec_sha512, 64, sha512_init, _gcry_md_block_write, sha512_final, sha512_read, sizeof (SHA512_CONTEXT), - }; -md_extra_spec_t _gcry_digest_extraspec_sha512 = - { run_selftests }; @@ -759,11 +757,9 @@ static gcry_md_oid_spec_t oid_spec_sha384[] = gcry_md_spec_t _gcry_digest_spec_sha384 = { + GCRY_MD_SHA384, {0, 1}, "SHA384", sha384_asn, DIM (sha384_asn), oid_spec_sha384, 48, sha384_init, _gcry_md_block_write, sha512_final, sha512_read, sizeof (SHA512_CONTEXT), - }; -md_extra_spec_t _gcry_digest_extraspec_sha384 = - { run_selftests }; diff --git a/cipher/stribog.c b/cipher/stribog.c index 61aa222..7dcdfd6 100644 --- a/cipher/stribog.c +++ b/cipher/stribog.c @@ -1387,6 +1387,7 @@ stribog_read_256 (void *context) gcry_md_spec_t _gcry_digest_spec_stribog_256 = { + GCRY_MD_STRIBOG256, {0, 0}, "STRIBOG256", NULL, 0, NULL, 32, stribog_init_256, _gcry_md_block_write, stribog_final, stribog_read_256, sizeof (STRIBOG_CONTEXT) @@ -1394,6 +1395,7 @@ gcry_md_spec_t _gcry_digest_spec_stribog_256 = gcry_md_spec_t _gcry_digest_spec_stribog_512 = { + GCRY_MD_STRIBOG512, {0, 0}, "STRIBOG512", NULL, 0, NULL, 64, stribog_init_512, _gcry_md_block_write, stribog_final, stribog_read_512, sizeof (STRIBOG_CONTEXT) diff --git a/cipher/tiger.c b/cipher/tiger.c index df16098..a70a3f2 100644 --- a/cipher/tiger.c +++ b/cipher/tiger.c @@ -810,6 +810,7 @@ tiger_read( void *context ) an OID anymore because that would not be correct. */ gcry_md_spec_t _gcry_digest_spec_tiger = { + GCRY_MD_TIGER, {0, 0}, "TIGER192", NULL, 0, NULL, 24, tiger_init, _gcry_md_block_write, tiger_final, tiger_read, sizeof (TIGER_CONTEXT) @@ -832,6 +833,7 @@ static gcry_md_oid_spec_t oid_spec_tiger1[] = gcry_md_spec_t _gcry_digest_spec_tiger1 = { + GCRY_MD_TIGER1, {0, 0}, "TIGER", asn1, DIM (asn1), oid_spec_tiger1, 24, tiger1_init, _gcry_md_block_write, tiger_final, tiger_read, sizeof (TIGER_CONTEXT) @@ -842,6 +844,7 @@ gcry_md_spec_t _gcry_digest_spec_tiger1 = /* This is TIGER2 which usues a changed padding algorithm. */ gcry_md_spec_t _gcry_digest_spec_tiger2 = { + GCRY_MD_TIGER2, {0, 0}, "TIGER2", NULL, 0, NULL, 24, tiger2_init, _gcry_md_block_write, tiger_final, tiger_read, sizeof (TIGER_CONTEXT) diff --git a/cipher/whirlpool.c b/cipher/whirlpool.c index fa632f9..168c38f 100644 --- a/cipher/whirlpool.c +++ b/cipher/whirlpool.c @@ -1351,6 +1351,7 @@ whirlpool_read (void *ctx) gcry_md_spec_t _gcry_digest_spec_whirlpool = { + GCRY_MD_WHIRLPOOL, {0, 0}, "WHIRLPOOL", NULL, 0, NULL, 64, whirlpool_init, whirlpool_write, whirlpool_final, whirlpool_read, sizeof (whirlpool_context_t) diff --git a/src/cipher-proto.h b/src/cipher-proto.h index 62bc8b9..f4b9959 100644 --- a/src/cipher-proto.h +++ b/src/cipher-proto.h @@ -230,20 +230,46 @@ typedef struct gcry_cipher_spec * */ -typedef struct md_extra_spec +/* Type for the md_init function. */ +typedef void (*gcry_md_init_t) (void *c); + +/* Type for the md_write function. */ +typedef void (*gcry_md_write_t) (void *c, const void *buf, size_t nbytes); + +/* Type for the md_final function. */ +typedef void (*gcry_md_final_t) (void *c); + +/* Type for the md_read function. */ +typedef unsigned char *(*gcry_md_read_t) (void *c); + +typedef struct gcry_md_oid_spec { - selftest_func_t selftest; -} md_extra_spec_t; + const char *oidstring; +} gcry_md_oid_spec_t; +/* Module specification structure for message digests. */ +typedef struct gcry_md_spec +{ + int algo; + struct { + unsigned int disabled:1; + unsigned int fips:1; + } flags; + const char *name; + unsigned char *asnoid; + int asnlen; + gcry_md_oid_spec_t *oids; + int mdlen; + gcry_md_init_t init; + gcry_md_write_t write; + gcry_md_final_t final; + gcry_md_read_t read; + size_t contextsize; /* allocate this amount of context */ + selftest_func_t selftest; +} gcry_md_spec_t; -/* The private register functions. */ -gcry_error_t _gcry_md_register (gcry_md_spec_t *cipher, - md_extra_spec_t *extraspec, - unsigned int *algorithm_id, - gcry_module_t *module); - /* The selftest functions. */ gcry_error_t _gcry_cipher_selftest (int algo, int extended, selftest_report_func_t report); diff --git a/src/cipher.h b/src/cipher.h index d080e72..3b7744a 100644 --- a/src/cipher.h +++ b/src/cipher.h @@ -224,12 +224,6 @@ extern gcry_md_spec_t _gcry_digest_spec_tiger1; extern gcry_md_spec_t _gcry_digest_spec_tiger2; extern gcry_md_spec_t _gcry_digest_spec_whirlpool; -extern md_extra_spec_t _gcry_digest_extraspec_sha1; -extern md_extra_spec_t _gcry_digest_extraspec_sha224; -extern md_extra_spec_t _gcry_digest_extraspec_sha256; -extern md_extra_spec_t _gcry_digest_extraspec_sha384; -extern md_extra_spec_t _gcry_digest_extraspec_sha512; - /* Declarations for the pubkey cipher specifications. */ extern gcry_pk_spec_t _gcry_pubkey_spec_rsa; extern gcry_pk_spec_t _gcry_pubkey_spec_elg; diff --git a/src/gcrypt-module.h b/src/gcrypt-module.h index 621a3a4..35a928c 100644 --- a/src/gcrypt-module.h +++ b/src/gcrypt-module.h @@ -47,37 +47,6 @@ typedef struct gcry_module *gcry_module_t; /* ********************** */ -/* Type for the md_init function. */ -typedef void (*gcry_md_init_t) (void *c); - -/* Type for the md_write function. */ -typedef void (*gcry_md_write_t) (void *c, const void *buf, size_t nbytes); - -/* Type for the md_final function. */ -typedef void (*gcry_md_final_t) (void *c); - -/* Type for the md_read function. */ -typedef unsigned char *(*gcry_md_read_t) (void *c); - -typedef struct gcry_md_oid_spec -{ - const char *oidstring; -} gcry_md_oid_spec_t; - -/* Module specification structure for message digests. */ -typedef struct gcry_md_spec -{ - const char *name; - unsigned char *asnoid; - int asnlen; - gcry_md_oid_spec_t *oids; - int mdlen; - gcry_md_init_t init; - gcry_md_write_t write; - gcry_md_final_t final; - gcry_md_read_t read; - size_t contextsize; /* allocate this amount of context */ -} gcry_md_spec_t; #if 0 /* keep Emacsens's auto-indent happy */ { ----------------------------------------------------------------------- Summary of changes: cipher/cipher-internal.h | 1 - cipher/crc.c | 6 + cipher/gostr3411-94.c | 1 + cipher/md.c | 550 +++++++++++++++------------------------------- cipher/md4.c | 1 + cipher/md5.c | 1 + cipher/rmd160.c | 1 + cipher/sha1.c | 6 +- cipher/sha256.c | 12 +- cipher/sha512.c | 8 +- cipher/stribog.c | 2 + cipher/tiger.c | 3 + cipher/whirlpool.c | 1 + src/Makefile.am | 4 +- src/cipher-proto.h | 44 +++- src/cipher.h | 6 - src/ec-context.h | 3 - src/g10lib.h | 47 +---- src/gcrypt-module.h | 88 -------- src/module.c | 212 ------------------ src/mpi.h | 4 + src/visibility.h | 1 - 22 files changed, 238 insertions(+), 764 deletions(-) delete mode 100644 src/gcrypt-module.h delete mode 100644 src/module.c hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From gniibe at fsij.org Thu Oct 3 06:14:13 2013 From: gniibe at fsij.org (NIIBE Yutaka) Date: Thu, 03 Oct 2013 13:14:13 +0900 Subject: [PATCH 3/3] Add support for GOST R 34.10-2001/-2012 signatures In-Reply-To: <1380197263-750-3-git-send-email-dbaryshkov@gmail.com> References: <1380197263-750-1-git-send-email-dbaryshkov@gmail.com> <1380197263-750-3-git-send-email-dbaryshkov@gmail.com> Message-ID: <1380773653.32263.2.camel@cfw2.gniibe.org> Hello, Patch looks great. Here is my comment. I think that it's better to add ECC_DIALECT_GOST_R34_10 (or something like that) to ecc_dialects in src/mpi.h. That's because it's a part of a curve definition, and domain parameters (only) make sense with this. -- From dbaryshkov at gmail.com Thu Oct 3 11:43:28 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Thu, 3 Oct 2013 13:43:28 +0400 Subject: [PATCH 3/3] Add support for GOST R 34.10-2001/-2012 signatures In-Reply-To: <1380773653.32263.2.camel@cfw2.gniibe.org> References: <1380197263-750-1-git-send-email-dbaryshkov@gmail.com> <1380197263-750-3-git-send-email-dbaryshkov@gmail.com> <1380773653.32263.2.camel@cfw2.gniibe.org> Message-ID: Hello, On Thu, Oct 3, 2013 at 8:14 AM, NIIBE Yutaka wrote: > Patch looks great. Here is my comment. > > I think that it's better to add ECC_DIALECT_GOST_R34_10 (or something > like that) to ecc_dialects in src/mpi.h. That's because it's a part > of a curve definition, and domain parameters (only) make sense with > this. Hmm. Interesting suggestion. Maybe I failed to understand the purpose of the domain field. From my understanding it is used only for key generation, isn't it? Does it have any other usecases? I thought about gost r 34.10 signatures as if they are 'just another format' of ecdsa signatures. Thus I added a (flag gost) to data/signature generation. IIUC, there is no any significant difference between ECDSA and GOST curves/keys. What do you think? -- With best wishes Dmitry From gniibe at fsij.org Thu Oct 3 15:11:14 2013 From: gniibe at fsij.org (NIIBE Yutaka) Date: Thu, 03 Oct 2013 22:11:14 +0900 Subject: [PATCH 3/3] Add support for GOST R 34.10-2001/-2012 signatures In-Reply-To: References: <1380197263-750-1-git-send-email-dbaryshkov@gmail.com> <1380197263-750-3-git-send-email-dbaryshkov@gmail.com> <1380773653.32263.2.camel@cfw2.gniibe.org> Message-ID: <1380805874.3589.0.camel@latx1.gniibe.org> Hello, Let me explain my thought. When I see the portion of your code: --- a/cipher/ecc-curves.c +++ b/cipher/ecc-curves.c @@ -263,6 +263,34 @@ static const ecc_domain_parms_t domain_parms[] = "0x7dde385d566332ecc0eabfa9cf7822fdf209f70024a57b1aa000c55b881f8111" "b2dcde494a5f485e5bca4bd88a2763aed1ca2b2fa8f0540678cd1e0f3ad80892" }, + { + "GOST2001-test", 256, 0, + MPI_EC_WEIERSTRASS, ECC_DIALECT_STANDARD, + "0x8000000000000000000000000000000000000000000000000000000000000431", /* p */ + "0x0000000000000000000000000000000000000000000000000000000000000007", /* a */ + "0x5fbff498aa938ce739b8e022fbafef40563f6e6a3472fc2a514c0ce9dae23b7e", /* b */ + "0x8000000000000000000000000000000150fe8a1892976154c59cfc193accf5b3", /* q */ + + "0x0000000000000000000000000000000000000000000000000000000000000002", /* x */ + "0x08e2a8a0e65147d4bd6316030e16d19c85c97f0a9ca267122b96abbcea7e8fc8", /* y */ + }, I felt, it's not correct to uses ECC_DIALECT_STANDARD. Then, I thought that if it uses ECC_DIALECT_GOST_R34_10, it will be correct. That is, the interpretation of p, a, b, g (q for GOST R34.10), x, y (of generator) is determined by the combination of curve model and dialect. On 2013-10-03 at 13:43 +0400, Dmitry Eremin-Solenikov wrote: > Hmm. Interesting suggestion. Maybe I failed to understand the purpose > of the domain field. From my understanding it is used only for key generation, > isn't it? Does it have any other usecases? ecc_domain_parms_t is only used internally in ecc-curves.c, but it is the basis of the type elliptic_curve_t (defined in ecc-common.h), which also has model and dialect. > I thought about gost r 34.10 signatures as if they are 'just another format' of > ecdsa signatures. Thus I added a (flag gost) to data/signature generation. > IIUC, there is no any significant difference between ECDSA and GOST > curves/keys. I understand your approach of adding routines for GOST R 34.10. I have no objection for that. I agree about similarity between computation of ECDSA signature and one of GOST signature. My point is that the semantics of a curve definition can be different. Common: p, a, b defines a curve. ECC_DIALECT_STANDARD: g is order of the group. G is the generator, where gG = O ECC_DIALECT_GOST_R34_10: q is order of the subgroup. P is the generator of the subgroup, where qP = O If q = g and P = G for a GOST curve, there is nothing to distinguish ECC_DIALECT_STANDARD and ECC_DIALECT_GOST_R34_10. If q < g and P /= G for a GOST curve, we need to distinguish dialects. If we have optional fields and let GOST curve has g and G too, we can compute ECDSA signature with GOST curve. Usually, with curve for ECDSA, the order g is prime, so, meaningful subgroup is the group itself. We can compute GOST signature with this ECDSA curve with q = g, P = G. -- From wk at gnupg.org Thu Oct 3 17:14:40 2013 From: wk at gnupg.org (Werner Koch) Date: Thu, 03 Oct 2013 17:14:40 +0200 Subject: [PATCH 3/3] Add support for GOST R 34.10-2001/-2012 signatures In-Reply-To: (Dmitry Eremin-Solenikov's message of "Thu, 3 Oct 2013 13:43:28 +0400") References: <1380197263-750-1-git-send-email-dbaryshkov@gmail.com> <1380197263-750-3-git-send-email-dbaryshkov@gmail.com> <1380773653.32263.2.camel@cfw2.gniibe.org> Message-ID: <87txgy5qu7.fsf@vigenere.g10code.de> On Thu, 3 Oct 2013 11:43, dbaryshkov at gmail.com said: > of the domain field. From my understanding it is used only for key generation, > isn't it? Does it have any other usecases? There are two purposes for the dialect field: - Allow to tweak the computation - for example in the case of EdDSA. - Allow to divert for optimized maths for a specific curve. > ecdsa signatures. Thus I added a (flag gost) to data/signature generation. > IIUC, there is no any significant difference between ECDSA and GOST > curves/keys. Even then it might be useful to flag it as a dialect (e.g. ..DIALECT_SUBGROUP) - might come handy later. Shalom-Salam, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From dbaryshkov at gmail.com Thu Oct 3 21:56:22 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Thu, 3 Oct 2013 23:56:22 +0400 Subject: [PATCH 3/3] Add support for GOST R 34.10-2001/-2012 signatures In-Reply-To: <1380805874.3589.0.camel@latx1.gniibe.org> References: <1380197263-750-1-git-send-email-dbaryshkov@gmail.com> <1380197263-750-3-git-send-email-dbaryshkov@gmail.com> <1380773653.32263.2.camel@cfw2.gniibe.org> <1380805874.3589.0.camel@latx1.gniibe.org> Message-ID: Hello, On Thu, Oct 3, 2013 at 5:11 PM, NIIBE Yutaka wrote: > > If q = g and P = G for a GOST curve, there is nothing to distinguish > ECC_DIALECT_STANDARD and ECC_DIALECT_GOST_R34_10. > > If q < g and P /= G for a GOST curve, we need to distinguish dialects. > > If we have optional fields and let GOST curve has g and G too, we can > compute ECDSA signature with GOST curve. Hmm. It's my fault. In the standard itself, there are two distinct values: m (the group order) and q (the subgroup order). However two facts distracted me. First, in the both curves defined in standard m = q. Second, rfc4357 (which supplements standards with exact parameters, values, etc) defines only q parameter for the curves that are used/defined. It looks like there is a possibility for m and q to differ. Thank you very much for pointing me to it! Even if we verify that queues defined in rfc4357 use m=q, there is absolutely guarantee that in future curves will follow. So it really looks like a separate domain. At least from the 'pure math' perspective. I like the Werner's idea of DIALECT_SUBGROUP. It defines the curve parameters and still leaves enough space for possible standards/curves which decide to use subgroup instead of full group. Werner, would you take first two patches in this serie? -- With best wishes Dmitry From gniibe at fsij.org Fri Oct 4 03:04:18 2013 From: gniibe at fsij.org (NIIBE Yutaka) Date: Fri, 04 Oct 2013 10:04:18 +0900 Subject: [PATCH 3/3] Add support for GOST R 34.10-2001/-2012 signatures In-Reply-To: References: <1380197263-750-1-git-send-email-dbaryshkov@gmail.com> <1380197263-750-3-git-send-email-dbaryshkov@gmail.com> <1380773653.32263.2.camel@cfw2.gniibe.org> <1380805874.3589.0.camel@latx1.gniibe.org> Message-ID: <1380848658.3346.1.camel@cfw2.gniibe.org> Hello, I withdraw my original suggestion (using DIALECT_*) if we don't have actual curve at hand, where m /= q. On 2013-10-03 at 23:56 +0400, Dmitry Eremin-Solenikov wrote: > First, in the both curves defined in standard m = q. Second, > rfc4357 (which supplements standards with exact parameters, values, > etc) defines only q parameter for the curves that are used/defined. Thank you for your explanation. I misunderstood as if m = q were just an example, and general cases of m /= q should be handled. > So it really looks like a separate domain. I understand. > I like the Werner's idea of DIALECT_SUBGROUP. It defines the curve > parameters and still leaves enough space for possible > standards/curves which decide to use subgroup instead of full group. IIUC, this means: We reserve DIALECT_SUBGROUP for future use (cases of m /= q). A curve with DIALECT_STANDARD will be able to compute GOST signature, as well as ECDSA signature. A curve with DIALECT_SUBGROUP will be able to compute GOST signature, but not to compute ECDSA signature. Let's will do that when we will have a curve m /= q. -- From wk at gnupg.org Mon Oct 7 11:56:00 2013 From: wk at gnupg.org (Werner Koch) Date: Mon, 07 Oct 2013 11:56:00 +0200 Subject: GOST ECC pubkey In-Reply-To: (Dmitry Eremin-Solenikov's message of "Wed, 2 Oct 2013 19:09:14 +0400") References: <1379653630.3179.2.camel@cfw2.gniibe.org> <1379655225.3179.3.camel@cfw2.gniibe.org> <8761thb4wk.fsf@vigenere.g10code.de> Message-ID: <8761t91k27.fsf@vigenere.g10code.de> On Wed, 2 Oct 2013 17:09, dbaryshkov at gmail.com said: > I have sent a patch a few days ago. It just adds a (flag gost) to the data > and signature s-exps. There is no difference with ECDSA in the keys > S-expressions. Please have some patience. I need to restructure the ecc code first and then there is an urgent request from the GNUnet folks to solve an problem with their special needs. Also the git server had a hardware failure and I need some time to migrate it to a new machine. Salam-Shalom, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From dbaryshkov at gmail.com Mon Oct 7 12:19:55 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Mon, 7 Oct 2013 14:19:55 +0400 Subject: GOST ECC pubkey In-Reply-To: <8761t91k27.fsf@vigenere.g10code.de> References: <1379653630.3179.2.camel@cfw2.gniibe.org> <1379655225.3179.3.camel@cfw2.gniibe.org> <8761thb4wk.fsf@vigenere.g10code.de> <8761t91k27.fsf@vigenere.g10code.de> Message-ID: On Mon, Oct 7, 2013 at 1:56 PM, Werner Koch wrote: > On Wed, 2 Oct 2013 17:09, dbaryshkov at gmail.com said: > >> I have sent a patch a few days ago. It just adds a (flag gost) to the data >> and signature s-exps. There is no difference with ECDSA in the keys >> S-expressions. > > Please have some patience. I need to restructure the ecc code first and > then there is an urgent request from the GNUnet folks to solve an > problem with their special needs. > > Also the git server had a hardware failure and I need some time to > migrate it to a new machine. No problem with waiting, I was just describing what was done. Hopefully all issues will be sorted out soon. At least this gives me some time to finish GCM :) -- With best wishes Dmitry From wk at gnupg.org Wed Oct 9 23:11:39 2013 From: wk at gnupg.org (Werner Koch) Date: Wed, 09 Oct 2013 23:11:39 +0200 Subject: [admin] Please do not poll the git server Message-ID: <87siwaup2s.fsf@vigenere.g10code.de> Hi, while looking at the new git server I found that someone is polling for changes in libgcrypt. Every minute! Please don't do that - subscribe to the commit list instead. For those who want an immediate notification they my join the Jabber MUC gnupg-devel at conference.jabber.gnupg.org . If you really think that you need to run a poll script please be nice and don't run it more often than every hour or so. The IP address in question seems to be from a Scandinavian shell server provider - please go and fix your cronjob now. Shalom-Salam, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From cvs at cvs.gnupg.org Thu Oct 10 15:00:28 2013 From: cvs at cvs.gnupg.org (by Jussi Kivilinna) Date: Thu, 10 Oct 2013 15:00:28 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-300-g150c031 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 150c0313f971bcea62d2802f0389c883e11ebb31 (commit) from 94b652ecb006c29fa2ffb1badc9f02b758581737 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 150c0313f971bcea62d2802f0389c883e11ebb31 Author: Jussi Kivilinna Date: Wed Oct 2 20:47:56 2013 +0300 Prevent tail call optimization with _gcry_burn_stack * configure.ac: New check, HAVE_GCC_ASM_VOLATILE_MEMORY. * src/g10lib.h (_gcry_burn_stack): Rename to __gcry_burn_stack. (__gcry_burn_stack_dummy): New. (_gcry_burn_stack): New macro. * src/misc.c (_gcry_burn_stack): Rename to __gcry_burn_stack. (__gcry_burn_stack_dummy): New. -- Tail call optimization can turn _gcry_burn_stack call in to tail jump. When this happens, stack pointer is restored to initial state of current function. This causes problem for _gcry_burn_stack because its callers do not count in current function stack depth. One solution is to prevent gcry_burn_stack being tail optimized by inserting dummy function call behind it. Another would be to add memory barrier 'asm volatile("":::"memory")' behind every _gcry_burn_stack call. This however requires GCC asm support from compiler. Patch adds detection for memory barrier support and when available uses memory barrier to prevent when tail call optimization. If not available dummy function call is used instead. Signed-off-by: Jussi Kivilinna diff --git a/configure.ac b/configure.ac index 2c92028..1460dfd 100644 --- a/configure.ac +++ b/configure.ac @@ -921,7 +921,7 @@ fi # # Check whether the compiler supports 'asm' or '__asm__' keyword for -# assembler blocks +# assembler blocks. # AC_CACHE_CHECK([whether 'asm' assembler keyword is supported], [gcry_cv_have_asm], @@ -945,6 +945,32 @@ fi # +# Check whether the compiler supports inline assembly memory barrier. +# +if test "$gcry_cv_have_asm" = "no" ; then + if test "$gcry_cv_have___asm__" = "yes" ; then + AC_CACHE_CHECK([whether inline assembly memory barrier is supported], + [gcry_cv_have_asm_volatile_memory], + [gcry_cv_have_asm_volatile_memory=no + AC_COMPILE_IFELSE([AC_LANG_SOURCE( + [[void a(void) { __asm__ volatile("":::"memory"); }]])], + [gcry_cv_have_asm_volatile_memory=yes])]) + fi +else + AC_CACHE_CHECK([whether inline assembly memory barrier is supported], + [gcry_cv_have_asm_volatile_memory], + [gcry_cv_have_asm_volatile_memory=no + AC_COMPILE_IFELSE([AC_LANG_SOURCE( + [[void a(void) { asm volatile("":::"memory"); }]])], + [gcry_cv_have_asm_volatile_memory=yes])]) +fi +if test "$gcry_cv_have_asm_volatile_memory" = "yes" ; then + AC_DEFINE(HAVE_GCC_ASM_VOLATILE_MEMORY,1, + [Define if inline asm memory barrier is supported]) +fi + + +# # Check whether GCC inline assembler supports SSSE3 instructions # This is required for the AES-NI instructions. # diff --git a/src/g10lib.h b/src/g10lib.h index 0ada30a..c1ba2f7 100644 --- a/src/g10lib.h +++ b/src/g10lib.h @@ -259,7 +259,16 @@ int strcasecmp (const char *a, const char *b) _GCRY_GCC_ATTR_PURE; /* Stack burning. */ -void _gcry_burn_stack (unsigned int bytes); +#ifdef HAVE_GCC_ASM_VOLATILE_MEMORY +#define __gcry_burn_stack_dummy() asm volatile ("":::"memory") +#else +void __gcry_burn_stack_dummy (void); +#endif + +void __gcry_burn_stack (unsigned int bytes); +#define _gcry_burn_stack(bytes) \ + do { __gcry_burn_stack (bytes); \ + __gcry_burn_stack_dummy (); } while(0) /* To avoid that a compiler optimizes certain memset calls away, these diff --git a/src/misc.c b/src/misc.c index 912039a..9b30ac3 100644 --- a/src/misc.c +++ b/src/misc.c @@ -438,7 +438,7 @@ _gcry_log_printsxp (const char *text, gcry_sexp_t sexp) void -_gcry_burn_stack (unsigned int bytes) +__gcry_burn_stack (unsigned int bytes) { #ifdef HAVE_VLA /* (bytes == 0 ? 1 : bytes) == (!bytes + bytes) */ @@ -456,6 +456,13 @@ _gcry_burn_stack (unsigned int bytes) #endif } +#ifndef HAVE_GCC_ASM_VOLATILE_MEMORY +void +__gcry_burn_stack_dummy (void) +{ +} +#endif + void _gcry_divide_by_zero (void) { ----------------------------------------------------------------------- Summary of changes: configure.ac | 28 +++++++++++++++++++++++++++- src/g10lib.h | 11 ++++++++++- src/misc.c | 9 ++++++++- 3 files changed, 45 insertions(+), 3 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From jussi.kivilinna at iki.fi Thu Oct 10 16:43:25 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Thu, 10 Oct 2013 17:43:25 +0300 Subject: [PATCH] arcfour: more optimized version for non-i386 architectures Message-ID: <20131010144325.30641.99663.stgit@localhost6.localdomain6> * cipher/arcfour.c (ARCFOUR_context): Reorder members. (do_encrypt_stream) [!__i386__]: Faster implementation for non-i386. (do_arcfour_setkey): Avoid modulo operations. -- Patch adds faster arcfour implementation for non-i386 architectures. New code is not activated on i386 as performance would regress. This is because i386 does not have enough registers hold new variables. Speed up on Intel i5-4570 (x86_64): 1.56x Speed up on ARM Cortex-A8: 1.18x Signed-off-by: Jussi Kivilinna --- cipher/arcfour.c | 40 +++++++++++++++++++++++++++++++++++----- 1 file changed, 35 insertions(+), 5 deletions(-) diff --git a/cipher/arcfour.c b/cipher/arcfour.c index dc32b07..e8a5484 100644 --- a/cipher/arcfour.c +++ b/cipher/arcfour.c @@ -34,14 +34,39 @@ static const char *selftest(void); typedef struct { - int idx_i, idx_j; byte sbox[256]; + int idx_i, idx_j; } ARCFOUR_context; static void do_encrypt_stream( ARCFOUR_context *ctx, byte *outbuf, const byte *inbuf, unsigned int length ) { +#ifndef __i386__ + register unsigned int i = ctx->idx_i; + register byte j = ctx->idx_j; + register byte *sbox = ctx->sbox; + register byte t, u; + + while ( length-- ) + { + i++; + t = sbox[(byte)i]; + j += t; + u = sbox[j]; + sbox[(byte)i] = u; + u += t; + sbox[j] = t; + *outbuf++ = sbox[u] ^ *inbuf++; + } + + ctx->idx_i = (byte)i; + ctx->idx_j = (byte)j; +#else /*__i386__*/ + /* Old implementation of arcfour is faster on i386 than the version above. + * This is because version above increases register pressure which on i386 + * would push some of the variables to memory/stack. Therefore keep this + * version for i386 to avoid regressing performance. */ register int i = ctx->idx_i; register int j = ctx->idx_j; register byte *sbox = ctx->sbox; @@ -59,6 +84,7 @@ do_encrypt_stream( ARCFOUR_context *ctx, ctx->idx_i = i; ctx->idx_j = j; +#endif } static void @@ -96,17 +122,21 @@ do_arcfour_setkey (void *context, const byte *key, unsigned int keylen) ctx->idx_i = ctx->idx_j = 0; for (i=0; i < 256; i++ ) ctx->sbox[i] = i; - for (i=0; i < 256; i++ ) - karr[i] = key[i%keylen]; + for (i=j=0; i < 256; i++,j++ ) + { + if (j >= keylen) + j = 0; + karr[i] = key[j]; + } for (i=j=0; i < 256; i++ ) { int t; - j = (j + ctx->sbox[i] + karr[i]) % 256; + j = (j + ctx->sbox[i] + karr[i]) & 255; t = ctx->sbox[i]; ctx->sbox[i] = ctx->sbox[j]; ctx->sbox[j] = t; } - memset( karr, 0, 256 ); + wipememory( karr, sizeof(karr) ); return GPG_ERR_NO_ERROR; } From jussi.kivilinna at iki.fi Sun Oct 13 12:02:28 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Sun, 13 Oct 2013 13:02:28 +0300 Subject: [RFC PATCH 1/3] Add API to support AEAD cipher modes Message-ID: <20131013100228.32014.526.stgit@localhost6.localdomain6> From: Dmitry Eremin-Solenikov * cipher/cipher.c (_gcry_cipher_authenticate, _gcry_cipher_tag): New. * src/visibility.c (gcry_cipher_authenticate, gcry_cipher_tag): New. * src/gcrypt.h.in, src/visibility.h: add declarations of these functions. * src/libgcrypt.defs, src/libgcrypt.vers: export functions. -- Authenticated Encryption with Associated Data (AEAD) cipher modes provide authentication tag that can be used to authenticate message. At the same time it allows one to specify additional (unencrypted data) that will be authenticated together with the message. This class of cipher modes requires additional API present in this commit. Signed-off-by: Dmitry Eremin-Solenikov --- cipher/cipher.c | 15 +++++++++++++++ src/gcrypt.h.in | 7 +++++++ src/libgcrypt.def | 2 ++ src/libgcrypt.vers | 1 + src/visibility.c | 18 ++++++++++++++++++ src/visibility.h | 6 ++++++ 6 files changed, 49 insertions(+) diff --git a/cipher/cipher.c b/cipher/cipher.c index 75d42d1..2d3a457 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -910,6 +910,21 @@ _gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen) return 0; } +gcry_error_t +_gcry_cipher_authenticate (gcry_cipher_hd_t hd, + const void *aad, size_t aadsize) +{ + log_fatal ("gcry_cipher_tag: invalid mode %d\n", hd->mode ); + return gpg_error (GPG_ERR_INV_CIPHER_MODE); +} + +gcry_error_t +_gcry_cipher_tag (gcry_cipher_hd_t hd, void *out, size_t outsize) +{ + log_fatal ("gcry_cipher_tag: invalid mode %d\n", hd->mode ); + return gpg_error (GPG_ERR_INV_CIPHER_MODE); +} + gcry_error_t gcry_cipher_ctl( gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) diff --git a/src/gcrypt.h.in b/src/gcrypt.h.in index 8646f43..a33dc08 100644 --- a/src/gcrypt.h.in +++ b/src/gcrypt.h.in @@ -940,6 +940,13 @@ gcry_error_t gcry_cipher_setkey (gcry_cipher_hd_t hd, gcry_error_t gcry_cipher_setiv (gcry_cipher_hd_t hd, const void *iv, size_t ivlen); +/* Provide additional authentication data for AEAD modes/ciphers. */ +gcry_error_t gcry_cipher_authenticate (gcry_cipher_hd_t h, + const void *aad, size_t aadlen); + +/* Get authentication tag for AEAD modes/ciphers. */ +gcry_error_t gcry_cipher_tag (gcry_cipher_hd_t h, + void *out, size_t outsize); /* Reset the handle to the state after open. */ #define gcry_cipher_reset(h) gcry_cipher_ctl ((h), GCRYCTL_RESET, NULL, 0) diff --git a/src/libgcrypt.def b/src/libgcrypt.def index 7efb3b9..7d8e679 100644 --- a/src/libgcrypt.def +++ b/src/libgcrypt.def @@ -253,5 +253,7 @@ EXPORTS gcry_log_debugpnt @223 gcry_log_debugsxp @224 + gcry_cipher_authenticate @225 + gcry_cipher_tag @226 ;; end of file with public symbols for Windows. diff --git a/src/libgcrypt.vers b/src/libgcrypt.vers index b1669fd..be20f51 100644 --- a/src/libgcrypt.vers +++ b/src/libgcrypt.vers @@ -51,6 +51,7 @@ GCRYPT_1.6 { gcry_cipher_info; gcry_cipher_map_name; gcry_cipher_mode_from_oid; gcry_cipher_open; gcry_cipher_setkey; gcry_cipher_setiv; gcry_cipher_setctr; + gcry_cipher_authenticate; gcry_cipher_tag; gcry_pk_algo_info; gcry_pk_algo_name; gcry_pk_ctl; gcry_pk_decrypt; gcry_pk_encrypt; gcry_pk_genkey; diff --git a/src/visibility.c b/src/visibility.c index 6e3c755..669537f 100644 --- a/src/visibility.c +++ b/src/visibility.c @@ -689,6 +689,24 @@ gcry_cipher_setiv (gcry_cipher_hd_t hd, const void *iv, size_t ivlen) return _gcry_cipher_setiv (hd, iv, ivlen); } +gcry_error_t +gcry_cipher_tag (gcry_cipher_hd_t hd, void *out, size_t outsize) +{ + if (!fips_is_operational ()) + return gpg_error (fips_not_operational ()); + + return _gcry_cipher_tag (hd, out, outsize); +} + +gcry_error_t +gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *aad, size_t aadsize) +{ + if (!fips_is_operational ()) + return gpg_error (fips_not_operational ()); + + return _gcry_cipher_authenticate (hd, aad, aadsize); +} + gpg_error_t gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen) { diff --git a/src/visibility.h b/src/visibility.h index cd2a60f..d4db258 100644 --- a/src/visibility.h +++ b/src/visibility.h @@ -81,6 +81,8 @@ #define gcry_cipher_setkey _gcry_cipher_setkey #define gcry_cipher_setiv _gcry_cipher_setiv #define gcry_cipher_setctr _gcry_cipher_setctr +#define gcry_cipher_authenticate _gcry_cipher_authenticate +#define gcry_cipher_tag _gcry_cipher_tag #define gcry_cipher_ctl _gcry_cipher_ctl #define gcry_cipher_decrypt _gcry_cipher_decrypt #define gcry_cipher_encrypt _gcry_cipher_encrypt @@ -297,6 +299,8 @@ gcry_err_code_t gcry_md_get (gcry_md_hd_t hd, int algo, #undef gcry_cipher_setkey #undef gcry_cipher_setiv #undef gcry_cipher_setctr +#undef gcry_cipher_authenticate +#undef gcry_cipher_tag #undef gcry_cipher_ctl #undef gcry_cipher_decrypt #undef gcry_cipher_encrypt @@ -474,6 +478,8 @@ MARK_VISIBLE (gcry_cipher_close) MARK_VISIBLE (gcry_cipher_setkey) MARK_VISIBLE (gcry_cipher_setiv) MARK_VISIBLE (gcry_cipher_setctr) +MARK_VISIBLE (gcry_cipher_authenticate) +MARK_VISIBLE (gcry_cipher_tag) MARK_VISIBLE (gcry_cipher_ctl) MARK_VISIBLE (gcry_cipher_decrypt) MARK_VISIBLE (gcry_cipher_encrypt) From jussi.kivilinna at iki.fi Sun Oct 13 12:02:33 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Sun, 13 Oct 2013 13:02:33 +0300 Subject: [RFC PATCH 2/3] Add API for initializing AEAD modes In-Reply-To: <20131013100228.32014.526.stgit@localhost6.localdomain6> References: <20131013100228.32014.526.stgit@localhost6.localdomain6> Message-ID: <20131013100233.32014.24561.stgit@localhost6.localdomain6> -- AEAD modes may have different requirements for initialization. For example, CCM mode needs to know length of encrypted data in advance. So, would it make sense to add variadic API function for initilizing AEAD mode? The one that this patch adds is: gcry_error_t gcry_cipher_aead_init (gcry_cipher_hd_t hd, ...); With this API, CCM mode could be initialized by calling gcry_cipher_aead_init using arguments (CCM needs the length of encrypted data, and the length of authentication tag at begining): 'gcry_cipher_hd_t hd, void *nonce, int noncelen, int cryptlen, int taglen'. GCM mode, in the other hand, could omit the length of data and tag from initialization and just provide nonce and nonce_len. Signed-off-by: Jussi Kivilinna --- cipher/cipher.c | 21 +++++++++++++++++++++ src/g10lib.h | 4 ++++ src/gcrypt.h.in | 5 ++++- src/libgcrypt.def | 1 + src/libgcrypt.vers | 2 +- src/visibility.c | 16 ++++++++++++++++ src/visibility.h | 3 +++ 7 files changed, 50 insertions(+), 2 deletions(-) diff --git a/cipher/cipher.c b/cipher/cipher.c index 2d3a457..8ebab7c 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -910,18 +910,39 @@ _gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen) return 0; } + gcry_error_t _gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *aad, size_t aadsize) { log_fatal ("gcry_cipher_tag: invalid mode %d\n", hd->mode ); + + (void)aad; + (void)aadsize; + return gpg_error (GPG_ERR_INV_CIPHER_MODE); } + gcry_error_t _gcry_cipher_tag (gcry_cipher_hd_t hd, void *out, size_t outsize) { log_fatal ("gcry_cipher_tag: invalid mode %d\n", hd->mode ); + + (void)out; + (void)outsize; + + return gpg_error (GPG_ERR_INV_CIPHER_MODE); +} + + +gcry_error_t +_gcry_cipher_aead_vinit (gcry_cipher_hd_t hd, va_list arg_ptr) +{ + log_fatal ("gcry_cipher_aead_init: invalid mode %d\n", hd->mode ); + + (void)arg_ptr; + return gpg_error (GPG_ERR_INV_CIPHER_MODE); } diff --git a/src/g10lib.h b/src/g10lib.h index c1ba2f7..e4f9e7e 100644 --- a/src/g10lib.h +++ b/src/g10lib.h @@ -190,6 +190,10 @@ void _gcry_detect_hw_features (unsigned int); const char *_gcry_mpi_get_hw_config (void); +/*-- cipher/cipher.c --*/ +gcry_error_t _gcry_cipher_aead_vinit (gcry_cipher_hd_t hd, va_list arg_ptr); + + /*-- cipher/pubkey.c --*/ /* FIXME: shouldn't this go into mpi.h? */ diff --git a/src/gcrypt.h.in b/src/gcrypt.h.in index a33dc08..2fffd69 100644 --- a/src/gcrypt.h.in +++ b/src/gcrypt.h.in @@ -940,7 +940,7 @@ gcry_error_t gcry_cipher_setkey (gcry_cipher_hd_t hd, gcry_error_t gcry_cipher_setiv (gcry_cipher_hd_t hd, const void *iv, size_t ivlen); -/* Provide additional authentication data for AEAD modes/ciphers. */ +/* Provide additional authentication data for AEAD modes/ciphers. */ gcry_error_t gcry_cipher_authenticate (gcry_cipher_hd_t h, const void *aad, size_t aadlen); @@ -948,6 +948,9 @@ gcry_error_t gcry_cipher_authenticate (gcry_cipher_hd_t h, gcry_error_t gcry_cipher_tag (gcry_cipher_hd_t h, void *out, size_t outsize); +/* Initialization for different AEAD modes. */ +gcry_error_t gcry_cipher_aead_init (gcry_cipher_hd_t hd, ...); + /* Reset the handle to the state after open. */ #define gcry_cipher_reset(h) gcry_cipher_ctl ((h), GCRYCTL_RESET, NULL, 0) diff --git a/src/libgcrypt.def b/src/libgcrypt.def index 7d8e679..176958e 100644 --- a/src/libgcrypt.def +++ b/src/libgcrypt.def @@ -255,5 +255,6 @@ EXPORTS gcry_cipher_authenticate @225 gcry_cipher_tag @226 + gcry_cipher_aead_init @227 ;; end of file with public symbols for Windows. diff --git a/src/libgcrypt.vers b/src/libgcrypt.vers index be20f51..5b837d6 100644 --- a/src/libgcrypt.vers +++ b/src/libgcrypt.vers @@ -51,7 +51,7 @@ GCRYPT_1.6 { gcry_cipher_info; gcry_cipher_map_name; gcry_cipher_mode_from_oid; gcry_cipher_open; gcry_cipher_setkey; gcry_cipher_setiv; gcry_cipher_setctr; - gcry_cipher_authenticate; gcry_cipher_tag; + gcry_cipher_authenticate; gcry_cipher_tag; gcry_cipher_aead_init; gcry_pk_algo_info; gcry_pk_algo_name; gcry_pk_ctl; gcry_pk_decrypt; gcry_pk_encrypt; gcry_pk_genkey; diff --git a/src/visibility.c b/src/visibility.c index 669537f..ed53a14 100644 --- a/src/visibility.c +++ b/src/visibility.c @@ -707,6 +707,22 @@ gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *aad, size_t aadsize) return _gcry_cipher_authenticate (hd, aad, aadsize); } +gcry_error_t +gcry_cipher_aead_init (gcry_cipher_hd_t hd, ...) +{ + va_list arg_ptr; + gcry_error_t err; + + if (!fips_is_operational ()) + return gpg_error (fips_not_operational ()); + + va_start (arg_ptr, hd); + err = _gcry_cipher_aead_vinit (hd, arg_ptr); + va_end (arg_ptr); + + return err; +} + gpg_error_t gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen) { diff --git a/src/visibility.h b/src/visibility.h index d4db258..a992ef5 100644 --- a/src/visibility.h +++ b/src/visibility.h @@ -83,6 +83,7 @@ #define gcry_cipher_setctr _gcry_cipher_setctr #define gcry_cipher_authenticate _gcry_cipher_authenticate #define gcry_cipher_tag _gcry_cipher_tag +#define gcry_cipher_aead_init _gcry_cipher_aead_init #define gcry_cipher_ctl _gcry_cipher_ctl #define gcry_cipher_decrypt _gcry_cipher_decrypt #define gcry_cipher_encrypt _gcry_cipher_encrypt @@ -301,6 +302,7 @@ gcry_err_code_t gcry_md_get (gcry_md_hd_t hd, int algo, #undef gcry_cipher_setctr #undef gcry_cipher_authenticate #undef gcry_cipher_tag +#undef gcry_cipher_aead_init #undef gcry_cipher_ctl #undef gcry_cipher_decrypt #undef gcry_cipher_encrypt @@ -480,6 +482,7 @@ MARK_VISIBLE (gcry_cipher_setiv) MARK_VISIBLE (gcry_cipher_setctr) MARK_VISIBLE (gcry_cipher_authenticate) MARK_VISIBLE (gcry_cipher_tag) +MARK_VISIBLE (gcry_cipher_aead_init) MARK_VISIBLE (gcry_cipher_ctl) MARK_VISIBLE (gcry_cipher_decrypt) MARK_VISIBLE (gcry_cipher_encrypt) From jussi.kivilinna at iki.fi Sun Oct 13 12:02:38 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Sun, 13 Oct 2013 13:02:38 +0300 Subject: [RFC PATCH 3/3] Add Counter with CBC-MAC mode (CCM) In-Reply-To: <20131013100228.32014.526.stgit@localhost6.localdomain6> References: <20131013100228.32014.526.stgit@localhost6.localdomain6> Message-ID: <20131013100238.32014.88403.stgit@localhost6.localdomain6> -- Patch adds CCM (Counter with CBC-MAC) mode as defined in RFC 3610 and NIST Special Publication 800-38C. --- cipher/Makefile.am | 1 cipher/cipher-ccm.c | 349 ++++++++++++++++++++++++++++++ cipher/cipher-internal.h | 41 +++- cipher/cipher.c | 88 ++++++-- src/gcrypt.h.in | 5 tests/basic.c | 532 ++++++++++++++++++++++++++++++++++++++++++++++ tests/benchmark.c | 67 +++++- 7 files changed, 1063 insertions(+), 20 deletions(-) create mode 100644 cipher/cipher-ccm.c diff --git a/cipher/Makefile.am b/cipher/Makefile.am index a2b2c8a..b0efd89 100644 --- a/cipher/Makefile.am +++ b/cipher/Makefile.am @@ -40,6 +40,7 @@ libcipher_la_LIBADD = $(GCRYPT_MODULES) libcipher_la_SOURCES = \ cipher.c cipher-internal.h \ cipher-cbc.c cipher-cfb.c cipher-ofb.c cipher-ctr.c cipher-aeswrap.c \ +cipher-ccm.c \ cipher-selftest.c cipher-selftest.h \ pubkey.c pubkey-internal.h pubkey-util.c \ md.c \ diff --git a/cipher/cipher-ccm.c b/cipher/cipher-ccm.c new file mode 100644 index 0000000..94ffbea --- /dev/null +++ b/cipher/cipher-ccm.c @@ -0,0 +1,349 @@ +/* cipher-ccm.c - CTR mode with CBC-MAC mode implementation + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser general Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include +#include +#include +#include +#include + +#include "g10lib.h" +#include "cipher.h" +#include "ath.h" +#include "bufhelp.h" +#include "./cipher-internal.h" + + +#define set_burn(burn, nburn) do { \ + unsigned int __nburn = (nburn); \ + (burn) = (burn) > __nburn ? (burn) : __nburn; } while (0) + + +static unsigned int +do_cbc_mac (gcry_cipher_hd_t c, const unsigned char *inbuf, unsigned int inlen, + int do_padding) +{ + unsigned int burn = 0; + unsigned int unused = c->u_mode.ccm.mac_unused; + unsigned int blocksize = 16; + + if (inlen == 0 && unused == 0) + return 0; + + do + { + if (inlen + unused < blocksize || unused > 0) + { + for (; inlen && unused < blocksize; inlen--) + c->u_mode.ccm.macbuf[unused++] = *inbuf++; + } + if (!inlen) + { + if (!do_padding) + break; + + while (unused < blocksize) + c->u_mode.ccm.macbuf[unused++] = 0; + } + + if (unused > 0) + { + /* Process one block from macbuf. */ + buf_xor(c->u_iv.iv, c->u_iv.iv, c->u_mode.ccm.macbuf, blocksize); + set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_iv.iv, + c->u_iv.iv )); + + unused = 0; + } + + while (inlen >= blocksize) + { + buf_xor(c->u_iv.iv, c->u_iv.iv, inbuf, blocksize); + + set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_iv.iv, + c->u_iv.iv )); + + inlen -= blocksize; + inbuf += blocksize; + } + } + while (inlen > 0); + + c->u_mode.ccm.mac_unused = unused; + + if (burn) + burn += 4 * sizeof(void *); + + return burn; +} + + +gcry_err_code_t +_gcry_cipher_ccm_aead_init (gcry_cipher_hd_t c, const unsigned char *nonce, + unsigned int noncelen, unsigned int authlen, + size_t encryptlen) +{ + unsigned int L = 15 - noncelen; + unsigned int M = authlen; + unsigned int L_, M_; + int i; + + M_ = (M - 2) / 2; + L_ = L - 1; + + if (!nonce) + return GPG_ERR_INV_ARG; + /* Authentication field must be 4, 6, 8, 10, 12, 14 or 16. */ + if ((M_ * 2 + 2) != M || M < 4 || M > 16) + return GPG_ERR_INV_LENGTH; + /* Length field must be 2, 3, ..., or 8. */ + if (L < 2 || L > 8) + return GPG_ERR_INV_LENGTH; + + /* Reset state */ + memset (&c->u_mode, 0, sizeof(c->u_mode)); + memset (&c->marks, 0, sizeof(c->marks)); + memset (&c->u_iv, 0, sizeof(c->u_iv)); + memset (&c->u_ctr, 0, sizeof(c->u_ctr)); + memset (c->lastiv, 0, sizeof(c->lastiv)); + c->unused = 0; + + c->u_mode.ccm.authlen = authlen; + c->u_mode.ccm.encryptlen = encryptlen; + + /* Setup IV & CTR */ + c->u_iv.iv[0] = M_ * 8 + L_; /* Do not yet know if addlen > 0. */ + memcpy (&c->u_iv.iv[1], nonce, noncelen); + for (i = 16 - 1; i >= 1 + noncelen; i--) + { + c->u_iv.iv[i] = encryptlen & 0xff; + encryptlen >>= 8; + } + c->u_ctr.ctr[0] = L_; + memcpy (&c->u_ctr.ctr[1], nonce, noncelen); + memset (&c->u_ctr.ctr[1 + noncelen], 0, L); + + c->u_mode.ccm.aead = 1; + + return GPG_ERR_NO_ERROR; +} + + +gcry_err_code_t +_gcry_cipher_ccm_authenticate (gcry_cipher_hd_t c, const unsigned char *aadbuf, + unsigned int aadbuflen) +{ + unsigned int burn = 0; + unsigned char b0[16]; + + if (aadbuflen > 0 && !aadbuf) + return GPG_ERR_INV_ARG; + if (!c->u_mode.ccm.aead || c->u_mode.ccm.tag) + return GPG_ERR_INV_STATE; + if (c->u_mode.ccm.aad) + return GPG_ERR_INV_STATE; + + memcpy (b0, c->u_iv.iv, 16); + memset (c->u_iv.iv, 0, 16); + + b0[0] |= (aadbuflen > 0) * 64; + + set_burn (burn, do_cbc_mac (c, b0, 16, 0)); + + if (aadbuflen > 0 && aadbuflen <= (unsigned int)0xfeff) + { + b0[0] = (aadbuflen >> 8) & 0xff; + b0[1] = aadbuflen & 0xff; + set_burn (burn, do_cbc_mac (c, b0, 2, 0)); + } + else if (aadbuflen > 0xfeff && aadbuflen <= (unsigned int)0xffffffff) + { + b0[0] = 0xff; + b0[1] = 0xfe; + buf_put_be32(&b0[2], aadbuflen); + set_burn (burn, do_cbc_mac (c, b0, 6, 0)); + } +#ifdef HAVE_U64_TYPEDEF + else if (aadbuflen > (unsigned int)0xffffffff) + { + b0[0] = 0xff; + b0[1] = 0xff; + buf_put_be64(&b0[2], aadbuflen); + set_burn (burn, do_cbc_mac (c, b0, 10, 0)); + } +#endif + + if (aadbuflen > 0) + set_burn (burn, do_cbc_mac (c, aadbuf, aadbuflen, 1)); + + + /* Generate S_0 and increase counter. */ + set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_mode.ccm.s0, + c->u_ctr.ctr )); + c->u_ctr.ctr[15]++; + + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + + c->u_mode.ccm.aad = 1; + + return GPG_ERR_NO_ERROR; +} + + +gcry_err_code_t +_gcry_cipher_ccm_tag (gcry_cipher_hd_t c, unsigned char *outbuf, + unsigned int outbuflen, int check) +{ + gcry_err_code_t err; + unsigned int burn; + + if (!outbuf || outbuflen == 0) + return GPG_ERR_INV_ARG; + /* Tag length must be same as initial authlen. */ + if (c->u_mode.ccm.authlen != outbuflen) + return GPG_ERR_INV_LENGTH; + if (!c->u_mode.ccm.aead) + return GPG_ERR_INV_STATE; + /* Initial encrypt length must match with length of actual data processed. */ + if (c->u_mode.ccm.encryptlen > 0) + return GPG_ERR_UNFINISHED; + + if (!c->u_mode.ccm.aad) + { + err = _gcry_cipher_ccm_authenticate (c, NULL, 0); + if (err) + return err; + } + + if (!c->u_mode.ccm.tag) + { + burn = do_cbc_mac (c, NULL, 0, 1); /* Perform final padding. */ + + /* Add S_0 */ + buf_xor (c->u_iv.iv, c->u_iv.iv, c->u_mode.ccm.s0, 16); + + wipememory (c->u_ctr.ctr, 16); + wipememory (c->u_mode.ccm.s0, 16); + wipememory (c->u_mode.ccm.macbuf, 16); + + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + } + + if (!check) + { + memcpy (outbuf, c->u_iv.iv, outbuflen); + return GPG_ERR_NO_ERROR; + } + else + { + int diff, i; + + /* Constant-time compare. */ + for (i = 0, diff = 0; i < outbuflen; i++) + diff -= !!(outbuf[i] - c->u_iv.iv[i]); + + return !diff ? GPG_ERR_NO_ERROR : GPG_ERR_CHECKSUM; + } +} + + +gcry_err_code_t +_gcry_cipher_ccm_encrypt (gcry_cipher_hd_t c, unsigned char *outbuf, + unsigned int outbuflen, const unsigned char *inbuf, + unsigned int inbuflen) +{ + gcry_err_code_t err; + unsigned int burn; + + if (outbuflen < inbuflen + c->u_mode.ccm.authlen) + return GPG_ERR_BUFFER_TOO_SHORT; + if (!c->u_mode.ccm.aead || c->u_mode.ccm.tag) + return GPG_ERR_INV_STATE; + if (inbuflen != c->u_mode.ccm.encryptlen) + return GPG_ERR_INV_LENGTH; + + if (!c->u_mode.ccm.aad) + { + err = _gcry_cipher_ccm_authenticate (c, NULL, 0); + if (err) + return err; + } + + c->u_mode.ccm.encryptlen -= inbuflen; + burn = do_cbc_mac (c, inbuf, inbuflen, 0); + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + + err = _gcry_cipher_ctr_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); + if (err) + return err; + + /* Generate and append MAC. */ + return _gcry_cipher_ccm_tag (c, &outbuf[inbuflen], c->u_mode.ccm.authlen, 0); +} + + +gcry_err_code_t +_gcry_cipher_ccm_decrypt (gcry_cipher_hd_t c, unsigned char *outbuf, + unsigned int outbuflen, const unsigned char *inbuf, + unsigned int inbuflen) +{ + gcry_err_code_t err; + unsigned int burn; + unsigned int datalen; + + if (inbuflen < c->u_mode.ccm.authlen) + return GPG_ERR_BUFFER_TOO_SHORT; + if (outbuflen < inbuflen - c->u_mode.ccm.authlen) + return GPG_ERR_BUFFER_TOO_SHORT; + if (!c->u_mode.ccm.aead || c->u_mode.ccm.tag) + return GPG_ERR_INV_STATE; + datalen = inbuflen - c->u_mode.ccm.authlen; + if (datalen != c->u_mode.ccm.encryptlen) + return GPG_ERR_INV_LENGTH; + + if (!c->u_mode.ccm.aad) + { + err = _gcry_cipher_ccm_authenticate (c, NULL, 0); + if (err) + return err; + } + + err = _gcry_cipher_ctr_encrypt (c, outbuf, outbuflen, inbuf, datalen); + if (err) + return err; + + c->u_mode.ccm.encryptlen -= datalen; + burn = do_cbc_mac (c, outbuf, datalen, 0); + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + + err = _gcry_cipher_ccm_tag (c, (unsigned char *)&inbuf[datalen], + c->u_mode.ccm.authlen, 1); + if (err) + { + /* MAC check failed! */ + wipememory (outbuf, outbuflen); + wipememory (c->u_iv.iv, 16); + } + return err; +} + diff --git a/cipher/cipher-internal.h b/cipher/cipher-internal.h index b60ef38..abd70f4 100644 --- a/cipher/cipher-internal.h +++ b/cipher/cipher-internal.h @@ -100,7 +100,8 @@ struct gcry_cipher_handle /* The initialization vector. For best performance we make sure that it is properly aligned. In particular some implementations - of bulk operations expect an 16 byte aligned IV. */ + of bulk operations expect an 16 byte aligned IV. IV is also used + to store CBC-MAC in CCM mode; counter IV is stored in U_CTR. */ union { cipher_context_alignment_t iv_align; unsigned char iv[MAX_BLOCKSIZE]; @@ -117,6 +118,24 @@ struct gcry_cipher_handle unsigned char lastiv[MAX_BLOCKSIZE]; int unused; /* Number of unused bytes in LASTIV. */ + union { + /* Mode specific storage for CCM mode. */ + struct { + /* Space to save partial input lengths for MAC. */ + unsigned char macbuf[GCRY_CCM_BLOCK_LEN]; + int mac_unused; /* Number of unprocessed bytes in MACBUF. */ + + unsigned char s0[GCRY_CCM_BLOCK_LEN]; + + unsigned int authlen; + unsigned int encryptlen; + + unsigned int aead:1;/* Set to 1 if AEAD mode has been initialized. */ + unsigned int aad:1; /* Set to 1 if AAD has been processed. */ + unsigned int tag:1; /* Set to 1 if tag has been finalized. */ + } ccm; + } u_mode; + /* What follows are two contexts of the cipher in use. The first one needs to be aligned well enough for the cipher operation whereas the second one is a copy created by cipher_setkey and @@ -175,5 +194,25 @@ gcry_err_code_t _gcry_cipher_aeswrap_decrypt const byte *inbuf, unsigned int inbuflen); +/*-- cipher-ccm.c --*/ +gcry_err_code_t _gcry_cipher_ccm_encrypt +/* */ (gcry_cipher_hd_t c, + unsigned char *outbuf, unsigned int outbuflen, + const unsigned char *inbuf, unsigned int inbuflen); +gcry_err_code_t _gcry_cipher_ccm_decrypt +/* */ (gcry_cipher_hd_t c, + unsigned char *outbuf, unsigned int outbuflen, + const unsigned char *inbuf, unsigned int inbuflen); +gcry_err_code_t _gcry_cipher_ccm_aead_init +/* */ (gcry_cipher_hd_t c, + const unsigned char *nonce, unsigned int noncelen, + unsigned int authlen, size_t encryptlen); +gcry_err_code_t _gcry_cipher_ccm_authenticate +/* */ (gcry_cipher_hd_t c, + const unsigned char *aadbuf, unsigned int aadbuflen); +gcry_err_code_t _gcry_cipher_ccm_tag +/* */ (gcry_cipher_hd_t c, + unsigned char *outbuf, unsigned int outbuflen, int check); + #endif /*G10_CIPHER_INTERNAL_H*/ diff --git a/cipher/cipher.c b/cipher/cipher.c index 8ebab7c..42fecce 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -375,6 +375,13 @@ gcry_cipher_open (gcry_cipher_hd_t *handle, if (! err) switch (mode) { + case GCRY_CIPHER_MODE_CCM: + if (spec->blocksize != GCRY_CCM_BLOCK_LEN) + err = GPG_ERR_INV_CIPHER_MODE; + if (!spec->encrypt || !spec->decrypt) + err = GPG_ERR_INV_CIPHER_MODE; + break; + case GCRY_CIPHER_MODE_ECB: case GCRY_CIPHER_MODE_CBC: case GCRY_CIPHER_MODE_CFB: @@ -718,6 +725,10 @@ cipher_encrypt (gcry_cipher_hd_t c, byte *outbuf, unsigned int outbuflen, inbuf, inbuflen); break; + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); + break; + case GCRY_CIPHER_MODE_STREAM: c->spec->stencrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf, inbuflen); @@ -811,6 +822,10 @@ cipher_decrypt (gcry_cipher_hd_t c, byte *outbuf, unsigned int outbuflen, inbuf, inbuflen); break; + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); + break; + case GCRY_CIPHER_MODE_STREAM: c->spec->stdecrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf, inbuflen); @@ -885,8 +900,20 @@ _gcry_cipher_setkey (gcry_cipher_hd_t hd, const void *key, size_t keylen) gcry_error_t _gcry_cipher_setiv (gcry_cipher_hd_t hd, const void *iv, size_t ivlen) { - cipher_setiv (hd, iv, ivlen); - return 0; + gcry_err_code_t rc = GPG_ERR_NO_ERROR; + + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + log_fatal ("cipher_setiv: invalid mode %d\n", hd->mode ); + rc = GPG_ERR_INV_CIPHER_MODE; + break; + + default: + cipher_setiv (hd, iv, ivlen); + break; + } + return gpg_error (rc); } /* Set counter for CTR mode. (CTR,CTRLEN) must denote a buffer of @@ -915,35 +942,70 @@ gcry_error_t _gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *aad, size_t aadsize) { - log_fatal ("gcry_cipher_tag: invalid mode %d\n", hd->mode ); + gcry_err_code_t rc = GPG_ERR_NO_ERROR; + + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_authenticate (hd, aad, aadsize); + break; - (void)aad; - (void)aadsize; + default: + log_fatal ("gcry_cipher_tag: invalid mode %d\n", hd->mode ); + rc = GPG_ERR_INV_CIPHER_MODE; + break; + } - return gpg_error (GPG_ERR_INV_CIPHER_MODE); + return gpg_error (rc); } gcry_error_t _gcry_cipher_tag (gcry_cipher_hd_t hd, void *out, size_t outsize) { - log_fatal ("gcry_cipher_tag: invalid mode %d\n", hd->mode ); + gcry_err_code_t rc = GPG_ERR_NO_ERROR; - (void)out; - (void)outsize; + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_tag (hd, out, outsize, 0); + break; + + default: + log_fatal ("gcry_cipher_tag: invalid mode %d\n", hd->mode ); + rc = GPG_ERR_INV_CIPHER_MODE; + break; + } - return gpg_error (GPG_ERR_INV_CIPHER_MODE); + return gpg_error (rc); } gcry_error_t _gcry_cipher_aead_vinit (gcry_cipher_hd_t hd, va_list arg_ptr) { - log_fatal ("gcry_cipher_aead_init: invalid mode %d\n", hd->mode ); + gcry_err_code_t rc; - (void)arg_ptr; + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + { + const void *nonce = va_arg(arg_ptr, const void *); + unsigned int nonce_len = va_arg(arg_ptr, unsigned int); + unsigned int auth_len = va_arg(arg_ptr, unsigned int); + size_t encrypt_len = va_arg(arg_ptr, size_t); + + rc = _gcry_cipher_ccm_aead_init (hd, nonce, nonce_len, auth_len, + encrypt_len); + break; + } + default: + log_fatal ("gcry_cipher_aead_init: invalid mode %d\n", hd->mode ); + rc = GPG_ERR_INV_CIPHER_MODE; + break; + } - return gpg_error (GPG_ERR_INV_CIPHER_MODE); + return gpg_error (rc); } diff --git a/src/gcrypt.h.in b/src/gcrypt.h.in index 2fffd69..88e9c36 100644 --- a/src/gcrypt.h.in +++ b/src/gcrypt.h.in @@ -871,7 +871,8 @@ enum gcry_cipher_modes GCRY_CIPHER_MODE_STREAM = 4, /* Used with stream ciphers. */ GCRY_CIPHER_MODE_OFB = 5, /* Outer feedback. */ GCRY_CIPHER_MODE_CTR = 6, /* Counter. */ - GCRY_CIPHER_MODE_AESWRAP= 7 /* AES-WRAP algorithm. */ + GCRY_CIPHER_MODE_AESWRAP= 7, /* AES-WRAP algorithm. */ + GCRY_CIPHER_MODE_CCM = 8 /* Counter with CBC-MAC. */ }; /* Flags used with the open function. */ @@ -883,6 +884,8 @@ enum gcry_cipher_flags GCRY_CIPHER_CBC_MAC = 8 /* Enable CBC message auth. code (MAC). */ }; +/* CCM works only with blocks of 128 bits. */ +#define GCRY_CCM_BLOCK_LEN (128 / 8) /* Create a handle for algorithm ALGO to be used in MODE. FLAGS may be given as an bitwise OR of the gcry_cipher_flags values. */ diff --git a/tests/basic.c b/tests/basic.c index ee04900..78f6a48 100644 --- a/tests/basic.c +++ b/tests/basic.c @@ -1139,6 +1139,537 @@ check_ofb_cipher (void) static void +check_ccm_cipher (void) +{ + static const struct tv + { + int algo; + int keylen; + const char *key; + int noncelen; + const char *nonce; + int aadlen; + const char *aad; + int plainlen; + const char *plaintext; + int cipherlen; + const char *ciphertext; + } tv[] = + { + /* RFC 3610 */ + { GCRY_CIPHER_AES, /* Packet Vector #1 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x03\x02\x01\x00\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 31, + "\x58\x8C\x97\x9A\x61\xC6\x63\xD2\xF0\x66\xD0\xC2\xC0\xF9\x89\x80\x6D\x5F\x6B\x61\xDA\xC3\x84\x17\xE8\xD1\x2C\xFD\xF9\x26\xE0"}, + { GCRY_CIPHER_AES, /* Packet Vector #2 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x04\x03\x02\x01\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 32, + "\x72\xC9\x1A\x36\xE1\x35\xF8\xCF\x29\x1C\xA8\x94\x08\x5C\x87\xE3\xCC\x15\xC4\x39\xC9\xE4\x3A\x3B\xA0\x91\xD5\x6E\x10\x40\x09\x16"}, + { GCRY_CIPHER_AES, /* Packet Vector #3 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x05\x04\x03\x02\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 33, + "\x51\xB1\xE5\xF4\x4A\x19\x7D\x1D\xA4\x6B\x0F\x8E\x2D\x28\x2A\xE8\x71\xE8\x38\xBB\x64\xDA\x85\x96\x57\x4A\xDA\xA7\x6F\xBD\x9F\xB0\xC5"}, + { GCRY_CIPHER_AES, /* Packet Vector #4 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x06\x05\x04\x03\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 27, + "\xA2\x8C\x68\x65\x93\x9A\x9A\x79\xFA\xAA\x5C\x4C\x2A\x9D\x4A\x91\xCD\xAC\x8C\x96\xC8\x61\xB9\xC9\xE6\x1E\xF1"}, + { GCRY_CIPHER_AES, /* Packet Vector #5 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x07\x06\x05\x04\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 28, + "\xDC\xF1\xFB\x7B\x5D\x9E\x23\xFB\x9D\x4E\x13\x12\x53\x65\x8A\xD8\x6E\xBD\xCA\x3E\x51\xE8\x3F\x07\x7D\x9C\x2D\x93"}, + { GCRY_CIPHER_AES, /* Packet Vector #6 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x08\x07\x06\x05\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 29, + "\x6F\xC1\xB0\x11\xF0\x06\x56\x8B\x51\x71\xA4\x2D\x95\x3D\x46\x9B\x25\x70\xA4\xBD\x87\x40\x5A\x04\x43\xAC\x91\xCB\x94"}, + { GCRY_CIPHER_AES, /* Packet Vector #7 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x09\x08\x07\x06\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 33, + "\x01\x35\xD1\xB2\xC9\x5F\x41\xD5\xD1\xD4\xFE\xC1\x85\xD1\x66\xB8\x09\x4E\x99\x9D\xFE\xD9\x6C\x04\x8C\x56\x60\x2C\x97\xAC\xBB\x74\x90"}, + { GCRY_CIPHER_AES, /* Packet Vector #8 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0A\x09\x08\x07\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 34, + "\x7B\x75\x39\x9A\xC0\x83\x1D\xD2\xF0\xBB\xD7\x58\x79\xA2\xFD\x8F\x6C\xAE\x6B\x6C\xD9\xB7\xDB\x24\xC1\x7B\x44\x33\xF4\x34\x96\x3F\x34\xB4"}, + { GCRY_CIPHER_AES, /* Packet Vector #9 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0B\x0A\x09\x08\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 35, + "\x82\x53\x1A\x60\xCC\x24\x94\x5A\x4B\x82\x79\x18\x1A\xB5\xC8\x4D\xF2\x1C\xE7\xF9\xB7\x3F\x42\xE1\x97\xEA\x9C\x07\xE5\x6B\x5E\xB1\x7E\x5F\x4E"}, + { GCRY_CIPHER_AES, /* Packet Vector #10 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0C\x0B\x0A\x09\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 29, + "\x07\x34\x25\x94\x15\x77\x85\x15\x2B\x07\x40\x98\x33\x0A\xBB\x14\x1B\x94\x7B\x56\x6A\xA9\x40\x6B\x4D\x99\x99\x88\xDD"}, + { GCRY_CIPHER_AES, /* Packet Vector #11 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0D\x0C\x0B\x0A\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 30, + "\x67\x6B\xB2\x03\x80\xB0\xE3\x01\xE8\xAB\x79\x59\x0A\x39\x6D\xA7\x8B\x83\x49\x34\xF5\x3A\xA2\xE9\x10\x7A\x8B\x6C\x02\x2C"}, + { GCRY_CIPHER_AES, /* Packet Vector #12 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0E\x0D\x0C\x0B\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 31, + "\xC0\xFF\xA0\xD6\xF0\x5B\xDB\x67\xF2\x4D\x43\xA4\x33\x8D\x2A\xA4\xBE\xD7\xB2\x0E\x43\xCD\x1A\xA3\x16\x62\xE7\xAD\x65\xD6\xDB"}, + { GCRY_CIPHER_AES, /* Packet Vector #13 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x41\x2B\x4E\xA9\xCD\xBE\x3C\x96\x96\x76\x6C\xFA", + 8, "\x0B\xE1\xA8\x8B\xAC\xE0\x18\xB1", + 23, + "\x08\xE8\xCF\x97\xD8\x20\xEA\x25\x84\x60\xE9\x6A\xD9\xCF\x52\x89\x05\x4D\x89\x5C\xEA\xC4\x7C", + 31, + "\x4C\xB9\x7F\x86\xA2\xA4\x68\x9A\x87\x79\x47\xAB\x80\x91\xEF\x53\x86\xA6\xFF\xBD\xD0\x80\xF8\xE7\x8C\xF7\xCB\x0C\xDD\xD7\xB3"}, + { GCRY_CIPHER_AES, /* Packet Vector #14 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x33\x56\x8E\xF7\xB2\x63\x3C\x96\x96\x76\x6C\xFA", + 8, "\x63\x01\x8F\x76\xDC\x8A\x1B\xCB", + 24, + "\x90\x20\xEA\x6F\x91\xBD\xD8\x5A\xFA\x00\x39\xBA\x4B\xAF\xF9\xBF\xB7\x9C\x70\x28\x94\x9C\xD0\xEC", + 32, + "\x4C\xCB\x1E\x7C\xA9\x81\xBE\xFA\xA0\x72\x6C\x55\xD3\x78\x06\x12\x98\xC8\x5C\x92\x81\x4A\xBC\x33\xC5\x2E\xE8\x1D\x7D\x77\xC0\x8A"}, + { GCRY_CIPHER_AES, /* Packet Vector #15 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x10\x3F\xE4\x13\x36\x71\x3C\x96\x96\x76\x6C\xFA", + 8, "\xAA\x6C\xFA\x36\xCA\xE8\x6B\x40", + 25, + "\xB9\x16\xE0\xEA\xCC\x1C\x00\xD7\xDC\xEC\x68\xEC\x0B\x3B\xBB\x1A\x02\xDE\x8A\x2D\x1A\xA3\x46\x13\x2E", + 33, + "\xB1\xD2\x3A\x22\x20\xDD\xC0\xAC\x90\x0D\x9A\xA0\x3C\x61\xFC\xF4\xA5\x59\xA4\x41\x77\x67\x08\x97\x08\xA7\x76\x79\x6E\xDB\x72\x35\x06"}, + { GCRY_CIPHER_AES, /* Packet Vector #16 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x76\x4C\x63\xB8\x05\x8E\x3C\x96\x96\x76\x6C\xFA", + 12, "\xD0\xD0\x73\x5C\x53\x1E\x1B\xEC\xF0\x49\xC2\x44", + 19, + "\x12\xDA\xAC\x56\x30\xEF\xA5\x39\x6F\x77\x0C\xE1\xA6\x6B\x21\xF7\xB2\x10\x1C", + 27, + "\x14\xD2\x53\xC3\x96\x7B\x70\x60\x9B\x7C\xBB\x7C\x49\x91\x60\x28\x32\x45\x26\x9A\x6F\x49\x97\x5B\xCA\xDE\xAF"}, + { GCRY_CIPHER_AES, /* Packet Vector #17 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\xF8\xB6\x78\x09\x4E\x3B\x3C\x96\x96\x76\x6C\xFA", + 12, "\x77\xB6\x0F\x01\x1C\x03\xE1\x52\x58\x99\xBC\xAE", + 20, + "\xE8\x8B\x6A\x46\xC7\x8D\x63\xE5\x2E\xB8\xC5\x46\xEF\xB5\xDE\x6F\x75\xE9\xCC\x0D", + 28, + "\x55\x45\xFF\x1A\x08\x5E\xE2\xEF\xBF\x52\xB2\xE0\x4B\xEE\x1E\x23\x36\xC7\x3E\x3F\x76\x2C\x0C\x77\x44\xFE\x7E\x3C"}, + { GCRY_CIPHER_AES, /* Packet Vector #18 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\xD5\x60\x91\x2D\x3F\x70\x3C\x96\x96\x76\x6C\xFA", + 12, "\xCD\x90\x44\xD2\xB7\x1F\xDB\x81\x20\xEA\x60\xC0", + 21, + "\x64\x35\xAC\xBA\xFB\x11\xA8\x2E\x2F\x07\x1D\x7C\xA4\xA5\xEB\xD9\x3A\x80\x3B\xA8\x7F", + 29, + "\x00\x97\x69\xEC\xAB\xDF\x48\x62\x55\x94\xC5\x92\x51\xE6\x03\x57\x22\x67\x5E\x04\xC8\x47\x09\x9E\x5A\xE0\x70\x45\x51"}, + { GCRY_CIPHER_AES, /* Packet Vector #19 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x42\xFF\xF8\xF1\x95\x1C\x3C\x96\x96\x76\x6C\xFA", + 8, "\xD8\x5B\xC7\xE6\x9F\x94\x4F\xB8", + 23, + "\x8A\x19\xB9\x50\xBC\xF7\x1A\x01\x8E\x5E\x67\x01\xC9\x17\x87\x65\x98\x09\xD6\x7D\xBE\xDD\x18", + 33, + "\xBC\x21\x8D\xAA\x94\x74\x27\xB6\xDB\x38\x6A\x99\xAC\x1A\xEF\x23\xAD\xE0\xB5\x29\x39\xCB\x6A\x63\x7C\xF9\xBE\xC2\x40\x88\x97\xC6\xBA"}, + { GCRY_CIPHER_AES, /* Packet Vector #20 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x92\x0F\x40\xE5\x6C\xDC\x3C\x96\x96\x76\x6C\xFA", + 8, "\x74\xA0\xEB\xC9\x06\x9F\x5B\x37", + 24, + "\x17\x61\x43\x3C\x37\xC5\xA3\x5F\xC1\xF3\x9F\x40\x63\x02\xEB\x90\x7C\x61\x63\xBE\x38\xC9\x84\x37", + 34, + "\x58\x10\xE6\xFD\x25\x87\x40\x22\xE8\x03\x61\xA4\x78\xE3\xE9\xCF\x48\x4A\xB0\x4F\x44\x7E\xFF\xF6\xF0\xA4\x77\xCC\x2F\xC9\xBF\x54\x89\x44"}, + { GCRY_CIPHER_AES, /* Packet Vector #21 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x27\xCA\x0C\x71\x20\xBC\x3C\x96\x96\x76\x6C\xFA", + 8, "\x44\xA3\xAA\x3A\xAE\x64\x75\xCA", + 25, + "\xA4\x34\xA8\xE5\x85\x00\xC6\xE4\x15\x30\x53\x88\x62\xD6\x86\xEA\x9E\x81\x30\x1B\x5A\xE4\x22\x6B\xFA", + 35, + "\xF2\xBE\xED\x7B\xC5\x09\x8E\x83\xFE\xB5\xB3\x16\x08\xF8\xE2\x9C\x38\x81\x9A\x89\xC8\xE7\x76\xF1\x54\x4D\x41\x51\xA4\xED\x3A\x8B\x87\xB9\xCE"}, + { GCRY_CIPHER_AES, /* Packet Vector #22 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x5B\x8C\xCB\xCD\x9A\xF8\x3C\x96\x96\x76\x6C\xFA", + 12, "\xEC\x46\xBB\x63\xB0\x25\x20\xC3\x3C\x49\xFD\x70", + 19, + "\xB9\x6B\x49\xE2\x1D\x62\x17\x41\x63\x28\x75\xDB\x7F\x6C\x92\x43\xD2\xD7\xC2", + 29, + "\x31\xD7\x50\xA0\x9D\xA3\xED\x7F\xDD\xD4\x9A\x20\x32\xAA\xBF\x17\xEC\x8E\xBF\x7D\x22\xC8\x08\x8C\x66\x6B\xE5\xC1\x97"}, + { GCRY_CIPHER_AES, /* Packet Vector #23 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x3E\xBE\x94\x04\x4B\x9A\x3C\x96\x96\x76\x6C\xFA", + 12, "\x47\xA6\x5A\xC7\x8B\x3D\x59\x42\x27\xE8\x5E\x71", + 20, + "\xE2\xFC\xFB\xB8\x80\x44\x2C\x73\x1B\xF9\x51\x67\xC8\xFF\xD7\x89\x5E\x33\x70\x76", + 30, + "\xE8\x82\xF1\xDB\xD3\x8C\xE3\xED\xA7\xC2\x3F\x04\xDD\x65\x07\x1E\xB4\x13\x42\xAC\xDF\x7E\x00\xDC\xCE\xC7\xAE\x52\x98\x7D"}, + { GCRY_CIPHER_AES, /* Packet Vector #24 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x8D\x49\x3B\x30\xAE\x8B\x3C\x96\x96\x76\x6C\xFA", + 12, "\x6E\x37\xA6\xEF\x54\x6D\x95\x5D\x34\xAB\x60\x59", + 21, + "\xAB\xF2\x1C\x0B\x02\xFE\xB8\x8F\x85\x6D\xF4\xA3\x73\x81\xBC\xE3\xCC\x12\x85\x17\xD4", + 31, + "\xF3\x29\x05\xB8\x8A\x64\x1B\x04\xB9\xC9\xFF\xB5\x8C\xC3\x90\x90\x0F\x3D\xA1\x2A\xB1\x6D\xCE\x9E\x82\xEF\xA1\x6D\xA6\x20\x59"}, + /* RFC 5528 */ + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #1 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x03\x02\x01\x00\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 31, + "\xBA\x73\x71\x85\xE7\x19\x31\x04\x92\xF3\x8A\x5F\x12\x51\xDA\x55\xFA\xFB\xC9\x49\x84\x8A\x0D\xFC\xAE\xCE\x74\x6B\x3D\xB9\xAD"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #2 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x04\x03\x02\x01\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 32, + "\x5D\x25\x64\xBF\x8E\xAF\xE1\xD9\x95\x26\xEC\x01\x6D\x1B\xF0\x42\x4C\xFB\xD2\xCD\x62\x84\x8F\x33\x60\xB2\x29\x5D\xF2\x42\x83\xE8"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #3 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x05\x04\x03\x02\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 33, + "\x81\xF6\x63\xD6\xC7\x78\x78\x17\xF9\x20\x36\x08\xB9\x82\xAD\x15\xDC\x2B\xBD\x87\xD7\x56\xF7\x92\x04\xF5\x51\xD6\x68\x2F\x23\xAA\x46"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #4 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x06\x05\x04\x03\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 27, + "\xCA\xEF\x1E\x82\x72\x11\xB0\x8F\x7B\xD9\x0F\x08\xC7\x72\x88\xC0\x70\xA4\xA0\x8B\x3A\x93\x3A\x63\xE4\x97\xA0"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #5 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x07\x06\x05\x04\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 28, + "\x2A\xD3\xBA\xD9\x4F\xC5\x2E\x92\xBE\x43\x8E\x82\x7C\x10\x23\xB9\x6A\x8A\x77\x25\x8F\xA1\x7B\xA7\xF3\x31\xDB\x09"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #6 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x08\x07\x06\x05\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 29, + "\xFE\xA5\x48\x0B\xA5\x3F\xA8\xD3\xC3\x44\x22\xAA\xCE\x4D\xE6\x7F\xFA\x3B\xB7\x3B\xAB\xAB\x36\xA1\xEE\x4F\xE0\xFE\x28"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #7 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x09\x08\x07\x06\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 33, + "\x54\x53\x20\x26\xE5\x4C\x11\x9A\x8D\x36\xD9\xEC\x6E\x1E\xD9\x74\x16\xC8\x70\x8C\x4B\x5C\x2C\xAC\xAF\xA3\xBC\xCF\x7A\x4E\xBF\x95\x73"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #8 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0A\x09\x08\x07\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 34, + "\x8A\xD1\x9B\x00\x1A\x87\xD1\x48\xF4\xD9\x2B\xEF\x34\x52\x5C\xCC\xE3\xA6\x3C\x65\x12\xA6\xF5\x75\x73\x88\xE4\x91\x3E\xF1\x47\x01\xF4\x41"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #9 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0B\x0A\x09\x08\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 35, + "\x5D\xB0\x8D\x62\x40\x7E\x6E\x31\xD6\x0F\x9C\xA2\xC6\x04\x74\x21\x9A\xC0\xBE\x50\xC0\xD4\xA5\x77\x87\x94\xD6\xE2\x30\xCD\x25\xC9\xFE\xBF\x87"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #10 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0C\x0B\x0A\x09\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 29, + "\xDB\x11\x8C\xCE\xC1\xB8\x76\x1C\x87\x7C\xD8\x96\x3A\x67\xD6\xF3\xBB\xBC\x5C\xD0\x92\x99\xEB\x11\xF3\x12\xF2\x32\x37"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #11 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0D\x0C\x0B\x0A\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 30, + "\x7C\xC8\x3D\x8D\xC4\x91\x03\x52\x5B\x48\x3D\xC5\xCA\x7E\xA9\xAB\x81\x2B\x70\x56\x07\x9D\xAF\xFA\xDA\x16\xCC\xCF\x2C\x4E"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #12 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0E\x0D\x0C\x0B\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 31, + "\x2C\xD3\x5B\x88\x20\xD2\x3E\x7A\xA3\x51\xB0\xE9\x2F\xC7\x93\x67\x23\x8B\x2C\xC7\x48\xCB\xB9\x4C\x29\x47\x79\x3D\x64\xAF\x75"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #13 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xA9\x70\x11\x0E\x19\x27\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x6B\x7F\x46\x45\x07\xFA\xE4\x96", + 23, + "\xC6\xB5\xF3\xE6\xCA\x23\x11\xAE\xF7\x47\x2B\x20\x3E\x73\x5E\xA5\x61\xAD\xB1\x7D\x56\xC5\xA3", + 31, + "\xA4\x35\xD7\x27\x34\x8D\xDD\x22\x90\x7F\x7E\xB8\xF5\xFD\xBB\x4D\x93\x9D\xA6\x52\x4D\xB4\xF6\x45\x58\xC0\x2D\x25\xB1\x27\xEE"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #14 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x83\xCD\x8C\xE0\xCB\x42\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x98\x66\x05\xB4\x3D\xF1\x5D\xE7", + 24, + "\x01\xF6\xCE\x67\x64\xC5\x74\x48\x3B\xB0\x2E\x6B\xBF\x1E\x0A\xBD\x26\xA2\x25\x72\xB4\xD8\x0E\xE7", + 32, + "\x8A\xE0\x52\x50\x8F\xBE\xCA\x93\x2E\x34\x6F\x05\xE0\xDC\x0D\xFB\xCF\x93\x9E\xAF\xFA\x3E\x58\x7C\x86\x7D\x6E\x1C\x48\x70\x38\x06"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #15 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x5F\x54\x95\x0B\x18\xF2\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x48\xF2\xE7\xE1\xA7\x67\x1A\x51", + 25, + "\xCD\xF1\xD8\x40\x6F\xC2\xE9\x01\x49\x53\x89\x70\x05\xFB\xFB\x8B\xA5\x72\x76\xF9\x24\x04\x60\x8E\x08", + 33, + "\x08\xB6\x7E\xE2\x1C\x8B\xF2\x6E\x47\x3E\x40\x85\x99\xE9\xC0\x83\x6D\x6A\xF0\xBB\x18\xDF\x55\x46\x6C\xA8\x08\x78\xA7\x90\x47\x6D\xE5"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #16 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xEC\x60\x08\x63\x31\x9A\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\xDE\x97\xDF\x3B\x8C\xBD\x6D\x8E\x50\x30\xDA\x4C", + 19, + "\xB0\x05\xDC\xFA\x0B\x59\x18\x14\x26\xA9\x61\x68\x5A\x99\x3D\x8C\x43\x18\x5B", + 27, + "\x63\xB7\x8B\x49\x67\xB1\x9E\xDB\xB7\x33\xCD\x11\x14\xF6\x4E\xB2\x26\x08\x93\x68\xC3\x54\x82\x8D\x95\x0C\xC5"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #17 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x60\xCF\xF1\xA3\x1E\xA1\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\xA5\xEE\x93\xE4\x57\xDF\x05\x46\x6E\x78\x2D\xCF", + 20, + "\x2E\x20\x21\x12\x98\x10\x5F\x12\x9D\x5E\xD9\x5B\x93\xF7\x2D\x30\xB2\xFA\xCC\xD7", + 28, + "\x0B\xC6\xBB\xE2\xA8\xB9\x09\xF4\x62\x9E\xE6\xDC\x14\x8D\xA4\x44\x10\xE1\x8A\xF4\x31\x47\x38\x32\x76\xF6\x6A\x9F"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #18 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x0F\x85\xCD\x99\x5C\x97\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\x24\xAA\x1B\xF9\xA5\xCD\x87\x61\x82\xA2\x50\x74", + 21, + "\x26\x45\x94\x1E\x75\x63\x2D\x34\x91\xAF\x0F\xC0\xC9\x87\x6C\x3B\xE4\xAA\x74\x68\xC9", + 29, + "\x22\x2A\xD6\x32\xFA\x31\xD6\xAF\x97\x0C\x34\x5F\x7E\x77\xCA\x3B\xD0\xDC\x25\xB3\x40\xA1\xA3\xD3\x1F\x8D\x4B\x44\xB7"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #19 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xC2\x9B\x2C\xAA\xC4\xCD\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x69\x19\x46\xB9\xCA\x07\xBE\x87", + 23, + "\x07\x01\x35\xA6\x43\x7C\x9D\xB1\x20\xCD\x61\xD8\xF6\xC3\x9C\x3E\xA1\x25\xFD\x95\xA0\xD2\x3D", + 33, + "\x05\xB8\xE1\xB9\xC4\x9C\xFD\x56\xCF\x13\x0A\xA6\x25\x1D\xC2\xEC\xC0\x6C\xCC\x50\x8F\xE6\x97\xA0\x06\x6D\x57\xC8\x4B\xEC\x18\x27\x68"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #20 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x2C\x6B\x75\x95\xEE\x62\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\xD0\xC5\x4E\xCB\x84\x62\x7D\xC4", + 24, + "\xC8\xC0\x88\x0E\x6C\x63\x6E\x20\x09\x3D\xD6\x59\x42\x17\xD2\xE1\x88\x77\xDB\x26\x4E\x71\xA5\xCC", + 34, + "\x54\xCE\xB9\x68\xDE\xE2\x36\x11\x57\x5E\xC0\x03\xDF\xAA\x1C\xD4\x88\x49\xBD\xF5\xAE\x2E\xDB\x6B\x7F\xA7\x75\xB1\x50\xED\x43\x83\xC5\xA9"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #21 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xC5\x3C\xD4\xC2\xAA\x24\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\xE2\x85\xE0\xE4\x80\x8C\xDA\x3D", + 25, + "\xF7\x5D\xAA\x07\x10\xC4\xE6\x42\x97\x79\x4D\xC2\xB7\xD2\xA2\x07\x57\xB1\xAA\x4E\x44\x80\x02\xFF\xAB", + 35, + "\xB1\x40\x45\x46\xBF\x66\x72\x10\xCA\x28\xE3\x09\xB3\x9B\xD6\xCA\x7E\x9F\xC8\x28\x5F\xE6\x98\xD4\x3C\xD2\x0A\x02\xE0\xBD\xCA\xED\x20\x10\xD3"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #22 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xBE\xE9\x26\x7F\xBA\xDC\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\x6C\xAE\xF9\x94\x11\x41\x57\x0D\x7C\x81\x34\x05", + 19, + "\xC2\x38\x82\x2F\xAC\x5F\x98\xFF\x92\x94\x05\xB0\xAD\x12\x7A\x4E\x41\x85\x4E", + 29, + "\x94\xC8\x95\x9C\x11\x56\x9A\x29\x78\x31\xA7\x21\x00\x58\x57\xAB\x61\xB8\x7A\x2D\xEA\x09\x36\xB6\xEB\x5F\x62\x5F\x5D"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #23 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xDF\xA8\xB1\x24\x50\x07\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\x36\xA5\x2C\xF1\x6B\x19\xA2\x03\x7A\xB7\x01\x1E", + 20, + "\x4D\xBF\x3E\x77\x4A\xD2\x45\xE5\xD5\x89\x1F\x9D\x1C\x32\xA0\xAE\x02\x2C\x85\xD7", + 30, + "\x58\x69\xE3\xAA\xD2\x44\x7C\x74\xE0\xFC\x05\xF9\xA4\xEA\x74\x57\x7F\x4D\xE8\xCA\x89\x24\x76\x42\x96\xAD\x04\x11\x9C\xE7"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #24 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x3B\x8F\xD8\xD3\xA9\x37\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\xA4\xD4\x99\xF7\x84\x19\x72\x8C\x19\x17\x8B\x0C", + 21, + "\x9D\xC9\xED\xAE\x2F\xF5\xDF\x86\x36\xE8\xC6\xDE\x0E\xED\x55\xF7\x86\x7E\x33\x33\x7D", + 31, + "\x4B\x19\x81\x56\x39\x3B\x0F\x77\x96\x08\x6A\xAF\xB4\x54\xF8\xC3\xF0\x34\xCC\xA9\x66\x94\x5F\x1F\xCE\xA7\xE1\x1B\xEE\x6A\x2F"} + }; + gcry_cipher_hd_t hde, hdd; + unsigned char out[MAX_DATA_LEN]; + int i, keylen, blklen, authlen; + gcry_error_t err = 0; + + if (verbose) + fprintf (stderr, " Starting CCM checks.\n"); + + for (i = 0; i < sizeof (tv) / sizeof (tv[0]); i++) + { + if (verbose) + fprintf (stderr, " checking CCM mode for %s [%i]\n", + gcry_cipher_algo_name (tv[i].algo), + tv[i].algo); + err = gcry_cipher_open (&hde, tv[i].algo, GCRY_CIPHER_MODE_CCM, 0); + if (!err) + err = gcry_cipher_open (&hdd, tv[i].algo, GCRY_CIPHER_MODE_CCM, 0); + if (err) + { + fail ("cipher-ccm, gcry_cipher_open failed: %s\n", + gpg_strerror (err)); + return; + } + + keylen = gcry_cipher_get_algo_keylen(tv[i].algo); + if (!keylen) + { + fail ("cipher-ccm, gcry_cipher_get_algo_keylen failed\n"); + return; + } + + err = gcry_cipher_setkey (hde, tv[i].key, keylen); + if (!err) + err = gcry_cipher_setkey (hdd, tv[i].key, keylen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_setkey failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + blklen = gcry_cipher_get_algo_blklen(tv[i].algo); + if (!blklen) + { + fail ("cipher-ccm, gcry_cipher_get_algo_blklen failed\n"); + return; + } + + authlen = tv[i].cipherlen - tv[i].plainlen; + err = gcry_cipher_aead_init (hde, tv[i].nonce, tv[i].noncelen, + authlen, tv[i].plainlen); + if (!err) + err = gcry_cipher_aead_init (hdd, tv[i].nonce, tv[i].noncelen, + authlen, tv[i].plainlen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_aead_init failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + err = gcry_cipher_authenticate (hde, tv[i].aad, tv[i].aadlen); + if (!err) + err = gcry_cipher_authenticate (hdd, tv[i].aad, tv[i].aadlen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_authenticate failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + err = gcry_cipher_encrypt (hde, out, MAX_DATA_LEN, tv[i].plaintext, + tv[i].plainlen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_encrypt (%d) failed: %s\n", + i, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + if (memcmp (tv[i].ciphertext, out, tv[i].cipherlen)) + fail ("cipher-ccm, encrypt mismatch entry %d\n", i); + + err = gcry_cipher_decrypt (hdd, out, tv[i].cipherlen, NULL, 0); + if (err) + { + fail ("cipher-ccm, gcry_cipher_decrypt (%d) failed: %s\n", + i, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + if (memcmp (tv[i].plaintext, out, tv[i].plainlen)) + fail ("cipher-ccm, decrypt mismatch entry %d:%d\n", i); + + memset (out, 0, sizeof(out)); + err = gcry_cipher_tag (hdd, out, authlen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_tag (%d) failed: %s\n", + i, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + if (memcmp (&tv[i].ciphertext[tv[i].plainlen], out, authlen)) + fail ("cipher-ccm, decrypt auth-tag mismatch entry %d\n", i); + + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + } + if (verbose) + fprintf (stderr, " Completed CCM checks.\n"); +} + + +static void check_stream_cipher (void) { struct tv @@ -2455,6 +2986,7 @@ check_cipher_modes(void) check_ctr_cipher (); check_cfb_cipher (); check_ofb_cipher (); + check_ccm_cipher (); check_stream_cipher (); check_stream_cipher_large_block (); diff --git a/tests/benchmark.c b/tests/benchmark.c index 5d1434a..21ae176 100644 --- a/tests/benchmark.c +++ b/tests/benchmark.c @@ -435,6 +435,36 @@ md_bench ( const char *algoname ) fflush (stdout); } + +static void ccm_aead_init(gcry_cipher_hd_t hd, size_t buflen, int authlen) +{ + const int _L = 4; + const int noncelen = 15 - _L; + char nonce[noncelen]; + gcry_error_t err = GPG_ERR_NO_ERROR; + + memset (nonce, 0x33, noncelen); + + err = gcry_cipher_aead_init (hd, nonce, noncelen, authlen, buflen - authlen); + if (err) + { + fprintf (stderr, "gcry_cipher_aead_init failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + err = gcry_cipher_authenticate (hd, NULL, 0); + if (err) + { + fprintf (stderr, "gcry_cipher_authenticate failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } +} + + static void cipher_bench ( const char *algoname ) { @@ -446,14 +476,23 @@ cipher_bench ( const char *algoname ) char key[128]; char *outbuf, *buf; char *raw_outbuf, *raw_buf; - size_t allocated_buflen, buflen; + size_t allocated_buflen, buflen, plainlen; int repetitions; - static struct { int mode; const char *name; int blocked; } modes[] = { + static const struct { + int mode; + const char *name; + int blocked; + void (* const aead_init)(gcry_cipher_hd_t hd, size_t buflen, int authlen); + int req_blocksize; + int authlen; + } modes[] = { { GCRY_CIPHER_MODE_ECB, " ECB/Stream", 1 }, { GCRY_CIPHER_MODE_CBC, " CBC", 1 }, { GCRY_CIPHER_MODE_CFB, " CFB", 0 }, { GCRY_CIPHER_MODE_OFB, " OFB", 0 }, { GCRY_CIPHER_MODE_CTR, " CTR", 0 }, + { GCRY_CIPHER_MODE_CCM, " CCM", 0, + ccm_aead_init, GCRY_CCM_BLOCK_LEN, 8 }, { GCRY_CIPHER_MODE_STREAM, "", 0 }, {0} }; @@ -542,9 +581,16 @@ cipher_bench ( const char *algoname ) for (modeidx=0; modes[modeidx].mode; modeidx++) { if ((blklen > 1 && modes[modeidx].mode == GCRY_CIPHER_MODE_STREAM) - | (blklen == 1 && modes[modeidx].mode != GCRY_CIPHER_MODE_STREAM)) + || (blklen == 1 && modes[modeidx].mode != GCRY_CIPHER_MODE_STREAM)) continue; + if (modes[modeidx].req_blocksize > 0 + && blklen != modes[modeidx].req_blocksize) + { + printf (" %7s %7s", "-", "-" ); + continue; + } + for (i=0; i < sizeof buf; i++) buf[i] = i; @@ -570,6 +616,7 @@ cipher_bench ( const char *algoname ) buflen = allocated_buflen; if (modes[modeidx].blocked) buflen = (buflen / blklen) * blklen; + plainlen = buflen - modes[modeidx].authlen; start_timer (); for (i=err=0; !err && i < repetitions; i++) @@ -585,7 +632,9 @@ cipher_bench ( const char *algoname ) exit (1); } } - err = gcry_cipher_encrypt ( hd, outbuf, buflen, buf, buflen); + if (modes[modeidx].aead_init) + (*modes[modeidx].aead_init) (hd, buflen, modes[modeidx].authlen); + err = gcry_cipher_encrypt (hd, outbuf, buflen, buf, plainlen); } stop_timer (); @@ -632,7 +681,15 @@ cipher_bench ( const char *algoname ) exit (1); } } - err = gcry_cipher_decrypt ( hd, outbuf, buflen, buf, buflen); + if (modes[modeidx].aead_init) + { + (*modes[modeidx].aead_init) (hd, buflen, modes[modeidx].authlen); + err = gcry_cipher_decrypt (hd, outbuf, plainlen, buf, buflen); + if (gpg_err_code (err) == GPG_ERR_CHECKSUM) + err = gpg_error (GPG_ERR_NO_ERROR); + } + else + err = gcry_cipher_decrypt (hd, outbuf, plainlen, buf, buflen); } stop_timer (); printf (" %s", elapsed_time ()); From wk at gnupg.org Sun Oct 13 12:45:07 2013 From: wk at gnupg.org (Werner Koch) Date: Sun, 13 Oct 2013 12:45:07 +0200 Subject: [RFC PATCH 2/3] Add API for initializing AEAD modes In-Reply-To: <20131013100233.32014.24561.stgit@localhost6.localdomain6> (Jussi Kivilinna's message of "Sun, 13 Oct 2013 13:02:33 +0300") References: <20131013100228.32014.526.stgit@localhost6.localdomain6> <20131013100233.32014.24561.stgit@localhost6.localdomain6> Message-ID: <87iox1phzg.fsf@vigenere.g10code.de> On Sun, 13 Oct 2013 12:02, jussi.kivilinna at iki.fi said: > CCM mode needs to know length of encrypted data in advance. So, would it make > sense to add variadic API function for initilizing AEAD mode? The one that Let's talks about this api first. I need to look closer at it. Did you notice the new gcry_buffer_t ? Salam-Shalom, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From wk at gnupg.org Sun Oct 13 13:57:53 2013 From: wk at gnupg.org (Werner Koch) Date: Sun, 13 Oct 2013 13:57:53 +0200 Subject: [PATCH 3/3] Add support for GOST R 34.10-2001/-2012 signatures In-Reply-To: (Dmitry Eremin-Solenikov's message of "Thu, 3 Oct 2013 23:56:22 +0400") References: <1380197263-750-1-git-send-email-dbaryshkov@gmail.com> <1380197263-750-3-git-send-email-dbaryshkov@gmail.com> <1380773653.32263.2.camel@cfw2.gniibe.org> <1380805874.3589.0.camel@latx1.gniibe.org> Message-ID: <87bo2tpem6.fsf@vigenere.g10code.de> On Thu, 3 Oct 2013 21:56, dbaryshkov at gmail.com said: > Werner, would you take first two patches in this serie? Can you please rework them to fit into the new module-does-parsing framework? Shalom-Salam, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From jussi.kivilinna at iki.fi Mon Oct 14 13:20:10 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 14 Oct 2013 14:20:10 +0300 Subject: [RFC PATCH 2/3] Add API for initializing AEAD modes In-Reply-To: <87iox1phzg.fsf@vigenere.g10code.de> References: <20131013100228.32014.526.stgit@localhost6.localdomain6> <20131013100233.32014.24561.stgit@localhost6.localdomain6> <87iox1phzg.fsf@vigenere.g10code.de> Message-ID: <525BD36A.9010507@iki.fi> On 13.10.2013 13:45, Werner Koch wrote: > On Sun, 13 Oct 2013 12:02, jussi.kivilinna at iki.fi said: > >> CCM mode needs to know length of encrypted data in advance. So, would it make >> sense to add variadic API function for initilizing AEAD mode? The one that > > Let's talks about this api first. I need to look closer at it. Sure. I based CCM patchset on the AEAD API patch Dmitry sent earlier for GCM. Since CCM has more restrictions (need to know data lengths in advance) than GCM, I added gcry_cipher_aead_init. With this patchset to encrypt a buffer using CCM, you'd first need to initialize/reset CCM state with: gcry_cipher_aead_init (hd, nonce_buf, nonce_len, authtag_len, plaintext_len) CCM needs tag and plaintext lengths for MAC initialization. CCM also needs length of AAD (additional authenticated data) for MAC, so this call is followed by: gcry_cipher_authenticate (hd, aadbuf, aadbuflen) which does the actual MAC initialization. If aadbuflen == 0, then above call can be omitted and gcry_cipher_(en|de)crypt will call gcry_cipher_authenticate with zero length. Plaintext can then be encrypted with: gcry_cipher_encrypt (hd, ciphertext_buf, ciphertext_len, plaintext_buf, plaintext_len) where ciphertext_len >= plaintext_len + authtag_len. Ciphertext can be decrypted with: gcry_cipher_decrypt (hd, plaintext_buf, plaintext_len, ciphertext_buf, ciphertext_len) NIST paper and RFC 3610 define CCM ciphertext as [ctr-enc(plaintext) || authtag] and that decryption must not reveal any information (plaintext or authtag) if authtag is not correct. Therefore full buffers, matching with length of plaintext_len and authtag_len given in gcry_cipher_aead_init, have to be used. If authentication check fails, decrypt function clears output buffer and internal authtag buffer. > Did you notice the new gcry_buffer_t ? Yes, I did. Would it be better to add functions to do AEAD encrypt/decrypt in single go and use gcry_buffer_t? This would avoid having internal state machines in AEAD modes and having to call different functions in correct order. gcry_cipher_aead_encrypt (hd, gcry_buffer_t ct_buf, ct_len, gcry_buffer_t pt_buf, pt_len, gcry_buffer_t aad_buf, aad_len, nonce, nonce_len) For CCM, authentication tag length could be calculated from ct_len - pt_len. -Jussi > > > Salam-Shalom, > > Werner > From cvs at cvs.gnupg.org Tue Oct 15 09:10:34 2013 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Tue, 15 Oct 2013 09:10:34 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-308-g537969f Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 537969fbbb1104b8305a7edb331b7666d54eff2c (commit) via d3a605d7827b8a73ef844e9e5183590bd6b1389a (commit) via 5be2345ddec4147e535d5b039ee74f84bcacf9e4 (commit) via 0cd551faa775ad5309a40629ae30bf86b75fca09 (commit) from a951c061523e1c13f1358c9760fc3a9d787ab2d4 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 537969fbbb1104b8305a7edb331b7666d54eff2c Author: Werner Koch Date: Tue Oct 15 09:08:31 2013 +0200 ecc: Support use of Ed25519 with ECDSA. * src/cipher.h (PUBKEY_FLAG_ECDSA): New. * cipher/pubkey-util.c (_gcry_pk_util_parse_flaglist): Add flag "ecdsa". * cipher/ecc.c (verify_ecdsa, verify_eddsa): Remove some debug output. (ecc_generate, ecc_sign, ecc_verify): Support Ed25519 with ECDSA. * tests/keygen.c (check_ecc_keys): Create such a test key. * tests/pubkey.c (fail, info, data_from_hex, extract_cmp_data): New. Take from dsa-6979.c (check_ed25519ecdsa_sample_key): new. (main): Call new test. Signed-off-by: Werner Koch diff --git a/cipher/ecc.c b/cipher/ecc.c index da384e8..3b75fea 100644 --- a/cipher/ecc.c +++ b/cipher/ecc.c @@ -558,13 +558,10 @@ verify_ecdsa (gcry_mpi_t input, ECC_public_key *pkey, log_mpidump (" x", x); log_mpidump (" r", r); log_mpidump (" s", s); - log_debug ("ecc verify: Not verified\n"); } err = GPG_ERR_BAD_SIGNATURE; goto leave; } - if (DBG_CIPHER) - log_debug ("ecc verify: Accepted\n"); leave: _gcry_mpi_ec_free (ctx); @@ -1208,14 +1205,10 @@ verify_eddsa (gcry_mpi_t input, ECC_public_key *pkey, goto leave; if (tlen != rlen || memcmp (tbuf, rbuf, tlen)) { - if (DBG_CIPHER) - log_debug ("eddsa verify: Not verified\n"); rc = GPG_ERR_BAD_SIGNATURE; goto leave; } - if (DBG_CIPHER) - log_debug ("eddsa verify: Accepted\n"); rc = 0; leave: @@ -1250,10 +1243,12 @@ ecc_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) gcry_random_level_t random_level; mpi_ec_t ctx = NULL; gcry_sexp_t curve_info = NULL; + gcry_sexp_t curve_flags = NULL; gcry_mpi_t base = NULL; gcry_mpi_t public = NULL; gcry_mpi_t secret = NULL; int flags = 0; + int ed25519_with_ecdsa = 0; memset (&E, 0, sizeof E); memset (&sk, 0, sizeof sk); @@ -1328,7 +1323,13 @@ ecc_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) rc = nist_generate_key (&sk, &E, ctx, random_level, nbits); break; case ECC_DIALECT_ED25519: - rc = eddsa_generate_key (&sk, &E, ctx, random_level); + if ((flags & PUBKEY_FLAG_ECDSA)) + { + ed25519_with_ecdsa = 1; + rc = nist_generate_key (&sk, &E, ctx, random_level, nbits); + } + else + rc = eddsa_generate_key (&sk, &E, ctx, random_level); break; default: rc = GPG_ERR_INTERNAL; @@ -1341,7 +1342,7 @@ ecc_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) if (_gcry_mpi_ec_get_affine (x, y, &sk.E.G, ctx)) log_fatal ("ecgen: Failed to get affine coordinates for %s\n", "G"); base = _gcry_ecc_ec2os (x, y, sk.E.p); - if (sk.E.dialect == ECC_DIALECT_ED25519) + if (sk.E.dialect == ECC_DIALECT_ED25519 && !ed25519_with_ecdsa) { unsigned char *encpk; unsigned int encpklen; @@ -1367,16 +1368,23 @@ ecc_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) goto leave; } + if (ed25519_with_ecdsa) + { + rc = gcry_sexp_build (&curve_info, NULL, "(flags ecdsa)"); + if (rc) + goto leave; + } + rc = gcry_sexp_build (r_skey, NULL, "(key-data" " (public-key" - " (ecc%S(p%m)(a%m)(b%m)(g%m)(n%m)(q%m)))" + " (ecc%S%S(p%m)(a%m)(b%m)(g%m)(n%m)(q%m)))" " (private-key" - " (ecc%S(p%m)(a%m)(b%m)(g%m)(n%m)(q%m)(d%m)))" + " (ecc%S%S(p%m)(a%m)(b%m)(g%m)(n%m)(q%m)(d%m)))" " )", - curve_info, + curve_info, curve_flags, sk.E.p, sk.E.a, sk.E.b, base, sk.E.n, public, - curve_info, + curve_info, curve_flags, sk.E.p, sk.E.a, sk.E.b, base, sk.E.n, public, secret); if (rc) goto leave; @@ -1390,6 +1398,8 @@ ecc_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) log_printmpi ("ecgen result n", sk.E.n); log_printmpi ("ecgen result Q", public); log_printmpi ("ecgen result d", secret); + if (ed25519_with_ecdsa) + log_debug ("ecgen result using Ed25519/ECDSA\n"); } leave: @@ -1580,9 +1590,11 @@ ecc_sign (gcry_sexp_t *r_sig, gcry_sexp_t s_data, gcry_sexp_t keyparms) } if (DBG_CIPHER) { - log_debug ("ecc_sign info: %s/%s\n", + log_debug ("ecc_sign info: %s/%s%s\n", _gcry_ecc_model2str (sk.E.model), - _gcry_ecc_dialect2str (sk.E.dialect)); + _gcry_ecc_dialect2str (sk.E.dialect), + (sk.E.dialect == ECC_DIALECT_ED25519 + && (ctx.flags & PUBKEY_FLAG_ECDSA))? "ECDSA":""); if (sk.E.name) log_debug ("ecc_sign name: %s\n", sk.E.name); log_printmpi ("ecc_sign p", sk.E.p); @@ -1733,9 +1745,11 @@ ecc_verify (gcry_sexp_t s_sig, gcry_sexp_t s_data, gcry_sexp_t s_keyparms) if (DBG_CIPHER) { - log_debug ("ecc_verify info: %s/%s\n", + log_debug ("ecc_verify info: %s/%s%s\n", _gcry_ecc_model2str (pk.E.model), - _gcry_ecc_dialect2str (pk.E.dialect)); + _gcry_ecc_dialect2str (pk.E.dialect), + (pk.E.dialect == ECC_DIALECT_ED25519 + && !(sigflags & PUBKEY_FLAG_EDDSA))? "/ECDSA":""); if (pk.E.name) log_debug ("ecc_verify name: %s\n", pk.E.name); log_printmpi ("ecc_verify p", pk.E.p); diff --git a/cipher/pubkey-util.c b/cipher/pubkey-util.c index 3dfc027..caf715e 100644 --- a/cipher/pubkey-util.c +++ b/cipher/pubkey-util.c @@ -75,6 +75,10 @@ _gcry_pk_util_parse_flaglist (gcry_sexp_t list, encoding = PUBKEY_ENC_RAW; flags |= PUBKEY_FLAG_EDDSA; } + else if (n == 5 && !memcmp (s, "ecdsa", 5)) + { + flags |= PUBKEY_FLAG_ECDSA; + } else if (n == 3 && !memcmp (s, "raw", 3) && encoding == PUBKEY_ENC_UNKNOWN) { diff --git a/src/cipher.h b/src/cipher.h index b3469e5..077af98 100644 --- a/src/cipher.h +++ b/src/cipher.h @@ -28,14 +28,15 @@ #define PUBKEY_FLAG_NO_BLINDING (1 << 0) #define PUBKEY_FLAG_RFC6979 (1 << 1) -#define PUBKEY_FLAG_EDDSA (1 << 2) -#define PUBKEY_FLAG_FIXEDLEN (1 << 3) -#define PUBKEY_FLAG_LEGACYRESULT (1 << 4) -#define PUBKEY_FLAG_RAW_FLAG (1 << 5) -#define PUBKEY_FLAG_TRANSIENT_KEY (1 << 6) -#define PUBKEY_FLAG_USE_X931 (1 << 7) -#define PUBKEY_FLAG_USE_FIPS186 (1 << 8) -#define PUBKEY_FLAG_USE_FIPS186_2 (1 << 9) +#define PUBKEY_FLAG_FIXEDLEN (1 << 2) +#define PUBKEY_FLAG_LEGACYRESULT (1 << 3) +#define PUBKEY_FLAG_RAW_FLAG (1 << 4) +#define PUBKEY_FLAG_TRANSIENT_KEY (1 << 5) +#define PUBKEY_FLAG_USE_X931 (1 << 6) +#define PUBKEY_FLAG_USE_FIPS186 (1 << 7) +#define PUBKEY_FLAG_USE_FIPS186_2 (1 << 8) +#define PUBKEY_FLAG_ECDSA (1 << 9) +#define PUBKEY_FLAG_EDDSA (1 << 10) enum pk_operation diff --git a/tests/keygen.c b/tests/keygen.c index b955116..2b98c42 100644 --- a/tests/keygen.c +++ b/tests/keygen.c @@ -394,6 +394,23 @@ check_ecc_keys (void) gcry_sexp_release (key); } + + if (verbose) + show ("creating ECC key using curve Ed25519 for ECDSA\n"); + rc = gcry_sexp_build (&keyparm, NULL, + "(genkey(ecc(curve Ed25519)(flags ecdsa)))"); + if (rc) + die ("error creating S-expression: %s\n", gpg_strerror (rc)); + rc = gcry_pk_genkey (&key, keyparm); + gcry_sexp_release (keyparm); + if (rc) + die ("error generating ECC key using curve Ed25519 for ECDSA: %s\n", + gpg_strerror (rc)); + + if (verbose > 1) + show_sexp ("ECC key:\n", key); + + gcry_sexp_release (key); } diff --git a/tests/pubkey.c b/tests/pubkey.c index baf234c..4dadf88 100644 --- a/tests/pubkey.c +++ b/tests/pubkey.c @@ -28,6 +28,18 @@ #include "../src/gcrypt-int.h" +#define my_isascii(c) (!((c) & 0x80)) +#define digitp(p) (*(p) >= '0' && *(p) <= '9') +#define hexdigitp(a) (digitp (a) \ + || (*(a) >= 'A' && *(a) <= 'F') \ + || (*(a) >= 'a' && *(a) <= 'f')) +#define xtoi_1(p) (*(p) <= '9'? (*(p)- '0'): \ + *(p) <= 'F'? (*(p)-'A'+10):(*(p)-'a'+10)) +#define xtoi_2(p) ((xtoi_1(p) * 16) + xtoi_1((p)+1)) +#define DIM(v) (sizeof(v)/sizeof((v)[0])) +#define DIMof(type,member) DIM(((type *)0)->member) + + /* Sample RSA keys, taken from basic.c. */ static const char sample_private_key_1[] = @@ -101,6 +113,7 @@ static const char sample_public_key_1[] = static int verbose; +static int error_count; static void die (const char *format, ...) @@ -116,6 +129,27 @@ die (const char *format, ...) } static void +fail (const char *format, ...) +{ + va_list arg_ptr; + + va_start (arg_ptr, format); + vfprintf (stderr, format, arg_ptr); + va_end (arg_ptr); + error_count++; +} + +static void +info (const char *format, ...) +{ + va_list arg_ptr; + + va_start (arg_ptr, format); + vfprintf (stderr, format, arg_ptr); + va_end (arg_ptr); +} + +static void show_sexp (const char *prefix, gcry_sexp_t a) { char *buf; @@ -132,6 +166,59 @@ show_sexp (const char *prefix, gcry_sexp_t a) } +/* Convert STRING consisting of hex characters into its binary + representation and return it as an allocated buffer. The valid + length of the buffer is returned at R_LENGTH. The string is + delimited by end of string. The function returns NULL on + error. */ +static void * +data_from_hex (const char *string, size_t *r_length) +{ + const char *s; + unsigned char *buffer; + size_t length; + + buffer = gcry_xmalloc (strlen(string)/2+1); + length = 0; + for (s=string; *s; s +=2 ) + { + if (!hexdigitp (s) || !hexdigitp (s+1)) + die ("error parsing hex string `%s'\n", string); + ((unsigned char*)buffer)[length++] = xtoi_2 (s); + } + *r_length = length; + return buffer; +} + + +static void +extract_cmp_data (gcry_sexp_t sexp, const char *name, const char *expected) +{ + gcry_sexp_t l1; + const void *a; + size_t alen; + void *b; + size_t blen; + + l1 = gcry_sexp_find_token (sexp, name, 0); + a = gcry_sexp_nth_data (l1, 1, &alen); + b = data_from_hex (expected, &blen); + if (!a) + fail ("parameter \"%s\" missing in key\n", name); + else if ( alen != blen || memcmp (a, b, alen) ) + { + fail ("parameter \"%s\" does not match expected value\n", name); + if (verbose) + { + info ("expected: %s\n", expected); + show_sexp ("sexp: ", sexp); + } + } + gcry_free (b); + gcry_sexp_release (l1); +} + + static void check_keys_crypt (gcry_sexp_t pkey, gcry_sexp_t skey, gcry_sexp_t plain0, gpg_err_code_t decrypt_fail_code) @@ -939,6 +1026,85 @@ check_ecc_sample_key (void) } +static void +check_ed25519ecdsa_sample_key (void) +{ + static const char ecc_private_key[] = + "(private-key\n" + " (ecc\n" + " (curve \"Ed25519\")\n" + " (q #044C056555BE4084BB3D8D8895FDF7C2893DFE0256251923053010977D12658321" + " 156D1ADDC07987713A418783658B476358D48D582DB53233D9DED3C1C2577B04#)" + " (d #09A0C38E0F1699073541447C19DA12E3A07A7BFDB0C186E4AC5BCE6F23D55252#)" + "))"; + static const char ecc_private_key_wo_q[] = + "(private-key\n" + " (ecc\n" + " (curve \"Ed25519\")\n" + " (d #09A0C38E0F1699073541447C19DA12E3A07A7BFDB0C186E4AC5BCE6F23D55252#)" + "))"; + static const char ecc_public_key[] = + "(public-key\n" + " (ecc\n" + " (curve \"Ed25519\")\n" + " (q #044C056555BE4084BB3D8D8895FDF7C2893DFE0256251923053010977D12658321" + " 156D1ADDC07987713A418783658B476358D48D582DB53233D9DED3C1C2577B04#)" + "))"; + static const char hash_string[] = + "(data (flags ecdsa rfc6979)\n" + " (hash sha256 #00112233445566778899AABBCCDDEEFF" + /* */ "000102030405060708090A0B0C0D0E0F#))"; + + gpg_error_t err; + gcry_sexp_t key, hash, sig; + + if (verbose) + fprintf (stderr, "Checking sample Ed25519/ECDSA key.\n"); + + if ((err = gcry_sexp_new (&hash, hash_string, 0, 1))) + die ("line %d: %s", __LINE__, gpg_strerror (err)); + + if ((err = gcry_sexp_new (&key, ecc_private_key, 0, 1))) + die ("line %d: %s", __LINE__, gpg_strerror (err)); + + if ((err = gcry_pk_sign (&sig, hash, key))) + die ("gcry_pk_sign failed: %s", gpg_strerror (err)); + + gcry_sexp_release (key); + if ((err = gcry_sexp_new (&key, ecc_public_key, 0, 1))) + die ("line %d: %s", __LINE__, gpg_strerror (err)); + + if ((err = gcry_pk_verify (sig, hash, key))) + die ("gcry_pk_verify failed: %s", gpg_strerror (err)); + + /* Now try signing without the Q parameter. */ + + gcry_sexp_release (key); + if ((err = gcry_sexp_new (&key, ecc_private_key_wo_q, 0, 1))) + die ("line %d: %s", __LINE__, gpg_strerror (err)); + + gcry_sexp_release (sig); + if ((err = gcry_pk_sign (&sig, hash, key))) + die ("gcry_pk_sign without Q failed: %s", gpg_strerror (err)); + + gcry_sexp_release (key); + if ((err = gcry_sexp_new (&key, ecc_public_key, 0, 1))) + die ("line %d: %s", __LINE__, gpg_strerror (err)); + + if ((err = gcry_pk_verify (sig, hash, key))) + die ("gcry_pk_verify signed without Q failed: %s", gpg_strerror (err)); + + extract_cmp_data (sig, "r", ("a63123a783ef29b8276e08987daca4" + "655d0179e22199bf63691fd88eb64e15")); + extract_cmp_data (sig, "s", ("0d9b45c696ab90b96b08812b485df185" + "623ddaf5d02fa65ca5056cb6bd0f16f1")); + + gcry_sexp_release (sig); + gcry_sexp_release (key); + gcry_sexp_release (hash); +} + + int main (int argc, char **argv) { @@ -969,6 +1135,7 @@ main (int argc, char **argv) check_x931_derived_key (i); check_ecc_sample_key (); + check_ed25519ecdsa_sample_key (); - return 0; + return !!error_count; } commit d3a605d7827b8a73ef844e9e5183590bd6b1389a Author: Werner Koch Date: Mon Oct 14 19:48:10 2013 +0200 pubkey: Support flags list in gcry_pk_genkey. * src/cipher.h (PUBKEY_FLAG_TRANSIENT_KEY): New. (PUBKEY_FLAG_USE_X931): New. (PUBKEY_FLAG_USE_FIPS186): New. (PUBKEY_FLAG_USE_FIPS186_2): New. * cipher/pubkey-util.c (_gcry_pk_util_parse_flaglist): Rename from parse_flags_list. Parse new flags. * cipher/dsa.c (dsa_generate): Support flag list. * cipher/ecc.c (ecc_generate): Ditto. * cipher/rsa.c (rsa_generate): Ditto. Signed-off-by: Werner Koch diff --git a/cipher/dsa.c b/cipher/dsa.c index f86ff15..e43bdf4 100644 --- a/cipher/dsa.c +++ b/cipher/dsa.c @@ -710,9 +710,7 @@ dsa_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) gcry_sexp_t deriveparms = NULL; gcry_sexp_t seedinfo = NULL; gcry_sexp_t misc_info = NULL; - int transient_key = 0; - int use_fips186_2 = 0; - int use_fips186 = 0; + int flags = 0; dsa_domain_t domain; gcry_mpi_t *factors = NULL; @@ -723,6 +721,16 @@ dsa_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) if (rc) return rc; + /* Parse the optional flags list. */ + l1 = gcry_sexp_find_token (genparms, "flags", 0); + if (l1) + { + rc = _gcry_pk_util_parse_flaglist (l1, &flags, NULL); + gcry_sexp_release (l1); + if (rc) + return rc;\ + } + /* Parse the optional qbits element. */ l1 = gcry_sexp_find_token (genparms, "qbits", 0); if (l1) @@ -744,28 +752,37 @@ dsa_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) } /* Parse the optional transient-key flag. */ - l1 = gcry_sexp_find_token (genparms, "transient-key", 0); - if (l1) + if (!(flags & PUBKEY_FLAG_TRANSIENT_KEY)) { - transient_key = 1; - gcry_sexp_release (l1); + l1 = gcry_sexp_find_token (genparms, "transient-key", 0); + if (l1) + { + flags |= PUBKEY_FLAG_TRANSIENT_KEY; + gcry_sexp_release (l1); + } } /* Get the optional derive parameters. */ deriveparms = gcry_sexp_find_token (genparms, "derive-parms", 0); /* Parse the optional "use-fips186" flags. */ - l1 = gcry_sexp_find_token (genparms, "use-fips186", 0); - if (l1) + if (!(flags & PUBKEY_FLAG_USE_FIPS186)) { - use_fips186 = 1; - gcry_sexp_release (l1); + l1 = gcry_sexp_find_token (genparms, "use-fips186", 0); + if (l1) + { + flags |= PUBKEY_FLAG_USE_FIPS186; + gcry_sexp_release (l1); + } } - l1 = gcry_sexp_find_token (genparms, "use-fips186-2", 0); - if (l1) + if (!(flags & PUBKEY_FLAG_USE_FIPS186_2)) { - use_fips186_2 = 1; - gcry_sexp_release (l1); + l1 = gcry_sexp_find_token (genparms, "use-fips186-2", 0); + if (l1) + { + flags |= PUBKEY_FLAG_USE_FIPS186_2; + gcry_sexp_release (l1); + } } /* Check whether domain parameters are given. */ @@ -809,14 +826,18 @@ dsa_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) qbits = mpi_get_nbits (domain.q); } - if (deriveparms || use_fips186 || use_fips186_2 || fips_mode ()) + if (deriveparms + || (flags & PUBKEY_FLAG_USE_FIPS186) + || (flags & PUBKEY_FLAG_USE_FIPS186_2) + || fips_mode ()) { int counter; void *seed; size_t seedlen; gcry_mpi_t h_value; - rc = generate_fips186 (&sk, nbits, qbits, deriveparms, use_fips186_2, + rc = generate_fips186 (&sk, nbits, qbits, deriveparms, + !!(flags & PUBKEY_FLAG_USE_FIPS186_2), &domain, &counter, &seed, &seedlen, &h_value); if (!rc && h_value) @@ -832,7 +853,9 @@ dsa_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) } else { - rc = generate (&sk, nbits, qbits, transient_key, &domain, &factors); + rc = generate (&sk, nbits, qbits, + !!(flags & PUBKEY_FLAG_TRANSIENT_KEY), + &domain, &factors); } if (!rc) diff --git a/cipher/ecc.c b/cipher/ecc.c index bd4d253..da384e8 100644 --- a/cipher/ecc.c +++ b/cipher/ecc.c @@ -1247,13 +1247,13 @@ ecc_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) gcry_mpi_t y = NULL; char *curve_name = NULL; gcry_sexp_t l1; - int transient_key = 0; gcry_random_level_t random_level; mpi_ec_t ctx = NULL; gcry_sexp_t curve_info = NULL; gcry_mpi_t base = NULL; gcry_mpi_t public = NULL; gcry_mpi_t secret = NULL; + int flags = 0; memset (&E, 0, sizeof E); memset (&sk, 0, sizeof sk); @@ -1276,10 +1276,20 @@ ecc_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) l1 = gcry_sexp_find_token (genparms, "transient-key", 0); if (l1) { - transient_key = 1; + flags |= PUBKEY_FLAG_TRANSIENT_KEY; gcry_sexp_release (l1); } + /* Parse the optional flags list. */ + l1 = gcry_sexp_find_token (genparms, "flags", 0); + if (l1) + { + rc = _gcry_pk_util_parse_flaglist (l1, &flags, NULL); + gcry_sexp_release (l1); + if (rc) + goto leave; + } + /* NBITS is required if no curve name has been given. */ if (!nbits && !curve_name) return GPG_ERR_NO_OBJ; /* No NBITS parameter. */ @@ -1303,7 +1313,11 @@ ecc_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) log_printpnt ("ecgen curve G", &E.G, NULL); } - random_level = transient_key ? GCRY_STRONG_RANDOM : GCRY_VERY_STRONG_RANDOM; + if ((flags & PUBKEY_FLAG_TRANSIENT_KEY)) + random_level = GCRY_STRONG_RANDOM; + else + random_level = GCRY_VERY_STRONG_RANDOM; + ctx = _gcry_mpi_ec_p_internal_new (E.model, E.dialect, E.p, E.a, E.b); x = mpi_new (0); y = mpi_new (0); diff --git a/cipher/pubkey-internal.h b/cipher/pubkey-internal.h index 7e3667e..cb2721d 100644 --- a/cipher/pubkey-internal.h +++ b/cipher/pubkey-internal.h @@ -21,6 +21,9 @@ #define GCRY_PUBKEY_INTERNAL_H /*-- pubkey-util.c --*/ +gpg_err_code_t _gcry_pk_util_parse_flaglist (gcry_sexp_t list, + int *r_flags, + enum pk_encoding *r_encoding); gpg_err_code_t _gcry_pk_util_get_nbits (gcry_sexp_t list, unsigned int *r_nbits); gpg_err_code_t _gcry_pk_util_get_rsa_use_e (gcry_sexp_t list, diff --git a/cipher/pubkey-util.c b/cipher/pubkey-util.c index 52d69cf..3dfc027 100644 --- a/cipher/pubkey-util.c +++ b/cipher/pubkey-util.c @@ -50,9 +50,9 @@ pss_verify_cmp (void *opaque, gcry_mpi_t tmp) R_ENCODING and the flags are stored at R_FLAGS. if any of them is not needed, NULL may be passed. The function returns 0 on success or an error code. */ -static gpg_err_code_t -parse_flag_list (gcry_sexp_t list, - int *r_flags, enum pk_encoding *r_encoding) +gpg_err_code_t +_gcry_pk_util_parse_flaglist (gcry_sexp_t list, + int *r_flags, enum pk_encoding *r_encoding) { gpg_err_code_t rc = 0; const char *s; @@ -101,6 +101,14 @@ parse_flag_list (gcry_sexp_t list, } else if (n == 11 && ! memcmp (s, "no-blinding", 11)) flags |= PUBKEY_FLAG_NO_BLINDING; + else if (n == 13 && ! memcmp (s, "transient-key", 13)) + flags |= PUBKEY_FLAG_TRANSIENT_KEY; + else if (n == 8 && ! memcmp (s, "use-x931", 8)) + flags |= PUBKEY_FLAG_USE_X931; + else if (n == 11 && ! memcmp (s, "use-fips186", 11)) + flags |= PUBKEY_FLAG_USE_FIPS186; + else if (n == 13 && ! memcmp (s, "use-fips186-2", 13)) + flags |= PUBKEY_FLAG_USE_FIPS186_2; else rc = GPG_ERR_INV_FLAG; } @@ -524,7 +532,7 @@ _gcry_pk_util_preparse_encval (gcry_sexp_t sexp, const char **algo_names, const char *s; /* There is a flags element - process it. */ - rc = parse_flag_list (l2, &parsed_flags, &ctx->encoding); + rc = _gcry_pk_util_parse_flaglist (l2, &parsed_flags, &ctx->encoding); if (rc) goto leave; if (ctx->encoding == PUBKEY_ENC_PSS) @@ -701,12 +709,13 @@ _gcry_pk_util_data_to_mpi (gcry_sexp_t input, gcry_mpi_t *ret_mpi, return *ret_mpi ? GPG_ERR_NO_ERROR : GPG_ERR_INV_OBJ; } - /* see whether there is a flags object */ + /* See whether there is a flags list. */ { gcry_sexp_t lflags = gcry_sexp_find_token (ldata, "flags", 0); if (lflags) { - if (parse_flag_list (lflags, &parsed_flags, &ctx->encoding)) + if (_gcry_pk_util_parse_flaglist (lflags, + &parsed_flags, &ctx->encoding)) unknown_flag = 1; gcry_sexp_release (lflags); } diff --git a/cipher/rsa.c b/cipher/rsa.c index fc6bbe5..d4d2a0a 100644 --- a/cipher/rsa.c +++ b/cipher/rsa.c @@ -760,8 +760,7 @@ rsa_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) unsigned long evalue; RSA_secret_key sk; gcry_sexp_t deriveparms; - int transient_key = 0; - int use_x931 = 0; + int flags = 0; gcry_sexp_t l1; gcry_sexp_t swap_info = NULL; @@ -775,6 +774,16 @@ rsa_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) if (ec) return ec; + /* Parse the optional flags list. */ + l1 = gcry_sexp_find_token (genparms, "flags", 0); + if (l1) + { + ec = _gcry_pk_util_parse_flaglist (l1, &flags, NULL); + gcry_sexp_release (l1); + if (ec) + return ec; + } + deriveparms = (genparms? gcry_sexp_find_token (genparms, "derive-parms", 0) : NULL); if (!deriveparms) @@ -783,12 +792,12 @@ rsa_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) l1 = gcry_sexp_find_token (genparms, "use-x931", 0); if (l1) { - use_x931 = 1; + flags |= PUBKEY_FLAG_USE_X931; gcry_sexp_release (l1); } } - if (deriveparms || use_x931 || fips_mode ()) + if (deriveparms || (flags & PUBKEY_FLAG_USE_X931) || fips_mode ()) { int swapped; ec = generate_x931 (&sk, nbits, evalue, deriveparms, &swapped); @@ -799,14 +808,18 @@ rsa_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) else { /* Parse the optional "transient-key" flag. */ - l1 = gcry_sexp_find_token (genparms, "transient-key", 0); - if (l1) + if (!(flags & PUBKEY_FLAG_TRANSIENT_KEY)) { - transient_key = 1; - gcry_sexp_release (l1); + l1 = gcry_sexp_find_token (genparms, "transient-key", 0); + if (l1) + { + flags |= PUBKEY_FLAG_TRANSIENT_KEY; + gcry_sexp_release (l1); + } } /* Generate. */ - ec = generate_std (&sk, nbits, evalue, transient_key); + ec = generate_std (&sk, nbits, evalue, + !!(flags & PUBKEY_FLAG_TRANSIENT_KEY)); } if (!ec) diff --git a/src/cipher.h b/src/cipher.h index 28f5070..b3469e5 100644 --- a/src/cipher.h +++ b/src/cipher.h @@ -32,6 +32,11 @@ #define PUBKEY_FLAG_FIXEDLEN (1 << 3) #define PUBKEY_FLAG_LEGACYRESULT (1 << 4) #define PUBKEY_FLAG_RAW_FLAG (1 << 5) +#define PUBKEY_FLAG_TRANSIENT_KEY (1 << 6) +#define PUBKEY_FLAG_USE_X931 (1 << 7) +#define PUBKEY_FLAG_USE_FIPS186 (1 << 8) +#define PUBKEY_FLAG_USE_FIPS186_2 (1 << 9) + enum pk_operation { commit 5be2345ddec4147e535d5b039ee74f84bcacf9e4 Author: Werner Koch Date: Mon Oct 14 10:21:53 2013 +0200 pubkey: Remove duplicated flag parsing code. * cipher/pubkey-util.c (_gcry_pk_util_preparse_encval) (_gcry_pk_util_data_to_mpi): Factor flag parsing code out to .. (parse_flag_list): New. * src/cipher.h (PUBKEY_FLAG_RAW_FLAG): New. -- A minor disadvantage of that code is that invalid flags are not anymore detected depending on the use. According to the documentation this is anyway the expected behavior. Signed-off-by: Werner Koch diff --git a/cipher/pubkey-util.c b/cipher/pubkey-util.c index 2838802..52d69cf 100644 --- a/cipher/pubkey-util.c +++ b/cipher/pubkey-util.c @@ -46,6 +46,74 @@ pss_verify_cmp (void *opaque, gcry_mpi_t tmp) } +/* Parser for a flag list. On return the encoding is stored at + R_ENCODING and the flags are stored at R_FLAGS. if any of them is + not needed, NULL may be passed. The function returns 0 on success + or an error code. */ +static gpg_err_code_t +parse_flag_list (gcry_sexp_t list, + int *r_flags, enum pk_encoding *r_encoding) +{ + gpg_err_code_t rc = 0; + const char *s; + size_t n; + int i; + int encoding = PUBKEY_ENC_UNKNOWN; + int flags = 0; + + for (i=list?gcry_sexp_length (list)-1:0; i > 0; i--) + { + s = gcry_sexp_nth_data (list, i, &n); + if (!s) + ; /* not a data element*/ + else if (n == 7 && !memcmp (s, "rfc6979", 7)) + { + flags |= PUBKEY_FLAG_RFC6979; + } + else if (n == 5 && !memcmp (s, "eddsa", 5)) + { + encoding = PUBKEY_ENC_RAW; + flags |= PUBKEY_FLAG_EDDSA; + } + else if (n == 3 && !memcmp (s, "raw", 3) + && encoding == PUBKEY_ENC_UNKNOWN) + { + encoding = PUBKEY_ENC_RAW; + flags |= PUBKEY_FLAG_RAW_FLAG; /* Explicitly given. */ + } + else if (n == 5 && !memcmp (s, "pkcs1", 5) + && encoding == PUBKEY_ENC_UNKNOWN) + { + encoding = PUBKEY_ENC_PKCS1; + flags |= PUBKEY_FLAG_FIXEDLEN; + } + else if (n == 4 && !memcmp (s, "oaep", 4) + && encoding == PUBKEY_ENC_UNKNOWN) + { + encoding = PUBKEY_ENC_OAEP; + flags |= PUBKEY_FLAG_FIXEDLEN; + } + else if (n == 3 && !memcmp (s, "pss", 3) + && encoding == PUBKEY_ENC_UNKNOWN) + { + encoding = PUBKEY_ENC_PSS; + flags |= PUBKEY_FLAG_FIXEDLEN; + } + else if (n == 11 && ! memcmp (s, "no-blinding", 11)) + flags |= PUBKEY_FLAG_NO_BLINDING; + else + rc = GPG_ERR_INV_FLAG; + } + + if (r_flags) + *r_flags = flags; + if (r_encoding) + *r_encoding = encoding; + + return rc; +} + + static int get_hash_algo (const char *s, size_t n) { @@ -453,36 +521,16 @@ _gcry_pk_util_preparse_encval (gcry_sexp_t sexp, const char **algo_names, if (!strcmp (name, "flags")) { - /* There is a flags element - process it. */ const char *s; - for (i = gcry_sexp_length (l2) - 1; i > 0; i--) + /* There is a flags element - process it. */ + rc = parse_flag_list (l2, &parsed_flags, &ctx->encoding); + if (rc) + goto leave; + if (ctx->encoding == PUBKEY_ENC_PSS) { - s = gcry_sexp_nth_data (l2, i, &n); - if (! s) - ; /* Not a data element - ignore. */ - else if (n == 3 && !memcmp (s, "raw", 3) - && ctx->encoding == PUBKEY_ENC_UNKNOWN) - ctx->encoding = PUBKEY_ENC_RAW; - else if (n == 5 && !memcmp (s, "pkcs1", 5) - && ctx->encoding == PUBKEY_ENC_UNKNOWN) - ctx->encoding = PUBKEY_ENC_PKCS1; - else if (n == 4 && !memcmp (s, "oaep", 4) - && ctx->encoding == PUBKEY_ENC_UNKNOWN) - ctx->encoding = PUBKEY_ENC_OAEP; - else if (n == 3 && !memcmp (s, "pss", 3) - && ctx->encoding == PUBKEY_ENC_UNKNOWN) - { - rc = GPG_ERR_CONFLICT; - goto leave; - } - else if (n == 11 && !memcmp (s, "no-blinding", 11)) - parsed_flags |= PUBKEY_FLAG_NO_BLINDING; - else - { - rc = GPG_ERR_INV_FLAG; - goto leave; - } + rc = GPG_ERR_CONFLICT; + goto leave; } /* Get the OAEP parameters HASH-ALGO and LABEL, if any. */ @@ -640,12 +688,10 @@ _gcry_pk_util_data_to_mpi (gcry_sexp_t input, gcry_mpi_t *ret_mpi, { gcry_err_code_t rc = 0; gcry_sexp_t ldata, lhash, lvalue; - int i; size_t n; const char *s; int unknown_flag = 0; int parsed_flags = 0; - int explicit_raw = 0; *ret_mpi = NULL; ldata = gcry_sexp_find_token (input, "data", 0); @@ -659,48 +705,9 @@ _gcry_pk_util_data_to_mpi (gcry_sexp_t input, gcry_mpi_t *ret_mpi, { gcry_sexp_t lflags = gcry_sexp_find_token (ldata, "flags", 0); if (lflags) - { /* parse the flags list. */ - for (i=gcry_sexp_length (lflags)-1; i > 0; i--) - { - s = gcry_sexp_nth_data (lflags, i, &n); - if (!s) - ; /* not a data element*/ - else if (n == 7 && !memcmp (s, "rfc6979", 7)) - parsed_flags |= PUBKEY_FLAG_RFC6979; - else if (n == 5 && !memcmp (s, "eddsa", 5)) - { - ctx->encoding = PUBKEY_ENC_RAW; - parsed_flags |= PUBKEY_FLAG_EDDSA; - } - else if ( n == 3 && !memcmp (s, "raw", 3) - && ctx->encoding == PUBKEY_ENC_UNKNOWN) - { - ctx->encoding = PUBKEY_ENC_RAW; - explicit_raw = 1; - } - else if ( n == 5 && !memcmp (s, "pkcs1", 5) - && ctx->encoding == PUBKEY_ENC_UNKNOWN) - { - ctx->encoding = PUBKEY_ENC_PKCS1; - parsed_flags |= PUBKEY_FLAG_FIXEDLEN; - } - else if ( n == 4 && !memcmp (s, "oaep", 4) - && ctx->encoding == PUBKEY_ENC_UNKNOWN) - { - ctx->encoding = PUBKEY_ENC_OAEP; - parsed_flags |= PUBKEY_FLAG_FIXEDLEN; - } - else if ( n == 3 && !memcmp (s, "pss", 3) - && ctx->encoding == PUBKEY_ENC_UNKNOWN) - { - ctx->encoding = PUBKEY_ENC_PSS; - parsed_flags |= PUBKEY_FLAG_FIXEDLEN; - } - else if (n == 11 && ! memcmp (s, "no-blinding", 11)) - parsed_flags |= PUBKEY_FLAG_NO_BLINDING; - else - unknown_flag = 1; - } + { + if (parse_flag_list (lflags, &parsed_flags, &ctx->encoding)) + unknown_flag = 1; gcry_sexp_release (lflags); } } @@ -773,7 +780,8 @@ _gcry_pk_util_data_to_mpi (gcry_sexp_t input, gcry_mpi_t *ret_mpi, *ret_mpi = gcry_mpi_set_opaque (NULL, value, valuelen*8); } else if (ctx->encoding == PUBKEY_ENC_RAW && lhash - && (explicit_raw || (parsed_flags & PUBKEY_FLAG_RFC6979))) + && ((parsed_flags & PUBKEY_FLAG_RAW_FLAG) + || (parsed_flags & PUBKEY_FLAG_RFC6979))) { /* Raw encoding along with a hash element. This is commonly used for DSA. For better backward error compatibility we diff --git a/doc/gcrypt.texi b/doc/gcrypt.texi index 4585a32..79d4d74 100644 --- a/doc/gcrypt.texi +++ b/doc/gcrypt.texi @@ -2181,33 +2181,78 @@ or @code{oid.}. @section Cryptographic Functions @noindent -Note that we will in future allow to use keys without p,q and u -specified and may also support other parameters for performance -reasons. - - at noindent - -Some functions operating on S-expressions support `flags', that -influence the operation. These flags have to be listed in a -sub-S-expression named `flags'; the following flags are known: +Some functions operating on S-expressions support `flags' to influence +the operation. These flags have to be listed in a sub-S-expression +named `flags'. Flag names are case-sensitive. The following flags +are known: @table @code @item pkcs1 + at cindex PKCS1 Use PKCS#1 block type 2 padding for encryption, block type 1 padding for signing. + @item oaep + at cindex OAEP Use RSA-OAEP padding for encryption. + @item pss + at cindex PSS Use RSA-PSS padding for signing. + + at item ecdsa + at cindex ECDSA +Create an ECDSA public key instead of using the default key generation +of the specified curve. + @item eddsa -Use the EdDSA scheme instead of ECDSA. + at cindex EdDSA +Use the EdDSA scheme instead of the default signature algorithm of the +used curve. + @item rfc6979 + at cindex RFC6979 For DSA and ECDSA use a deterministic scheme for the k parameter. + @item no-blinding + at cindex no-blinding Do not use a technique called `blinding', which is used by default in order to prevent leaking of secret information. Blinding is only implemented by RSA, but it might be implemented by other algorithms in the future as well, when necessary. + + at item transient-key + at cindex transient-key +This flag is only meaningful for RSA, DSA, and ECC key generation. If +given the key is created using a faster and a somewhat less secure +random number generator. This flag may be used for keys which are +only used for a short time or per-message and do not require full +cryptographic strength. + + at item use-x931 + at cindex X9.31 +Force the use of the ANSI X9.31 key generation algorithm instead of +the default algorithm. This flag is only meaningful for RSA key +generation and usually not required. Note that this algorithm is +implicitly used if either @code{derive-parms} is given or Libgcrypt is +in FIPS mode. + + at item use-fips186 + at cindex FIPS 186 +Force the use of the FIPS 186 key generation algorithm instead of the +default algorithm. This flag is only meaningful for DSA and usually +not required. Note that this algorithm is implicitly used if either + at code{derive-parms} is given or Libgcrypt is in FIPS mode. As of now +FIPS 186-2 is implemented; after the approval of FIPS 186-3 the code +will be changed to implement 186-3. + + at item use-fips186-2 + at cindex FIPS 186-2 +Force the use of the FIPS 186-2 key generation algorithm instead of +the default algorithm. This algorithm is slighlty different from +FIPS 186-3 and allows only 1024 bit keys. This flag is only meaningful +for DSA and only required for FIPS testing backward compatibility. + @end table @noindent @@ -2641,10 +2686,10 @@ default selection Libgcrypt would have taken if @code{nbits} has been given. The available names are listed with the description of the ECC public key parameters. - at item rsa-use-e + at item rsa-use-e @var{value} This is only used with RSA to give a hint for the public exponent. The -value will be used as a base to test for a usable exponent. Some values -are special: + at var{value} will be used as a base to test for a usable exponent. Some +values are special: @table @samp @item 0 @@ -2662,10 +2707,10 @@ Use the given value. If this parameter is not used, Libgcrypt uses for historic reasons 65537. - at item qbits + at item qbits @var{n} This is only meanigful for DSA keys. If it is given the DSA key is -generated with a Q parameyer of this size. If it is not given or zero -Q is deduced from NBITS in this way: +generated with a Q parameyer of size @var{n} bits. If it is not given +or zero Q is deduced from NBITS in this way: @table @samp @item 512 <= N <= 1024 Q = 160 @@ -2682,14 +2727,7 @@ Note that in this case only the values for N, as given in the table, are allowed. When specifying Q all values of N in the range 512 to 15680 are valid as long as they are multiples of 8. - at item transient-key -This is only meaningful for RSA, DSA, and ECC keys. This is a flag -with no value. If given the key is created using a faster and a -somewhat less secure random number generator. This flag may be used -for keys which are only used for a short time or per-message and do -not require full cryptographic strength. - - at item domain + at item domain @var{list} This is only meaningful for DLP algorithms. If specified keys are generated with domain parameters taken from this list. The exact format of this parameter depends on the actual algorithm. It is @@ -2707,7 +2745,7 @@ currently only implemented for DSA using this format: @code{nbits} and @code{qbits} may not be specified because they are derived from the domain parameters. - at item derive-parms + at item derive-parms @var{list} This is currently only implemented for RSA and DSA keys. It is not allowed to use this together with a @code{domain} specification. If given, it is used to derive the keys using the given parameters. @@ -2745,28 +2783,25 @@ FIPS 186 algorithm is used even if libgcrypt is not in FIPS mode. @end example - at item use-x931 - at cindex X9.31 -Force the use of the ANSI X9.31 key generation algorithm instead of -the default algorithm. This flag is only meaningful for RSA and -usually not required. Note that this algorithm is implicitly used if -either @code{derive-parms} is given or Libgcrypt is in FIPS mode. - - at item use-fips186 - at cindex FIPS 186 -Force the use of the FIPS 186 key generation algorithm instead of the -default algorithm. This flag is only meaningful for DSA and usually -not required. Note that this algorithm is implicitly used if either - at code{derive-parms} is given or Libgcrypt is in FIPS mode. As of now -FIPS 186-2 is implemented; after the approval of FIPS 186-3 the code -will be changed to implement 186-3. + at item flags @var{flaglist} +This is preferred way to define flags. @var{flaglist} may contain any +number of flags. See above for a specification of these flags. +Here is an example on how to create a key using curve Ed25519 with the +ECDSA signature algorithm. Note that the use of ECDSA with that curve +is in general not recommended. + at example +(genkey + (ecc + (flags transient-key ecdsa))) + at end example - at item use-fips186-2 -Force the use of the FIPS 186-2 key generation algorithm instead of -the default algorithm. This algorithm is slighlty different from -FIPS 186-3 and allows only 1024 bit keys. This flag is only meaningful -for DSA and only required for FIPS testing backward compatibility. + at item transient-key + at itemx use-x931 + at itemx use-fips186 + at itemx use-fips186-2 +These are deprecated ways to set a flag with that name; see above for +a description of each flag. @end table diff --git a/src/cipher.h b/src/cipher.h index 3b7744a..28f5070 100644 --- a/src/cipher.h +++ b/src/cipher.h @@ -31,6 +31,7 @@ #define PUBKEY_FLAG_EDDSA (1 << 2) #define PUBKEY_FLAG_FIXEDLEN (1 << 3) #define PUBKEY_FLAG_LEGACYRESULT (1 << 4) +#define PUBKEY_FLAG_RAW_FLAG (1 << 5) enum pk_operation { commit 0cd551faa775ad5309a40629ae30bf86b75fca09 Author: Werner Koch Date: Mon Oct 14 08:09:31 2013 +0200 mpicalc: Accept lowercase hex digits. * src/mpicalc.c (main): Test for lowercase hex digits. -- This change is useful so to make use of Libgcrypt debug code easier. diff --git a/src/mpicalc.c b/src/mpicalc.c index dd85227..335b7c3 100644 --- a/src/mpicalc.c +++ b/src/mpicalc.c @@ -399,7 +399,9 @@ main (int argc, char **argv) case '-': if ((c = my_getc ()) == '-') do_dec (); - else if (isdigit (c) || (c >= 'A' && c <= 'F')) + else if (isdigit (c) + || (c >= 'A' && c <= 'F') + || (c >= 'a' && c <= 'f')) { state = 1; ungetc (c, stdin); ----------------------------------------------------------------------- Summary of changes: cipher/dsa.c | 59 +++++++++++----- cipher/ecc.c | 68 +++++++++++++------ cipher/pubkey-internal.h | 3 + cipher/pubkey-util.c | 167 +++++++++++++++++++++++++-------------------- cipher/rsa.c | 31 ++++++--- doc/gcrypt.texi | 125 ++++++++++++++++++++++------------ src/cipher.h | 13 +++- src/mpicalc.c | 4 +- tests/keygen.c | 17 +++++ tests/pubkey.c | 169 +++++++++++++++++++++++++++++++++++++++++++++- 10 files changed, 486 insertions(+), 170 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From wk at gnupg.org Tue Oct 15 15:49:44 2013 From: wk at gnupg.org (Werner Koch) Date: Tue, 15 Oct 2013 15:49:44 +0200 Subject: possible mpi-pow improvement In-Reply-To: <1380674512.3342.2.camel@cfw2.gniibe.org> (NIIBE Yutaka's message of "Wed, 02 Oct 2013 09:41:52 +0900") References: <1378456897.3188.14.camel@cfw2.gniibe.org> <87a9itb52z.fsf@vigenere.g10code.de> <1380674512.3342.2.camel@cfw2.gniibe.org> Message-ID: <87d2n6myo7.fsf@vigenere.g10code.de> On Wed, 2 Oct 2013 02:41, gniibe at fsij.org said: > Yes, at least on machine. And on mine ;-)/ Before: Algorithm generate 100*sign 100*verify ------------------------------------------------ RSA 1024 bit 20ms 120ms 10ms RSA 2048 bit 120ms 630ms 10ms RSA 3072 bit 450ms 1780ms 10ms RSA 4096 bit 4110ms 3700ms 30ms DSA 1024/160 - 60ms 60ms DSA 2048/224 - 210ms 200ms DSA 3072/256 - 470ms 440ms After: Algorithm generate 100*sign 100*verify ------------------------------------------------ RSA 1024 bit 10ms 80ms 0ms RSA 2048 bit 60ms 370ms 10ms RSA 3072 bit 1710ms 1060ms 10ms RSA 4096 bit 1310ms 2220ms 20ms DSA 1024/160 - 40ms 60ms DSA 2048/224 - 140ms 200ms DSA 3072/256 - 300ms 430ms > With compilation option -DUSE_ALGORITHM_SIMPLE_EXPONENTIATION, we can > use old implementation. > > Please test it out. Please push your patch to master. Salam-Shalom, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From wk at gnupg.org Tue Oct 15 15:46:28 2013 From: wk at gnupg.org (Werner Koch) Date: Tue, 15 Oct 2013 15:46:28 +0200 Subject: [RFC PATCH 2/3] Add API for initializing AEAD modes In-Reply-To: <525BD36A.9010507@iki.fi> (Jussi Kivilinna's message of "Mon, 14 Oct 2013 14:20:10 +0300") References: <20131013100228.32014.526.stgit@localhost6.localdomain6> <20131013100233.32014.24561.stgit@localhost6.localdomain6> <87iox1phzg.fsf@vigenere.g10code.de> <525BD36A.9010507@iki.fi> Message-ID: <87hacimytn.fsf@vigenere.g10code.de> On Mon, 14 Oct 2013 13:20, jussi.kivilinna at iki.fi said: > I based CCM patchset on the AEAD API patch Dmitry sent earlier for > GCM. Since CCM has more restrictions (need to know data lengths in > advance) than GCM, I added gcry_cipher_aead_init. I forgot about this because I delayed that far too long. Sorry. > With this patchset to encrypt a buffer using CCM, you'd first need to > initialize/reset CCM state with: > > gcry_cipher_aead_init (hd, nonce_buf, nonce_len, authtag_len, plaintext_len) > > CCM needs tag and plaintext lengths for MAC initialization. CCM also Up until now we use separate functions to set key, IV, and counter. This function changes this pattern. Why not reusing setiv for the nonce and adding a function for the authentication tag? Is the plaintext length really required in advance? That is embarrassing in particular because the authenticated data is appended to the ciphertext. > gcry_cipher_authenticate (hd, aadbuf, aadbuflen) > > which does the actual MAC initialization. If aadbuflen == 0, then > above call can be omitted and gcry_cipher_(en|de)crypt will call > gcry_cipher_authenticate with zero length. What about extending this fucntion to also take the authentication tag and, if the plaintext length is required for the MAC setup, also that length? That would group the information together. > NIST paper and RFC 3610 define CCM ciphertext as [ctr-enc(plaintext) > || authtag] and that decryption must not reveal any information > (plaintext or authtag) if authtag is not correct. Therefore full > buffers, matching with length of plaintext_len and authtag_len given > in gcry_cipher_aead_init, have to be used. If authentication check I don't see that. That is an implementaion detail and the requirement could also be achieved by putting it into the security policy. > Would it be better to add functions to do AEAD encrypt/decrypt in > single go and use gcry_buffer_t? This would avoid having internal > state machines in AEAD modes and having to call different functions in > correct order. And it also means that you can't use that mode with large amounts of data. Not a good idea; there are still lots of platforms with a quite limited amount of memory but in need to encrypt large data take from somewhere. Yes, the need for the plaintext length makes it hard to use but I can image systems which know that length on advance even if the data does not fit into memory. I mentioned gcry_buffer_t having the AAD in mind; a scatter/gather style operation may be useful here. We should also keep in mind that adding support for OCB is desirable. I had in mind to require the use of a new GCRYCTL_ENABLE_GPL_CODE to state that Libgcrypt is used under the terms of the GPL. However, meanwhile Phil Rogaway relaxed the requirement for royalty free patent licensing to all free software licenses and thus we could simply got forth and implement OCB. Any new API should make it easy to do just that. Shalom-Salam, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From dbaryshkov at gmail.com Tue Oct 15 21:56:42 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Tue, 15 Oct 2013 23:56:42 +0400 Subject: [PATCH v2 0/2] Suppot for GOST R 34.10-2001/-2012 signatures Message-ID: <1381867004-21231-1-git-send-email-dbaryshkov@gmail.com> Hello, I have forward-ported my patchset to current ECC/Pubkey interfaces. As agreed with NIIBE Yutaka, I did not add a 'subgroup' domain (at least for now), because for "test" curves subgroup is equal to the whole group. -- With best wishes Dmitry From dbaryshkov at gmail.com Tue Oct 15 21:56:44 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Tue, 15 Oct 2013 23:56:44 +0400 Subject: [PATCH v2 2/2] Add support for GOST R 34.10-2001/-2012 signatures In-Reply-To: <1381867004-21231-1-git-send-email-dbaryshkov@gmail.com> References: <1381867004-21231-1-git-send-email-dbaryshkov@gmail.com> Message-ID: <1381867004-21231-3-git-send-email-dbaryshkov@gmail.com> * src/cipher.h: define PUBKEY_FLAG_GOST * cipher/ecc-curves.c: add GOST2001-test and GOST2012-test curves defined in standards. Typical applications would use either those curves, or curves defined in RFC 4357 (will be added later). * cipher/ecc.c (sign_gost, verify_gost): New. (ecc_sign, ecc_verify): use sign_gost/verify_gost if PUBKEY_FLAG_GOST is set. (ecc_names): add "gost" for gost signatures. * cipher/pubkey-util.c (_gcry_pk_util_parse_flaglist, _gcry_pk_util_preparse_sigval): set PUBKEY_FLAG_GOST if gost flag is present in s-exp. * tests/benchmark.c (ecc_bench): also benchmark GOST signatures. * tests/basic.c (check_pubkey): add two public keys from GOST R 34.10-2012 standard. (check_pubkey_sign_ecdsa): add two data sets to check gost signatures. * tests/curves.c: correct N_CURVES as we now have 2 more curves. Signed-off-by: Dmitry Eremin-Solenikov --- cipher/ecc-curves.c | 28 +++++++ cipher/ecc.c | 214 +++++++++++++++++++++++++++++++++++++++++++++++++++ cipher/pubkey-util.c | 7 ++ src/cipher.h | 1 + tests/basic.c | 75 ++++++++++++++++++ tests/benchmark.c | 17 +++- tests/curves.c | 2 +- 7 files changed, 342 insertions(+), 2 deletions(-) diff --git a/cipher/ecc-curves.c b/cipher/ecc-curves.c index 53433a2..9fbd721 100644 --- a/cipher/ecc-curves.c +++ b/cipher/ecc-curves.c @@ -267,6 +267,34 @@ static const ecc_domain_parms_t domain_parms[] = "0x7dde385d566332ecc0eabfa9cf7822fdf209f70024a57b1aa000c55b881f8111" "b2dcde494a5f485e5bca4bd88a2763aed1ca2b2fa8f0540678cd1e0f3ad80892" }, + { + "GOST2001-test", 256, 0, + MPI_EC_WEIERSTRASS, ECC_DIALECT_STANDARD, + "0x8000000000000000000000000000000000000000000000000000000000000431", /* p */ + "0x0000000000000000000000000000000000000000000000000000000000000007", /* a */ + "0x5fbff498aa938ce739b8e022fbafef40563f6e6a3472fc2a514c0ce9dae23b7e", /* b */ + "0x8000000000000000000000000000000150fe8a1892976154c59cfc193accf5b3", /* q */ + + "0x0000000000000000000000000000000000000000000000000000000000000002", /* x */ + "0x08e2a8a0e65147d4bd6316030e16d19c85c97f0a9ca267122b96abbcea7e8fc8", /* y */ + }, + + { + "GOST2012-test", 511, 0, + MPI_EC_WEIERSTRASS, ECC_DIALECT_STANDARD, + "0x4531acd1fe0023c7550d267b6b2fee80922b14b2ffb90f04d4eb7c09b5d2d15d" + "f1d852741af4704a0458047e80e4546d35b8336fac224dd81664bbf528be6373", /* p */ + "0x0000000000000000000000000000000000000000000000000000000000000007", /* a */ + "0x1cff0806a31116da29d8cfa54e57eb748bc5f377e49400fdd788b649eca1ac4" + "361834013b2ad7322480a89ca58e0cf74bc9e540c2add6897fad0a3084f302adc", /* b */ + "0x4531acd1fe0023c7550d267b6b2fee80922b14b2ffb90f04d4eb7c09b5d2d15d" + "a82f2d7ecb1dbac719905c5eecc423f1d86e25edbe23c595d644aaf187e6e6df", /* q */ + + "0x24d19cc64572ee30f396bf6ebbfd7a6c5213b3b3d7057cc825f91093a68cd762" + "fd60611262cd838dc6b60aa7eee804e28bc849977fac33b4b530f1b120248a9a", /* x */ + "0x2bb312a43bd2ce6e0d020613c857acddcfbf061e91e5f2c3f32447c259f39b2" + "c83ab156d77f1496bf7eb3351e1ee4e43dc1a18b91b24640b6dbb92cb1add371e", /* y */ + }, { NULL, 0, 0, 0, 0, NULL, NULL, NULL, NULL } }; diff --git a/cipher/ecc.c b/cipher/ecc.c index 3b75fea..d241770 100644 --- a/cipher/ecc.c +++ b/cipher/ecc.c @@ -71,6 +71,7 @@ static const char *ecc_names[] = "ecdsa", "ecdh", "eddsa", + "gost", NULL, }; @@ -575,6 +576,203 @@ verify_ecdsa (gcry_mpi_t input, ECC_public_key *pkey, return err; } +/* Compute an GOST R 34.10-01/-12 signature. + * Return the signature struct (r,s) from the message hash. The caller + * must have allocated R and S. + */ +static gpg_err_code_t +sign_gost (gcry_mpi_t input, ECC_secret_key *skey, gcry_mpi_t r, gcry_mpi_t s) +{ + gpg_err_code_t err = 0; + gcry_mpi_t k, dr, sum, ke, x, e; + mpi_point_struct I; + gcry_mpi_t hash; + const void *abuf; + unsigned int abits, qbits; + mpi_ec_t ctx; + + if (DBG_CIPHER) + log_mpidump ("gost sign hash ", input ); + + qbits = mpi_get_nbits (skey->E.n); + + /* Convert the INPUT into an MPI if needed. */ + if (mpi_is_opaque (input)) + { + abuf = gcry_mpi_get_opaque (input, &abits); + err = gpg_err_code (gcry_mpi_scan (&hash, GCRYMPI_FMT_USG, + abuf, (abits+7)/8, NULL)); + if (err) + return err; + if (abits > qbits) + gcry_mpi_rshift (hash, hash, abits - qbits); + } + else + hash = input; + + + k = NULL; + dr = mpi_alloc (0); + sum = mpi_alloc (0); + ke = mpi_alloc (0); + e = mpi_alloc (0); + x = mpi_alloc (0); + point_init (&I); + + ctx = _gcry_mpi_ec_p_internal_new (skey->E.model, skey->E.dialect, + skey->E.p, skey->E.a, skey->E.b); + + mpi_mod (e, input, skey->E.n); /* e = hash mod n */ + + if (!mpi_cmp_ui (e, 0)) + mpi_set_ui (e, 1); + + /* Two loops to avoid R or S are zero. This is more of a joke than + a real demand because the probability of them being zero is less + than any hardware failure. Some specs however require it. */ + do + { + do + { + mpi_free (k); + k = _gcry_dsa_gen_k (skey->E.n, GCRY_STRONG_RANDOM); + + _gcry_mpi_ec_mul_point (&I, k, &skey->E.G, ctx); + if (_gcry_mpi_ec_get_affine (x, NULL, &I, ctx)) + { + if (DBG_CIPHER) + log_debug ("ecc sign: Failed to get affine coordinates\n"); + err = GPG_ERR_BAD_SIGNATURE; + goto leave; + } + mpi_mod (r, x, skey->E.n); /* r = x mod n */ + } + while (!mpi_cmp_ui (r, 0)); + mpi_mulm (dr, skey->d, r, skey->E.n); /* dr = d*r mod n */ + mpi_mulm (ke, k, e, skey->E.n); /* ke = k*e mod n */ + mpi_addm (s, ke, dr, skey->E.n); /* sum = (k*e+ d*r) mod n */ + } + while (!mpi_cmp_ui (s, 0)); + + if (DBG_CIPHER) + { + log_mpidump ("gost sign result r ", r); + log_mpidump ("gost sign result s ", s); + } + + leave: + _gcry_mpi_ec_free (ctx); + point_free (&I); + mpi_free (x); + mpi_free (e); + mpi_free (ke); + mpi_free (sum); + mpi_free (dr); + mpi_free (k); + + if (hash != input) + mpi_free (hash); + + return err; +} + +/* Verify a GOST R 34.10-01/-12 signature. + * Check if R and S verifies INPUT. + */ +static gpg_err_code_t +verify_gost (gcry_mpi_t input, ECC_public_key *pkey, + gcry_mpi_t r, gcry_mpi_t s) +{ + gpg_err_code_t err = 0; + gcry_mpi_t e, x, z1, z2, v, rv, zero; + mpi_point_struct Q, Q1, Q2; + mpi_ec_t ctx; + + if( !(mpi_cmp_ui (r, 0) > 0 && mpi_cmp (r, pkey->E.n) < 0) ) + return GPG_ERR_BAD_SIGNATURE; /* Assertion 0 < r < n failed. */ + if( !(mpi_cmp_ui (s, 0) > 0 && mpi_cmp (s, pkey->E.n) < 0) ) + return GPG_ERR_BAD_SIGNATURE; /* Assertion 0 < s < n failed. */ + + x = mpi_alloc (0); + e = mpi_alloc (0); + z1 = mpi_alloc (0); + z2 = mpi_alloc (0); + v = mpi_alloc (0); + rv = mpi_alloc (0); + zero = mpi_alloc (0); + + point_init (&Q); + point_init (&Q1); + point_init (&Q2); + + ctx = _gcry_mpi_ec_p_internal_new (pkey->E.model, pkey->E.dialect, + pkey->E.p, pkey->E.a, pkey->E.b); + + mpi_mod (e, input, pkey->E.n); /* e = hash mod n */ + if (!mpi_cmp_ui (e, 0)) + mpi_set_ui (e, 1); + mpi_invm (v, e, pkey->E.n); /* v = e^(-1) (mod n) */ + mpi_mulm (z1, s, v, pkey->E.n); /* z1 = s*v (mod n) */ + mpi_mulm (rv, r, v, pkey->E.n); /* rv = s*v (mod n) */ + mpi_subm (z2, zero, rv, pkey->E.n); /* z2 = -r*v (mod n) */ + + _gcry_mpi_ec_mul_point (&Q1, z1, &pkey->E.G, ctx); +/* log_mpidump ("Q1.x", Q1.x); */ +/* log_mpidump ("Q1.y", Q1.y); */ +/* log_mpidump ("Q1.z", Q1.z); */ + _gcry_mpi_ec_mul_point (&Q2, z2, &pkey->Q, ctx); +/* log_mpidump ("Q2.x", Q2.x); */ +/* log_mpidump ("Q2.y", Q2.y); */ +/* log_mpidump ("Q2.z", Q2.z); */ + _gcry_mpi_ec_add_points (&Q, &Q1, &Q2, ctx); +/* log_mpidump (" Q.x", Q.x); */ +/* log_mpidump (" Q.y", Q.y); */ +/* log_mpidump (" Q.z", Q.z); */ + + if (!mpi_cmp_ui (Q.z, 0)) + { + if (DBG_CIPHER) + log_debug ("ecc verify: Rejected\n"); + err = GPG_ERR_BAD_SIGNATURE; + goto leave; + } + if (_gcry_mpi_ec_get_affine (x, NULL, &Q, ctx)) + { + if (DBG_CIPHER) + log_debug ("ecc verify: Failed to get affine coordinates\n"); + err = GPG_ERR_BAD_SIGNATURE; + goto leave; + } + mpi_mod (x, x, pkey->E.n); /* x = x mod E_n */ + if (mpi_cmp (x, r)) /* x != r */ + { + if (DBG_CIPHER) + { + log_mpidump (" x", x); + log_mpidump (" r", r); + log_mpidump (" s", s); + log_debug ("ecc verify: Not verified\n"); + } + err = GPG_ERR_BAD_SIGNATURE; + goto leave; + } + if (DBG_CIPHER) + log_debug ("ecc verify: Accepted\n"); + + leave: + _gcry_mpi_ec_free (ctx); + point_free (&Q2); + point_free (&Q1); + point_free (&Q); + mpi_free (zero); + mpi_free (rv); + mpi_free (v); + mpi_free (z2); + mpi_free (z1); + mpi_free (x); + mpi_free (e); + return err; +} static void @@ -1623,6 +1821,13 @@ ecc_sign (gcry_sexp_t *r_sig, gcry_sexp_t s_data, gcry_sexp_t keyparms) rc = gcry_sexp_build (r_sig, NULL, "(sig-val(eddsa(r%M)(s%M)))", sig_r, sig_s); } + else if ((ctx.flags & PUBKEY_FLAG_GOST)) + { + rc = sign_gost (data, &sk, sig_r, sig_s); + if (!rc) + rc = gcry_sexp_build (r_sig, NULL, + "(sig-val(gost(r%M)(s%M)))", sig_r, sig_s); + } else { rc = sign_ecdsa (data, &sk, sig_r, sig_s, ctx.flags, ctx.hash_algo); @@ -1773,6 +1978,15 @@ ecc_verify (gcry_sexp_t s_sig, gcry_sexp_t s_data, gcry_sexp_t s_keyparms) { rc = verify_eddsa (data, &pk, sig_r, sig_s, ctx.hash_algo, mpi_q); } + else if ((sigflags & PUBKEY_FLAG_GOST)) + { + point_init (&pk.Q); + rc = _gcry_ecc_os2ec (&pk.Q, mpi_q); + if (rc) + goto leave; + + rc = verify_gost (data, &pk, sig_r, sig_s); + } else { point_init (&pk.Q); diff --git a/cipher/pubkey-util.c b/cipher/pubkey-util.c index 54832ec..c4058bd 100644 --- a/cipher/pubkey-util.c +++ b/cipher/pubkey-util.c @@ -79,6 +79,11 @@ _gcry_pk_util_parse_flaglist (gcry_sexp_t list, { flags |= PUBKEY_FLAG_ECDSA; } + else if (n == 4 && !memcmp (s, "gost", 4)) + { + encoding = PUBKEY_ENC_RAW; + flags |= PUBKEY_FLAG_GOST; + } else if (n == 3 && !memcmp (s, "raw", 3) && encoding == PUBKEY_ENC_UNKNOWN) { @@ -460,6 +465,8 @@ _gcry_pk_util_preparse_sigval (gcry_sexp_t s_sig, const char **algo_names, { if (!strcmp (name, "eddsa")) *r_eccflags = PUBKEY_FLAG_EDDSA; + if (!strcmp (name, "gost")) + *r_eccflags = PUBKEY_FLAG_GOST; } *r_parms = l2; diff --git a/src/cipher.h b/src/cipher.h index 077af98..20818ba 100644 --- a/src/cipher.h +++ b/src/cipher.h @@ -37,6 +37,7 @@ #define PUBKEY_FLAG_USE_FIPS186_2 (1 << 8) #define PUBKEY_FLAG_ECDSA (1 << 9) #define PUBKEY_FLAG_EDDSA (1 << 10) +#define PUBKEY_FLAG_GOST (1 << 11) enum pk_operation diff --git a/tests/basic.c b/tests/basic.c index 899dae5..1267831 100644 --- a/tests/basic.c +++ b/tests/basic.c @@ -3677,6 +3677,30 @@ check_pubkey_sign_ecdsa (int n, gcry_sexp_t skey, gcry_sexp_t pkey) /* */ "000102030405060708090A0B0C0D0E0F#))", 0 }, + { 256, + "(data (flags gost)\n" + " (value #00112233445566778899AABBCCDDEEFF" + /* */ "000102030405060708090A0B0C0D0E0F#))", + 0, + "(data (flags gost)\n" + " (value #80112233445566778899AABBCCDDEEFF" + /* */ "000102030405060708090A0B0C0D0E0F#))", + 0 + }, + { 512, + "(data (flags gost)\n" + " (value #00112233445566778899AABBCCDDEEFF" + /* */ "000102030405060708090A0B0C0D0E0F" + /* */ "000102030405060708090A0B0C0D0E0F" + /* */ "000102030405060708090A0B0C0D0E0F#))", + 0, + "(data (flags gost)\n" + " (value #80112233445566778899AABBCCDDEEFF" + /* */ "000102030405060708090A0B0C0D0E0F" + /* */ "000102030405060708090A0B0C0D0E0F" + /* */ "000102030405060708090A0B0C0D0E0F#))", + 0 + }, { 0, NULL } }; @@ -4264,6 +4288,57 @@ check_pubkey (void) "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" } + }, + { /* GOST R 34.10-2001/2012 test 256 bit. */ + GCRY_PK_ECDSA, FLAG_SIGN, + { + "(private-key\n" + " (ecc\n" + " (curve GOST2001-test)\n" + " (q #047F2B49E270DB6D90D8595BEC458B50C58585BA1D4E9B78" + " 8F6689DBD8E56FD80B26F1B489D6701DD185C8413A977B3C" + " BBAF64D1C593D26627DFFB101A87FF77DA#)\n" + " (d #7A929ADE789BB9BE10ED359DD39A72C11B60961F49397EEE" + " 1D19CE9891EC3B28#)))\n", + + "(public-key\n" + " (ecc\n" + " (curve GOST2001-test)\n" + " (q #047F2B49E270DB6D90D8595BEC458B50C58585BA1D4E9B78" + " 8F6689DBD8E56FD80B26F1B489D6701DD185C8413A977B3C" + " BBAF64D1C593D26627DFFB101A87FF77DA#)))\n", + + "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" + "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" } + }, + { /* GOST R 34.10-2012 test 512 bit. */ + GCRY_PK_ECDSA, FLAG_SIGN, + { + "(private-key\n" + " (ecc\n" + " (curve GOST2012-test)\n" + " (q #04115DC5BC96760C7B48598D8AB9E740D4C4A85A65BE33C1" + " 815B5C320C854621DD5A515856D13314AF69BC5B924C8B" + " 4DDFF75C45415C1D9DD9DD33612CD530EFE137C7C90CD4" + " 0B0F5621DC3AC1B751CFA0E2634FA0503B3D52639F5D7F" + " B72AFD61EA199441D943FFE7F0C70A2759A3CDB84C114E" + " 1F9339FDF27F35ECA93677BEEC#)\n" + " (d #0BA6048AADAE241BA40936D47756D7C93091A0E851466970" + " 0EE7508E508B102072E8123B2200A0563322DAD2827E2714" + " A2636B7BFD18AADFC62967821FA18DD4#)))\n", + + "(public-key\n" + " (ecc\n" + " (curve GOST2001-test)\n" + " (q #04115DC5BC96760C7B48598D8AB9E740D4C4A85A65BE33C1" + " 815B5C320C854621DD5A515856D13314AF69BC5B924C8B" + " 4DDFF75C45415C1D9DD9DD33612CD530EFE137C7C90CD4" + " 0B0F5621DC3AC1B751CFA0E2634FA0503B3D52639F5D7F" + " B72AFD61EA199441D943FFE7F0C70A2759A3CDB84C114E" + " 1F9339FDF27F35ECA93677BEEC#)))\n" + + "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" + "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" } } }; int i; diff --git a/tests/benchmark.c b/tests/benchmark.c index 5d1434a..ecda0d3 100644 --- a/tests/benchmark.c +++ b/tests/benchmark.c @@ -883,7 +883,8 @@ ecc_bench (int iterations, int print_header) { #if USE_ECC gpg_error_t err; - const char *p_sizes[] = { "192", "224", "256", "384", "521", "Ed25519" }; + const char *p_sizes[] = { "192", "224", "256", "384", "521", "Ed25519", + "gost256", "gost512" }; int testno; if (print_header) @@ -899,14 +900,22 @@ ecc_bench (int iterations, int print_header) int count; int p_size; int is_ed25519; + int is_gost; is_ed25519 = !strcmp (p_sizes[testno], "Ed25519"); + is_gost = !strncmp (p_sizes[testno], "gost", 4); if (is_ed25519) { p_size = 256; printf ("EdDSA Ed25519 "); fflush (stdout); } + else if (is_gost) + { + p_size = atoi (p_sizes[testno] + 4); + printf ("GOST %3d bit ", p_size); + fflush (stdout); + } else { p_size = atoi (p_sizes[testno]); @@ -917,6 +926,10 @@ ecc_bench (int iterations, int print_header) if (is_ed25519) err = gcry_sexp_build (&key_spec, NULL, "(genkey (ecdsa (curve \"Ed25519\")))"); + else if (is_gost) + err = gcry_sexp_build (&key_spec, NULL, + "(genkey (ecdsa (curve %s)))", + p_size == 256 ? "GOST2001-test" : "GOST2012-test"); else err = gcry_sexp_build (&key_spec, NULL, "(genkey (ECDSA (nbits %d)))", p_size); @@ -950,6 +963,8 @@ ecc_bench (int iterations, int print_header) err = gcry_sexp_build (&data, NULL, "(data (flags eddsa)(hash-algo sha512)" " (value %m))", x); + else if (is_gost) + err = gcry_sexp_build (&data, NULL, "(data (flags gost) (value %m))", x); else err = gcry_sexp_build (&data, NULL, "(data (flags raw) (value %m))", x); gcry_mpi_release (x); diff --git a/tests/curves.c b/tests/curves.c index 2c3ae53..198693e 100644 --- a/tests/curves.c +++ b/tests/curves.c @@ -29,7 +29,7 @@ #include "../src/gcrypt-int.h" /* Number of curves defined in ../cipger/ecc.c */ -#define N_CURVES 13 +#define N_CURVES 15 /* A real world sample public key. */ static char const sample_key_1[] = -- 1.8.4.rc3 From dbaryshkov at gmail.com Tue Oct 15 21:56:43 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Tue, 15 Oct 2013 23:56:43 +0400 Subject: [PATCH v2 1/2] Fix 256-bit ecdsa test key definition In-Reply-To: <1381867004-21231-1-git-send-email-dbaryshkov@gmail.com> References: <1381867004-21231-1-git-send-email-dbaryshkov@gmail.com> Message-ID: <1381867004-21231-2-git-send-email-dbaryshkov@gmail.com> * tests/basic.c (check_pubkey): fix nistp256 testing key declaration - add missing comma. Signed-off-by: Dmitry Eremin-Solenikov --- tests/basic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/basic.c b/tests/basic.c index 3ffcefe..899dae5 100644 --- a/tests/basic.c +++ b/tests/basic.c @@ -4260,7 +4260,7 @@ check_pubkey (void) " (curve nistp256)\n" " (q #04D4F6A6738D9B8D3A7075C1E4EE95015FC0C9B7E4272D2B" " EB6644D3609FC781B71F9A8072F58CB66AE2F89BB1245187" - " 3ABF7D91F9E1FBF96BF2F70E73AAC9A283#)))\n" + " 3ABF7D91F9E1FBF96BF2F70E73AAC9A283#)))\n", "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" } -- 1.8.4.rc3 From cvs at cvs.gnupg.org Wed Oct 16 02:12:54 2013 From: cvs at cvs.gnupg.org (by NIIBE Yutaka) Date: Wed, 16 Oct 2013 02:12:54 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-309-g45aa613 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 45aa6131e93fac89d46733b3436d960f35fb99b2 (commit) from 537969fbbb1104b8305a7edb331b7666d54eff2c (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 45aa6131e93fac89d46733b3436d960f35fb99b2 Author: NIIBE Yutaka Date: Wed Oct 2 09:27:09 2013 +0900 mpi: mpi-pow improvement. * mpi/mpi-pow.c (gcry_mpi_powm): New implementation of left-to-right k-ary exponentiation. -- Signed-off-by: NIIBE Yutaka For the Yarom/Falkner flush+reload cache side-channel attack, we changed the code so that it always calls the multiplication routine (even if we can skip it to get result). This results some performance regression. This change is for recovering performance with efficient algorithm. diff --git a/mpi/mpi-pow.c b/mpi/mpi-pow.c index 85d6fd8..469c382 100644 --- a/mpi/mpi-pow.c +++ b/mpi/mpi-pow.c @@ -34,6 +34,14 @@ #include "longlong.h" +/* + * When you need old implementation, please add compilation option + * -DUSE_ALGORITHM_SIMPLE_EXPONENTIATION + * or expose this line: +#define USE_ALGORITHM_SIMPLE_EXPONENTIATION 1 + */ + +#if defined(USE_ALGORITHM_SIMPLE_EXPONENTIATION) /**************** * RES = BASE ^ EXPO mod MOD */ @@ -336,3 +344,449 @@ gcry_mpi_powm (gcry_mpi_t res, if (tspace) _gcry_mpi_free_limb_space( tspace, 0 ); } +#else +/** + * Internal function to compute + * + * X = R * S mod M + * + * and set the size of X at the pointer XSIZE_P. + * Use karatsuba structure at KARACTX_P. + * + * Condition: + * RSIZE >= SSIZE + * Enough space for X is allocated beforehand. + * + * For generic cases, we can/should use gcry_mpi_mulm. + * This function is use for specific internal case. + */ +static void +mul_mod (mpi_ptr_t xp, mpi_size_t *xsize_p, + mpi_ptr_t rp, mpi_size_t rsize, + mpi_ptr_t sp, mpi_size_t ssize, + mpi_ptr_t mp, mpi_size_t msize, + struct karatsuba_ctx *karactx_p) +{ + if( ssize < KARATSUBA_THRESHOLD ) + _gcry_mpih_mul ( xp, rp, rsize, sp, ssize ); + else + _gcry_mpih_mul_karatsuba_case (xp, rp, rsize, sp, ssize, karactx_p); + + if (rsize + ssize > msize) + { + _gcry_mpih_divrem (xp + msize, 0, xp, rsize + ssize, mp, msize); + *xsize_p = msize; + } + else + *xsize_p = rsize + ssize; +} + +#define SIZE_B_2I3 ((1 << (5 - 1)) - 1) + +/**************** + * RES = BASE ^ EXPO mod MOD + * + * To mitigate the Yarom/Falkner flush+reload cache side-channel + * attack on the RSA secret exponent, we don't use the square + * routine but multiplication. + * + * Reference: + * Handbook of Applied Cryptography + * Algorithm 14.83: Modified left-to-right k-ary exponentiation + */ +void +gcry_mpi_powm (gcry_mpi_t res, + gcry_mpi_t base, gcry_mpi_t expo, gcry_mpi_t mod) +{ + /* Pointer to the limbs of the arguments, their size and signs. */ + mpi_ptr_t rp, ep, mp, bp; + mpi_size_t esize, msize, bsize, rsize; + int msign, bsign, rsign; + /* Flags telling the secure allocation status of the arguments. */ + int esec, msec, bsec; + /* Size of the result including space for temporary values. */ + mpi_size_t size; + /* Helper. */ + int mod_shift_cnt; + int negative_result; + mpi_ptr_t mp_marker = NULL; + mpi_ptr_t bp_marker = NULL; + mpi_ptr_t ep_marker = NULL; + mpi_ptr_t xp_marker = NULL; + unsigned int mp_nlimbs = 0; + unsigned int bp_nlimbs = 0; + unsigned int ep_nlimbs = 0; + unsigned int xp_nlimbs = 0; + mpi_ptr_t b_2i3[SIZE_B_2I3]; /* Pre-computed array: BASE^3, ^5, ^7, ... */ + mpi_size_t b_2i3size[SIZE_B_2I3]; + mpi_size_t W; + mpi_ptr_t base_u; + mpi_size_t base_u_size; + + esize = expo->nlimbs; + msize = mod->nlimbs; + size = 2 * msize; + msign = mod->sign; + + if (esize * BITS_PER_MPI_LIMB > 512) + W = 5; + else if (esize * BITS_PER_MPI_LIMB > 256) + W = 4; + else if (esize * BITS_PER_MPI_LIMB > 128) + W = 3; + else if (esize * BITS_PER_MPI_LIMB > 64) + W = 2; + else + W = 1; + + esec = mpi_is_secure(expo); + msec = mpi_is_secure(mod); + bsec = mpi_is_secure(base); + + rp = res->d; + ep = expo->d; + + if (!msize) + _gcry_divide_by_zero(); + + if (!esize) + { + /* Exponent is zero, result is 1 mod MOD, i.e., 1 or 0 depending + on if MOD equals 1. */ + res->nlimbs = (msize == 1 && mod->d[0] == 1) ? 0 : 1; + if (res->nlimbs) + { + RESIZE_IF_NEEDED (res, 1); + rp = res->d; + rp[0] = 1; + } + res->sign = 0; + goto leave; + } + + /* Normalize MOD (i.e. make its most significant bit set) as + required by mpn_divrem. This will make the intermediate values + in the calculation slightly larger, but the correct result is + obtained after a final reduction using the original MOD value. */ + mp_nlimbs = msec? msize:0; + mp = mp_marker = mpi_alloc_limb_space(msize, msec); + count_leading_zeros (mod_shift_cnt, mod->d[msize-1]); + if (mod_shift_cnt) + _gcry_mpih_lshift (mp, mod->d, msize, mod_shift_cnt); + else + MPN_COPY( mp, mod->d, msize ); + + bsize = base->nlimbs; + bsign = base->sign; + if (bsize > msize) + { + /* The base is larger than the module. Reduce it. + + Allocate (BSIZE + 1) with space for remainder and quotient. + (The quotient is (bsize - msize + 1) limbs.) */ + bp_nlimbs = bsec ? (bsize + 1):0; + bp = bp_marker = mpi_alloc_limb_space( bsize + 1, bsec ); + MPN_COPY ( bp, base->d, bsize ); + /* We don't care about the quotient, store it above the + * remainder, at BP + MSIZE. */ + _gcry_mpih_divrem( bp + msize, 0, bp, bsize, mp, msize ); + bsize = msize; + /* Canonicalize the base, since we are going to multiply with it + quite a few times. */ + MPN_NORMALIZE( bp, bsize ); + } + else + bp = base->d; + + if (!bsize) + { + res->nlimbs = 0; + res->sign = 0; + goto leave; + } + + + /* Make BASE, EXPO and MOD not overlap with RES. */ + if ( rp == bp ) + { + /* RES and BASE are identical. Allocate temp. space for BASE. */ + gcry_assert (!bp_marker); + bp_nlimbs = bsec? bsize:0; + bp = bp_marker = mpi_alloc_limb_space( bsize, bsec ); + MPN_COPY(bp, rp, bsize); + } + if ( rp == ep ) + { + /* RES and EXPO are identical. Allocate temp. space for EXPO. */ + ep_nlimbs = esec? esize:0; + ep = ep_marker = mpi_alloc_limb_space( esize, esec ); + MPN_COPY(ep, rp, esize); + } + if ( rp == mp ) + { + /* RES and MOD are identical. Allocate temporary space for MOD.*/ + gcry_assert (!mp_marker); + mp_nlimbs = msec?msize:0; + mp = mp_marker = mpi_alloc_limb_space( msize, msec ); + MPN_COPY(mp, rp, msize); + } + + /* Copy base to the result. */ + if (res->alloced < size) + { + mpi_resize (res, size); + rp = res->d; + } + + /* Main processing. */ + { + mpi_size_t i, j; + mpi_ptr_t xp; + mpi_size_t xsize; + int c; + mpi_limb_t e; + mpi_limb_t carry_limb; + struct karatsuba_ctx karactx; + mpi_ptr_t tp; + + xp_nlimbs = msec? (2 * (msize + 1)):0; + xp = xp_marker = mpi_alloc_limb_space( 2 * (msize + 1), msec ); + + memset( &karactx, 0, sizeof karactx ); + negative_result = (ep[0] & 1) && bsign; + + /* Precompute B_2I3[], BASE^(2 * i + 3), BASE^3, ^5, ^7, ... */ + if (W > 1) /* X := BASE^2 */ + mul_mod (xp, &xsize, bp, bsize, bp, bsize, mp, msize, &karactx); + for (i = 0; i < (1 << (W - 1)) - 1; i++) + { /* B_2I3[i] = BASE^(2 * i + 3) */ + if (i == 0) + { + base_u = bp; + base_u_size = bsize; + } + else + { + base_u = b_2i3[i-1]; + base_u_size = b_2i3size[i-1]; + } + + if (xsize >= base_u_size) + mul_mod (rp, &rsize, xp, xsize, base_u, base_u_size, + mp, msize, &karactx); + else + mul_mod (rp, &rsize, base_u, base_u_size, xp, xsize, + mp, msize, &karactx); + b_2i3[i] = mpi_alloc_limb_space (rsize, esec); + b_2i3size[i] = rsize; + MPN_COPY (b_2i3[i], rp, rsize); + } + + i = esize - 1; + + /* Main loop. + + Make the result be pointed to alternately by XP and RP. This + helps us avoid block copying, which would otherwise be + necessary with the overlap restrictions of + _gcry_mpih_divmod. With 50% probability the result after this + loop will be in the area originally pointed by RP (==RES->d), + and with 50% probability in the area originally pointed to by XP. */ + rsign = 0; + if (W == 1) + { + rsize = bsize; + } + else + { + rsize = msize; + MPN_ZERO (rp, rsize); + } + MPN_COPY ( rp, bp, bsize ); + + e = ep[i]; + count_leading_zeros (c, e); + e = (e << c) << 1; + c = BITS_PER_MPI_LIMB - 1 - c; + + j = 0; + + for (;;) + if (e == 0) + { + j += c; + i--; + if ( i < 0 ) + { + c = 0; + break; + } + + e = ep[i]; + c = BITS_PER_MPI_LIMB; + } + else + { + int c0; + mpi_limb_t e0; + + count_leading_zeros (c0, e); + e = (e << c0); + c -= c0; + j += c0; + + if (c >= W) + { + e0 = (e >> (BITS_PER_MPI_LIMB - W)); + e = (e << W); + c -= W; + } + else + { + i--; + if ( i < 0 ) + { + e = (e >> (BITS_PER_MPI_LIMB - c)); + break; + } + + c0 = c; + e0 = (e >> (BITS_PER_MPI_LIMB - W)) + | (ep[i] >> (BITS_PER_MPI_LIMB - W + c0)); + e = (ep[i] << (W - c0)); + c = BITS_PER_MPI_LIMB - W + c0; + } + + count_trailing_zeros (c0, e0); + e0 = (e0 >> c0) >> 1; + + for (j += W - c0; j; j--) + { + mul_mod (xp, &xsize, rp, rsize, rp, rsize, mp, msize, &karactx); + tp = rp; rp = xp; xp = tp; + rsize = xsize; + } + + if (e0 == 0) + { + base_u = bp; + base_u_size = bsize; + } + else + { + base_u = b_2i3[e0 - 1]; + base_u_size = b_2i3size[e0 -1]; + } + + mul_mod (xp, &xsize, rp, rsize, base_u, base_u_size, + mp, msize, &karactx); + tp = rp; rp = xp; xp = tp; + rsize = xsize; + + j = c0; + } + + if (c != 0) + { + j += c; + count_trailing_zeros (c, e); + e = (e >> c); + j -= c; + } + + while (j--) + { + mul_mod (xp, &xsize, rp, rsize, rp, rsize, mp, msize, &karactx); + tp = rp; rp = xp; xp = tp; + rsize = xsize; + } + + if (e != 0) + { + if ((e>>1) == 0) + { + base_u = bp; + base_u_size = bsize; + } + else + { + base_u = b_2i3[(e>>1) - 1]; + base_u_size = b_2i3size[(e>>1) -1]; + } + + mul_mod (xp, &xsize, rp, rsize, base_u, base_u_size, + mp, msize, &karactx); + tp = rp; rp = xp; xp = tp; + rsize = xsize; + + for (; c; c--) + { + mul_mod (xp, &xsize, rp, rsize, rp, rsize, mp, msize, &karactx); + tp = rp; rp = xp; xp = tp; + rsize = xsize; + } + } + + /* We shifted MOD, the modulo reduction argument, left + MOD_SHIFT_CNT steps. Adjust the result by reducing it with the + original MOD. + + Also make sure the result is put in RES->d (where it already + might be, see above). */ + if ( mod_shift_cnt ) + { + carry_limb = _gcry_mpih_lshift( res->d, rp, rsize, mod_shift_cnt); + rp = res->d; + if ( carry_limb ) + { + rp[rsize] = carry_limb; + rsize++; + } + } + else if (res->d != rp) + { + MPN_COPY (res->d, rp, rsize); + rp = res->d; + } + + if ( rsize >= msize ) + { + _gcry_mpih_divrem(rp + msize, 0, rp, rsize, mp, msize); + rsize = msize; + } + + /* Remove any leading zero words from the result. */ + if ( mod_shift_cnt ) + _gcry_mpih_rshift( rp, rp, rsize, mod_shift_cnt); + MPN_NORMALIZE (rp, rsize); + + _gcry_mpih_release_karatsuba_ctx (&karactx ); + for (i = 0; i < (1 << (W - 1)) - 1; i++) + _gcry_mpi_free_limb_space( b_2i3[i], esec ? b_2i3size[i] : 0 ); + } + + /* Fixup for negative results. */ + if ( negative_result && rsize ) + { + if ( mod_shift_cnt ) + _gcry_mpih_rshift( mp, mp, msize, mod_shift_cnt); + _gcry_mpih_sub( rp, mp, msize, rp, rsize); + rsize = msize; + rsign = msign; + MPN_NORMALIZE(rp, rsize); + } + gcry_assert (res->d == rp); + res->nlimbs = rsize; + res->sign = rsign; + + leave: + if (mp_marker) + _gcry_mpi_free_limb_space( mp_marker, mp_nlimbs ); + if (bp_marker) + _gcry_mpi_free_limb_space( bp_marker, bp_nlimbs ); + if (ep_marker) + _gcry_mpi_free_limb_space( ep_marker, ep_nlimbs ); + if (xp_marker) + _gcry_mpi_free_limb_space( xp_marker, xp_nlimbs ); +} +#endif ----------------------------------------------------------------------- Summary of changes: mpi/mpi-pow.c | 454 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 454 insertions(+) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From gniibe at fsij.org Wed Oct 16 02:26:43 2013 From: gniibe at fsij.org (NIIBE Yutaka) Date: Wed, 16 Oct 2013 09:26:43 +0900 Subject: possible mpi-pow improvement In-Reply-To: <87d2n6myo7.fsf@vigenere.g10code.de> References: <1378456897.3188.14.camel@cfw2.gniibe.org> <87a9itb52z.fsf@vigenere.g10code.de> <1380674512.3342.2.camel@cfw2.gniibe.org> <87d2n6myo7.fsf@vigenere.g10code.de> Message-ID: <1381883203.3175.0.camel@cfw2.gniibe.org> Thank you for your testing. On 2013-10-15 at 15:49 +0200, Werner Koch wrote: > Please push your patch to master. Done. -- From jussi.kivilinna at iki.fi Wed Oct 16 11:05:06 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Wed, 16 Oct 2013 12:05:06 +0300 Subject: [RFC PATCH 2/3] Add API for initializing AEAD modes In-Reply-To: <87hacimytn.fsf@vigenere.g10code.de> References: <20131013100228.32014.526.stgit@localhost6.localdomain6> <20131013100233.32014.24561.stgit@localhost6.localdomain6> <87iox1phzg.fsf@vigenere.g10code.de> <525BD36A.9010507@iki.fi> <87hacimytn.fsf@vigenere.g10code.de> Message-ID: <525E56C2.6060704@iki.fi> On 15.10.2013 16:46, Werner Koch wrote: > On Mon, 14 Oct 2013 13:20, jussi.kivilinna at iki.fi said: > >> I based CCM patchset on the AEAD API patch Dmitry sent earlier for >> GCM. Since CCM has more restrictions (need to know data lengths in >> advance) than GCM, I added gcry_cipher_aead_init. > > I forgot about this because I delayed that far too long. Sorry. > >> With this patchset to encrypt a buffer using CCM, you'd first need to >> initialize/reset CCM state with: >> >> gcry_cipher_aead_init (hd, nonce_buf, nonce_len, authtag_len, plaintext_len) >> >> CCM needs tag and plaintext lengths for MAC initialization. CCM also > > Up until now we use separate functions to set key, IV, and counter. > This function changes this pattern. Why not reusing setiv for the > nonce and adding a function for the authentication tag? Sure, setiv can be reused. > > Is the plaintext length really required in advance? That is > embarrassing in particular because the authenticated data is appended to > the ciphertext. Yes, this is really the case with CCM. The first block to CBC-MAC contains length of encrypted message.. from RFC: The first block B_0 is formatted as follows, where l(m) is encoded in most-significant-byte first order: Octet Number Contents ------------ --------- 0 Flags 1 ... 15-L Nonce N 16-L ... 15 l(m) Flags encodes the length of length field (L), tag length and an AAD flag. > >> gcry_cipher_authenticate (hd, aadbuf, aadbuflen) >> >> which does the actual MAC initialization. If aadbuflen == 0, then >> above call can be omitted and gcry_cipher_(en|de)crypt will call >> gcry_cipher_authenticate with zero length. > > What about extending this fucntion to also take the authentication tag > and, if the plaintext length is required for the MAC setup, also that > length? That would group the information together. Ok, so we'd have gcry_cipher_authenticate (hd, const void *aadbuf, size_t aadbuflen, count void *tag, size_t taglen, size_t crypt_len) For encryption, tag is NULL pointer and taglen is zero and after encryption authentication tag can be read with 'gcry_cipher_tag'. For decryption, tag is given for authentication check with above function. Also (at least) for CCM, encrypt_len needs to be given. > >> NIST paper and RFC 3610 define CCM ciphertext as [ctr-enc(plaintext) >> || authtag] and that decryption must not reveal any information >> (plaintext or authtag) if authtag is not correct. Therefore full >> buffers, matching with length of plaintext_len and authtag_len given >> in gcry_cipher_aead_init, have to be used. If authentication check > > I don't see that. That is an implementaion detail and the requirement > could also be achieved by putting it into the security policy. Ok. > >> Would it be better to add functions to do AEAD encrypt/decrypt in >> single go and use gcry_buffer_t? This would avoid having internal >> state machines in AEAD modes and having to call different functions in >> correct order. > > And it also means that you can't use that mode with large amounts of > data. Not a good idea; there are still lots of platforms with a quite > limited amount of memory but in need to encrypt large data take from > somewhere. Yes, the need for the plaintext length makes it hard to use > but I can image systems which know that length on advance even if the > data does not fit into memory. Agreed and not all AEAD modes require plaintext length in advance. > > I mentioned gcry_buffer_t having the AAD in mind; a scatter/gather style > operation may be useful here. > > We should also keep in mind that adding support for OCB is desirable. I > had in mind to require the use of a new GCRYCTL_ENABLE_GPL_CODE to state > that Libgcrypt is used under the terms of the GPL. However, meanwhile > Phil Rogaway relaxed the requirement for royalty free patent licensing > to all free software licenses and thus we could simply got forth and > implement OCB. Any new API should make it easy to do just that. > > > > Shalom-Salam, > > Werner > From wk at gnupg.org Wed Oct 16 11:25:36 2013 From: wk at gnupg.org (Werner Koch) Date: Wed, 16 Oct 2013 11:25:36 +0200 Subject: [RFC PATCH 2/3] Add API for initializing AEAD modes In-Reply-To: <525E56C2.6060704@iki.fi> (Jussi Kivilinna's message of "Wed, 16 Oct 2013 12:05:06 +0300") References: <20131013100228.32014.526.stgit@localhost6.localdomain6> <20131013100233.32014.24561.stgit@localhost6.localdomain6> <87iox1phzg.fsf@vigenere.g10code.de> <525BD36A.9010507@iki.fi> <87hacimytn.fsf@vigenere.g10code.de> <525E56C2.6060704@iki.fi> Message-ID: <87eh7llg8f.fsf@vigenere.g10code.de> On Wed, 16 Oct 2013 11:05, jussi.kivilinna at iki.fi said: > Ok, so we'd have > gcry_cipher_authenticate (hd, const void *aadbuf, size_t aadbuflen, > count void *tag, size_t taglen, size_t crypt_len) > > For encryption, tag is NULL pointer and taglen is zero and after encryption > authentication tag can be read with 'gcry_cipher_tag'. For decryption, tag > is given for authentication check with above function. A last idea: What about two functions gcry_cipher_settag () -- To be used before decryption gcry_cipher_gettag () -- to be used after encryption. gcry_cipher_set_tag would actually look prettier but we already use setkey and setiv. Wit these fucntions gcry_cipher_authenticate (hd, const void *aadbuf, size_t aadbuflen, size_t crypt_len) would be pretty easy to describe. And a very last idea: What about renaming gcry_cipher_authenticate to gcry_cipher_setaad ? Shalom-Salam, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From dbaryshkov at gmail.com Wed Oct 16 13:20:15 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Wed, 16 Oct 2013 15:20:15 +0400 Subject: [RFC PATCH 2/3] Add API for initializing AEAD modes In-Reply-To: <525E56C2.6060704@iki.fi> References: <20131013100228.32014.526.stgit@localhost6.localdomain6> <20131013100233.32014.24561.stgit@localhost6.localdomain6> <87iox1phzg.fsf@vigenere.g10code.de> <525BD36A.9010507@iki.fi> <87hacimytn.fsf@vigenere.g10code.de> <525E56C2.6060704@iki.fi> Message-ID: Hello, On Wed, Oct 16, 2013 at 1:05 PM, Jussi Kivilinna wrote: > On 15.10.2013 16:46, Werner Koch wrote: >> On Mon, 14 Oct 2013 13:20, jussi.kivilinna at iki.fi said: >> >>> gcry_cipher_authenticate (hd, aadbuf, aadbuflen) >>> >>> which does the actual MAC initialization. If aadbuflen == 0, then >>> above call can be omitted and gcry_cipher_(en|de)crypt will call >>> gcry_cipher_authenticate with zero length. >> >> What about extending this fucntion to also take the authentication tag >> and, if the plaintext length is required for the MAC setup, also that >> length? That would group the information together. > > Ok, so we'd have > gcry_cipher_authenticate (hd, const void *aadbuf, size_t aadbuflen, > count void *tag, size_t taglen, size_t crypt_len) > > For encryption, tag is NULL pointer and taglen is zero and after encryption > authentication tag can be read with 'gcry_cipher_tag'. For decryption, tag > is given for authentication check with above function. Hmm. That would require for the tag to be stored in the context to be validated after we process all enciphered data. I would suggest to move tag validation to upper layer: * setiv/setkey/etc. * authenticate(AAD, crypt_len) * while (has_data) enc_data = encrypt(data) * tag = tag() -> returns tag for AAD and passed data Upper layer passes AAD, enc_data and tag to other side. Upper layer received AAD, enc_data, and tag * setiv/setkey/etc * authenticate(AAD, crypt_len) * while (has_enc_data) data = decrypt(enc_data) * new_tag = tag() -> returns tag for AAD and unencrypted data Then upper layer can compare tag with new_tag and thus verify that data is authenticated. What do you think? BTW: Looking at GCM/GMAC I have the feeling that single authenticate() might be enough for now (and should be enough for e.g. TLS), but in future to support GMAC (GCM working in auth-only mode, no crypto data) we might want to support several sequential authenticate() calls (if all of them come before first encrypt()/decrypt() call). -- With best wishes Dmitry From jussi.kivilinna at iki.fi Wed Oct 16 13:26:19 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Wed, 16 Oct 2013 14:26:19 +0300 Subject: [RFC PATCH 2/3] Add API for initializing AEAD modes In-Reply-To: <87eh7llg8f.fsf@vigenere.g10code.de> References: <20131013100228.32014.526.stgit@localhost6.localdomain6> <20131013100233.32014.24561.stgit@localhost6.localdomain6> <87iox1phzg.fsf@vigenere.g10code.de> <525BD36A.9010507@iki.fi> <87hacimytn.fsf@vigenere.g10code.de> <525E56C2.6060704@iki.fi> <87eh7llg8f.fsf@vigenere.g10code.de> Message-ID: <525E77DB.8030402@iki.fi> On 16.10.2013 12:25, Werner Koch wrote: > On Wed, 16 Oct 2013 11:05, jussi.kivilinna at iki.fi said: > >> Ok, so we'd have >> gcry_cipher_authenticate (hd, const void *aadbuf, size_t aadbuflen, >> count void *tag, size_t taglen, size_t crypt_len) >> >> For encryption, tag is NULL pointer and taglen is zero and after encryption >> authentication tag can be read with 'gcry_cipher_tag'. For decryption, tag >> is given for authentication check with above function. > > A last idea: What about two functions > > gcry_cipher_settag () -- To be used before decryption > gcry_cipher_gettag () -- to be used after encryption. > > gcry_cipher_set_tag would actually look prettier but we already use > setkey and setiv. Wit these fucntions > > gcry_cipher_authenticate (hd, const void *aadbuf, size_t aadbuflen, > size_t crypt_len) > > would be pretty easy to describe. And a very last idea: What about > renaming > > gcry_cipher_authenticate to gcry_cipher_setaad > > ? I started writing following example to check is for CCM would work with these. Problem here is that CCM needs authentication tag length for first CBC-MAC block. Maybe taglen could be given to CCM mode encryption with gcry_cipher_settag(hd, NULL, taglen)? CCM encryption, without AAD: gcry_cipher_setkey (hd, key, key_len); /* Set nonce. */ gcry_cipher_setiv (hd, nonce, nonce_len); /* No AAD, but for CCM need to set crypt_len. */ gcry_cipher_setaad (hd, NULL, 0, inbuf_len_1 + inbuf_len_2); <-- cannot initialize CBC-MAC, needs tag_len. /* Do encryption. */ gcry_cipher_encrypt (hd, outbuf_1, outbuf_len_1, inbuf_1, inbuf_len_1); /* More data... */ gcry_cipher_encrypt (hd, outbuf_2, outbuf_len_2, inbuf_2, inbuf_len_2); /* Finalize and read tag. */ gcry_cipher_gettag (hd, tag, tag_len); With OCB, if AAD stays the same between messages, one can reuse the preprocessed HASH(Key, AAD). Following example would process three messages, where first two have same AAD and last one has zero length AAD. Does this look ok? OCB encryption: gcry_cipher_setkey (hd, key, key_len); /* Process packet/message #1. */ /* Set nonce. */ gcry_cipher_setiv (hd, nonce_1, nonce_len); /* Set AAD. */ gcry_cipher_setaad (hd, aad, aadlen, 0); /* Do encryption. */ gcry_cipher_encrypt (hd, outbuf_1, inbuf_len_1, inbuf_1, inbuf_len_1); /* Finalize and read tag. */ gcry_cipher_gettag (hd, tag_1, tag_len); /*** Process next packet/message, #2. */ /* Same key and AAD (preprocessed). */ /* Set next nonce. */ gcry_cipher_setiv (hd, nonce_2, nonce_len); /* Do encryption. */ gcry_cipher_encrypt (hd, outbuf_2, inbuf_len_2, inbuf_2, inbuf_len_2); /* Finalize and read tag. */ gcry_cipher_gettag (hd, tag_2, tag_len); /*** Process next packet/message, #3. */ /* Same key and new AAD (empty). */ /* Set next nonce. */ gcry_cipher_setiv (hd, nonce_3, nonce_len); /* Set/clear AAD. */ gcry_cipher_setaad (hd, NULL, 0, 0); /* Do encryption. */ gcry_cipher_encrypt (hd, outbuf_3, inbuf_len_3, inbuf_3, inbuf_len_3); /* Finalize and read tag. */ gcry_cipher_gettag (hd, tag_3, tag_len); -Jussi > > > > Shalom-Salam, > > Werner > > From jussi.kivilinna at iki.fi Wed Oct 16 13:49:13 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Wed, 16 Oct 2013 14:49:13 +0300 Subject: [RFC PATCH 2/3] Add API for initializing AEAD modes In-Reply-To: References: <20131013100228.32014.526.stgit@localhost6.localdomain6> <20131013100233.32014.24561.stgit@localhost6.localdomain6> <87iox1phzg.fsf@vigenere.g10code.de> <525BD36A.9010507@iki.fi> <87hacimytn.fsf@vigenere.g10code.de> <525E56C2.6060704@iki.fi> Message-ID: <525E7D39.4080407@iki.fi> On 16.10.2013 14:20, Dmitry Eremin-Solenikov wrote: > Hello, > > On Wed, Oct 16, 2013 at 1:05 PM, Jussi Kivilinna wrote: >> On 15.10.2013 16:46, Werner Koch wrote: >>> On Mon, 14 Oct 2013 13:20, jussi.kivilinna at iki.fi said: >>> >>>> gcry_cipher_authenticate (hd, aadbuf, aadbuflen) >>>> >>>> which does the actual MAC initialization. If aadbuflen == 0, then >>>> above call can be omitted and gcry_cipher_(en|de)crypt will call >>>> gcry_cipher_authenticate with zero length. >>> >>> What about extending this fucntion to also take the authentication tag >>> and, if the plaintext length is required for the MAC setup, also that >>> length? That would group the information together. >> >> Ok, so we'd have >> gcry_cipher_authenticate (hd, const void *aadbuf, size_t aadbuflen, >> count void *tag, size_t taglen, size_t crypt_len) >> >> For encryption, tag is NULL pointer and taglen is zero and after encryption >> authentication tag can be read with 'gcry_cipher_tag'. For decryption, tag >> is given for authentication check with above function. > > Hmm. That would require for the tag to be stored in the context to be validated > after we process all enciphered data. I would suggest to move tag validation > to upper layer: > > * setiv/setkey/etc. > * authenticate(AAD, crypt_len) > * while (has_data) enc_data = encrypt(data) > * tag = tag() -> returns tag for AAD and passed data > Upper layer passes AAD, enc_data and tag to other side. > > Upper layer received AAD, enc_data, and tag > * setiv/setkey/etc > * authenticate(AAD, crypt_len) > * while (has_enc_data) data = decrypt(enc_data) > * new_tag = tag() -> returns tag for AAD and unencrypted data > Then upper layer can compare tag with new_tag and thus verify that data > is authenticated. What do you think? > > BTW: Looking at GCM/GMAC I have the feeling that single > authenticate() might be enough for now (and should be enough for > e.g. TLS), but in future to support GMAC (GCM working in auth-only mode, > no crypto data) we might want to support several sequential authenticate() > calls (if all of them come before first encrypt()/decrypt() call). > I've been looking at adding CMAC to libgcrypt, maybe gcry_cipher_authenticate() could be used for this too. -Jussi From wk at gnupg.org Wed Oct 16 15:18:32 2013 From: wk at gnupg.org (Werner Koch) Date: Wed, 16 Oct 2013 15:18:32 +0200 Subject: [RFC PATCH 2/3] Add API for initializing AEAD modes In-Reply-To: <525E77DB.8030402@iki.fi> (Jussi Kivilinna's message of "Wed, 16 Oct 2013 14:26:19 +0300") References: <20131013100228.32014.526.stgit@localhost6.localdomain6> <20131013100233.32014.24561.stgit@localhost6.localdomain6> <87iox1phzg.fsf@vigenere.g10code.de> <525BD36A.9010507@iki.fi> <87hacimytn.fsf@vigenere.g10code.de> <525E56C2.6060704@iki.fi> <87eh7llg8f.fsf@vigenere.g10code.de> <525E77DB.8030402@iki.fi> Message-ID: <87txghjqvr.fsf@vigenere.g10code.de> On Wed, 16 Oct 2013 13:26, jussi.kivilinna at iki.fi said: > I started writing following example to check is for CCM would work with > these. Problem here is that CCM needs authentication tag length for > first CBC-MAC block. Maybe taglen could be given to CCM mode encryption > with gcry_cipher_settag(hd, NULL, taglen)? Yes, I think this is okay. > With OCB, if AAD stays the same between messages, one can reuse the > preprocessed HASH(Key, AAD). Following example would process three messages, > where first two have same AAD and last one has zero length AAD. Does this > look ok? Yes. We may eventually put such examples into the manual. Salam-Shalom, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From cvs at cvs.gnupg.org Wed Oct 16 16:36:37 2013 From: cvs at cvs.gnupg.org (by Dmitry Eremin-Solenikov) Date: Wed, 16 Oct 2013 16:36:37 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-312-g83902f1 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 83902f1f1dbc8263a0c3f61be59cd2eb95293c97 (commit) via 187b2bb541b985255aee262d181434a7cb4ae2e7 (commit) from a329b6abf00c990faf1986f9fbad7b4d71c13bcb (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 83902f1f1dbc8263a0c3f61be59cd2eb95293c97 Author: Dmitry Eremin-Solenikov Date: Tue Oct 15 23:56:44 2013 +0400 ecc: Add support for GOST R 34.10-2001/-2012 signatures * src/cipher.h: define PUBKEY_FLAG_GOST * cipher/ecc-curves.c: Add GOST2001-test and GOST2012-test curves defined in standards. Typical applications would use either those curves, or curves defined in RFC 4357 (will be added later). * cipher/ecc.c (sign_gost, verify_gost): New. (ecc_sign, ecc_verify): use sign_gost/verify_gost if PUBKEY_FLAG_GOST is set. (ecc_names): add "gost" for gost signatures. * cipher/pubkey-util.c (_gcry_pk_util_parse_flaglist, _gcry_pk_util_preparse_sigval): set PUBKEY_FLAG_GOST if gost flag is present in s-exp. * tests/benchmark.c (ecc_bench): also benchmark GOST signatures. * tests/basic.c (check_pubkey): add two public keys from GOST R 34.10-2012 standard. (check_pubkey_sign_ecdsa): add two data sets to check gost signatures. * tests/curves.c: correct N_CURVES as we now have 2 more curves. Signed-off-by: Dmitry Eremin-Solenikov Removed some comments from the new curve definitions in ecc-curves.c to avoid line wrapping. Eventually we will develop a precompiler to avoid parsing those hex strings. -wk diff --git a/cipher/ecc-curves.c b/cipher/ecc-curves.c index 2cdb9b4..fb0db3b 100644 --- a/cipher/ecc-curves.c +++ b/cipher/ecc-curves.c @@ -267,6 +267,34 @@ static const ecc_domain_parms_t domain_parms[] = "0x7dde385d566332ecc0eabfa9cf7822fdf209f70024a57b1aa000c55b881f8111" "b2dcde494a5f485e5bca4bd88a2763aed1ca2b2fa8f0540678cd1e0f3ad80892" }, + { + "GOST2001-test", 256, 0, + MPI_EC_WEIERSTRASS, ECC_DIALECT_STANDARD, + "0x8000000000000000000000000000000000000000000000000000000000000431", + "0x0000000000000000000000000000000000000000000000000000000000000007", + "0x5fbff498aa938ce739b8e022fbafef40563f6e6a3472fc2a514c0ce9dae23b7e", + "0x8000000000000000000000000000000150fe8a1892976154c59cfc193accf5b3", + + "0x0000000000000000000000000000000000000000000000000000000000000002", + "0x08e2a8a0e65147d4bd6316030e16d19c85c97f0a9ca267122b96abbcea7e8fc8", + }, + + { + "GOST2012-test", 511, 0, + MPI_EC_WEIERSTRASS, ECC_DIALECT_STANDARD, + "0x4531acd1fe0023c7550d267b6b2fee80922b14b2ffb90f04d4eb7c09b5d2d15d" + "f1d852741af4704a0458047e80e4546d35b8336fac224dd81664bbf528be6373", + "0x0000000000000000000000000000000000000000000000000000000000000007", + "0x1cff0806a31116da29d8cfa54e57eb748bc5f377e49400fdd788b649eca1ac4" + "361834013b2ad7322480a89ca58e0cf74bc9e540c2add6897fad0a3084f302adc", + "0x4531acd1fe0023c7550d267b6b2fee80922b14b2ffb90f04d4eb7c09b5d2d15d" + "a82f2d7ecb1dbac719905c5eecc423f1d86e25edbe23c595d644aaf187e6e6df", + + "0x24d19cc64572ee30f396bf6ebbfd7a6c5213b3b3d7057cc825f91093a68cd762" + "fd60611262cd838dc6b60aa7eee804e28bc849977fac33b4b530f1b120248a9a", + "0x2bb312a43bd2ce6e0d020613c857acddcfbf061e91e5f2c3f32447c259f39b2" + "c83ab156d77f1496bf7eb3351e1ee4e43dc1a18b91b24640b6dbb92cb1add371e", + }, { NULL, 0, 0, 0, 0, NULL, NULL, NULL, NULL } }; diff --git a/cipher/ecc.c b/cipher/ecc.c index 1323d00..8b61ae4 100644 --- a/cipher/ecc.c +++ b/cipher/ecc.c @@ -71,6 +71,7 @@ static const char *ecc_names[] = "ecdsa", "ecdh", "eddsa", + "gost", NULL, }; @@ -575,6 +576,203 @@ verify_ecdsa (gcry_mpi_t input, ECC_public_key *pkey, return err; } +/* Compute an GOST R 34.10-01/-12 signature. + * Return the signature struct (r,s) from the message hash. The caller + * must have allocated R and S. + */ +static gpg_err_code_t +sign_gost (gcry_mpi_t input, ECC_secret_key *skey, gcry_mpi_t r, gcry_mpi_t s) +{ + gpg_err_code_t err = 0; + gcry_mpi_t k, dr, sum, ke, x, e; + mpi_point_struct I; + gcry_mpi_t hash; + const void *abuf; + unsigned int abits, qbits; + mpi_ec_t ctx; + + if (DBG_CIPHER) + log_mpidump ("gost sign hash ", input ); + + qbits = mpi_get_nbits (skey->E.n); + + /* Convert the INPUT into an MPI if needed. */ + if (mpi_is_opaque (input)) + { + abuf = gcry_mpi_get_opaque (input, &abits); + err = gpg_err_code (gcry_mpi_scan (&hash, GCRYMPI_FMT_USG, + abuf, (abits+7)/8, NULL)); + if (err) + return err; + if (abits > qbits) + gcry_mpi_rshift (hash, hash, abits - qbits); + } + else + hash = input; + + + k = NULL; + dr = mpi_alloc (0); + sum = mpi_alloc (0); + ke = mpi_alloc (0); + e = mpi_alloc (0); + x = mpi_alloc (0); + point_init (&I); + + ctx = _gcry_mpi_ec_p_internal_new (skey->E.model, skey->E.dialect, + skey->E.p, skey->E.a, skey->E.b); + + mpi_mod (e, input, skey->E.n); /* e = hash mod n */ + + if (!mpi_cmp_ui (e, 0)) + mpi_set_ui (e, 1); + + /* Two loops to avoid R or S are zero. This is more of a joke than + a real demand because the probability of them being zero is less + than any hardware failure. Some specs however require it. */ + do + { + do + { + mpi_free (k); + k = _gcry_dsa_gen_k (skey->E.n, GCRY_STRONG_RANDOM); + + _gcry_mpi_ec_mul_point (&I, k, &skey->E.G, ctx); + if (_gcry_mpi_ec_get_affine (x, NULL, &I, ctx)) + { + if (DBG_CIPHER) + log_debug ("ecc sign: Failed to get affine coordinates\n"); + err = GPG_ERR_BAD_SIGNATURE; + goto leave; + } + mpi_mod (r, x, skey->E.n); /* r = x mod n */ + } + while (!mpi_cmp_ui (r, 0)); + mpi_mulm (dr, skey->d, r, skey->E.n); /* dr = d*r mod n */ + mpi_mulm (ke, k, e, skey->E.n); /* ke = k*e mod n */ + mpi_addm (s, ke, dr, skey->E.n); /* sum = (k*e+ d*r) mod n */ + } + while (!mpi_cmp_ui (s, 0)); + + if (DBG_CIPHER) + { + log_mpidump ("gost sign result r ", r); + log_mpidump ("gost sign result s ", s); + } + + leave: + _gcry_mpi_ec_free (ctx); + point_free (&I); + mpi_free (x); + mpi_free (e); + mpi_free (ke); + mpi_free (sum); + mpi_free (dr); + mpi_free (k); + + if (hash != input) + mpi_free (hash); + + return err; +} + +/* Verify a GOST R 34.10-01/-12 signature. + * Check if R and S verifies INPUT. + */ +static gpg_err_code_t +verify_gost (gcry_mpi_t input, ECC_public_key *pkey, + gcry_mpi_t r, gcry_mpi_t s) +{ + gpg_err_code_t err = 0; + gcry_mpi_t e, x, z1, z2, v, rv, zero; + mpi_point_struct Q, Q1, Q2; + mpi_ec_t ctx; + + if( !(mpi_cmp_ui (r, 0) > 0 && mpi_cmp (r, pkey->E.n) < 0) ) + return GPG_ERR_BAD_SIGNATURE; /* Assertion 0 < r < n failed. */ + if( !(mpi_cmp_ui (s, 0) > 0 && mpi_cmp (s, pkey->E.n) < 0) ) + return GPG_ERR_BAD_SIGNATURE; /* Assertion 0 < s < n failed. */ + + x = mpi_alloc (0); + e = mpi_alloc (0); + z1 = mpi_alloc (0); + z2 = mpi_alloc (0); + v = mpi_alloc (0); + rv = mpi_alloc (0); + zero = mpi_alloc (0); + + point_init (&Q); + point_init (&Q1); + point_init (&Q2); + + ctx = _gcry_mpi_ec_p_internal_new (pkey->E.model, pkey->E.dialect, + pkey->E.p, pkey->E.a, pkey->E.b); + + mpi_mod (e, input, pkey->E.n); /* e = hash mod n */ + if (!mpi_cmp_ui (e, 0)) + mpi_set_ui (e, 1); + mpi_invm (v, e, pkey->E.n); /* v = e^(-1) (mod n) */ + mpi_mulm (z1, s, v, pkey->E.n); /* z1 = s*v (mod n) */ + mpi_mulm (rv, r, v, pkey->E.n); /* rv = s*v (mod n) */ + mpi_subm (z2, zero, rv, pkey->E.n); /* z2 = -r*v (mod n) */ + + _gcry_mpi_ec_mul_point (&Q1, z1, &pkey->E.G, ctx); +/* log_mpidump ("Q1.x", Q1.x); */ +/* log_mpidump ("Q1.y", Q1.y); */ +/* log_mpidump ("Q1.z", Q1.z); */ + _gcry_mpi_ec_mul_point (&Q2, z2, &pkey->Q, ctx); +/* log_mpidump ("Q2.x", Q2.x); */ +/* log_mpidump ("Q2.y", Q2.y); */ +/* log_mpidump ("Q2.z", Q2.z); */ + _gcry_mpi_ec_add_points (&Q, &Q1, &Q2, ctx); +/* log_mpidump (" Q.x", Q.x); */ +/* log_mpidump (" Q.y", Q.y); */ +/* log_mpidump (" Q.z", Q.z); */ + + if (!mpi_cmp_ui (Q.z, 0)) + { + if (DBG_CIPHER) + log_debug ("ecc verify: Rejected\n"); + err = GPG_ERR_BAD_SIGNATURE; + goto leave; + } + if (_gcry_mpi_ec_get_affine (x, NULL, &Q, ctx)) + { + if (DBG_CIPHER) + log_debug ("ecc verify: Failed to get affine coordinates\n"); + err = GPG_ERR_BAD_SIGNATURE; + goto leave; + } + mpi_mod (x, x, pkey->E.n); /* x = x mod E_n */ + if (mpi_cmp (x, r)) /* x != r */ + { + if (DBG_CIPHER) + { + log_mpidump (" x", x); + log_mpidump (" r", r); + log_mpidump (" s", s); + log_debug ("ecc verify: Not verified\n"); + } + err = GPG_ERR_BAD_SIGNATURE; + goto leave; + } + if (DBG_CIPHER) + log_debug ("ecc verify: Accepted\n"); + + leave: + _gcry_mpi_ec_free (ctx); + point_free (&Q2); + point_free (&Q1); + point_free (&Q); + mpi_free (zero); + mpi_free (rv); + mpi_free (v); + mpi_free (z2); + mpi_free (z1); + mpi_free (x); + mpi_free (e); + return err; +} static void @@ -1623,6 +1821,13 @@ ecc_sign (gcry_sexp_t *r_sig, gcry_sexp_t s_data, gcry_sexp_t keyparms) rc = gcry_sexp_build (r_sig, NULL, "(sig-val(eddsa(r%M)(s%M)))", sig_r, sig_s); } + else if ((ctx.flags & PUBKEY_FLAG_GOST)) + { + rc = sign_gost (data, &sk, sig_r, sig_s); + if (!rc) + rc = gcry_sexp_build (r_sig, NULL, + "(sig-val(gost(r%M)(s%M)))", sig_r, sig_s); + } else { rc = sign_ecdsa (data, &sk, sig_r, sig_s, ctx.flags, ctx.hash_algo); @@ -1773,6 +1978,15 @@ ecc_verify (gcry_sexp_t s_sig, gcry_sexp_t s_data, gcry_sexp_t s_keyparms) { rc = verify_eddsa (data, &pk, sig_r, sig_s, ctx.hash_algo, mpi_q); } + else if ((sigflags & PUBKEY_FLAG_GOST)) + { + point_init (&pk.Q); + rc = _gcry_ecc_os2ec (&pk.Q, mpi_q); + if (rc) + goto leave; + + rc = verify_gost (data, &pk, sig_r, sig_s); + } else { point_init (&pk.Q); diff --git a/cipher/pubkey-util.c b/cipher/pubkey-util.c index 0b90054..0db5840 100644 --- a/cipher/pubkey-util.c +++ b/cipher/pubkey-util.c @@ -79,6 +79,11 @@ _gcry_pk_util_parse_flaglist (gcry_sexp_t list, { flags |= PUBKEY_FLAG_ECDSA; } + else if (n == 4 && !memcmp (s, "gost", 4)) + { + encoding = PUBKEY_ENC_RAW; + flags |= PUBKEY_FLAG_GOST; + } else if (n == 3 && !memcmp (s, "raw", 3) && encoding == PUBKEY_ENC_UNKNOWN) { @@ -347,6 +352,8 @@ _gcry_pk_util_preparse_sigval (gcry_sexp_t s_sig, const char **algo_names, { if (!strcmp (name, "eddsa")) *r_eccflags = PUBKEY_FLAG_EDDSA; + if (!strcmp (name, "gost")) + *r_eccflags = PUBKEY_FLAG_GOST; } *r_parms = l2; diff --git a/src/cipher.h b/src/cipher.h index 077af98..20818ba 100644 --- a/src/cipher.h +++ b/src/cipher.h @@ -37,6 +37,7 @@ #define PUBKEY_FLAG_USE_FIPS186_2 (1 << 8) #define PUBKEY_FLAG_ECDSA (1 << 9) #define PUBKEY_FLAG_EDDSA (1 << 10) +#define PUBKEY_FLAG_GOST (1 << 11) enum pk_operation diff --git a/tests/basic.c b/tests/basic.c index 991309a..1d6e637 100644 --- a/tests/basic.c +++ b/tests/basic.c @@ -3439,6 +3439,30 @@ check_pubkey_sign_ecdsa (int n, gcry_sexp_t skey, gcry_sexp_t pkey) /* */ "000102030405060708090A0B0C0D0E0F#))", 0 }, + { 256, + "(data (flags gost)\n" + " (value #00112233445566778899AABBCCDDEEFF" + /* */ "000102030405060708090A0B0C0D0E0F#))", + 0, + "(data (flags gost)\n" + " (value #80112233445566778899AABBCCDDEEFF" + /* */ "000102030405060708090A0B0C0D0E0F#))", + 0 + }, + { 512, + "(data (flags gost)\n" + " (value #00112233445566778899AABBCCDDEEFF" + /* */ "000102030405060708090A0B0C0D0E0F" + /* */ "000102030405060708090A0B0C0D0E0F" + /* */ "000102030405060708090A0B0C0D0E0F#))", + 0, + "(data (flags gost)\n" + " (value #80112233445566778899AABBCCDDEEFF" + /* */ "000102030405060708090A0B0C0D0E0F" + /* */ "000102030405060708090A0B0C0D0E0F" + /* */ "000102030405060708090A0B0C0D0E0F#))", + 0 + }, { 0, NULL } }; @@ -4021,6 +4045,57 @@ check_pubkey (void) "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" } + }, + { /* GOST R 34.10-2001/2012 test 256 bit. */ + GCRY_PK_ECDSA, FLAG_SIGN, + { + "(private-key\n" + " (ecc\n" + " (curve GOST2001-test)\n" + " (q #047F2B49E270DB6D90D8595BEC458B50C58585BA1D4E9B78" + " 8F6689DBD8E56FD80B26F1B489D6701DD185C8413A977B3C" + " BBAF64D1C593D26627DFFB101A87FF77DA#)\n" + " (d #7A929ADE789BB9BE10ED359DD39A72C11B60961F49397EEE" + " 1D19CE9891EC3B28#)))\n", + + "(public-key\n" + " (ecc\n" + " (curve GOST2001-test)\n" + " (q #047F2B49E270DB6D90D8595BEC458B50C58585BA1D4E9B78" + " 8F6689DBD8E56FD80B26F1B489D6701DD185C8413A977B3C" + " BBAF64D1C593D26627DFFB101A87FF77DA#)))\n", + + "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" + "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" } + }, + { /* GOST R 34.10-2012 test 512 bit. */ + GCRY_PK_ECDSA, FLAG_SIGN, + { + "(private-key\n" + " (ecc\n" + " (curve GOST2012-test)\n" + " (q #04115DC5BC96760C7B48598D8AB9E740D4C4A85A65BE33C1" + " 815B5C320C854621DD5A515856D13314AF69BC5B924C8B" + " 4DDFF75C45415C1D9DD9DD33612CD530EFE137C7C90CD4" + " 0B0F5621DC3AC1B751CFA0E2634FA0503B3D52639F5D7F" + " B72AFD61EA199441D943FFE7F0C70A2759A3CDB84C114E" + " 1F9339FDF27F35ECA93677BEEC#)\n" + " (d #0BA6048AADAE241BA40936D47756D7C93091A0E851466970" + " 0EE7508E508B102072E8123B2200A0563322DAD2827E2714" + " A2636B7BFD18AADFC62967821FA18DD4#)))\n", + + "(public-key\n" + " (ecc\n" + " (curve GOST2001-test)\n" + " (q #04115DC5BC96760C7B48598D8AB9E740D4C4A85A65BE33C1" + " 815B5C320C854621DD5A515856D13314AF69BC5B924C8B" + " 4DDFF75C45415C1D9DD9DD33612CD530EFE137C7C90CD4" + " 0B0F5621DC3AC1B751CFA0E2634FA0503B3D52639F5D7F" + " B72AFD61EA199441D943FFE7F0C70A2759A3CDB84C114E" + " 1F9339FDF27F35ECA93677BEEC#)))\n" + + "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" + "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" } } }; int i; diff --git a/tests/benchmark.c b/tests/benchmark.c index 5d1434a..ecda0d3 100644 --- a/tests/benchmark.c +++ b/tests/benchmark.c @@ -883,7 +883,8 @@ ecc_bench (int iterations, int print_header) { #if USE_ECC gpg_error_t err; - const char *p_sizes[] = { "192", "224", "256", "384", "521", "Ed25519" }; + const char *p_sizes[] = { "192", "224", "256", "384", "521", "Ed25519", + "gost256", "gost512" }; int testno; if (print_header) @@ -899,14 +900,22 @@ ecc_bench (int iterations, int print_header) int count; int p_size; int is_ed25519; + int is_gost; is_ed25519 = !strcmp (p_sizes[testno], "Ed25519"); + is_gost = !strncmp (p_sizes[testno], "gost", 4); if (is_ed25519) { p_size = 256; printf ("EdDSA Ed25519 "); fflush (stdout); } + else if (is_gost) + { + p_size = atoi (p_sizes[testno] + 4); + printf ("GOST %3d bit ", p_size); + fflush (stdout); + } else { p_size = atoi (p_sizes[testno]); @@ -917,6 +926,10 @@ ecc_bench (int iterations, int print_header) if (is_ed25519) err = gcry_sexp_build (&key_spec, NULL, "(genkey (ecdsa (curve \"Ed25519\")))"); + else if (is_gost) + err = gcry_sexp_build (&key_spec, NULL, + "(genkey (ecdsa (curve %s)))", + p_size == 256 ? "GOST2001-test" : "GOST2012-test"); else err = gcry_sexp_build (&key_spec, NULL, "(genkey (ECDSA (nbits %d)))", p_size); @@ -950,6 +963,8 @@ ecc_bench (int iterations, int print_header) err = gcry_sexp_build (&data, NULL, "(data (flags eddsa)(hash-algo sha512)" " (value %m))", x); + else if (is_gost) + err = gcry_sexp_build (&data, NULL, "(data (flags gost) (value %m))", x); else err = gcry_sexp_build (&data, NULL, "(data (flags raw) (value %m))", x); gcry_mpi_release (x); diff --git a/tests/curves.c b/tests/curves.c index 2c3ae53..198693e 100644 --- a/tests/curves.c +++ b/tests/curves.c @@ -29,7 +29,7 @@ #include "../src/gcrypt-int.h" /* Number of curves defined in ../cipger/ecc.c */ -#define N_CURVES 13 +#define N_CURVES 15 /* A real world sample public key. */ static char const sample_key_1[] = commit 187b2bb541b985255aee262d181434a7cb4ae2e7 Author: Dmitry Eremin-Solenikov Date: Tue Oct 15 23:56:43 2013 +0400 Fix 256-bit ecdsa test key definition * tests/basic.c (check_pubkey): fix nistp256 testing key declaration - add missing comma. Signed-off-by: Dmitry Eremin-Solenikov diff --git a/tests/basic.c b/tests/basic.c index ee04900..991309a 100644 --- a/tests/basic.c +++ b/tests/basic.c @@ -4017,7 +4017,7 @@ check_pubkey (void) " (curve nistp256)\n" " (q #04D4F6A6738D9B8D3A7075C1E4EE95015FC0C9B7E4272D2B" " EB6644D3609FC781B71F9A8072F58CB66AE2F89BB1245187" - " 3ABF7D91F9E1FBF96BF2F70E73AAC9A283#)))\n" + " 3ABF7D91F9E1FBF96BF2F70E73AAC9A283#)))\n", "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" } ----------------------------------------------------------------------- Summary of changes: cipher/ecc-curves.c | 28 +++++++ cipher/ecc.c | 214 ++++++++++++++++++++++++++++++++++++++++++++++++++ cipher/pubkey-util.c | 7 ++ src/cipher.h | 1 + tests/basic.c | 77 +++++++++++++++++- tests/benchmark.c | 17 +++- tests/curves.c | 2 +- 7 files changed, 343 insertions(+), 3 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From wk at gnupg.org Wed Oct 16 16:35:25 2013 From: wk at gnupg.org (Werner Koch) Date: Wed, 16 Oct 2013 16:35:25 +0200 Subject: [PATCH v2 2/2] Add support for GOST R 34.10-2001/-2012 signatures In-Reply-To: <1381867004-21231-3-git-send-email-dbaryshkov@gmail.com> (Dmitry Eremin-Solenikov's message of "Tue, 15 Oct 2013 23:56:44 +0400") References: <1381867004-21231-1-git-send-email-dbaryshkov@gmail.com> <1381867004-21231-3-git-send-email-dbaryshkov@gmail.com> Message-ID: <87ppr5jnbm.fsf@vigenere.g10code.de> Hi, Thanks for the patches. I just pushed them. > + "GOST2001-test", 256, 0, What is the reason that you used the "-test" suffix? Is tehre a standard name for this curve? I am fine with GOSTxxxx, though. Shalom-Salam, Werner p.s. "make check" currently often fails in keygen. This is due to a problem with Ed25519 with ECDSA - I will look into that. -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From dbaryshkov at gmail.com Wed Oct 16 18:13:12 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Wed, 16 Oct 2013 20:13:12 +0400 Subject: [PATCH v2 2/2] Add support for GOST R 34.10-2001/-2012 signatures In-Reply-To: <87ppr5jnbm.fsf@vigenere.g10code.de> References: <1381867004-21231-1-git-send-email-dbaryshkov@gmail.com> <1381867004-21231-3-git-send-email-dbaryshkov@gmail.com> <87ppr5jnbm.fsf@vigenere.g10code.de> Message-ID: Hello, On Wed, Oct 16, 2013 at 6:35 PM, Werner Koch wrote: > Hi, > > Thanks for the patches. I just pushed them. > >> + "GOST2001-test", 256, 0, > > What is the reason that you used the "-test" suffix? Is tehre a standard > name for this curve? I am fine with GOSTxxxx, though. Because they are "test" curves defined by the GOST standards. RFC 4357 e.g. names the first curve as 'id-GostR3410-2001-TestParamSet' and tells that And adds that 'Use of the test parameter sets [...] is NOT RECOMMENDED.' Unfortunately these parameter sets are the only parameters defined by standard and thus they are used to verify implementation. > p.s. > "make check" currently often fails in keygen. This is due to a > problem with Ed25519 with ECDSA - I will look into that. And strangely enough it aborts in 50% of runs. Sometimes it does, sometimes it just outputs a note regarding testkey and and exits normally. I failed to capture a problem either via gdb or via valgrind. -- With best wishes Dmitry From cvs at cvs.gnupg.org Wed Oct 16 20:22:16 2013 From: cvs at cvs.gnupg.org (by Jussi Kivilinna) Date: Wed, 16 Oct 2013 20:22:16 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-313-gc89ab92 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via c89ab921ccfaefe6c4f6a724d01e0df41a1a381f (commit) from 83902f1f1dbc8263a0c3f61be59cd2eb95293c97 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit c89ab921ccfaefe6c4f6a724d01e0df41a1a381f Author: Jussi Kivilinna Date: Wed Oct 16 21:16:15 2013 +0300 Avoid void* pointer arithmetic * tests/tsexp.c (check_extract_param): Cast void* pointers to char* before doing arithmetics. -- GCC was complaining: tsexp.c: In function ?check_extract_param?: tsexp.c:938:44: warning: pointer of type ?void *? used in arithmetic [-Wpointer-arith] tsexp.c:944:46: warning: pointer of type ?void *? used in arithmetic [-Wpointer-arith] tsexp.c:955:44: warning: pointer of type ?void *? used in arithmetic [-Wpointer-arith] tsexp.c:961:46: warning: pointer of type ?void *? used in arithmetic [-Wpointer-arith] Signed-off-by: Jussi Kivilinna diff --git a/tests/tsexp.c b/tests/tsexp.c index 8a6b912..afa79ff 100644 --- a/tests/tsexp.c +++ b/tests/tsexp.c @@ -935,13 +935,14 @@ check_extract_param (void) fail ("gcry_sexp_extract_param/desc failed: A off changed"); else if (ioarray[1].len != 1) fail ("gcry_sexp_extract_param/desc failed: A has wrong length"); - else if (cmp_bufhex (ioarray[1].data + ioarray[1].off, ioarray[1].len, - sample1_a)) + else if (cmp_bufhex ((char *)ioarray[1].data + ioarray[1].off, + ioarray[1].len, sample1_a)) { fail ("gcry_sexp_extract_param/desc failed: A mismatch"); gcry_log_debug ("expected: %s\n", sample1_a); gcry_log_debughex (" got", - ioarray[1].data + ioarray[1].off, ioarray[1].len); + (char *)ioarray[1].data + ioarray[1].off, + ioarray[1].len); } if (!ioarray[2].data) @@ -952,13 +953,14 @@ check_extract_param (void) fail ("gcry_sexp_extract_param/desc failed: B off changed"); else if (ioarray[2].len != 32) fail ("gcry_sexp_extract_param/desc failed: B has wrong length"); - else if (cmp_bufhex (ioarray[2].data + ioarray[2].off, ioarray[2].len, - sample1_b)) + else if (cmp_bufhex ((char *)ioarray[2].data + ioarray[2].off, + ioarray[2].len, sample1_b)) { fail ("gcry_sexp_extract_param/desc failed: B mismatch"); gcry_log_debug ("expected: %s\n", sample1_b); gcry_log_debughex (" got", - ioarray[2].data + ioarray[2].off, ioarray[2].len); + (char *)ioarray[2].data + ioarray[2].off, + ioarray[2].len); } xfree (ioarray[0].data); ----------------------------------------------------------------------- Summary of changes: tests/tsexp.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Wed Oct 16 21:40:36 2013 From: cvs at cvs.gnupg.org (by Jussi Kivilinna) Date: Wed, 16 Oct 2013 21:40:36 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-314-gf9371c0 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via f9371c026aad09ff48746d22c8333746c886e773 (commit) from c89ab921ccfaefe6c4f6a724d01e0df41a1a381f (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit f9371c026aad09ff48746d22c8333746c886e773 Author: Jussi Kivilinna Date: Wed Oct 16 21:23:15 2013 +0300 arcfour: more optimized version for non-i386 architectures * cipher/arcfour.c (ARCFOUR_context): Reorder members. (do_encrypt_stream) [!__i386__]: Faster implementation for non-i386. (do_arcfour_setkey): Avoid modulo operations. -- Patch adds faster arcfour implementation for non-i386 architectures. New code is not activated on i386 as performance would regress. This is because i386 does not have enough registers hold new variables. Speed up on Intel i5-4570 (x86_64): 1.56x Speed up on ARM Cortex-A8: 1.18x Signed-off-by: Jussi Kivilinna diff --git a/cipher/arcfour.c b/cipher/arcfour.c index dc32b07..e8a5484 100644 --- a/cipher/arcfour.c +++ b/cipher/arcfour.c @@ -34,14 +34,39 @@ static const char *selftest(void); typedef struct { - int idx_i, idx_j; byte sbox[256]; + int idx_i, idx_j; } ARCFOUR_context; static void do_encrypt_stream( ARCFOUR_context *ctx, byte *outbuf, const byte *inbuf, unsigned int length ) { +#ifndef __i386__ + register unsigned int i = ctx->idx_i; + register byte j = ctx->idx_j; + register byte *sbox = ctx->sbox; + register byte t, u; + + while ( length-- ) + { + i++; + t = sbox[(byte)i]; + j += t; + u = sbox[j]; + sbox[(byte)i] = u; + u += t; + sbox[j] = t; + *outbuf++ = sbox[u] ^ *inbuf++; + } + + ctx->idx_i = (byte)i; + ctx->idx_j = (byte)j; +#else /*__i386__*/ + /* Old implementation of arcfour is faster on i386 than the version above. + * This is because version above increases register pressure which on i386 + * would push some of the variables to memory/stack. Therefore keep this + * version for i386 to avoid regressing performance. */ register int i = ctx->idx_i; register int j = ctx->idx_j; register byte *sbox = ctx->sbox; @@ -59,6 +84,7 @@ do_encrypt_stream( ARCFOUR_context *ctx, ctx->idx_i = i; ctx->idx_j = j; +#endif } static void @@ -96,17 +122,21 @@ do_arcfour_setkey (void *context, const byte *key, unsigned int keylen) ctx->idx_i = ctx->idx_j = 0; for (i=0; i < 256; i++ ) ctx->sbox[i] = i; - for (i=0; i < 256; i++ ) - karr[i] = key[i%keylen]; + for (i=j=0; i < 256; i++,j++ ) + { + if (j >= keylen) + j = 0; + karr[i] = key[j]; + } for (i=j=0; i < 256; i++ ) { int t; - j = (j + ctx->sbox[i] + karr[i]) % 256; + j = (j + ctx->sbox[i] + karr[i]) & 255; t = ctx->sbox[i]; ctx->sbox[i] = ctx->sbox[j]; ctx->sbox[j] = t; } - memset( karr, 0, 256 ); + wipememory( karr, sizeof(karr) ); return GPG_ERR_NO_ERROR; } ----------------------------------------------------------------------- Summary of changes: cipher/arcfour.c | 40 +++++++++++++++++++++++++++++++++++----- 1 file changed, 35 insertions(+), 5 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Wed Oct 16 16:23:13 2013 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Wed, 16 Oct 2013 16:23:13 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-310-ga329b6a Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via a329b6abf00c990faf1986f9fbad7b4d71c13bcb (commit) from 45aa6131e93fac89d46733b3436d960f35fb99b2 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit a329b6abf00c990faf1986f9fbad7b4d71c13bcb Author: Werner Koch Date: Wed Oct 16 16:20:56 2013 +0200 sexp: Add function gcry_sexp_extract_param. * src/gcrypt.h.in (_GCRY_GCC_ATTR_SENTINEL): New. (gcry_sexp_extract_param): New. * src/visibility.c (gcry_sexp_extract_param): New. * src/visibility.h (gcry_sexp_extract_param): Add hack to detect internal use. * cipher/pubkey-util.c (_gcry_pk_util_extract_mpis): Move and split into ... * src/sexp.c (_gcry_sexp_vextract_param) (_gcry_sexp_extract_param): this. Change all callers. Add support for buffer descriptors and a path option/ * tests/tsexp.c (die, hex2buffer, hex2mpi, hex2mpiopa): New. (cmp_mpihex, cmp_bufhex): New. (check_extract_param): New. Signed-off-by: Werner Koch diff --git a/NEWS b/NEWS index ab326eb..d60e067 100644 --- a/NEWS +++ b/NEWS @@ -35,13 +35,13 @@ Noteworthy changes in version 1.6.0 (unreleased) * Added a scatter gather hash convenience function. - * Added several MPI helper functions. + * Added several MPI amd SEXP helper functions. * Added support for negative numbers to gcry_mpi_print, gcry_mpi_aprint and gcry_mpi_scan. * The algorithm ids GCRY_PK_ECDSA and GCRY_PK_ECDH are now - deprecated. Use GCRY_PK_ECC instead. + deprecated. Use GCRY_PK_ECC if you need an algorithm id. * Interface changes relative to the 1.5.0 release: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -108,6 +108,7 @@ Noteworthy changes in version 1.6.0 (unreleased) GCRYCTL_DISABLE_PRIV_DROP NEW. GCRY_CIPHER_SALSA20 NEW. gcry_sexp_nth_buffer NEW. + gcry_sexp_extract_param NEW. GCRY_CIPHER_SALSA20R12 NEW. GCRY_CIPHER_GOST28147 NEW. GCRY_MD_GOSTR3411_94 NEW. diff --git a/cipher/dsa.c b/cipher/dsa.c index e43bdf4..45ad97a 100644 --- a/cipher/dsa.c +++ b/cipher/dsa.c @@ -956,9 +956,9 @@ dsa_check_secret_key (gcry_sexp_t keyparms) gcry_err_code_t rc; DSA_secret_key sk = {NULL, NULL, NULL, NULL, NULL}; - rc = _gcry_pk_util_extract_mpis (keyparms, "pqgyx", - &sk.p, &sk.q, &sk.g, &sk.y, &sk.x, - NULL); + rc = _gcry_sexp_extract_param (keyparms, NULL, "pqgyx", + &sk.p, &sk.q, &sk.g, &sk.y, &sk.x, + NULL); if (rc) goto leave; @@ -998,8 +998,8 @@ dsa_sign (gcry_sexp_t *r_sig, gcry_sexp_t s_data, gcry_sexp_t keyparms) log_mpidump ("dsa_sign data", data); /* Extract the key. */ - rc = _gcry_pk_util_extract_mpis (keyparms, "pqgyx", - &sk.p, &sk.q, &sk.g, &sk.y, &sk.x, NULL); + rc = _gcry_sexp_extract_param (keyparms, NULL, "pqgyx", + &sk.p, &sk.q, &sk.g, &sk.y, &sk.x, NULL); if (rc) goto leave; if (DBG_CIPHER) @@ -1065,7 +1065,7 @@ dsa_verify (gcry_sexp_t s_sig, gcry_sexp_t s_data, gcry_sexp_t s_keyparms) rc = _gcry_pk_util_preparse_sigval (s_sig, dsa_names, &l1, NULL); if (rc) goto leave; - rc = _gcry_pk_util_extract_mpis (l1, "rs", &sig_r, &sig_s, NULL); + rc = _gcry_sexp_extract_param (l1, NULL, "rs", &sig_r, &sig_s, NULL); if (rc) goto leave; if (DBG_CIPHER) @@ -1075,8 +1075,8 @@ dsa_verify (gcry_sexp_t s_sig, gcry_sexp_t s_data, gcry_sexp_t s_keyparms) } /* Extract the key. */ - rc = _gcry_pk_util_extract_mpis (s_keyparms, "pqgy", - &pk.p, &pk.q, &pk.g, &pk.y, NULL); + rc = _gcry_sexp_extract_param (s_keyparms, NULL, "pqgy", + &pk.p, &pk.q, &pk.g, &pk.y, NULL); if (rc) goto leave; if (DBG_CIPHER) diff --git a/cipher/ecc-curves.c b/cipher/ecc-curves.c index 53433a2..2cdb9b4 100644 --- a/cipher/ecc-curves.c +++ b/cipher/ecc-curves.c @@ -436,9 +436,9 @@ _gcry_ecc_get_curve (gcry_sexp_t keyparms, int iterator, unsigned int *r_nbits) /* * Extract the curve parameters.. */ - if (_gcry_pk_util_extract_mpis (keyparms, "-pabgn", - &E.p, &E.a, &E.b, &mpi_g, &E.n, - NULL)) + if (_gcry_sexp_extract_param (keyparms, NULL, "-pabgn", + &E.p, &E.a, &E.b, &mpi_g, &E.n, + NULL)) goto leave; if (mpi_g) { diff --git a/cipher/ecc-misc.c b/cipher/ecc-misc.c index d89971f..fa0bded 100644 --- a/cipher/ecc-misc.c +++ b/cipher/ecc-misc.c @@ -55,6 +55,7 @@ _gcry_ecc_curve_copy (elliptic_curve_t E) R.model = E.model; R.dialect = E.dialect; + R.name = E.name; R.p = mpi_copy (E.p); R.a = mpi_copy (E.a); R.b = mpi_copy (E.b); diff --git a/cipher/ecc.c b/cipher/ecc.c index 3b75fea..1323d00 100644 --- a/cipher/ecc.c +++ b/cipher/ecc.c @@ -1435,9 +1435,9 @@ ecc_check_secret_key (gcry_sexp_t keyparms) /* * Extract the key. */ - rc = _gcry_pk_util_extract_mpis (keyparms, "-p?a?b?g?n?/q?+d", - &sk.E.p, &sk.E.a, &sk.E.b, &mpi_g, &sk.E.n, - &mpi_q, &sk.d, NULL); + rc = _gcry_sexp_extract_param (keyparms, NULL, "-p?a?b?g?n?/q?+d", + &sk.E.p, &sk.E.a, &sk.E.b, &mpi_g, &sk.E.n, + &mpi_q, &sk.d, NULL); if (rc) goto leave; if (mpi_g) @@ -1552,9 +1552,9 @@ ecc_sign (gcry_sexp_t *r_sig, gcry_sexp_t s_data, gcry_sexp_t keyparms) /* * Extract the key. */ - rc = _gcry_pk_util_extract_mpis (keyparms, "-p?a?b?g?n?/q?+d", - &sk.E.p, &sk.E.a, &sk.E.b, &mpi_g, &sk.E.n, - &mpi_q, &sk.d, NULL); + rc = _gcry_sexp_extract_param (keyparms, NULL, "-p?a?b?g?n?/q?+d", + &sk.E.p, &sk.E.a, &sk.E.b, &mpi_g, &sk.E.n, + &mpi_q, &sk.d, NULL); if (rc) goto leave; if (mpi_g) @@ -1686,9 +1686,9 @@ ecc_verify (gcry_sexp_t s_sig, gcry_sexp_t s_data, gcry_sexp_t s_keyparms) rc = _gcry_pk_util_preparse_sigval (s_sig, ecc_names, &l1, &sigflags); if (rc) goto leave; - rc = _gcry_pk_util_extract_mpis (l1, - (sigflags & PUBKEY_FLAG_EDDSA)? "/rs":"rs", - &sig_r, &sig_s, NULL); + rc = _gcry_sexp_extract_param (l1, NULL, + (sigflags & PUBKEY_FLAG_EDDSA)? "/rs":"rs", + &sig_r, &sig_s, NULL); if (rc) goto leave; if (DBG_CIPHER) @@ -1706,9 +1706,9 @@ ecc_verify (gcry_sexp_t s_sig, gcry_sexp_t s_data, gcry_sexp_t s_keyparms) /* * Extract the key. */ - rc = _gcry_pk_util_extract_mpis (s_keyparms, "-p?a?b?g?n?/q?", - &pk.E.p, &pk.E.a, &pk.E.b, &mpi_g, &pk.E.n, - &mpi_q, NULL); + rc = _gcry_sexp_extract_param (s_keyparms, NULL, "-p?a?b?g?n?/q?", + &pk.E.p, &pk.E.a, &pk.E.b, &mpi_g, &pk.E.n, + &mpi_q, NULL); if (rc) goto leave; if (mpi_g) @@ -1890,9 +1890,9 @@ ecc_encrypt_raw (gcry_sexp_t *r_ciph, gcry_sexp_t s_data, gcry_sexp_t keyparms) /* * Extract the key. */ - rc = _gcry_pk_util_extract_mpis (keyparms, "-p?a?b?g?n?+q", - &pk.E.p, &pk.E.a, &pk.E.b, &mpi_g, &pk.E.n, - &mpi_q, NULL); + rc = _gcry_sexp_extract_param (keyparms, NULL, "-p?a?b?g?n?+q", + &pk.E.p, &pk.E.a, &pk.E.b, &mpi_g, &pk.E.n, + &mpi_q, NULL); if (rc) goto leave; if (mpi_g) @@ -2044,7 +2044,7 @@ ecc_decrypt_raw (gcry_sexp_t *r_plain, gcry_sexp_t s_data, gcry_sexp_t keyparms) rc = _gcry_pk_util_preparse_encval (s_data, ecc_names, &l1, &ctx); if (rc) goto leave; - rc = _gcry_pk_util_extract_mpis (l1, "e", &data_e, NULL); + rc = _gcry_sexp_extract_param (l1, NULL, "e", &data_e, NULL); if (rc) goto leave; if (DBG_CIPHER) @@ -2058,9 +2058,9 @@ ecc_decrypt_raw (gcry_sexp_t *r_plain, gcry_sexp_t s_data, gcry_sexp_t keyparms) /* * Extract the key. */ - rc = _gcry_pk_util_extract_mpis (keyparms, "-p?a?b?g?n?+d", - &sk.E.p, &sk.E.a, &sk.E.b, &mpi_g, &sk.E.n, - &sk.d, NULL); + rc = _gcry_sexp_extract_param (keyparms, NULL, "-p?a?b?g?n?+d", + &sk.E.p, &sk.E.a, &sk.E.b, &mpi_g, &sk.E.n, + &sk.d, NULL); if (rc) goto leave; if (mpi_g) diff --git a/cipher/elgamal.c b/cipher/elgamal.c index 691e122..432ba6f 100644 --- a/cipher/elgamal.c +++ b/cipher/elgamal.c @@ -735,9 +735,9 @@ elg_check_secret_key (gcry_sexp_t keyparms) gcry_err_code_t rc; ELG_secret_key sk = {NULL, NULL, NULL, NULL}; - rc = _gcry_pk_util_extract_mpis (keyparms, "pgyx", - &sk.p, &sk.g, &sk.y, &sk.x, - NULL); + rc = _gcry_sexp_extract_param (keyparms, NULL, "pgyx", + &sk.p, &sk.g, &sk.y, &sk.x, + NULL); if (rc) goto leave; @@ -781,7 +781,8 @@ elg_encrypt (gcry_sexp_t *r_ciph, gcry_sexp_t s_data, gcry_sexp_t keyparms) } /* Extract the key. */ - rc = _gcry_pk_util_extract_mpis (keyparms, "pgy", &pk.p, &pk.g, &pk.y, NULL); + rc = _gcry_sexp_extract_param (keyparms, NULL, "pgy", + &pk.p, &pk.g, &pk.y, NULL); if (rc) goto leave; if (DBG_CIPHER) @@ -831,7 +832,7 @@ elg_decrypt (gcry_sexp_t *r_plain, gcry_sexp_t s_data, gcry_sexp_t keyparms) rc = _gcry_pk_util_preparse_encval (s_data, elg_names, &l1, &ctx); if (rc) goto leave; - rc = _gcry_pk_util_extract_mpis (l1, "ab", &data_a, &data_b, NULL); + rc = _gcry_sexp_extract_param (l1, NULL, "ab", &data_a, &data_b, NULL); if (rc) goto leave; if (DBG_CIPHER) @@ -846,9 +847,9 @@ elg_decrypt (gcry_sexp_t *r_plain, gcry_sexp_t s_data, gcry_sexp_t keyparms) } /* Extract the key. */ - rc = _gcry_pk_util_extract_mpis (keyparms, "pgyx", - &sk.p, &sk.g, &sk.y, &sk.x, - NULL); + rc = _gcry_sexp_extract_param (keyparms, NULL, "pgyx", + &sk.p, &sk.g, &sk.y, &sk.x, + NULL); if (rc) goto leave; if (DBG_CIPHER) @@ -940,8 +941,8 @@ elg_sign (gcry_sexp_t *r_sig, gcry_sexp_t s_data, gcry_sexp_t keyparms) } /* Extract the key. */ - rc = _gcry_pk_util_extract_mpis (keyparms, "pgyx", - &sk.p, &sk.g, &sk.y, &sk.x, NULL); + rc = _gcry_sexp_extract_param (keyparms, NULL, "pgyx", + &sk.p, &sk.g, &sk.y, &sk.x, NULL); if (rc) goto leave; if (DBG_CIPHER) @@ -1008,7 +1009,7 @@ elg_verify (gcry_sexp_t s_sig, gcry_sexp_t s_data, gcry_sexp_t s_keyparms) rc = _gcry_pk_util_preparse_sigval (s_sig, elg_names, &l1, NULL); if (rc) goto leave; - rc = _gcry_pk_util_extract_mpis (l1, "rs", &sig_r, &sig_s, NULL); + rc = _gcry_sexp_extract_param (l1, NULL, "rs", &sig_r, &sig_s, NULL); if (rc) goto leave; if (DBG_CIPHER) @@ -1018,8 +1019,8 @@ elg_verify (gcry_sexp_t s_sig, gcry_sexp_t s_data, gcry_sexp_t s_keyparms) } /* Extract the key. */ - rc = _gcry_pk_util_extract_mpis (s_keyparms, "pgy", - &pk.p, &pk.g, &pk.y, NULL); + rc = _gcry_sexp_extract_param (s_keyparms, NULL, "pgy", + &pk.p, &pk.g, &pk.y, NULL); if (rc) goto leave; if (DBG_CIPHER) diff --git a/cipher/pubkey-internal.h b/cipher/pubkey-internal.h index cb2721d..db1399d 100644 --- a/cipher/pubkey-internal.h +++ b/cipher/pubkey-internal.h @@ -28,9 +28,6 @@ gpg_err_code_t _gcry_pk_util_get_nbits (gcry_sexp_t list, unsigned int *r_nbits); gpg_err_code_t _gcry_pk_util_get_rsa_use_e (gcry_sexp_t list, unsigned long *r_e); -gpg_err_code_t _gcry_pk_util_extract_mpis (gcry_sexp_t sexp, - const char *list, ...) - GCC_ATTR_SENTINEL(0); gpg_err_code_t _gcry_pk_util_preparse_sigval (gcry_sexp_t s_sig, const char **algo_names, gcry_sexp_t *r_parms, diff --git a/cipher/pubkey-util.c b/cipher/pubkey-util.c index caf715e..0b90054 100644 --- a/cipher/pubkey-util.c +++ b/cipher/pubkey-util.c @@ -273,119 +273,6 @@ _gcry_pk_util_get_rsa_use_e (gcry_sexp_t list, unsigned long *r_e) } -/* Extract MPIs from an s-expression using a list of one letter - * parameters. The names of these parameters are given by the string - * LIST. Some special characters may be given to control the - * conversion: - * - * + :: Switch to unsigned integer format (default). - * - :: Switch to standard signed format. - * / :: Switch to opaque format. - * ? :: The previous parameter is optional. - * - * For each parameter name a pointer to an MPI variable is expected - * and finally a NULL is expected. Example: - * - * _gcry_pk_util_extract_mpis (key, "n/x+ed", &mpi_n, &mpi_x, &mpi_e, NULL) - * - * This stores the parameter "N" from KEY as an unsigned MPI into - * MPI_N, the parameter "X" as an opaque MPI into MPI_X, and the - * parameter "E" again as an unsigned MPI into MPI_E. - * - * The function returns NULL on success. On error an error code is - * returned and the passed MPIs are either unchanged or set to NULL. - */ -gpg_err_code_t -_gcry_pk_util_extract_mpis (gcry_sexp_t sexp, const char *list, ...) -{ - va_list arg_ptr; - const char *s; - gcry_mpi_t *array[10]; - int idx; - gcry_sexp_t l1; - enum gcry_mpi_format mpifmt = GCRYMPI_FMT_USG; - - /* First copy all the args into an array. This is required so that - we are able to release already allocated MPIs if later an error - was found. */ - va_start (arg_ptr, list) ; - for (s=list, idx=0; *s && idx < DIM (array); s++) - { - if (*s == '+' || *s == '-' || *s == '/' || *s == '?') - ; - else - { - array[idx] = va_arg (arg_ptr, gcry_mpi_t *); - if (!array[idx]) - { - va_end (arg_ptr); - return GPG_ERR_INTERNAL; /* NULL pointer given. */ - } - idx++; - } - } - if (*s) - { - va_end (arg_ptr); - return GPG_ERR_INTERNAL; /* Too many list elements. */ - } - if (va_arg (arg_ptr, gcry_mpi_t *)) - { - va_end (arg_ptr); - return GPG_ERR_INTERNAL; /* Not enough list elemends. */ - } - va_end (arg_ptr); - - /* Now extract all parameters. */ - for (s=list, idx=0; *s; s++) - { - if (*s == '+') - mpifmt = GCRYMPI_FMT_USG; - else if (*s == '-') - mpifmt = GCRYMPI_FMT_STD; - else if (*s == '/') - mpifmt = GCRYMPI_FMT_HEX; /* Used to indicate opaque. */ - else if (*s == '?') - ; /* Only used via lookahead. */ - else - { - l1 = gcry_sexp_find_token (sexp, s, 1); - if (!l1 && s[1] == '?') - *array[idx] = NULL; /* Optional element not found. */ - else if (!l1) - { - while (idx--) - { - gcry_mpi_release (*array[idx]); - *array[idx] = NULL; - } - return GPG_ERR_NO_OBJ; /* List element not found. */ - } - else - { - if (mpifmt == GCRYMPI_FMT_HEX) - *array[idx] = _gcry_sexp_nth_opaque_mpi (l1, 1); - else - *array[idx] = gcry_sexp_nth_mpi (l1, 1, mpifmt); - gcry_sexp_release (l1); - if (!*array[idx]) - { - while (idx--) - { - gcry_mpi_release (*array[idx]); - *array[idx] = NULL; - } - return GPG_ERR_INV_OBJ; /* Conversion failed. */ - } - } - idx++; - } - } - - return 0; -} - - /* Parse a "sig-val" s-expression and store the inner parameter list at R_PARMS. ALGO_NAMES is used to verify that the algorithm in "sig-val" is valid. Returns 0 on success and stores a new list at diff --git a/cipher/rsa.c b/cipher/rsa.c index d4d2a0a..fed52a1 100644 --- a/cipher/rsa.c +++ b/cipher/rsa.c @@ -855,9 +855,9 @@ rsa_check_secret_key (gcry_sexp_t keyparms) RSA_secret_key sk = {NULL, NULL, NULL, NULL, NULL, NULL}; /* To check the key we need the optional parameters. */ - rc = _gcry_pk_util_extract_mpis (keyparms, "nedpqu", - &sk.n, &sk.e, &sk.d, &sk.p, &sk.q, &sk.u, - NULL); + rc = _gcry_sexp_extract_param (keyparms, NULL, "nedpqu", + &sk.n, &sk.e, &sk.d, &sk.p, &sk.q, &sk.u, + NULL); if (rc) goto leave; @@ -902,7 +902,7 @@ rsa_encrypt (gcry_sexp_t *r_ciph, gcry_sexp_t s_data, gcry_sexp_t keyparms) } /* Extract the key. */ - rc = _gcry_pk_util_extract_mpis (keyparms, "ne", &pk.n, &pk.e, NULL); + rc = _gcry_sexp_extract_param (keyparms, NULL, "ne", &pk.n, &pk.e, NULL); if (rc) goto leave; if (DBG_CIPHER) @@ -969,7 +969,7 @@ rsa_decrypt (gcry_sexp_t *r_plain, gcry_sexp_t s_data, gcry_sexp_t keyparms) rc = _gcry_pk_util_preparse_encval (s_data, rsa_names, &l1, &ctx); if (rc) goto leave; - rc = _gcry_pk_util_extract_mpis (l1, "a", &data, NULL); + rc = _gcry_sexp_extract_param (l1, NULL, "a", &data, NULL); if (rc) goto leave; if (DBG_CIPHER) @@ -981,9 +981,9 @@ rsa_decrypt (gcry_sexp_t *r_plain, gcry_sexp_t s_data, gcry_sexp_t keyparms) } /* Extract the key. */ - rc = _gcry_pk_util_extract_mpis (keyparms, "nedp?q?u?", - &sk.n, &sk.e, &sk.d, &sk.p, &sk.q, &sk.u, - NULL); + rc = _gcry_sexp_extract_param (keyparms, NULL, "nedp?q?u?", + &sk.n, &sk.e, &sk.d, &sk.p, &sk.q, &sk.u, + NULL); if (rc) goto leave; if (DBG_CIPHER) @@ -1125,9 +1125,9 @@ rsa_sign (gcry_sexp_t *r_sig, gcry_sexp_t s_data, gcry_sexp_t keyparms) } /* Extract the key. */ - rc = _gcry_pk_util_extract_mpis (keyparms, "nedp?q?u?", - &sk.n, &sk.e, &sk.d, &sk.p, &sk.q, &sk.u, - NULL); + rc = _gcry_sexp_extract_param (keyparms, NULL, "nedp?q?u?", + &sk.n, &sk.e, &sk.d, &sk.p, &sk.q, &sk.u, + NULL); if (rc) goto leave; if (DBG_CIPHER) @@ -1213,14 +1213,14 @@ rsa_verify (gcry_sexp_t s_sig, gcry_sexp_t s_data, gcry_sexp_t keyparms) rc = _gcry_pk_util_preparse_sigval (s_sig, rsa_names, &l1, NULL); if (rc) goto leave; - rc = _gcry_pk_util_extract_mpis (l1, "s", &sig, NULL); + rc = _gcry_sexp_extract_param (l1, NULL, "s", &sig, NULL); if (rc) goto leave; if (DBG_CIPHER) log_printmpi ("rsa_verify sig", sig); /* Extract the key. */ - rc = _gcry_pk_util_extract_mpis (keyparms, "ne", &pk.n, &pk.e, NULL); + rc = _gcry_sexp_extract_param (keyparms, NULL, "ne", &pk.n, &pk.e, NULL); if (rc) goto leave; if (DBG_CIPHER) diff --git a/doc/gcrypt.texi b/doc/gcrypt.texi index 79d4d74..473c484 100644 --- a/doc/gcrypt.texi +++ b/doc/gcrypt.texi @@ -3748,6 +3748,66 @@ this function to parse results of a public key function, you most likely want to use @code{GCRYMPI_FMT_USG}. @end deftypefun + at deftypefun gpg_error_t gcry_sexp_extract_param ( @ + @w{gcry_sexp_t @var{sexp}}, @ + @w{const char *@var{path}}, @ + @w{const char *@var{list}}, ...) + +Extract parameters from an S-expression using a list of single letter +parameter names. The names of these parameters are specified in +LIST. Some special characters may be given to control the +conversion: + + at table @samp + at item + +Switch to unsigned integer format (GCRYMPI_FMT_USG). This is the +default mode. + at item - +Switch to standard signed format (GCRYMPI_FMT_STD). + at item / +Switch to opaque MPI format. The resulting MPIs may not be used for +computations; see @code{gcry_mpi_get_opaque} for details. + at item & +Switch to buffer descriptor mode. See below for details. + at item ? +If immediately following a parameter letter, that parameter is +considered optional. + at end table + +Unless in buffer descriptor mode for each parameter name a pointer to +an @code{gcry_mpi_t} variable is expected finally followed by a @code{NULL}. +For example + at example + _gcry_sexp_extract_param (key, NULL, "n/x+ed", + &mpi_n, &mpi_x, &mpi_e, NULL) + at end example + +stores the parameter 'n' from @var{key} as an unsigned MPI into + at var{mpi_n}, the parameter 'x' as an opaque MPI into @var{mpi_x}, and +the parameter 'e' again as an unsigned MPI into @var{mpi_e}. + + at var{path} is an optional string used to locate a token. The +exclamation mark separated tokens are used via + at code{gcry_sexp_find_token} to find a start point inside the +S-expression. + +In buffer descriptor mode a pointer to a @code{gcry_buffer_t} +descriptor is expected instead of a pointer to an MPI. The caller may +use two different operation modes here: If the @var{data} field of the +provided descriptor is @code{NULL}, the function allocates a new +buffer and stores it at @var{data}; the other fields are set +accordingly with @var{off} set to 0. If @var{data} is not + at code{NULL}, the function assumes that the @var{data}, @var{size}, and + at var{off} fields specify a buffer where to but the value of the +respective parameter; on return the @var{len} field receives the +number of bytes copied to that buffer; in case the buffer is too +small, the function immediately returns with an error code (and + at var{len} is set to 0). + +The function returns NULL on success. On error an error code is +returned and the passed MPIs are either unchanged or set to NULL. + at end deftypefun + @c ********************************************************** @c ******************* MPIs ******** *********************** diff --git a/src/g10lib.h b/src/g10lib.h index c1ba2f7..3b09448 100644 --- a/src/g10lib.h +++ b/src/g10lib.h @@ -75,13 +75,6 @@ #define GCC_ATTR_UNUSED #endif -#if __GNUC__ >= 4 -# define GCC_ATTR_SENTINEL(a) __attribute__ ((sentinel(a))) -#else -# define GCC_ATTR_SENTINEL(a) -#endif - - /* Gettext macros. */ #define _(a) _gcry_gettext(a) @@ -382,6 +375,12 @@ gcry_err_code_t _gcry_sexp_vbuild (gcry_sexp_t *retsexp, size_t *erroff, const char *format, va_list arg_ptr); gcry_mpi_t _gcry_sexp_nth_opaque_mpi (gcry_sexp_t list, int number); char *_gcry_sexp_nth_string (const gcry_sexp_t list, int number); +gpg_err_code_t _gcry_sexp_vextract_param (gcry_sexp_t sexp, const char *path, + const char *list, va_list arg_ptr); +gpg_err_code_t _gcry_sexp_extract_param (gcry_sexp_t sexp, + const char *path, + const char *list, + ...) _GCRY_GCC_ATTR_SENTINEL(0); /*-- fips.c --*/ diff --git a/src/gcrypt.h.in b/src/gcrypt.h.in index 8646f43..64cc0e4 100644 --- a/src/gcrypt.h.in +++ b/src/gcrypt.h.in @@ -102,6 +102,10 @@ extern "C" { #define _GCRY_GCC_ATTR_PRINTF(f,a) __attribute__ ((format (printf,f,a))) +#if _GCRT_GCC_VERSION >= 40000 +#define _GCRY_GCC_ATTR_SENTINEL(a) __attribute__ ((sentinel(a))) +#endif + #endif /*__GNUC__*/ #ifndef _GCRY_GCC_ATTR_DEPRECATED @@ -114,7 +118,10 @@ extern "C" { #define _GCRY_GCC_ATTR_MALLOC #endif #ifndef _GCRY_GCC_ATTR_PRINTF -#define _GCRY_GCC_ATTR_PRINTF +#define _GCRY_GCC_ATTR_PRINTF(f,a) +#endif +#ifndef _GCRY_GCC_ATTR_SENTINEL +#define _GCRY_GCC_ATTR_SENTINEL(a) #endif /* Make up an attribute to mark functions and types as deprecated but @@ -459,6 +466,12 @@ char *gcry_sexp_nth_string (gcry_sexp_t list, int number); value can't be converted to an MPI, `NULL' is returned. */ gcry_mpi_t gcry_sexp_nth_mpi (gcry_sexp_t list, int number, int mpifmt); +/* Convenience fucntion to extract parameters from an S-expression + * using a list of single letter parameters. */ +gpg_error_t gcry_sexp_extract_param (gcry_sexp_t sexp, + const char *path, + const char *list, + ...) _GCRY_GCC_ATTR_SENTINEL(0); /******************************************* diff --git a/src/libgcrypt.def b/src/libgcrypt.def index 7efb3b9..ec0c1e3 100644 --- a/src/libgcrypt.def +++ b/src/libgcrypt.def @@ -253,5 +253,8 @@ EXPORTS gcry_log_debugpnt @223 gcry_log_debugsxp @224 + gcry_sexp_extract_param @225 + + ;; end of file with public symbols for Windows. diff --git a/src/libgcrypt.vers b/src/libgcrypt.vers index b1669fd..be72aad 100644 --- a/src/libgcrypt.vers +++ b/src/libgcrypt.vers @@ -76,7 +76,7 @@ GCRYPT_1.6 { gcry_sexp_new; gcry_sexp_nth; gcry_sexp_nth_buffer; gcry_sexp_nth_data; gcry_sexp_nth_mpi; gcry_sexp_prepend; gcry_sexp_release; gcry_sexp_sprint; gcry_sexp_sscan; gcry_sexp_vlist; - gcry_sexp_nth_string; + gcry_sexp_nth_string; gcry_sexp_extract_param; gcry_mpi_is_neg; gcry_mpi_neg; gcry_mpi_abs; gcry_mpi_add; gcry_mpi_add_ui; gcry_mpi_addm; gcry_mpi_aprint; diff --git a/src/sexp.c b/src/sexp.c index 6a2a9be..6e4ff27 100644 --- a/src/sexp.c +++ b/src/sexp.c @@ -2117,3 +2117,238 @@ gcry_sexp_canon_len (const unsigned char *buffer, size_t length, } } } + + +/* Extract MPIs from an s-expression using a list of one letter + * parameters. The names of these parameters are given by the string + * LIST. Some special characters may be given to control the + * conversion: + * + * + :: Switch to unsigned integer format (default). + * - :: Switch to standard signed format. + * / :: Switch to opaque format. + * & :: Switch to buffer descriptor mode - see below. + * ? :: The previous parameter is optional. + * + * Unless in gcry_buffer_t mode for each parameter name a pointer to + * an MPI variable is expected and finally a NULL is expected. + * Example: + * + * _gcry_sexp_extract_param (key, NULL, "n/x+ed", + * &mpi_n, &mpi_x, &mpi_e, NULL) + * + * This stores the parameter "N" from KEY as an unsigned MPI into + * MPI_N, the parameter "X" as an opaque MPI into MPI_X, and the + * parameter "E" again as an unsigned MPI into MPI_E. + * + * If in buffer descriptor mode a pointer to gcry_buffer_t descriptor + * is expected instead of a pointer to an MPI. The caller may use two + * different operation modes: If the DATA field of the provided buffer + * descriptor is NULL, the function allocates a new buffer and stores + * it at DATA; the other fields are set accordingly with OFF being 0. + * If DATA is not NULL, the function assumes that DATA, SIZE, and OFF + * describe a buffer where to but the data; on return the LEN field + * receives the number of bytes copied to that buffer; if the buffer + * is too small, the function immediately returns with an error code + * (and LEN set to 0). + * + * PATH is an optional string used to locate a token. The exclamation + * mark separated tokens are used to via gcry_sexp_find_token to find + * a start point inside SEXP. + * + * The function returns NULL on success. On error an error code is + * returned and the passed MPIs are either unchanged or set to NULL. + */ +gpg_err_code_t +_gcry_sexp_vextract_param (gcry_sexp_t sexp, const char *path, + const char *list, va_list arg_ptr) +{ + gpg_err_code_t rc; + const char *s; + gcry_mpi_t *array[20]; + char arrayisdesc[20]; + int idx; + gcry_sexp_t l1; + int mode = '+'; /* Default to GCRYMPI_FMT_USG. */ + gcry_sexp_t freethis = NULL; + + memset (arrayisdesc, 0, sizeof arrayisdesc); + + /* First copy all the args into an array. This is required so that + we are able to release already allocated MPIs if later an error + was found. */ + for (s=list, idx=0; *s && idx < DIM (array); s++) + { + if (*s == '&' || *s == '+' || *s == '-' || *s == '/' || *s == '?' ) + ; + else + { + array[idx] = va_arg (arg_ptr, gcry_mpi_t *); + if (!array[idx]) + return GPG_ERR_MISSING_VALUE; /* NULL pointer given. */ + idx++; + } + } + if (*s) + return GPG_ERR_LIMIT_REACHED; /* Too many list elements. */ + if (va_arg (arg_ptr, gcry_mpi_t *)) + return GPG_ERR_INV_ARG; /* Not enough list elemends. */ + + /* Drill down. */ + while (path && *path) + { + size_t n; + + s = strchr (path, '!'); + if (s == path) + { + rc = GPG_ERR_NOT_FOUND; + goto cleanup; + } + n = s? s - path : 0; + l1 = gcry_sexp_find_token (sexp, path, n); + if (!l1) + { + rc = GPG_ERR_NOT_FOUND; + goto cleanup; + } + sexp = l1; l1 = NULL; + gcry_sexp_release (freethis); + freethis = sexp; + if (n) + path += n + 1; + else + path = NULL; + } + + + /* Now extract all parameters. */ + for (s=list, idx=0; *s; s++) + { + if (*s == '&' || *s == '+' || *s == '-' || *s == '/') + mode = *s; + else if (*s == '?') + ; /* Only used via lookahead. */ + else + { + l1 = gcry_sexp_find_token (sexp, s, 1); + if (!l1 && s[1] == '?') + { + /* Optional element not found. */ + if (mode == '&') + { + gcry_buffer_t *spec = (gcry_buffer_t*)array[idx]; + if (!spec->data) + { + spec->size = 0; + spec->off = 0; + } + spec->len = 0; + } + else + *array[idx] = NULL; + } + else if (!l1) + { + rc = GPG_ERR_NO_OBJ; /* List element not found. */ + goto cleanup; + } + else + { + if (mode == '&') + { + gcry_buffer_t *spec = (gcry_buffer_t*)array[idx]; + + if (spec->data) + { + const char *pbuf; + size_t nbuf; + + pbuf = gcry_sexp_nth_data (l1, 1, &nbuf); + if (!pbuf || !nbuf) + { + rc = GPG_ERR_INV_OBJ; + goto cleanup; + } + if (spec->off + nbuf > spec->size) + { + rc = GPG_ERR_BUFFER_TOO_SHORT; + goto cleanup; + } + memcpy ((char*)spec->data + spec->off, pbuf, nbuf); + spec->len = nbuf; + arrayisdesc[idx] = 1; + } + else + { + spec->data = gcry_sexp_nth_buffer (l1, 1, &spec->size); + if (!spec->data) + { + rc = GPG_ERR_INV_OBJ; /* Or out of core. */ + goto cleanup; + } + spec->len = spec->size; + spec->off = 0; + arrayisdesc[idx] = 2; + } + } + else if (mode == '/') + *array[idx] = _gcry_sexp_nth_opaque_mpi (l1, 1); + else if (mode == '-') + *array[idx] = gcry_sexp_nth_mpi (l1, 1, GCRYMPI_FMT_STD); + else + *array[idx] = gcry_sexp_nth_mpi (l1, 1, GCRYMPI_FMT_USG); + gcry_sexp_release (l1); l1 = NULL; + if (!*array[idx]) + { + rc = GPG_ERR_INV_OBJ; /* Conversion failed. */ + goto cleanup; + } + } + idx++; + } + } + + gcry_sexp_release (freethis); + return 0; + + cleanup: + gcry_sexp_release (freethis); + gcry_sexp_release (l1); + while (idx--) + { + if (!arrayisdesc[idx]) + { + gcry_mpi_release (*array[idx]); + *array[idx] = NULL; + } + else if (!arrayisdesc[idx] == 1) + { + /* Caller provided buffer. */ + gcry_buffer_t *spec = (gcry_buffer_t*)array[idx]; + spec->len = 0; + } + else + { + /* We might have allocated a buffer. */ + gcry_buffer_t *spec = (gcry_buffer_t*)array[idx]; + gcry_free (spec->data); + spec->data = NULL; + spec->size = spec->off = spec->len = 0; + } + } + return rc; +} + +gpg_error_t +_gcry_sexp_extract_param (gcry_sexp_t sexp, const char *path, + const char *list, ...) +{ + gcry_err_code_t rc; + va_list arg_ptr; + + va_start (arg_ptr, list); + rc = _gcry_sexp_vextract_param (sexp, path, list, arg_ptr); + va_end (arg_ptr); + return gpg_error (rc); +} diff --git a/src/visibility.c b/src/visibility.c index 6e3c755..848925e 100644 --- a/src/visibility.c +++ b/src/visibility.c @@ -250,6 +250,21 @@ gcry_sexp_nth_mpi (gcry_sexp_t list, int number, int mpifmt) return _gcry_sexp_nth_mpi (list, number, mpifmt); } +gpg_error_t +gcry_sexp_extract_param (gcry_sexp_t sexp, const char *path, + const char *list, ...) +{ + gcry_err_code_t rc; + va_list arg_ptr; + + va_start (arg_ptr, list); + rc = _gcry_sexp_vextract_param (sexp, path, list, arg_ptr); + va_end (arg_ptr); + return gpg_error (rc); +} + + + gcry_mpi_t gcry_mpi_new (unsigned int nbits) { diff --git a/src/visibility.h b/src/visibility.h index cd2a60f..1c8f047 100644 --- a/src/visibility.h +++ b/src/visibility.h @@ -537,6 +537,7 @@ MARK_VISIBLE (gcry_sexp_release) MARK_VISIBLE (gcry_sexp_sprint) MARK_VISIBLE (gcry_sexp_sscan) MARK_VISIBLE (gcry_sexp_vlist) +MARK_VISIBLEX(gcry_sexp_extract_param) MARK_VISIBLEX(gcry_mpi_abs) MARK_VISIBLE (gcry_mpi_add) @@ -615,6 +616,16 @@ MARK_VISIBLEX(_gcry_mpi_get_const) #undef MARK_VISIBLE -#endif /*_GCRY_INCLUDED_BY_VISIBILITY_C*/ +#else /*!_GCRY_INCLUDED_BY_VISIBILITY_C*/ + +/* To avoid accidental use of the public functions inside Libgcrypt, + we redefine them to catch such errors. The usual difference + between a public and an internal version is that the internal + version use gpg_err_code_t and the public version gpg_error_t. */ + +#define gcry_sexp_extract_param _gcry_USE_THE_UNDERSCORED_FUNCTION + + +#endif /*!_GCRY_INCLUDED_BY_VISIBILITY_C*/ #endif /*GCRY_VISIBILITY_H*/ diff --git a/tests/tsexp.c b/tests/tsexp.c index 7c4f7c8..8a6b912 100644 --- a/tests/tsexp.c +++ b/tests/tsexp.c @@ -25,14 +25,48 @@ #include #include #include +#include #include "../src/gcrypt-int.h" #define PGMNAME "tsexp" +#ifndef DIM +# define DIM(v) (sizeof(v)/sizeof((v)[0])) +#endif +#define my_isascii(c) (!((c) & 0x80)) +#define digitp(p) (*(p) >= '0' && *(p) <= '9') +#define hexdigitp(a) (digitp (a) \ + || (*(a) >= 'A' && *(a) <= 'F') \ + || (*(a) >= 'a' && *(a) <= 'f')) +#define xtoi_1(p) (*(p) <= '9'? (*(p)- '0'): \ + *(p) <= 'F'? (*(p)-'A'+10):(*(p)-'a'+10)) +#define xtoi_2(p) ((xtoi_1(p) * 16) + xtoi_1((p)+1)) +#define xmalloc(a) gcry_xmalloc ((a)) +#define xcalloc(a,b) gcry_xcalloc ((a),(b)) +#define xstrdup(a) gcry_xstrdup ((a)) +#define xfree(a) gcry_free ((a)) +#define pass() do { ; } while (0) + + static int verbose; static int error_count; static void +die (const char *format, ...) +{ + va_list arg_ptr ; + + fflush (stdout); + fprintf (stderr, "%s: ", PGMNAME); + va_start( arg_ptr, format ) ; + vfprintf (stderr, format, arg_ptr ); + va_end(arg_ptr); + if (*format && format[strlen(format)-1] != '\n') + putc ('\n', stderr); + exit (1); +} + +static void info (const char *format, ...) { va_list arg_ptr; @@ -42,6 +76,8 @@ info (const char *format, ...) va_start( arg_ptr, format ) ; vfprintf (stderr, format, arg_ptr ); va_end(arg_ptr); + if (*format && format[strlen(format)-1] != '\n') + putc ('\n', stderr); } } @@ -54,10 +90,110 @@ fail ( const char *format, ... ) va_start( arg_ptr, format ) ; vfprintf (stderr, format, arg_ptr ); va_end(arg_ptr); + if (*format && format[strlen(format)-1] != '\n') + putc ('\n', stderr); error_count++; } + +/* Convert STRING consisting of hex characters into its binary + representation and return it as an allocated buffer. The valid + length of the buffer is returned at R_LENGTH. The string is + delimited by end of string. The function returns NULL on + error. */ +static void * +hex2buffer (const char *string, size_t *r_length) +{ + const char *s; + unsigned char *buffer; + size_t length; + + buffer = xmalloc (strlen(string)/2+1); + length = 0; + for (s=string; *s; s +=2 ) + { + if (!hexdigitp (s) || !hexdigitp (s+1)) + return NULL; /* Invalid hex digits. */ + ((unsigned char*)buffer)[length++] = xtoi_2 (s); + } + *r_length = length; + return buffer; +} + + +static gcry_mpi_t +hex2mpi (const char *string) +{ + gpg_error_t err; + gcry_mpi_t val; + + err = gcry_mpi_scan (&val, GCRYMPI_FMT_HEX, string, 0, NULL); + if (err) + die ("hex2mpi '%s' failed: %s\n", string, gpg_strerror (err)); + return val; +} + +static gcry_mpi_t +hex2mpiopa (const char *string) +{ + char *buffer; + size_t buflen; + gcry_mpi_t val; + + buffer = hex2buffer (string, &buflen); + if (!buffer) + die ("hex2mpiopa '%s' failed: parser error\n", string); + val = gcry_mpi_set_opaque (NULL, buffer, buflen*8); + if (!buffer) + die ("hex2mpiopa '%s' failed: set_opaque error%s\n", string); + return val; +} + + +/* Compare A to B, where B is given as a hex string. */ +static int +cmp_mpihex (gcry_mpi_t a, const char *b) +{ + gcry_mpi_t bval; + int res; + + if (gcry_mpi_get_flag (a, GCRYMPI_FLAG_OPAQUE)) + bval = hex2mpiopa (b); + else + bval = hex2mpi (b); + res = gcry_mpi_cmp (a, bval); + gcry_mpi_release (bval); + return res; +} + +/* Compare A to B, where A is a buffer and B a hex string. */ +static int +cmp_bufhex (const void *a, size_t alen, const char *b) +{ + void *bbuf; + size_t blen; + int res; + + if (!a && !b) + return 0; + if (a && !b) + return 1; + if (!a && b) + return -1; + + bbuf = hex2buffer (b, &blen); + if (!bbuf) + die ("cmp_bufhex: error converting hex string\n"); + if (alen != blen) + return alen < blen? -1 : 1; + res = memcmp (a, bbuf, alen); + xfree (bbuf); + return res; +} + + + /* fixme: we need better tests */ static void basic (void) @@ -195,7 +331,7 @@ basic (void) fail ("no car for `%s'\n", token); continue; } - info ("car=`%.*s'\n", (int)n, p); + /* info ("car=`%.*s'\n", (int)n, p); */ s2 = gcry_sexp_cdr (s1); if (!s2) @@ -230,7 +366,7 @@ basic (void) fail("no car for `%s'\n", parm ); continue; } - info ("car=`%.*s'\n", (int)n, p); + /* info ("car=`%.*s'\n", (int)n, p); */ p = gcry_sexp_nth_data (s2, 1, &n); if (!p) { @@ -238,7 +374,7 @@ basic (void) fail("no cdr for `%s'\n", parm ); continue; } - info ("cdr=`%.*s'\n", (int)n, p); + /* info ("cdr=`%.*s'\n", (int)n, p); */ a = gcry_sexp_nth_mpi (s2, 0, GCRYMPI_FMT_USG); gcry_sexp_release (s2); @@ -457,6 +593,379 @@ check_sscan (void) } +static void +check_extract_param (void) +{ + /* This sample data is a real key but with some parameters of the + public key modified. */ + static char sample1[] = + "(key-data" + " (public-key" + " (ecc" + " (curve Ed25519)" + " (p #6FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFED#)" + " (a #EF#)" + " (b #C2036CEE2B6FFE738CC740797779E89800700A4D4141D8AB75EB4DCA135978B6#)" + " (g #14" + " 216936D3CD6E53FEC0A4E231FDD6DC5C692CC7609525A7B2C9562D608F25D51A" + " 6666666666666666666666666666666666666666666666666666666666666658#)" + " (n #0000000000000000000000000000000014DEF9DEA2F79CD65812631A5CF5D3ED#)" + " (q #20B37806015CA06B3AEB9423EE84A41D7F31AA65F4148553755206D679F8BF62#)" + "))" + " (private-key" + " (ecc" + " (curve Ed25519)" + " (p #7FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFED#)" + " (a #FF#)" + " (b #D2036CEE2B6FFE738CC740797779E89800700A4D4141D8AB75EB4DCA135978B6#)" + " (g #04" + " 216936D3CD6E53FEC0A4E231FDD6DC5C692CC7609525A7B2C9562D608F25D51A" + " 6666666666666666666666666666666666666666666666666666666666666658#)" + " (n #1000000000000000000000000000000014DEF9DEA2F79CD65812631A5CF5D3ED#)" + " (q #30B37806015CA06B3AEB9423EE84A41D7F31AA65F4148553755206D679F8BF62#)" + " (d #56BEA284A22F443A7AEA8CEFA24DA5055CDF1D490C94D8C568FE0802C9169276#)" + ")))"; + + static char sample1_p[] = + "7FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFED"; + static char sample1_px[] = + "6FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFED"; + static char sample1_a[] = "FF"; + static char sample1_ax[] = "EF"; + static char sample1_b[] = + "D2036CEE2B6FFE738CC740797779E89800700A4D4141D8AB75EB4DCA135978B6"; + static char sample1_bx[] = + "C2036CEE2B6FFE738CC740797779E89800700A4D4141D8AB75EB4DCA135978B6"; + static char sample1_g[] = + "04" + "216936D3CD6E53FEC0A4E231FDD6DC5C692CC7609525A7B2C9562D608F25D51A" + "6666666666666666666666666666666666666666666666666666666666666658"; + static char sample1_gx[] = + "14" + "216936D3CD6E53FEC0A4E231FDD6DC5C692CC7609525A7B2C9562D608F25D51A" + "6666666666666666666666666666666666666666666666666666666666666658"; + static char sample1_n[] = + "1000000000000000000000000000000014DEF9DEA2F79CD65812631A5CF5D3ED"; + static char sample1_nx[] = + "0000000000000000000000000000000014DEF9DEA2F79CD65812631A5CF5D3ED"; + static char sample1_q[] = + "30B37806015CA06B3AEB9423EE84A41D7F31AA65F4148553755206D679F8BF62"; + static char sample1_qx[] = + "20B37806015CA06B3AEB9423EE84A41D7F31AA65F4148553755206D679F8BF62"; + static char sample1_d[] = + "56BEA284A22F443A7AEA8CEFA24DA5055CDF1D490C94D8C568FE0802C9169276"; + + static struct { + const char *sexp_str; + const char *path; + const char *list; + int nparam; + gpg_err_code_t expected_err; + const char *exp_p; + const char *exp_a; + const char *exp_b; + const char *exp_g; + const char *exp_n; + const char *exp_q; + const char *exp_d; + } tests[] = { + { + sample1, + NULL, + "pabgnqd", 6, + GPG_ERR_MISSING_VALUE, + }, + { + sample1, + NULL, + "pabgnq", 7, + GPG_ERR_INV_ARG + }, + { + sample1, + NULL, + "pabgnqd", 7, + 0, + sample1_px, sample1_ax, sample1_bx, sample1_gx, sample1_nx, + sample1_qx, sample1_d + }, + { + sample1, + NULL, + "abg", 3, + 0, + sample1_ax, sample1_bx, sample1_gx + }, + { + sample1, + NULL, + "x?abg", 4, + 0, + NULL, sample1_ax, sample1_bx, sample1_gx + }, + { + sample1, + NULL, + "p?abg", 4, + GPG_ERR_USER_1, + NULL, sample1_ax, sample1_bx, sample1_gx + }, + { + sample1, + NULL, + "pax?gnqd", 7, + 0, + sample1_px, sample1_ax, NULL, sample1_gx, sample1_nx, + sample1_qx, sample1_d + }, + { + sample1, + "public-key", + "pabgnqd", 7, + GPG_ERR_NO_OBJ, /* d is not in public key. */ + sample1_px, sample1_ax, sample1_bx, sample1_gx, sample1_nx, + sample1_qx, sample1_d + }, + { + sample1, + "private-key", + "pabgnqd", 7, + 0, + sample1_p, sample1_a, sample1_b, sample1_g, sample1_n, + sample1_q, sample1_d + }, + { + sample1, + "public-key!ecc", + "pabgnq", 6, + 0, + sample1_px, sample1_ax, sample1_bx, sample1_gx, sample1_nx, + sample1_qx + }, + { + sample1, + "public-key!ecc!foo", + "pabgnq", 6, + GPG_ERR_NOT_FOUND + }, + { + sample1, + "public-key!!ecc", + "pabgnq", 6, + GPG_ERR_NOT_FOUND + }, + { + sample1, + "private-key", + "pa/bgnqd", 7, + 0, + sample1_p, sample1_a, sample1_b, sample1_g, sample1_n, + sample1_q, sample1_d + }, + { + sample1, + "private-key", + "p-a+bgnqd", 7, + 0, + sample1_p, "-01", sample1_b, sample1_g, sample1_n, + sample1_q, sample1_d + }, + {NULL} + }; + int idx, i; + const char *paramstr; + int paramidx; + gpg_error_t err; + gcry_sexp_t sxp; + gcry_mpi_t mpis[7]; + gcry_buffer_t ioarray[7]; + char iobuffer[200]; + + info ("checking gcry_sexp_extract_param\n"); + for (idx=0; tests[idx].sexp_str; idx++) + { + err = gcry_sexp_new (&sxp, tests[idx].sexp_str, 0, 1); + if (err) + die ("converting string to sexp failed: %s", gpg_strerror (err)); + + memset (mpis, 0, sizeof mpis); + switch (tests[idx].nparam) + { + case 0: + err = gcry_sexp_extract_param (sxp, tests[idx].path, tests[idx].list, + NULL); + break; + case 1: + err = gcry_sexp_extract_param (sxp, tests[idx].path, tests[idx].list, + mpis+0, NULL); + break; + case 2: + err = gcry_sexp_extract_param (sxp, tests[idx].path, tests[idx].list, + mpis+0, mpis+1, NULL); + break; + case 3: + err = gcry_sexp_extract_param (sxp, tests[idx].path, tests[idx].list, + mpis+0, mpis+1, mpis+2, NULL); + break; + case 4: + err = gcry_sexp_extract_param (sxp, tests[idx].path, tests[idx].list, + mpis+0, mpis+1, mpis+2, mpis+3, NULL); + break; + case 5: + err = gcry_sexp_extract_param (sxp, tests[idx].path, tests[idx].list, + mpis+0, mpis+1, mpis+2, mpis+3, mpis+4, + NULL); + break; + case 6: + err = gcry_sexp_extract_param (sxp, tests[idx].path, tests[idx].list, + mpis+0, mpis+1, mpis+2, mpis+3, mpis+4, + mpis+5, NULL); + break; + case 7: + err = gcry_sexp_extract_param (sxp, tests[idx].path, tests[idx].list, + mpis+0, mpis+1, mpis+2, mpis+3, mpis+4, + mpis+5, mpis+6, NULL); + break; + default: + die ("test %d: internal error", idx); + } + + if (tests[idx].expected_err + && tests[idx].expected_err != GPG_ERR_USER_1) + { + if (tests[idx].expected_err != gpg_err_code (err)) + fail ("gcry_sexp_extract_param test %d failed: " + "expected error '%s' - got '%s'", idx, + gpg_strerror (tests[idx].expected_err),gpg_strerror (err)); + + } + else if (err) + { + fail ("gcry_sexp_extract_param test %d failed: %s", + idx, gpg_strerror (err)); + } + else /* No error - check the extracted values. */ + { + for (paramidx=0; paramidx < DIM (mpis); paramidx++) + { + switch (paramidx) + { + case 0: paramstr = tests[idx].exp_p; break; + case 1: paramstr = tests[idx].exp_a; break; + case 2: paramstr = tests[idx].exp_b; break; + case 3: paramstr = tests[idx].exp_g; break; + case 4: paramstr = tests[idx].exp_n; break; + case 5: paramstr = tests[idx].exp_q; break; + case 6: paramstr = tests[idx].exp_d; break; + default: + die ("test %d: internal error: bad param %d", + idx, paramidx); + } + + if (tests[idx].expected_err == GPG_ERR_USER_1 + && mpis[paramidx] && !paramstr && paramidx == 0) + ; /* Okay Special case error for param 0. */ + else if (!mpis[paramidx] && !paramstr) + ; /* Okay. */ + else if (!mpis[paramidx] && paramstr) + fail ("test %d: value for param %d expected but not returned", + idx, paramidx); + else if (mpis[paramidx] && !paramstr) + fail ("test %d: value for param %d not expected", + idx, paramidx); + else if (cmp_mpihex (mpis[paramidx], paramstr)) + { + fail ("test %d: param %d mismatch", idx, paramidx); + gcry_log_debug ("expected: %s\n", paramstr); + gcry_log_debugmpi (" got", mpis[paramidx]); + } + else if (tests[idx].expected_err && paramidx == 0) + fail ("test %d: param %d: expected error '%s' - got 'Success'", + idx, paramidx, gpg_strerror (tests[idx].expected_err)); + } + + } + + for (i=0; i < DIM (mpis); i++) + gcry_mpi_release (mpis[i]); + gcry_sexp_release (sxp); + } + + info ("checking gcry_sexp_extract_param/desc\n"); + + memset (ioarray, 0, sizeof ioarray); + + err = gcry_sexp_new (&sxp, sample1, 0, 1); + if (err) + die ("converting string to sexp failed: %s", gpg_strerror (err)); + + ioarray[1].size = sizeof iobuffer; + ioarray[1].data = iobuffer; + ioarray[1].off = 0; + ioarray[2].size = sizeof iobuffer; + ioarray[2].data = iobuffer; + ioarray[2].off = 50; + assert (ioarray[2].off < sizeof iobuffer); + err = gcry_sexp_extract_param (sxp, "key-data!private-key", "&pab", + ioarray+0, ioarray+1, ioarray+2, NULL); + if (err) + fail ("gcry_sexp_extract_param with desc failed: %s", gpg_strerror (err)); + else + { + if (!ioarray[0].data) + fail ("gcry_sexp_extract_param/desc failed: no P"); + else if (ioarray[0].size != 32) + fail ("gcry_sexp_extract_param/desc failed: P has wrong size"); + else if (ioarray[0].len != 32) + fail ("gcry_sexp_extract_param/desc failed: P has wrong length"); + else if (ioarray[0].off) + fail ("gcry_sexp_extract_param/desc failed: P has OFF set"); + else if (cmp_bufhex (ioarray[0].data, ioarray[0].len, sample1_p)) + { + fail ("gcry_sexp_extract_param/desc failed: P mismatch"); + gcry_log_debug ("expected: %s\n", sample1_p); + gcry_log_debughex (" got", ioarray[0].data, ioarray[0].len); + } + + if (!ioarray[1].data) + fail ("gcry_sexp_extract_param/desc failed: A buffer lost"); + else if (ioarray[1].size != sizeof iobuffer) + fail ("gcry_sexp_extract_param/desc failed: A size changed"); + else if (ioarray[1].off != 0) + fail ("gcry_sexp_extract_param/desc failed: A off changed"); + else if (ioarray[1].len != 1) + fail ("gcry_sexp_extract_param/desc failed: A has wrong length"); + else if (cmp_bufhex (ioarray[1].data + ioarray[1].off, ioarray[1].len, + sample1_a)) + { + fail ("gcry_sexp_extract_param/desc failed: A mismatch"); + gcry_log_debug ("expected: %s\n", sample1_a); + gcry_log_debughex (" got", + ioarray[1].data + ioarray[1].off, ioarray[1].len); + } + + if (!ioarray[2].data) + fail ("gcry_sexp_extract_param/desc failed: B buffer lost"); + else if (ioarray[2].size != sizeof iobuffer) + fail ("gcry_sexp_extract_param/desc failed: B size changed"); + else if (ioarray[2].off != 50) + fail ("gcry_sexp_extract_param/desc failed: B off changed"); + else if (ioarray[2].len != 32) + fail ("gcry_sexp_extract_param/desc failed: B has wrong length"); + else if (cmp_bufhex (ioarray[2].data + ioarray[2].off, ioarray[2].len, + sample1_b)) + { + fail ("gcry_sexp_extract_param/desc failed: B mismatch"); + gcry_log_debug ("expected: %s\n", sample1_b); + gcry_log_debughex (" got", + ioarray[2].data + ioarray[2].off, ioarray[2].len); + } + + xfree (ioarray[0].data); + } + + gcry_sexp_release (sxp); +} int @@ -472,6 +981,7 @@ main (int argc, char **argv) canon_len (); back_and_forth (); check_sscan (); + check_extract_param (); return error_count? 1:0; } ----------------------------------------------------------------------- Summary of changes: NEWS | 5 +- cipher/dsa.c | 16 +- cipher/ecc-curves.c | 6 +- cipher/ecc-misc.c | 1 + cipher/ecc.c | 38 ++-- cipher/elgamal.c | 27 +-- cipher/pubkey-internal.h | 3 - cipher/pubkey-util.c | 113 ---------- cipher/rsa.c | 26 +-- doc/gcrypt.texi | 60 ++++++ src/g10lib.h | 13 +- src/gcrypt.h.in | 15 +- src/libgcrypt.def | 3 + src/libgcrypt.vers | 2 +- src/sexp.c | 235 +++++++++++++++++++++ src/visibility.c | 15 ++ src/visibility.h | 13 +- tests/tsexp.c | 516 +++++++++++++++++++++++++++++++++++++++++++++- 18 files changed, 920 insertions(+), 187 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From dbaryshkov at gmail.com Wed Oct 16 23:05:36 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Thu, 17 Oct 2013 01:05:36 +0400 Subject: Question regarding GCM/block alignment Message-ID: Hello, While implementing/imroving GCM code, I have stumbled upon the following question/problem. My current code only supports encryption/decryption of data if data is submitted in full-blocksize quantities (except last data block of course). Manual vaguely mentions that 'Depending on the selected algorithms and encryption mode, the length of the buffers must be a multiple of the block size.' Should I add buffering of non-full blocks (by the cost of additional -- With best wishes Dmitry From dbaryshkov at gmail.com Wed Oct 16 23:38:54 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Thu, 17 Oct 2013 01:38:54 +0400 Subject: [PATCH] Drop _gcry_cipher_ofb_decrypt as it duplicates _gcry_cipher_ofb_encrypt Message-ID: <1381959534-14943-1-git-send-email-dbaryshkov@gmail.com> * cipher/cipher.c (cipher_decrypt): Use _gcry_cipher_ofb_encrypt for OFB decryption. * cipher/cipher-internal.h: Remove _gcry_cipher_ofb_decrypt declaration. * cipher/cipher-ofb.c (_gcry_cipher_ofb_decrypt): Remove. (_gcry_cipher_ofb_encrypt): remove copying of IV to lastiv, it's unused there. Signed-off-by: Dmitry Eremin-Solenikov --- cipher/cipher-internal.h | 4 --- cipher/cipher-ofb.c | 70 +----------------------------------------------- cipher/cipher.c | 2 +- 3 files changed, 2 insertions(+), 74 deletions(-) diff --git a/cipher/cipher-internal.h b/cipher/cipher-internal.h index b60ef38..95f9759 100644 --- a/cipher/cipher-internal.h +++ b/cipher/cipher-internal.h @@ -152,10 +152,6 @@ gcry_err_code_t _gcry_cipher_ofb_encrypt /* */ (gcry_cipher_hd_t c, unsigned char *outbuf, unsigned int outbuflen, const unsigned char *inbuf, unsigned int inbuflen); -gcry_err_code_t _gcry_cipher_ofb_decrypt -/* */ (gcry_cipher_hd_t c, - unsigned char *outbuf, unsigned int outbuflen, - const unsigned char *inbuf, unsigned int inbuflen); /*-- cipher-ctr.c --*/ gcry_err_code_t _gcry_cipher_ctr_encrypt diff --git a/cipher/cipher-ofb.c b/cipher/cipher-ofb.c index 3d9d54c..ab426bd 100644 --- a/cipher/cipher-ofb.c +++ b/cipher/cipher-ofb.c @@ -47,7 +47,7 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, { /* Short enough to be encoded by the remaining XOR mask. */ /* XOR the input with the IV */ - ivp = c->u_iv.iv + c->spec->blocksize - c->unused; + ivp = c->u_iv.iv + blocksize - c->unused; buf_xor(outbuf, ivp, inbuf, inbuflen); c->unused -= inbuflen; return 0; @@ -69,7 +69,6 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, while ( inbuflen >= blocksize ) { /* Encrypt the IV (and save the current one). */ - memcpy( c->lastiv, c->u_iv.iv, blocksize ); nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; buf_xor(outbuf, c->u_iv.iv, inbuf, blocksize); @@ -79,73 +78,6 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, } if ( inbuflen ) { /* process the remaining bytes */ - memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); - burn = nburn > burn ? nburn : burn; - c->unused = blocksize; - c->unused -= inbuflen; - buf_xor(outbuf, c->u_iv.iv, inbuf, inbuflen); - outbuf += inbuflen; - inbuf += inbuflen; - inbuflen = 0; - } - - if (burn > 0) - _gcry_burn_stack (burn + 4 * sizeof(void *)); - - return 0; -} - - -gcry_err_code_t -_gcry_cipher_ofb_decrypt (gcry_cipher_hd_t c, - unsigned char *outbuf, unsigned int outbuflen, - const unsigned char *inbuf, unsigned int inbuflen) -{ - unsigned char *ivp; - size_t blocksize = c->spec->blocksize; - unsigned int burn, nburn; - - if (outbuflen < inbuflen) - return GPG_ERR_BUFFER_TOO_SHORT; - - if( inbuflen <= c->unused ) - { - /* Short enough to be encoded by the remaining XOR mask. */ - ivp = c->u_iv.iv + blocksize - c->unused; - buf_xor(outbuf, ivp, inbuf, inbuflen); - c->unused -= inbuflen; - return 0; - } - - burn = 0; - - if ( c->unused ) - { - inbuflen -= c->unused; - ivp = c->u_iv.iv + blocksize - c->unused; - buf_xor(outbuf, ivp, inbuf, c->unused); - outbuf += c->unused; - inbuf += c->unused; - c->unused = 0; - } - - /* Now we can process complete blocks. */ - while ( inbuflen >= blocksize ) - { - /* Encrypt the IV (and save the current one). */ - memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); - burn = nburn > burn ? nburn : burn; - buf_xor(outbuf, c->u_iv.iv, inbuf, blocksize); - outbuf += blocksize; - inbuf += blocksize; - inbuflen -= blocksize; - } - if ( inbuflen ) - { /* Process the remaining bytes. */ - /* Encrypt the IV (and save the current one). */ - memcpy( c->lastiv, c->u_iv.iv, blocksize ); nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; c->unused = blocksize; diff --git a/cipher/cipher.c b/cipher/cipher.c index 75d42d1..a02affb 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -799,7 +799,7 @@ cipher_decrypt (gcry_cipher_hd_t c, byte *outbuf, unsigned int outbuflen, break; case GCRY_CIPHER_MODE_OFB: - rc = _gcry_cipher_ofb_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); + rc = _gcry_cipher_ofb_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); break; case GCRY_CIPHER_MODE_CTR: -- 1.8.4.rc3 From wk at gnupg.org Thu Oct 17 08:43:03 2013 From: wk at gnupg.org (Werner Koch) Date: Thu, 17 Oct 2013 08:43:03 +0200 Subject: Question regarding GCM/block alignment In-Reply-To: (Dmitry Eremin-Solenikov's message of "Thu, 17 Oct 2013 01:05:36 +0400") References: Message-ID: <87mwm8ieiw.fsf@vigenere.g10code.de> On Wed, 16 Oct 2013 23:05, dbaryshkov at gmail.com said: > Should I add buffering of non-full blocks (by the cost of additional This is a long standing open question. Until known there has been no real requirement for this. Buffering should be put on top of the existing functions (probably by using a new flag for gcry_cipher_open). However, I think it is more important to stabilize the current API with all the changes we did for 1.6 and work towards a release of 1.6. Adding a few feature can be done later. Shalom-Salam, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From wk at gnupg.org Thu Oct 17 08:44:51 2013 From: wk at gnupg.org (Werner Koch) Date: Thu, 17 Oct 2013 08:44:51 +0200 Subject: [PATCH v2 2/2] Add support for GOST R 34.10-2001/-2012 signatures In-Reply-To: (Dmitry Eremin-Solenikov's message of "Wed, 16 Oct 2013 20:13:12 +0400") References: <1381867004-21231-1-git-send-email-dbaryshkov@gmail.com> <1381867004-21231-3-git-send-email-dbaryshkov@gmail.com> <87ppr5jnbm.fsf@vigenere.g10code.de> Message-ID: <87iowwiefw.fsf@vigenere.g10code.de> On Wed, 16 Oct 2013 18:13, dbaryshkov at gmail.com said: > Because they are "test" curves defined by the GOST standards. RFC 4357 > e.g. names the first curve as 'id-GostR3410-2001-TestParamSet' and tells that > And adds that 'Use of the test parameter sets [...] is NOT RECOMMENDED.' Funny standard. > And strangely enough it aborts in 50% of runs. Sometimes it does, sometimes > it just outputs a note regarding testkey and and exits normally. > I failed to capture a problem either via gdb or via valgrind. It is an algorithmic problem. Salam-Shalom, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From cvs at cvs.gnupg.org Thu Oct 17 10:50:24 2013 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Thu, 17 Oct 2013 10:50:24 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-315-gb224171 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via b22417158c50ec3a0b2ff55b4ade063b42a87e8f (commit) from f9371c026aad09ff48746d22c8333746c886e773 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit b22417158c50ec3a0b2ff55b4ade063b42a87e8f Author: Werner Koch Date: Thu Oct 17 10:45:14 2013 +0200 ecc: Support Weierstrass curves in gcry_mpi_ec_curve_point. * mpi/ec.c (_gcry_mpi_ec_curve_point): Support MPI_EC_WEIERSTRASS. diff --git a/mpi/ec.c b/mpi/ec.c index 889df8e..39ab5eb 100644 --- a/mpi/ec.c +++ b/mpi/ec.c @@ -1216,8 +1216,23 @@ _gcry_mpi_ec_curve_point (gcry_mpi_point_t point, mpi_ec_t ctx) switch (ctx->model) { case MPI_EC_WEIERSTRASS: - log_fatal ("%s: %s not yet supported\n", - "_gcry_mpi_ec_curve_point", "Weierstrass"); + { + gcry_mpi_t xx = mpi_new (0); + + /* y^2 == x^3 + a?x^2 + b */ + ec_pow2 (y, y, ctx); + + ec_pow2 (xx, x, ctx); + ec_mulm (w, ctx->a, xx, ctx); + ec_addm (w, w, ctx->b, ctx); + ec_mulm (xx, xx, x, ctx); + ec_addm (w, w, xx, ctx); + + if (!mpi_cmp (y, w)) + res = 1; + + gcry_mpi_release (xx); + } break; case MPI_EC_MONTGOMERY: log_fatal ("%s: %s not yet supported\n", ----------------------------------------------------------------------- Summary of changes: mpi/ec.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From dbaryshkov at gmail.com Thu Oct 17 11:02:21 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Thu, 17 Oct 2013 13:02:21 +0400 Subject: Question regarding GCM/block alignment In-Reply-To: <87mwm8ieiw.fsf@vigenere.g10code.de> References: <87mwm8ieiw.fsf@vigenere.g10code.de> Message-ID: Hello, On Thu, Oct 17, 2013 at 10:43 AM, Werner Koch wrote: > On Wed, 16 Oct 2013 23:05, dbaryshkov at gmail.com said: > >> Should I add buffering of non-full blocks (by the cost of additional > > This is a long standing open question. Until known there has been no > real requirement for this. Buffering should be put on top of the > existing functions (probably by using a new flag for gcry_cipher_open). Ack. > > However, I think it is more important to stabilize the current API with > all the changes we did for 1.6 and work towards a release of 1.6. > Adding a few feature can be done later. Do you plan to merge AEAD support (GCM/CCM/whatever) before 1.6? Or we better postpone those modes till next release? -- With best wishes Dmitry From wk at gnupg.org Thu Oct 17 12:06:04 2013 From: wk at gnupg.org (Werner Koch) Date: Thu, 17 Oct 2013 12:06:04 +0200 Subject: Question regarding GCM/block alignment In-Reply-To: (Dmitry Eremin-Solenikov's message of "Thu, 17 Oct 2013 13:02:21 +0400") References: <87mwm8ieiw.fsf@vigenere.g10code.de> Message-ID: <87a9i8i54j.fsf@vigenere.g10code.de> On Thu, 17 Oct 2013 11:02, dbaryshkov at gmail.com said: > Do you plan to merge AEAD support (GCM/CCM/whatever) before 1.6? Yes. Salam-Shalom, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From cvs at cvs.gnupg.org Fri Oct 18 13:09:01 2013 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Fri, 18 Oct 2013 13:09:01 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-318-gf7711e6 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via f7711e6eb5f02d03c74911f6f037ab28075e7c0d (commit) via 91e007606f1f6f8e1416c403fe809d47fddf9b1f (commit) via 4776dcd394ce59fa50d959921857b3427c5a63c8 (commit) from b22417158c50ec3a0b2ff55b4ade063b42a87e8f (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit f7711e6eb5f02d03c74911f6f037ab28075e7c0d Author: Werner Koch Date: Thu Oct 17 18:08:59 2013 +0200 tests: Add test options to keygen. * tests/keygen.c (usage): New. (main): Print usage info. Allow running just one algo. Signed-off-by: Werner Koch diff --git a/tests/keygen.c b/tests/keygen.c index 2b98c42..5ab8e9d 100644 --- a/tests/keygen.c +++ b/tests/keygen.c @@ -470,6 +470,19 @@ progress_cb (void *cb_data, const char *what, int printchar, } +static void +usage (int mode) +{ + fputs ("usage: " PGM " [options] [{rsa|elg|dsa|ecc|nonce}]\n" + "Options:\n" + " --verbose be verbose\n" + " --debug flyswatter\n" + " --progress print progress indicators\n", + mode? stderr : stdout); + if (mode) + exit (1); +} + int main (int argc, char **argv) { @@ -489,12 +502,7 @@ main (int argc, char **argv) } else if (!strcmp (*argv, "--help")) { - fputs ("usage: " PGM " [options]\n" - "Options:\n" - " --verbose be verbose\n" - " --debug flyswatter\n" - " --progress print progress indicators\n", - stdout); + usage (0); exit (0); } else if (!strcmp (*argv, "--verbose")) @@ -515,6 +523,8 @@ main (int argc, char **argv) } else if (!strncmp (*argv, "--", 2)) die ("unknown option '%s'", *argv); + else + break; } if (!gcry_check_version (GCRYPT_VERSION)) @@ -528,11 +538,30 @@ main (int argc, char **argv) if (with_progress) gcry_set_progress_handler (progress_cb, NULL); - check_rsa_keys (); - check_elg_keys (); - check_dsa_keys (); - check_ecc_keys (); - check_nonce (); + if (!argc) + { + check_rsa_keys (); + check_elg_keys (); + check_dsa_keys (); + check_ecc_keys (); + check_nonce (); + } + else + { + for (; argc; argc--, argv++) + if (!strcmp (*argv, "rsa")) + check_rsa_keys (); + else if (!strcmp (*argv, "elg")) + check_elg_keys (); + else if (!strcmp (*argv, "dsa")) + check_dsa_keys (); + else if (!strcmp (*argv, "ecc")) + check_ecc_keys (); + else if (!strcmp (*argv, "nonce")) + check_nonce (); + else + usage (1); + } return error_count? 1:0; } commit 91e007606f1f6f8e1416c403fe809d47fddf9b1f Author: Werner Koch Date: Thu Oct 17 18:07:28 2013 +0200 mpi: Do not clear the sign of the mpi_mod result. * mpi/mpi-mod.c (_gcry_mpi_mod): Remove sign setting. Signed-off-by: Werner Koch diff --git a/mpi/mpi-mod.c b/mpi/mpi-mod.c index 3d6248b..d48d7f4 100644 --- a/mpi/mpi-mod.c +++ b/mpi/mpi-mod.c @@ -47,7 +47,6 @@ void _gcry_mpi_mod (gcry_mpi_t rem, gcry_mpi_t dividend, gcry_mpi_t divisor) { _gcry_mpi_fdiv_r (rem, dividend, divisor); - rem->sign = 0; } commit 4776dcd394ce59fa50d959921857b3427c5a63c8 Author: Werner Koch Date: Thu Oct 17 18:01:48 2013 +0200 ecc: Put the curve name again into the output of gcry_pk_genkey. * cipher/ecc.c (ecc_generate): Use the correct var. Release CURVE_FLAGS. Signed-off-by: Werner Koch diff --git a/cipher/ecc.c b/cipher/ecc.c index 8b61ae4..6f3cbbd 100644 --- a/cipher/ecc.c +++ b/cipher/ecc.c @@ -1568,7 +1568,7 @@ ecc_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) if (ed25519_with_ecdsa) { - rc = gcry_sexp_build (&curve_info, NULL, "(flags ecdsa)"); + rc = gcry_sexp_build (&curve_flags, NULL, "(flags ecdsa)"); if (rc) goto leave; } @@ -1613,6 +1613,7 @@ ecc_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) mpi_free (x); mpi_free (y); _gcry_mpi_ec_free (ctx); + gcry_sexp_release (curve_flags); gcry_sexp_release (curve_info); return rc; } ----------------------------------------------------------------------- Summary of changes: cipher/ecc.c | 3 ++- mpi/mpi-mod.c | 1 - tests/keygen.c | 51 ++++++++++++++++++++++++++++++++++++++++----------- 3 files changed, 42 insertions(+), 13 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From jussi.kivilinna at iki.fi Sat Oct 19 16:36:48 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Sat, 19 Oct 2013 17:36:48 +0300 Subject: [RFC PATCH 2/3] Add API for initializing AEAD modes In-Reply-To: <87eh7llg8f.fsf@vigenere.g10code.de> References: <20131013100228.32014.526.stgit@localhost6.localdomain6> <20131013100233.32014.24561.stgit@localhost6.localdomain6> <87iox1phzg.fsf@vigenere.g10code.de> <525BD36A.9010507@iki.fi> <87hacimytn.fsf@vigenere.g10code.de> <525E56C2.6060704@iki.fi> <87eh7llg8f.fsf@vigenere.g10code.de> Message-ID: <52629900.7070501@iki.fi> On 16.10.2013 12:25, Werner Koch wrote: > On Wed, 16 Oct 2013 11:05, jussi.kivilinna at iki.fi said: > >> Ok, so we'd have >> gcry_cipher_authenticate (hd, const void *aadbuf, size_t aadbuflen, >> count void *tag, size_t taglen, size_t crypt_len) >> >> For encryption, tag is NULL pointer and taglen is zero and after encryption >> authentication tag can be read with 'gcry_cipher_tag'. For decryption, tag >> is given for authentication check with above function. > > A last idea: What about two functions > > gcry_cipher_settag () -- To be used before decryption > gcry_cipher_gettag () -- to be used after encryption. For some modes, gettag would need to be used to decryption too. For example, GCM does not need the encrypted data length before hand, so for decryption (and for encryption) one needs to mark end of encrypted data with gcry_cipher_gettag call. So we'd have GCM encryption as: gcry_cipher_setiv(h, nonce, noncelen); gcry_cipher_setaad(h, aad, addlen, 0); gcry_cipher_encrypt(h, buf1, len1, NULL, 0); gcry_cipher_encrypt(h, buf2, len2, NULL, 0); ... gcry_cipher_encrypt(h, bufX, lenX, NULL, 0); gcry_cipher_gettag(h, tag, 8); /* Mark end of data stream, output tag. */ and decryption: gcry_cipher_setiv(h, nonce, noncelen); gcry_cipher_setaad(h, aad, addlen, 0); gcry_cipher_settag(h, tag, 8); gcry_cipher_decrypt(h, buf1, len1, NULL, 0); gcry_cipher_decrypt(h, buf2, len2, NULL, 0); ... gcry_cipher_decrypt(h, bufX, lenX, NULL, 0); gcry_cipher_gettag(h, NULL, 0); /* Mark end of data stream, return 'checksum failed' if tags mismatch. */ So, renaming settag to checktag might be better: gcry_cipher_setiv(h, nonce, noncelen); gcry_cipher_setaad(h, aad, addlen, 0); gcry_cipher_decrypt(h, buf1, len1, NULL, 0); gcry_cipher_decrypt(h, buf2, len2, NULL, 0); ... gcry_cipher_decrypt(h, bufX, lenX, NULL, 0); gcry_cipher_checktag(h, tag, 8); /* Mark end of data stream, return 'checksum failed' if tags mismatch. */ But CCM would still need the tag length passed in before setaad. So do I add 'taglen' argument to setaad? Or just add gcry_cipher_ctl command to pass CCM specific values (encryptlen, taglen)? -Jussi > > gcry_cipher_set_tag would actually look prettier but we already use > setkey and setiv. Wit these fucntions > > gcry_cipher_authenticate (hd, const void *aadbuf, size_t aadbuflen, > size_t crypt_len) > > would be pretty easy to describe. And a very last idea: What about > renaming > > gcry_cipher_authenticate to gcry_cipher_setaad > > ? > > > > Shalom-Salam, > > Werner > > From jussi.kivilinna at iki.fi Sun Oct 20 14:03:13 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Sun, 20 Oct 2013 15:03:13 +0300 Subject: [PATCH 1/2] [v2] Add API to support AEAD cipher modes Message-ID: <20131020120313.21970.15918.stgit@localhost6.localdomain6> * cipher/cipher.c (_gcry_cipher_setaad, _gcry_cipher_checktag) (_gcry_cipher_gettag): New. * doc/gcrypt.texi: Add documentation for new API functions. * src/visibility.c (gcry_cipher_setaad, gcry_cipher_checktag) (gcry_cipher_gettag): New. * src/gcrypt.h.in, src/visibility.h: add declarations of these functions. * src/libgcrypt.defs, src/libgcrypt.vers: export functions. -- Authenticated Encryption with Associated Data (AEAD) cipher modes provide authentication tag that can be used to authenticate message. At the same time it allows one to specify additional (unencrypted data) that will be authenticated together with the message. This class of cipher modes requires additional API present in this commit. This patch is based on original patch by Dmitry Eremin-Solenikov. Changes in v2: - Change gcry_cipher_tag to gcry_cipher_checktag and gcry_cipher_gettag for giving tag (checktag) for decryption and reading tag (gettag) after encryption. - Change gcry_cipher_authenticate to gcry_cipher_setaad, since additional parameters needed for some AEAD modes (in this case CCM, which needs the length of encrypted data and tag for MAC initialization). - Add some documentation. Signed-off-by: Jussi Kivilinna --- cipher/cipher.c | 36 ++++++++++++++++++++++++++++++++++++ doc/gcrypt.texi | 42 ++++++++++++++++++++++++++++++++++++++++++ src/gcrypt.h.in | 14 ++++++++++++++ src/libgcrypt.def | 3 +++ src/libgcrypt.vers | 1 + src/visibility.c | 28 ++++++++++++++++++++++++++++ src/visibility.h | 9 +++++++++ 7 files changed, 133 insertions(+) diff --git a/cipher/cipher.c b/cipher/cipher.c index 75d42d1..60f1f0e 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -910,6 +910,42 @@ _gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen) return 0; } +gcry_error_t +_gcry_cipher_setaad (gcry_cipher_hd_t hd, const void *aad, size_t aadlen, + size_t encryptedlen, size_t taglen) +{ + log_fatal ("gcry_cipher_setaad: invalid mode %d\n", hd->mode ); + + (void)aad; + (void)aadlen; + (void)encryptedlen; + (void)taglen; + + return gpg_error (GPG_ERR_INV_CIPHER_MODE); +} + +gcry_error_t +_gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, size_t taglen) +{ + log_fatal ("gcry_cipher_gettag: invalid mode %d\n", hd->mode ); + + (void)outtag; + (void)taglen; + + return gpg_error (GPG_ERR_INV_CIPHER_MODE); +} + +gcry_error_t +_gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen) +{ + log_fatal ("gcry_cipher_checktag: invalid mode %d\n", hd->mode ); + + (void)intag; + (void)taglen; + + return gpg_error (GPG_ERR_INV_CIPHER_MODE); +} + gcry_error_t gcry_cipher_ctl( gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) diff --git a/doc/gcrypt.texi b/doc/gcrypt.texi index 473c484..1e5f414 100644 --- a/doc/gcrypt.texi +++ b/doc/gcrypt.texi @@ -1731,6 +1731,10 @@ matches the requirement of the selected algorithm and mode. This function is also used with the Salsa20 stream cipher to set or update the required nonce. In this case it needs to be called after setting the key. + +This function is also used with the AEAD cipher modes to set or +update the required nonce. + @end deftypefun @deftypefun gcry_error_t gcry_cipher_setctr (gcry_cipher_hd_t @var{h}, const void *@var{c}, size_t @var{l}) @@ -1750,6 +1754,44 @@ call to gcry_cipher_setkey and clear the initialization vector. Note that gcry_cipher_reset is implemented as a macro. @end deftypefun +Authenticated Encryption with Associated Data (AEAD) block cipher +modes require the handling of the authentication tag and the additional +authenticated data, which can be done by using the following +functions: + + at deftypefun gcry_error_t gcry_cipher_setaad (gcry_cipher_hd_t @var{h}, const void *@var{aad}, size_t @var{aadlen}, size_t @var{enclen}, size_t @var{taglen}) + +Process the buffer @var{aad} of length @var{aadlen} as the additional +authenticated data (AAD) for AEAD cipher modes. + +Some modes require more information at the beginning of operation; such +as the length of the data to be encrypted/decrypted or the length of +authentication tag. The length of encrypted data can be passed through + at var{enclen} and the length of authentication tag through @var{taglen}. +If currently used cipher mode does not require such information at the +early stages of the operation, these fields may be set to zero. + + at end deftypefun + + at deftypefun gcry_error_t gcry_cipher_gettag (gcry_cipher_hd_t @var{h}, void *@var{tag}, size_t @var{taglen}) + +This function is used to read the authentication tag after encryption. +The function finalizes and outputs theauthentication tag to the buffer + at var{tag} of length @var{taglen} bytes. + + at end deftypefun + + at deftypefun gcry_error_t gcry_cipher_checktag (gcry_cipher_hd_t @var{h}, const void *@var{tag}, size_t @var{taglen}) + +Check the authentication tag after decryption. The authentication +tag is passed as the buffer @var{tag} of length @var{taglen} bytes +and compared to internal authentication tag computed during +decryption. Error code @code{GPG_ERR_CHECKSUM} is returned if +the authentication tag in the buffer @var{tag} does not match +the authentication tag calculated during decryption. + + at end deftypefun + The actual encryption and decryption is done by using one of the following functions. They may be used as often as required to process all the data. diff --git a/src/gcrypt.h.in b/src/gcrypt.h.in index 64cc0e4..6132868 100644 --- a/src/gcrypt.h.in +++ b/src/gcrypt.h.in @@ -953,6 +953,20 @@ gcry_error_t gcry_cipher_setkey (gcry_cipher_hd_t hd, gcry_error_t gcry_cipher_setiv (gcry_cipher_hd_t hd, const void *iv, size_t ivlen); +/* Provide additional authentication data for AEAD modes/ciphers. Also + provides the length of encrypted data and authentication tag for modes that + require this information at early stage. */ +gcry_error_t gcry_cipher_setaad (gcry_cipher_hd_t hd, const void *aad, + size_t aadlen, size_t encryptedlen, + size_t taglen); + +/* Get authentication tag for AEAD modes/ciphers. */ +gcry_error_t gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, + size_t taglen); + +/* Check authentication tag for AEAD modes/ciphers. */ +gcry_error_t gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, + size_t taglen); /* Reset the handle to the state after open. */ #define gcry_cipher_reset(h) gcry_cipher_ctl ((h), GCRYCTL_RESET, NULL, 0) diff --git a/src/libgcrypt.def b/src/libgcrypt.def index ec0c1e3..ac2dfb8 100644 --- a/src/libgcrypt.def +++ b/src/libgcrypt.def @@ -255,6 +255,9 @@ EXPORTS gcry_sexp_extract_param @225 + gcry_cipher_setaad @226 + gcry_cipher_gettag @227 + gcry_cipher_checktag @228 ;; end of file with public symbols for Windows. diff --git a/src/libgcrypt.vers b/src/libgcrypt.vers index be72aad..0990ff2 100644 --- a/src/libgcrypt.vers +++ b/src/libgcrypt.vers @@ -51,6 +51,7 @@ GCRYPT_1.6 { gcry_cipher_info; gcry_cipher_map_name; gcry_cipher_mode_from_oid; gcry_cipher_open; gcry_cipher_setkey; gcry_cipher_setiv; gcry_cipher_setctr; + gcry_cipher_setaad; gcry_cipher_gettag; gcry_cipher_checktag; gcry_pk_algo_info; gcry_pk_algo_name; gcry_pk_ctl; gcry_pk_decrypt; gcry_pk_encrypt; gcry_pk_genkey; diff --git a/src/visibility.c b/src/visibility.c index 848925e..5daf2ea 100644 --- a/src/visibility.c +++ b/src/visibility.c @@ -713,6 +713,34 @@ gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen) return _gcry_cipher_setctr (hd, ctr, ctrlen); } +gcry_error_t +gcry_cipher_setaad (gcry_cipher_hd_t hd, const void *aad, size_t aadlen, + size_t encryptedlen, size_t taglen) +{ + if (!fips_is_operational ()) + return gpg_error (fips_not_operational ()); + + return _gcry_cipher_setaad (hd, aad, aadlen, encryptedlen, taglen); +} + +gcry_error_t +gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, size_t taglen) +{ + if (!fips_is_operational ()) + return gpg_error (fips_not_operational ()); + + return _gcry_cipher_gettag (hd, outtag, taglen); +} + +gcry_error_t +gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen) +{ + if (!fips_is_operational ()) + return gpg_error (fips_not_operational ()); + + return _gcry_cipher_checktag (hd, intag, taglen); +} + gcry_error_t gcry_cipher_ctl (gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) diff --git a/src/visibility.h b/src/visibility.h index 1c8f047..e406d7d 100644 --- a/src/visibility.h +++ b/src/visibility.h @@ -81,6 +81,9 @@ #define gcry_cipher_setkey _gcry_cipher_setkey #define gcry_cipher_setiv _gcry_cipher_setiv #define gcry_cipher_setctr _gcry_cipher_setctr +#define gcry_cipher_setaad _gcry_cipher_setaad +#define gcry_cipher_checktag _gcry_cipher_checktag +#define gcry_cipher_gettag _gcry_cipher_gettag #define gcry_cipher_ctl _gcry_cipher_ctl #define gcry_cipher_decrypt _gcry_cipher_decrypt #define gcry_cipher_encrypt _gcry_cipher_encrypt @@ -297,6 +300,9 @@ gcry_err_code_t gcry_md_get (gcry_md_hd_t hd, int algo, #undef gcry_cipher_setkey #undef gcry_cipher_setiv #undef gcry_cipher_setctr +#undef gcry_cipher_setaad +#undef gcry_cipher_checktag +#undef gcry_cipher_gettag #undef gcry_cipher_ctl #undef gcry_cipher_decrypt #undef gcry_cipher_encrypt @@ -474,6 +480,9 @@ MARK_VISIBLE (gcry_cipher_close) MARK_VISIBLE (gcry_cipher_setkey) MARK_VISIBLE (gcry_cipher_setiv) MARK_VISIBLE (gcry_cipher_setctr) +MARK_VISIBLE (gcry_cipher_setaad) +MARK_VISIBLE (gcry_cipher_checktag) +MARK_VISIBLE (gcry_cipher_gettag) MARK_VISIBLE (gcry_cipher_ctl) MARK_VISIBLE (gcry_cipher_decrypt) MARK_VISIBLE (gcry_cipher_encrypt) From jussi.kivilinna at iki.fi Sun Oct 20 14:03:18 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Sun, 20 Oct 2013 15:03:18 +0300 Subject: [PATCH 2/2] Add Counter with CBC-MAC mode (CCM) In-Reply-To: <20131020120313.21970.15918.stgit@localhost6.localdomain6> References: <20131020120313.21970.15918.stgit@localhost6.localdomain6> Message-ID: <20131020120318.21970.31300.stgit@localhost6.localdomain6> * cipher/Makefile.am: Add 'cipher-ccm.c'. * cipher/cipher-ccm.c: New. * cipher/cipher-internal.h (gcry_cipher_handle): Add 'u_mode'. (_gcry_cipher_ccm_encrypt, _gcry_cipher_ccm_decrypt) (_gcry_cipher_ccm_set_nonce, _gcry_cipher_ccm_set_aad) (_gcry_cipher_ccm_get_tag, _gcry_cipher_ccm_check_tag): New prototypes. * cipher/cipher.c (gcry_cipher_open, cipher_encrypt, cipher_decrypt) (_gcry_cipher_setiv, _gcry_cipher_setaad, _gcry_cipher_gettag) (_gcry_cipher_checktag): Add handling for CCM mode. * doc/gcrypt.texi: Add documentation for GCRY_CIPHER_MODE_CCM. * src/gcrypt.h.in (gcry_cipher_modes): Add 'GCRY_CIPHER_MODE_CCM'. (GCRY_CCM_BLOCK_LEN): New. * tests/basic.c (check_ccm_cipher): New. (check_cipher_modes): Call 'check_ccm_cipher'. * tests/benchmark.c (ccm_aead_init): New. (cipher_bench): Add handling for AEAD modes and add CCM benchmarking. -- Patch adds CCM (Counter with CBC-MAC) mode as defined in RFC 3610 and NIST Special Publication 800-38C. Example for encrypting message (split in two buffers; buf1, buf2) and authenticating additional non-encrypted data (aadbuf) with authentication tag length of eigth bytes: taglen = 8; gcry_cipher_setkey(h, key, len(key)); gcry_cipher_setiv(h, nonce, len(nonce)); gcry_cipher_setaad(h, aadbuf, len(aadbuf), len(buf1) + len(buf2), taglen); gcry_cipher_encrypt(h, buf1, len(buf1), buf1, len(buf1)); gcry_cipher_encrypt(h, buf2, len(buf2), buf2, len(buf2)); gcry_cipher_gettag(h, tag, taglen); Example for decrypting above message and checking authentication tag: taglen = 8; gcry_cipher_setkey(h, key, len(key)); gcry_cipher_setiv(h, nonce, len(nonce)); gcry_cipher_setaad(h, aadbuf, len(aadbuf), len(buf1) + len(buf2), taglen); gcry_cipher_decrypt(h, buf1, len(buf1), buf1, len(buf1)); gcry_cipher_decrypt(h, buf2, len(buf2), buf2, len(buf2)); err = gcry_cipher_checktag(h, tag, taglen); if (gpg_err_code (err) == GPG_ERR_CHECKSUM) { /* Authentication failed. */ } else if (err == 0) { /* Authentication ok. */ } Example for encrypting message without additional authenticated data: taglen = 10; gcry_cipher_setkey(h, key, len(key)); gcry_cipher_setiv(h, nonce, len(nonce)); gcry_cipher_setaad(h, NULL, 0, len(buf1), taglen); gcry_cipher_decrypt(h, buf1, len(buf1), buf1, len(buf1)); gcry_cipher_gettag(h, tag, taglen); To reset CCM state for cipher handle, one can either set new nonce or use 'gcry_cipher_reset'. This implementation reuses existing CTR mode code for encryption/decryption and is there for able to process multiple buffers that are not multiple of blocksize. Signed-off-by: Jussi Kivilinna --- cipher/Makefile.am | 1 cipher/cipher-ccm.c | 346 +++++++++++++++++++++++++++++ cipher/cipher-internal.h | 44 ++++ cipher/cipher.c | 85 ++++++- doc/gcrypt.texi | 16 + src/gcrypt.h.in | 5 tests/basic.c | 556 ++++++++++++++++++++++++++++++++++++++++++++++ tests/benchmark.c | 76 ++++++ 8 files changed, 1103 insertions(+), 26 deletions(-) create mode 100644 cipher/cipher-ccm.c diff --git a/cipher/Makefile.am b/cipher/Makefile.am index a2b2c8a..b0efd89 100644 --- a/cipher/Makefile.am +++ b/cipher/Makefile.am @@ -40,6 +40,7 @@ libcipher_la_LIBADD = $(GCRYPT_MODULES) libcipher_la_SOURCES = \ cipher.c cipher-internal.h \ cipher-cbc.c cipher-cfb.c cipher-ofb.c cipher-ctr.c cipher-aeswrap.c \ +cipher-ccm.c \ cipher-selftest.c cipher-selftest.h \ pubkey.c pubkey-internal.h pubkey-util.c \ md.c \ diff --git a/cipher/cipher-ccm.c b/cipher/cipher-ccm.c new file mode 100644 index 0000000..64cce5f --- /dev/null +++ b/cipher/cipher-ccm.c @@ -0,0 +1,346 @@ +/* cipher-ccm.c - CTR mode with CBC-MAC mode implementation + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser general Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include +#include +#include +#include +#include + +#include "g10lib.h" +#include "cipher.h" +#include "ath.h" +#include "bufhelp.h" +#include "./cipher-internal.h" + + +#define set_burn(burn, nburn) do { \ + unsigned int __nburn = (nburn); \ + (burn) = (burn) > __nburn ? (burn) : __nburn; } while (0) + + +static unsigned int +do_cbc_mac (gcry_cipher_hd_t c, const unsigned char *inbuf, size_t inlen, + int do_padding) +{ + const unsigned int blocksize = 16; + unsigned char tmp[blocksize]; + unsigned int burn = 0; + unsigned int unused = c->u_mode.ccm.mac_unused; + size_t nblocks; + + if (inlen == 0 && (unused == 0 || !do_padding)) + return 0; + + do + { + if (inlen + unused < blocksize || unused > 0) + { + for (; inlen && unused < blocksize; inlen--) + c->u_mode.ccm.macbuf[unused++] = *inbuf++; + } + if (!inlen) + { + if (!do_padding) + break; + + while (unused < blocksize) + c->u_mode.ccm.macbuf[unused++] = 0; + } + + if (unused > 0) + { + /* Process one block from macbuf. */ + buf_xor(c->u_iv.iv, c->u_iv.iv, c->u_mode.ccm.macbuf, blocksize); + set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_iv.iv, + c->u_iv.iv )); + + unused = 0; + } + + if (c->bulk.cbc_enc) + { + nblocks = inlen / blocksize; + c->bulk.cbc_enc (&c->context.c, c->u_iv.iv, tmp, inbuf, nblocks, 1); + inbuf += nblocks * blocksize; + inlen -= nblocks * blocksize; + + wipememory (tmp, sizeof(tmp)); + } + else + { + while (inlen >= blocksize) + { + buf_xor(c->u_iv.iv, c->u_iv.iv, inbuf, blocksize); + + set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_iv.iv, + c->u_iv.iv )); + + inlen -= blocksize; + inbuf += blocksize; + } + } + } + while (inlen > 0); + + c->u_mode.ccm.mac_unused = unused; + + if (burn) + burn += 4 * sizeof(void *); + + return burn; +} + + +gcry_err_code_t +_gcry_cipher_ccm_set_nonce (gcry_cipher_hd_t c, const unsigned char *nonce, + size_t noncelen) +{ + size_t L = 15 - noncelen; + size_t L_; + + L_ = L - 1; + + if (!nonce) + return GPG_ERR_INV_ARG; + /* Length field must be 2, 3, ..., or 8. */ + if (L < 2 || L > 8) + return GPG_ERR_INV_LENGTH; + + /* Reset state */ + memset (&c->u_mode, 0, sizeof(c->u_mode)); + memset (&c->marks, 0, sizeof(c->marks)); + memset (&c->u_iv, 0, sizeof(c->u_iv)); + memset (&c->u_ctr, 0, sizeof(c->u_ctr)); + memset (c->lastiv, 0, sizeof(c->lastiv)); + c->unused = 0; + + /* Setup CTR */ + c->u_ctr.ctr[0] = L_; + memcpy (&c->u_ctr.ctr[1], nonce, noncelen); + memset (&c->u_ctr.ctr[1 + noncelen], 0, L); + + /* Setup IV */ + c->u_iv.iv[0] = L_; + memcpy (&c->u_iv.iv[1], nonce, noncelen); + /* Add (8 * M_ + 64 * flags) to iv[0] and set iv[noncelen + 1 ... 15] later + in set_aad. */ + memset (&c->u_iv.iv[1 + noncelen], 0, L); + + c->u_mode.ccm.nonce = 1; + + return GPG_ERR_NO_ERROR; +} + + +gcry_err_code_t +_gcry_cipher_ccm_set_aad (gcry_cipher_hd_t c, const unsigned char *aadbuf, + size_t aadbuflen, size_t encryptlen, size_t taglen) +{ + unsigned int burn = 0; + unsigned char b0[16]; + size_t noncelen = 15 - (c->u_iv.iv[0] + 1); + size_t M = taglen; + size_t M_; + int i; + + M_ = (M - 2) / 2; + + if (aadbuflen > 0 && !aadbuf) + return GPG_ERR_INV_ARG; + /* Authentication field must be 4, 6, 8, 10, 12, 14 or 16. */ + if ((M_ * 2 + 2) != M || M < 4 || M > 16) + return GPG_ERR_INV_LENGTH; + if (!c->u_mode.ccm.nonce || c->u_mode.ccm.tag) + return GPG_ERR_INV_STATE; + if (c->u_mode.ccm.aad) + return GPG_ERR_INV_STATE; + + c->u_mode.ccm.authlen = taglen; + c->u_mode.ccm.encryptlen = encryptlen; + + /* Complete IV setup. */ + c->u_iv.iv[0] += (aadbuflen > 0) * 64 + M_ * 8; + for (i = 16 - 1; i >= 1 + noncelen; i--) + { + c->u_iv.iv[i] = encryptlen & 0xff; + encryptlen >>= 8; + } + + memcpy (b0, c->u_iv.iv, 16); + memset (c->u_iv.iv, 0, 16); + + set_burn (burn, do_cbc_mac (c, b0, 16, 0)); + + if (aadbuflen > 0 && aadbuflen <= (unsigned int)0xfeff) + { + b0[0] = (aadbuflen >> 8) & 0xff; + b0[1] = aadbuflen & 0xff; + set_burn (burn, do_cbc_mac (c, b0, 2, 0)); + set_burn (burn, do_cbc_mac (c, aadbuf, aadbuflen, 1)); + } + else if (aadbuflen > 0xfeff && aadbuflen <= (unsigned int)0xffffffff) + { + b0[0] = 0xff; + b0[1] = 0xfe; + buf_put_be32(&b0[2], aadbuflen); + set_burn (burn, do_cbc_mac (c, b0, 6, 0)); + set_burn (burn, do_cbc_mac (c, aadbuf, aadbuflen, 1)); + } +#ifdef HAVE_U64_TYPEDEF + else if (aadbuflen > (unsigned int)0xffffffff) + { + b0[0] = 0xff; + b0[1] = 0xff; + buf_put_be64(&b0[2], aadbuflen); + set_burn (burn, do_cbc_mac (c, b0, 10, 0)); + set_burn (burn, do_cbc_mac (c, aadbuf, aadbuflen, 1)); + } +#endif + + /* Generate S_0 and increase counter. */ + set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_mode.ccm.s0, + c->u_ctr.ctr )); + c->u_ctr.ctr[15]++; + + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + + c->u_mode.ccm.aad = 1; + + return GPG_ERR_NO_ERROR; +} + + +gcry_err_code_t +_gcry_cipher_ccm_tag (gcry_cipher_hd_t c, unsigned char *outbuf, + size_t outbuflen, int check) +{ + unsigned int burn; + + if (!outbuf || outbuflen == 0) + return GPG_ERR_INV_ARG; + /* Tag length must be same as initial authlen. */ + if (c->u_mode.ccm.authlen != outbuflen) + return GPG_ERR_INV_LENGTH; + if (!c->u_mode.ccm.nonce || !c->u_mode.ccm.aad) + return GPG_ERR_INV_STATE; + /* Initial encrypt length must match with length of actual data processed. */ + if (c->u_mode.ccm.encryptlen > 0) + return GPG_ERR_UNFINISHED; + + if (!c->u_mode.ccm.tag) + { + burn = do_cbc_mac (c, NULL, 0, 1); /* Perform final padding. */ + + /* Add S_0 */ + buf_xor (c->u_iv.iv, c->u_iv.iv, c->u_mode.ccm.s0, 16); + + wipememory (c->u_ctr.ctr, 16); + wipememory (c->u_mode.ccm.s0, 16); + wipememory (c->u_mode.ccm.macbuf, 16); + + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + } + + if (!check) + { + memcpy (outbuf, c->u_iv.iv, outbuflen); + return GPG_ERR_NO_ERROR; + } + else + { + int diff, i; + + /* Constant-time compare. */ + for (i = 0, diff = 0; i < outbuflen; i++) + diff -= !!(outbuf[i] - c->u_iv.iv[i]); + + return !diff ? GPG_ERR_NO_ERROR : GPG_ERR_CHECKSUM; + } +} + + +gcry_err_code_t +_gcry_cipher_ccm_get_tag (gcry_cipher_hd_t c, unsigned char *outtag, + size_t taglen) +{ + return _gcry_cipher_ccm_tag (c, outtag, taglen, 0); +} + + +gcry_err_code_t +_gcry_cipher_ccm_check_tag (gcry_cipher_hd_t c, const unsigned char *intag, + size_t taglen) +{ + return _gcry_cipher_ccm_tag (c, (unsigned char *)intag, taglen, 1); +} + + +gcry_err_code_t +_gcry_cipher_ccm_encrypt (gcry_cipher_hd_t c, unsigned char *outbuf, + unsigned int outbuflen, const unsigned char *inbuf, + unsigned int inbuflen) +{ + unsigned int burn; + + if (outbuflen < inbuflen) + return GPG_ERR_BUFFER_TOO_SHORT; + if (!c->u_mode.ccm.nonce || c->u_mode.ccm.tag || !c->u_mode.ccm.aad) + return GPG_ERR_INV_STATE; + if (inbuflen > c->u_mode.ccm.encryptlen) + return GPG_ERR_INV_LENGTH; + + c->u_mode.ccm.encryptlen -= inbuflen; + burn = do_cbc_mac (c, inbuf, inbuflen, 0); + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + + return _gcry_cipher_ctr_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); +} + + +gcry_err_code_t +_gcry_cipher_ccm_decrypt (gcry_cipher_hd_t c, unsigned char *outbuf, + unsigned int outbuflen, const unsigned char *inbuf, + unsigned int inbuflen) +{ + gcry_err_code_t err; + unsigned int burn; + + if (outbuflen < inbuflen) + return GPG_ERR_BUFFER_TOO_SHORT; + if (!c->u_mode.ccm.nonce || c->u_mode.ccm.tag || !c->u_mode.ccm.aad) + return GPG_ERR_INV_STATE; + if (inbuflen > c->u_mode.ccm.encryptlen) + return GPG_ERR_INV_LENGTH; + + err = _gcry_cipher_ctr_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); + if (err) + return err; + + c->u_mode.ccm.encryptlen -= inbuflen; + burn = do_cbc_mac (c, outbuf, inbuflen, 0); + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + + return err; +} + diff --git a/cipher/cipher-internal.h b/cipher/cipher-internal.h index b60ef38..6dc8385 100644 --- a/cipher/cipher-internal.h +++ b/cipher/cipher-internal.h @@ -100,7 +100,8 @@ struct gcry_cipher_handle /* The initialization vector. For best performance we make sure that it is properly aligned. In particular some implementations - of bulk operations expect an 16 byte aligned IV. */ + of bulk operations expect an 16 byte aligned IV. IV is also used + to store CBC-MAC in CCM mode; counter IV is stored in U_CTR. */ union { cipher_context_alignment_t iv_align; unsigned char iv[MAX_BLOCKSIZE]; @@ -117,6 +118,24 @@ struct gcry_cipher_handle unsigned char lastiv[MAX_BLOCKSIZE]; int unused; /* Number of unused bytes in LASTIV. */ + union { + /* Mode specific storage for CCM mode. */ + struct { + size_t encryptlen; + unsigned int authlen; + + /* Space to save partial input lengths for MAC. */ + unsigned char macbuf[GCRY_CCM_BLOCK_LEN]; + int mac_unused; /* Number of unprocessed bytes in MACBUF. */ + + unsigned char s0[GCRY_CCM_BLOCK_LEN]; + + unsigned int nonce:1;/* Set to 1 if nonce has been set. */ + unsigned int aad:1; /* Set to 1 if AAD has been processed. */ + unsigned int tag:1; /* Set to 1 if tag has been finalized. */ + } ccm; + } u_mode; + /* What follows are two contexts of the cipher in use. The first one needs to be aligned well enough for the cipher operation whereas the second one is a copy created by cipher_setkey and @@ -175,5 +194,28 @@ gcry_err_code_t _gcry_cipher_aeswrap_decrypt const byte *inbuf, unsigned int inbuflen); +/*-- cipher-ccm.c --*/ +gcry_err_code_t _gcry_cipher_ccm_encrypt +/* */ (gcry_cipher_hd_t c, + unsigned char *outbuf, unsigned int outbuflen, + const unsigned char *inbuf, unsigned int inbuflen); +gcry_err_code_t _gcry_cipher_ccm_decrypt +/* */ (gcry_cipher_hd_t c, + unsigned char *outbuf, unsigned int outbuflen, + const unsigned char *inbuf, unsigned int inbuflen); +gcry_err_code_t _gcry_cipher_ccm_set_nonce +/* */ (gcry_cipher_hd_t c, const unsigned char *nonce, + size_t noncelen); +gcry_err_code_t _gcry_cipher_ccm_set_aad +/* */ (gcry_cipher_hd_t c, const unsigned char *aadbuf, + size_t aadbuflen, size_t encryptedlen, size_t taglen); +gcry_err_code_t _gcry_cipher_ccm_get_tag +/* */ (gcry_cipher_hd_t c, + unsigned char *outtag, size_t taglen); +gcry_err_code_t _gcry_cipher_ccm_check_tag +/* */ (gcry_cipher_hd_t c, + const unsigned char *intag, size_t taglen); + + #endif /*G10_CIPHER_INTERNAL_H*/ diff --git a/cipher/cipher.c b/cipher/cipher.c index 60f1f0e..29796bd 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -375,6 +375,13 @@ gcry_cipher_open (gcry_cipher_hd_t *handle, if (! err) switch (mode) { + case GCRY_CIPHER_MODE_CCM: + if (spec->blocksize != GCRY_CCM_BLOCK_LEN) + err = GPG_ERR_INV_CIPHER_MODE; + if (!spec->encrypt || !spec->decrypt) + err = GPG_ERR_INV_CIPHER_MODE; + break; + case GCRY_CIPHER_MODE_ECB: case GCRY_CIPHER_MODE_CBC: case GCRY_CIPHER_MODE_CFB: @@ -613,6 +620,8 @@ cipher_reset (gcry_cipher_hd_t c) memset (c->u_iv.iv, 0, c->spec->blocksize); memset (c->lastiv, 0, c->spec->blocksize); memset (c->u_ctr.ctr, 0, c->spec->blocksize); + memset (&c->u_mode, 0, sizeof c->u_mode); + c->unused = 0; } @@ -718,6 +727,10 @@ cipher_encrypt (gcry_cipher_hd_t c, byte *outbuf, unsigned int outbuflen, inbuf, inbuflen); break; + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); + break; + case GCRY_CIPHER_MODE_STREAM: c->spec->stencrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf, inbuflen); @@ -811,6 +824,10 @@ cipher_decrypt (gcry_cipher_hd_t c, byte *outbuf, unsigned int outbuflen, inbuf, inbuflen); break; + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); + break; + case GCRY_CIPHER_MODE_STREAM: c->spec->stdecrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf, inbuflen); @@ -885,8 +902,19 @@ _gcry_cipher_setkey (gcry_cipher_hd_t hd, const void *key, size_t keylen) gcry_error_t _gcry_cipher_setiv (gcry_cipher_hd_t hd, const void *iv, size_t ivlen) { - cipher_setiv (hd, iv, ivlen); - return 0; + gcry_err_code_t rc = GPG_ERR_NO_ERROR; + + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_set_nonce (hd, iv, ivlen); + break; + + default: + cipher_setiv (hd, iv, ivlen); + break; + } + return gpg_error (rc); } /* Set counter for CTR mode. (CTR,CTRLEN) must denote a buffer of @@ -914,36 +942,61 @@ gcry_error_t _gcry_cipher_setaad (gcry_cipher_hd_t hd, const void *aad, size_t aadlen, size_t encryptedlen, size_t taglen) { - log_fatal ("gcry_cipher_setaad: invalid mode %d\n", hd->mode ); + gcry_err_code_t rc; - (void)aad; - (void)aadlen; - (void)encryptedlen; - (void)taglen; + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_set_aad (hd, aad, aadlen, encryptedlen, taglen); + break; - return gpg_error (GPG_ERR_INV_CIPHER_MODE); + default: + log_fatal ("gcry_cipher_setaad: invalid mode %d\n", hd->mode ); + rc = GPG_ERR_INV_CIPHER_MODE; + break; + } + + return gpg_error (rc); } gcry_error_t _gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, size_t taglen) { - log_fatal ("gcry_cipher_gettag: invalid mode %d\n", hd->mode ); + gcry_err_code_t rc; - (void)outtag; - (void)taglen; + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_get_tag (hd, outtag, taglen); + break; + + default: + log_fatal ("gcry_cipher_gettag: invalid mode %d\n", hd->mode ); + rc = GPG_ERR_INV_CIPHER_MODE; + break; + } - return gpg_error (GPG_ERR_INV_CIPHER_MODE); + return gpg_error (rc); } gcry_error_t _gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen) { - log_fatal ("gcry_cipher_checktag: invalid mode %d\n", hd->mode ); + gcry_err_code_t rc; + + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_check_tag (hd, intag, taglen); + break; - (void)intag; - (void)taglen; + default: + log_fatal ("gcry_cipher_checktag: invalid mode %d\n", hd->mode ); + rc = GPG_ERR_INV_CIPHER_MODE; + break; + } - return gpg_error (GPG_ERR_INV_CIPHER_MODE); + return gpg_error (rc); } diff --git a/doc/gcrypt.texi b/doc/gcrypt.texi index 1e5f414..8b8867a 100644 --- a/doc/gcrypt.texi +++ b/doc/gcrypt.texi @@ -1635,6 +1635,12 @@ may be specified 64 bit (8 byte) shorter than then input buffer. As per specs the input length must be at least 128 bits and the length must be a multiple of 64 bits. + at item GCRY_CIPHER_MODE_CCM + at cindex CCM, Counter with CBC-MAC mode +Counter with CBC-MAC mode is an Authenticated Encryption with +Associated Data (AEAD) block cipher mode, which is specified in +'NIST Special Publication 800-38C' and RFC 3610. + @end table @node Working with cipher handles @@ -1661,11 +1667,13 @@ The cipher mode to use must be specified via @var{mode}. See @xref{Available cipher modes}, for a list of supported cipher modes and the according constants. Note that some modes are incompatible with some algorithms - in particular, stream mode -(@code{GCRY_CIPHER_MODE_STREAM}) only works with stream ciphers. Any -block cipher mode (@code{GCRY_CIPHER_MODE_ECB}, +(@code{GCRY_CIPHER_MODE_STREAM}) only works with stream ciphers. The +block cipher modes (@code{GCRY_CIPHER_MODE_ECB}, @code{GCRY_CIPHER_MODE_CBC}, @code{GCRY_CIPHER_MODE_CFB}, - at code{GCRY_CIPHER_MODE_OFB} or @code{GCRY_CIPHER_MODE_CTR}) will work -with any block cipher algorithm. + at code{GCRY_CIPHER_MODE_OFB} and @code{GCRY_CIPHER_MODE_CTR}) will work +with any block cipher algorithm. The @code{GCRY_CIPHER_MODE_CCM} will +only work with block cipher algorithms which have the block size of +16 bytes. The third argument @var{flags} can either be passed as @code{0} or as the bit-wise OR of the following constants. diff --git a/src/gcrypt.h.in b/src/gcrypt.h.in index 6132868..ac42276 100644 --- a/src/gcrypt.h.in +++ b/src/gcrypt.h.in @@ -884,7 +884,8 @@ enum gcry_cipher_modes GCRY_CIPHER_MODE_STREAM = 4, /* Used with stream ciphers. */ GCRY_CIPHER_MODE_OFB = 5, /* Outer feedback. */ GCRY_CIPHER_MODE_CTR = 6, /* Counter. */ - GCRY_CIPHER_MODE_AESWRAP= 7 /* AES-WRAP algorithm. */ + GCRY_CIPHER_MODE_AESWRAP= 7, /* AES-WRAP algorithm. */ + GCRY_CIPHER_MODE_CCM = 8 /* Counter with CBC-MAC. */ }; /* Flags used with the open function. */ @@ -896,6 +897,8 @@ enum gcry_cipher_flags GCRY_CIPHER_CBC_MAC = 8 /* Enable CBC message auth. code (MAC). */ }; +/* CCM works only with blocks of 128 bits. */ +#define GCRY_CCM_BLOCK_LEN (128 / 8) /* Create a handle for algorithm ALGO to be used in MODE. FLAGS may be given as an bitwise OR of the gcry_cipher_flags values. */ diff --git a/tests/basic.c b/tests/basic.c index 1d6e637..6228c0e 100644 --- a/tests/basic.c +++ b/tests/basic.c @@ -1139,6 +1139,561 @@ check_ofb_cipher (void) static void +check_ccm_cipher (void) +{ + static const struct tv + { + int algo; + int keylen; + const char *key; + int noncelen; + const char *nonce; + int aadlen; + const char *aad; + int plainlen; + const char *plaintext; + int cipherlen; + const char *ciphertext; + } tv[] = + { + /* RFC 3610 */ + { GCRY_CIPHER_AES, /* Packet Vector #1 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x03\x02\x01\x00\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 31, + "\x58\x8C\x97\x9A\x61\xC6\x63\xD2\xF0\x66\xD0\xC2\xC0\xF9\x89\x80\x6D\x5F\x6B\x61\xDA\xC3\x84\x17\xE8\xD1\x2C\xFD\xF9\x26\xE0"}, + { GCRY_CIPHER_AES, /* Packet Vector #2 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x04\x03\x02\x01\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 32, + "\x72\xC9\x1A\x36\xE1\x35\xF8\xCF\x29\x1C\xA8\x94\x08\x5C\x87\xE3\xCC\x15\xC4\x39\xC9\xE4\x3A\x3B\xA0\x91\xD5\x6E\x10\x40\x09\x16"}, + { GCRY_CIPHER_AES, /* Packet Vector #3 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x05\x04\x03\x02\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 33, + "\x51\xB1\xE5\xF4\x4A\x19\x7D\x1D\xA4\x6B\x0F\x8E\x2D\x28\x2A\xE8\x71\xE8\x38\xBB\x64\xDA\x85\x96\x57\x4A\xDA\xA7\x6F\xBD\x9F\xB0\xC5"}, + { GCRY_CIPHER_AES, /* Packet Vector #4 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x06\x05\x04\x03\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 27, + "\xA2\x8C\x68\x65\x93\x9A\x9A\x79\xFA\xAA\x5C\x4C\x2A\x9D\x4A\x91\xCD\xAC\x8C\x96\xC8\x61\xB9\xC9\xE6\x1E\xF1"}, + { GCRY_CIPHER_AES, /* Packet Vector #5 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x07\x06\x05\x04\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 28, + "\xDC\xF1\xFB\x7B\x5D\x9E\x23\xFB\x9D\x4E\x13\x12\x53\x65\x8A\xD8\x6E\xBD\xCA\x3E\x51\xE8\x3F\x07\x7D\x9C\x2D\x93"}, + { GCRY_CIPHER_AES, /* Packet Vector #6 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x08\x07\x06\x05\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 29, + "\x6F\xC1\xB0\x11\xF0\x06\x56\x8B\x51\x71\xA4\x2D\x95\x3D\x46\x9B\x25\x70\xA4\xBD\x87\x40\x5A\x04\x43\xAC\x91\xCB\x94"}, + { GCRY_CIPHER_AES, /* Packet Vector #7 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x09\x08\x07\x06\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 33, + "\x01\x35\xD1\xB2\xC9\x5F\x41\xD5\xD1\xD4\xFE\xC1\x85\xD1\x66\xB8\x09\x4E\x99\x9D\xFE\xD9\x6C\x04\x8C\x56\x60\x2C\x97\xAC\xBB\x74\x90"}, + { GCRY_CIPHER_AES, /* Packet Vector #8 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0A\x09\x08\x07\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 34, + "\x7B\x75\x39\x9A\xC0\x83\x1D\xD2\xF0\xBB\xD7\x58\x79\xA2\xFD\x8F\x6C\xAE\x6B\x6C\xD9\xB7\xDB\x24\xC1\x7B\x44\x33\xF4\x34\x96\x3F\x34\xB4"}, + { GCRY_CIPHER_AES, /* Packet Vector #9 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0B\x0A\x09\x08\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 35, + "\x82\x53\x1A\x60\xCC\x24\x94\x5A\x4B\x82\x79\x18\x1A\xB5\xC8\x4D\xF2\x1C\xE7\xF9\xB7\x3F\x42\xE1\x97\xEA\x9C\x07\xE5\x6B\x5E\xB1\x7E\x5F\x4E"}, + { GCRY_CIPHER_AES, /* Packet Vector #10 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0C\x0B\x0A\x09\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 29, + "\x07\x34\x25\x94\x15\x77\x85\x15\x2B\x07\x40\x98\x33\x0A\xBB\x14\x1B\x94\x7B\x56\x6A\xA9\x40\x6B\x4D\x99\x99\x88\xDD"}, + { GCRY_CIPHER_AES, /* Packet Vector #11 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0D\x0C\x0B\x0A\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 30, + "\x67\x6B\xB2\x03\x80\xB0\xE3\x01\xE8\xAB\x79\x59\x0A\x39\x6D\xA7\x8B\x83\x49\x34\xF5\x3A\xA2\xE9\x10\x7A\x8B\x6C\x02\x2C"}, + { GCRY_CIPHER_AES, /* Packet Vector #12 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0E\x0D\x0C\x0B\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 31, + "\xC0\xFF\xA0\xD6\xF0\x5B\xDB\x67\xF2\x4D\x43\xA4\x33\x8D\x2A\xA4\xBE\xD7\xB2\x0E\x43\xCD\x1A\xA3\x16\x62\xE7\xAD\x65\xD6\xDB"}, + { GCRY_CIPHER_AES, /* Packet Vector #13 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x41\x2B\x4E\xA9\xCD\xBE\x3C\x96\x96\x76\x6C\xFA", + 8, "\x0B\xE1\xA8\x8B\xAC\xE0\x18\xB1", + 23, + "\x08\xE8\xCF\x97\xD8\x20\xEA\x25\x84\x60\xE9\x6A\xD9\xCF\x52\x89\x05\x4D\x89\x5C\xEA\xC4\x7C", + 31, + "\x4C\xB9\x7F\x86\xA2\xA4\x68\x9A\x87\x79\x47\xAB\x80\x91\xEF\x53\x86\xA6\xFF\xBD\xD0\x80\xF8\xE7\x8C\xF7\xCB\x0C\xDD\xD7\xB3"}, + { GCRY_CIPHER_AES, /* Packet Vector #14 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x33\x56\x8E\xF7\xB2\x63\x3C\x96\x96\x76\x6C\xFA", + 8, "\x63\x01\x8F\x76\xDC\x8A\x1B\xCB", + 24, + "\x90\x20\xEA\x6F\x91\xBD\xD8\x5A\xFA\x00\x39\xBA\x4B\xAF\xF9\xBF\xB7\x9C\x70\x28\x94\x9C\xD0\xEC", + 32, + "\x4C\xCB\x1E\x7C\xA9\x81\xBE\xFA\xA0\x72\x6C\x55\xD3\x78\x06\x12\x98\xC8\x5C\x92\x81\x4A\xBC\x33\xC5\x2E\xE8\x1D\x7D\x77\xC0\x8A"}, + { GCRY_CIPHER_AES, /* Packet Vector #15 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x10\x3F\xE4\x13\x36\x71\x3C\x96\x96\x76\x6C\xFA", + 8, "\xAA\x6C\xFA\x36\xCA\xE8\x6B\x40", + 25, + "\xB9\x16\xE0\xEA\xCC\x1C\x00\xD7\xDC\xEC\x68\xEC\x0B\x3B\xBB\x1A\x02\xDE\x8A\x2D\x1A\xA3\x46\x13\x2E", + 33, + "\xB1\xD2\x3A\x22\x20\xDD\xC0\xAC\x90\x0D\x9A\xA0\x3C\x61\xFC\xF4\xA5\x59\xA4\x41\x77\x67\x08\x97\x08\xA7\x76\x79\x6E\xDB\x72\x35\x06"}, + { GCRY_CIPHER_AES, /* Packet Vector #16 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x76\x4C\x63\xB8\x05\x8E\x3C\x96\x96\x76\x6C\xFA", + 12, "\xD0\xD0\x73\x5C\x53\x1E\x1B\xEC\xF0\x49\xC2\x44", + 19, + "\x12\xDA\xAC\x56\x30\xEF\xA5\x39\x6F\x77\x0C\xE1\xA6\x6B\x21\xF7\xB2\x10\x1C", + 27, + "\x14\xD2\x53\xC3\x96\x7B\x70\x60\x9B\x7C\xBB\x7C\x49\x91\x60\x28\x32\x45\x26\x9A\x6F\x49\x97\x5B\xCA\xDE\xAF"}, + { GCRY_CIPHER_AES, /* Packet Vector #17 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\xF8\xB6\x78\x09\x4E\x3B\x3C\x96\x96\x76\x6C\xFA", + 12, "\x77\xB6\x0F\x01\x1C\x03\xE1\x52\x58\x99\xBC\xAE", + 20, + "\xE8\x8B\x6A\x46\xC7\x8D\x63\xE5\x2E\xB8\xC5\x46\xEF\xB5\xDE\x6F\x75\xE9\xCC\x0D", + 28, + "\x55\x45\xFF\x1A\x08\x5E\xE2\xEF\xBF\x52\xB2\xE0\x4B\xEE\x1E\x23\x36\xC7\x3E\x3F\x76\x2C\x0C\x77\x44\xFE\x7E\x3C"}, + { GCRY_CIPHER_AES, /* Packet Vector #18 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\xD5\x60\x91\x2D\x3F\x70\x3C\x96\x96\x76\x6C\xFA", + 12, "\xCD\x90\x44\xD2\xB7\x1F\xDB\x81\x20\xEA\x60\xC0", + 21, + "\x64\x35\xAC\xBA\xFB\x11\xA8\x2E\x2F\x07\x1D\x7C\xA4\xA5\xEB\xD9\x3A\x80\x3B\xA8\x7F", + 29, + "\x00\x97\x69\xEC\xAB\xDF\x48\x62\x55\x94\xC5\x92\x51\xE6\x03\x57\x22\x67\x5E\x04\xC8\x47\x09\x9E\x5A\xE0\x70\x45\x51"}, + { GCRY_CIPHER_AES, /* Packet Vector #19 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x42\xFF\xF8\xF1\x95\x1C\x3C\x96\x96\x76\x6C\xFA", + 8, "\xD8\x5B\xC7\xE6\x9F\x94\x4F\xB8", + 23, + "\x8A\x19\xB9\x50\xBC\xF7\x1A\x01\x8E\x5E\x67\x01\xC9\x17\x87\x65\x98\x09\xD6\x7D\xBE\xDD\x18", + 33, + "\xBC\x21\x8D\xAA\x94\x74\x27\xB6\xDB\x38\x6A\x99\xAC\x1A\xEF\x23\xAD\xE0\xB5\x29\x39\xCB\x6A\x63\x7C\xF9\xBE\xC2\x40\x88\x97\xC6\xBA"}, + { GCRY_CIPHER_AES, /* Packet Vector #20 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x92\x0F\x40\xE5\x6C\xDC\x3C\x96\x96\x76\x6C\xFA", + 8, "\x74\xA0\xEB\xC9\x06\x9F\x5B\x37", + 24, + "\x17\x61\x43\x3C\x37\xC5\xA3\x5F\xC1\xF3\x9F\x40\x63\x02\xEB\x90\x7C\x61\x63\xBE\x38\xC9\x84\x37", + 34, + "\x58\x10\xE6\xFD\x25\x87\x40\x22\xE8\x03\x61\xA4\x78\xE3\xE9\xCF\x48\x4A\xB0\x4F\x44\x7E\xFF\xF6\xF0\xA4\x77\xCC\x2F\xC9\xBF\x54\x89\x44"}, + { GCRY_CIPHER_AES, /* Packet Vector #21 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x27\xCA\x0C\x71\x20\xBC\x3C\x96\x96\x76\x6C\xFA", + 8, "\x44\xA3\xAA\x3A\xAE\x64\x75\xCA", + 25, + "\xA4\x34\xA8\xE5\x85\x00\xC6\xE4\x15\x30\x53\x88\x62\xD6\x86\xEA\x9E\x81\x30\x1B\x5A\xE4\x22\x6B\xFA", + 35, + "\xF2\xBE\xED\x7B\xC5\x09\x8E\x83\xFE\xB5\xB3\x16\x08\xF8\xE2\x9C\x38\x81\x9A\x89\xC8\xE7\x76\xF1\x54\x4D\x41\x51\xA4\xED\x3A\x8B\x87\xB9\xCE"}, + { GCRY_CIPHER_AES, /* Packet Vector #22 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x5B\x8C\xCB\xCD\x9A\xF8\x3C\x96\x96\x76\x6C\xFA", + 12, "\xEC\x46\xBB\x63\xB0\x25\x20\xC3\x3C\x49\xFD\x70", + 19, + "\xB9\x6B\x49\xE2\x1D\x62\x17\x41\x63\x28\x75\xDB\x7F\x6C\x92\x43\xD2\xD7\xC2", + 29, + "\x31\xD7\x50\xA0\x9D\xA3\xED\x7F\xDD\xD4\x9A\x20\x32\xAA\xBF\x17\xEC\x8E\xBF\x7D\x22\xC8\x08\x8C\x66\x6B\xE5\xC1\x97"}, + { GCRY_CIPHER_AES, /* Packet Vector #23 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x3E\xBE\x94\x04\x4B\x9A\x3C\x96\x96\x76\x6C\xFA", + 12, "\x47\xA6\x5A\xC7\x8B\x3D\x59\x42\x27\xE8\x5E\x71", + 20, + "\xE2\xFC\xFB\xB8\x80\x44\x2C\x73\x1B\xF9\x51\x67\xC8\xFF\xD7\x89\x5E\x33\x70\x76", + 30, + "\xE8\x82\xF1\xDB\xD3\x8C\xE3\xED\xA7\xC2\x3F\x04\xDD\x65\x07\x1E\xB4\x13\x42\xAC\xDF\x7E\x00\xDC\xCE\xC7\xAE\x52\x98\x7D"}, + { GCRY_CIPHER_AES, /* Packet Vector #24 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x8D\x49\x3B\x30\xAE\x8B\x3C\x96\x96\x76\x6C\xFA", + 12, "\x6E\x37\xA6\xEF\x54\x6D\x95\x5D\x34\xAB\x60\x59", + 21, + "\xAB\xF2\x1C\x0B\x02\xFE\xB8\x8F\x85\x6D\xF4\xA3\x73\x81\xBC\xE3\xCC\x12\x85\x17\xD4", + 31, + "\xF3\x29\x05\xB8\x8A\x64\x1B\x04\xB9\xC9\xFF\xB5\x8C\xC3\x90\x90\x0F\x3D\xA1\x2A\xB1\x6D\xCE\x9E\x82\xEF\xA1\x6D\xA6\x20\x59"}, + /* RFC 5528 */ + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #1 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x03\x02\x01\x00\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 31, + "\xBA\x73\x71\x85\xE7\x19\x31\x04\x92\xF3\x8A\x5F\x12\x51\xDA\x55\xFA\xFB\xC9\x49\x84\x8A\x0D\xFC\xAE\xCE\x74\x6B\x3D\xB9\xAD"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #2 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x04\x03\x02\x01\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 32, + "\x5D\x25\x64\xBF\x8E\xAF\xE1\xD9\x95\x26\xEC\x01\x6D\x1B\xF0\x42\x4C\xFB\xD2\xCD\x62\x84\x8F\x33\x60\xB2\x29\x5D\xF2\x42\x83\xE8"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #3 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x05\x04\x03\x02\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 33, + "\x81\xF6\x63\xD6\xC7\x78\x78\x17\xF9\x20\x36\x08\xB9\x82\xAD\x15\xDC\x2B\xBD\x87\xD7\x56\xF7\x92\x04\xF5\x51\xD6\x68\x2F\x23\xAA\x46"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #4 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x06\x05\x04\x03\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 27, + "\xCA\xEF\x1E\x82\x72\x11\xB0\x8F\x7B\xD9\x0F\x08\xC7\x72\x88\xC0\x70\xA4\xA0\x8B\x3A\x93\x3A\x63\xE4\x97\xA0"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #5 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x07\x06\x05\x04\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 28, + "\x2A\xD3\xBA\xD9\x4F\xC5\x2E\x92\xBE\x43\x8E\x82\x7C\x10\x23\xB9\x6A\x8A\x77\x25\x8F\xA1\x7B\xA7\xF3\x31\xDB\x09"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #6 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x08\x07\x06\x05\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 29, + "\xFE\xA5\x48\x0B\xA5\x3F\xA8\xD3\xC3\x44\x22\xAA\xCE\x4D\xE6\x7F\xFA\x3B\xB7\x3B\xAB\xAB\x36\xA1\xEE\x4F\xE0\xFE\x28"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #7 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x09\x08\x07\x06\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 33, + "\x54\x53\x20\x26\xE5\x4C\x11\x9A\x8D\x36\xD9\xEC\x6E\x1E\xD9\x74\x16\xC8\x70\x8C\x4B\x5C\x2C\xAC\xAF\xA3\xBC\xCF\x7A\x4E\xBF\x95\x73"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #8 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0A\x09\x08\x07\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 34, + "\x8A\xD1\x9B\x00\x1A\x87\xD1\x48\xF4\xD9\x2B\xEF\x34\x52\x5C\xCC\xE3\xA6\x3C\x65\x12\xA6\xF5\x75\x73\x88\xE4\x91\x3E\xF1\x47\x01\xF4\x41"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #9 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0B\x0A\x09\x08\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 35, + "\x5D\xB0\x8D\x62\x40\x7E\x6E\x31\xD6\x0F\x9C\xA2\xC6\x04\x74\x21\x9A\xC0\xBE\x50\xC0\xD4\xA5\x77\x87\x94\xD6\xE2\x30\xCD\x25\xC9\xFE\xBF\x87"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #10 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0C\x0B\x0A\x09\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 29, + "\xDB\x11\x8C\xCE\xC1\xB8\x76\x1C\x87\x7C\xD8\x96\x3A\x67\xD6\xF3\xBB\xBC\x5C\xD0\x92\x99\xEB\x11\xF3\x12\xF2\x32\x37"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #11 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0D\x0C\x0B\x0A\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 30, + "\x7C\xC8\x3D\x8D\xC4\x91\x03\x52\x5B\x48\x3D\xC5\xCA\x7E\xA9\xAB\x81\x2B\x70\x56\x07\x9D\xAF\xFA\xDA\x16\xCC\xCF\x2C\x4E"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #12 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0E\x0D\x0C\x0B\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 31, + "\x2C\xD3\x5B\x88\x20\xD2\x3E\x7A\xA3\x51\xB0\xE9\x2F\xC7\x93\x67\x23\x8B\x2C\xC7\x48\xCB\xB9\x4C\x29\x47\x79\x3D\x64\xAF\x75"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #13 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xA9\x70\x11\x0E\x19\x27\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x6B\x7F\x46\x45\x07\xFA\xE4\x96", + 23, + "\xC6\xB5\xF3\xE6\xCA\x23\x11\xAE\xF7\x47\x2B\x20\x3E\x73\x5E\xA5\x61\xAD\xB1\x7D\x56\xC5\xA3", + 31, + "\xA4\x35\xD7\x27\x34\x8D\xDD\x22\x90\x7F\x7E\xB8\xF5\xFD\xBB\x4D\x93\x9D\xA6\x52\x4D\xB4\xF6\x45\x58\xC0\x2D\x25\xB1\x27\xEE"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #14 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x83\xCD\x8C\xE0\xCB\x42\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x98\x66\x05\xB4\x3D\xF1\x5D\xE7", + 24, + "\x01\xF6\xCE\x67\x64\xC5\x74\x48\x3B\xB0\x2E\x6B\xBF\x1E\x0A\xBD\x26\xA2\x25\x72\xB4\xD8\x0E\xE7", + 32, + "\x8A\xE0\x52\x50\x8F\xBE\xCA\x93\x2E\x34\x6F\x05\xE0\xDC\x0D\xFB\xCF\x93\x9E\xAF\xFA\x3E\x58\x7C\x86\x7D\x6E\x1C\x48\x70\x38\x06"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #15 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x5F\x54\x95\x0B\x18\xF2\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x48\xF2\xE7\xE1\xA7\x67\x1A\x51", + 25, + "\xCD\xF1\xD8\x40\x6F\xC2\xE9\x01\x49\x53\x89\x70\x05\xFB\xFB\x8B\xA5\x72\x76\xF9\x24\x04\x60\x8E\x08", + 33, + "\x08\xB6\x7E\xE2\x1C\x8B\xF2\x6E\x47\x3E\x40\x85\x99\xE9\xC0\x83\x6D\x6A\xF0\xBB\x18\xDF\x55\x46\x6C\xA8\x08\x78\xA7\x90\x47\x6D\xE5"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #16 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xEC\x60\x08\x63\x31\x9A\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\xDE\x97\xDF\x3B\x8C\xBD\x6D\x8E\x50\x30\xDA\x4C", + 19, + "\xB0\x05\xDC\xFA\x0B\x59\x18\x14\x26\xA9\x61\x68\x5A\x99\x3D\x8C\x43\x18\x5B", + 27, + "\x63\xB7\x8B\x49\x67\xB1\x9E\xDB\xB7\x33\xCD\x11\x14\xF6\x4E\xB2\x26\x08\x93\x68\xC3\x54\x82\x8D\x95\x0C\xC5"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #17 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x60\xCF\xF1\xA3\x1E\xA1\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\xA5\xEE\x93\xE4\x57\xDF\x05\x46\x6E\x78\x2D\xCF", + 20, + "\x2E\x20\x21\x12\x98\x10\x5F\x12\x9D\x5E\xD9\x5B\x93\xF7\x2D\x30\xB2\xFA\xCC\xD7", + 28, + "\x0B\xC6\xBB\xE2\xA8\xB9\x09\xF4\x62\x9E\xE6\xDC\x14\x8D\xA4\x44\x10\xE1\x8A\xF4\x31\x47\x38\x32\x76\xF6\x6A\x9F"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #18 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x0F\x85\xCD\x99\x5C\x97\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\x24\xAA\x1B\xF9\xA5\xCD\x87\x61\x82\xA2\x50\x74", + 21, + "\x26\x45\x94\x1E\x75\x63\x2D\x34\x91\xAF\x0F\xC0\xC9\x87\x6C\x3B\xE4\xAA\x74\x68\xC9", + 29, + "\x22\x2A\xD6\x32\xFA\x31\xD6\xAF\x97\x0C\x34\x5F\x7E\x77\xCA\x3B\xD0\xDC\x25\xB3\x40\xA1\xA3\xD3\x1F\x8D\x4B\x44\xB7"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #19 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xC2\x9B\x2C\xAA\xC4\xCD\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x69\x19\x46\xB9\xCA\x07\xBE\x87", + 23, + "\x07\x01\x35\xA6\x43\x7C\x9D\xB1\x20\xCD\x61\xD8\xF6\xC3\x9C\x3E\xA1\x25\xFD\x95\xA0\xD2\x3D", + 33, + "\x05\xB8\xE1\xB9\xC4\x9C\xFD\x56\xCF\x13\x0A\xA6\x25\x1D\xC2\xEC\xC0\x6C\xCC\x50\x8F\xE6\x97\xA0\x06\x6D\x57\xC8\x4B\xEC\x18\x27\x68"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #20 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x2C\x6B\x75\x95\xEE\x62\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\xD0\xC5\x4E\xCB\x84\x62\x7D\xC4", + 24, + "\xC8\xC0\x88\x0E\x6C\x63\x6E\x20\x09\x3D\xD6\x59\x42\x17\xD2\xE1\x88\x77\xDB\x26\x4E\x71\xA5\xCC", + 34, + "\x54\xCE\xB9\x68\xDE\xE2\x36\x11\x57\x5E\xC0\x03\xDF\xAA\x1C\xD4\x88\x49\xBD\xF5\xAE\x2E\xDB\x6B\x7F\xA7\x75\xB1\x50\xED\x43\x83\xC5\xA9"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #21 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xC5\x3C\xD4\xC2\xAA\x24\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\xE2\x85\xE0\xE4\x80\x8C\xDA\x3D", + 25, + "\xF7\x5D\xAA\x07\x10\xC4\xE6\x42\x97\x79\x4D\xC2\xB7\xD2\xA2\x07\x57\xB1\xAA\x4E\x44\x80\x02\xFF\xAB", + 35, + "\xB1\x40\x45\x46\xBF\x66\x72\x10\xCA\x28\xE3\x09\xB3\x9B\xD6\xCA\x7E\x9F\xC8\x28\x5F\xE6\x98\xD4\x3C\xD2\x0A\x02\xE0\xBD\xCA\xED\x20\x10\xD3"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #22 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xBE\xE9\x26\x7F\xBA\xDC\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\x6C\xAE\xF9\x94\x11\x41\x57\x0D\x7C\x81\x34\x05", + 19, + "\xC2\x38\x82\x2F\xAC\x5F\x98\xFF\x92\x94\x05\xB0\xAD\x12\x7A\x4E\x41\x85\x4E", + 29, + "\x94\xC8\x95\x9C\x11\x56\x9A\x29\x78\x31\xA7\x21\x00\x58\x57\xAB\x61\xB8\x7A\x2D\xEA\x09\x36\xB6\xEB\x5F\x62\x5F\x5D"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #23 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xDF\xA8\xB1\x24\x50\x07\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\x36\xA5\x2C\xF1\x6B\x19\xA2\x03\x7A\xB7\x01\x1E", + 20, + "\x4D\xBF\x3E\x77\x4A\xD2\x45\xE5\xD5\x89\x1F\x9D\x1C\x32\xA0\xAE\x02\x2C\x85\xD7", + 30, + "\x58\x69\xE3\xAA\xD2\x44\x7C\x74\xE0\xFC\x05\xF9\xA4\xEA\x74\x57\x7F\x4D\xE8\xCA\x89\x24\x76\x42\x96\xAD\x04\x11\x9C\xE7"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #24 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x3B\x8F\xD8\xD3\xA9\x37\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\xA4\xD4\x99\xF7\x84\x19\x72\x8C\x19\x17\x8B\x0C", + 21, + "\x9D\xC9\xED\xAE\x2F\xF5\xDF\x86\x36\xE8\xC6\xDE\x0E\xED\x55\xF7\x86\x7E\x33\x33\x7D", + 31, + "\x4B\x19\x81\x56\x39\x3B\x0F\x77\x96\x08\x6A\xAF\xB4\x54\xF8\xC3\xF0\x34\xCC\xA9\x66\x94\x5F\x1F\xCE\xA7\xE1\x1B\xEE\x6A\x2F"} + }; + static const int cut[] = { 0, 1, 8, 10, 16, 19, -1 }; + gcry_cipher_hd_t hde, hdd; + unsigned char out[MAX_DATA_LEN]; + int split, j, i, keylen, blklen, authlen; + gcry_error_t err = 0; + + if (verbose) + fprintf (stderr, " Starting CCM checks.\n"); + + for (i = 0; i < sizeof (tv) / sizeof (tv[0]); i++) + { + if (verbose) + fprintf (stderr, " checking CCM mode for %s [%i]\n", + gcry_cipher_algo_name (tv[i].algo), + tv[i].algo); + + for (j = 0; j < sizeof (cut) / sizeof (cut[0]); j++) + { + split = cut[j] < 0 ? tv[i].plainlen : cut[j]; + if (tv[i].plainlen < split) + continue; + + err = gcry_cipher_open (&hde, tv[i].algo, GCRY_CIPHER_MODE_CCM, 0); + if (!err) + err = gcry_cipher_open (&hdd, tv[i].algo, GCRY_CIPHER_MODE_CCM, 0); + if (err) + { + fail ("cipher-ccm, gcry_cipher_open failed: %s\n", + gpg_strerror (err)); + return; + } + + keylen = gcry_cipher_get_algo_keylen(tv[i].algo); + if (!keylen) + { + fail ("cipher-ccm, gcry_cipher_get_algo_keylen failed\n"); + return; + } + + err = gcry_cipher_setkey (hde, tv[i].key, keylen); + if (!err) + err = gcry_cipher_setkey (hdd, tv[i].key, keylen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_setkey failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + blklen = gcry_cipher_get_algo_blklen(tv[i].algo); + if (!blklen) + { + fail ("cipher-ccm, gcry_cipher_get_algo_blklen failed\n"); + return; + } + + err = gcry_cipher_setiv (hde, tv[i].nonce, tv[i].noncelen); + if (!err) + err = gcry_cipher_setiv (hdd, tv[i].nonce, tv[i].noncelen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_setiv failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + authlen = tv[i].cipherlen - tv[i].plainlen; + err = gcry_cipher_setaad (hde, tv[i].aad, tv[i].aadlen, + tv[i].plainlen, authlen); + if (!err) + err = gcry_cipher_setaad (hdd, tv[i].aad, tv[i].aadlen, + tv[i].plainlen, authlen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_setaad failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + err = gcry_cipher_encrypt (hde, out, MAX_DATA_LEN, tv[i].plaintext, + tv[i].plainlen - split); + if (!err) + err = gcry_cipher_encrypt (hde, &out[tv[i].plainlen - split], + MAX_DATA_LEN - (tv[i].plainlen - split), + &tv[i].plaintext[tv[i].plainlen - split], + split); + if (err) + { + fail ("cipher-ccm, gcry_cipher_encrypt (%d:%d) failed: %s\n", + i, j, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + err = gcry_cipher_gettag (hde, &out[tv[i].plainlen], authlen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_gettag (%d:%d) failed: %s\n", + i, j, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + if (memcmp (tv[i].ciphertext, out, tv[i].cipherlen)) + fail ("cipher-ccm, encrypt mismatch entry %d:%d\n", i, j); + + err = gcry_cipher_decrypt (hdd, out, tv[i].plainlen - split, NULL, 0); + if (!err) + err = gcry_cipher_decrypt (hdd, &out[tv[i].plainlen - split], split, + NULL, 0); + if (err) + { + fail ("cipher-ccm, gcry_cipher_decrypt (%d:%d) failed: %s\n", + i, j, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + if (memcmp (tv[i].plaintext, out, tv[i].plainlen)) + fail ("cipher-ccm, decrypt mismatch entry %d:%d\n", i, j); + + err = gcry_cipher_checktag (hdd, &out[tv[i].plainlen], authlen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_checktag (%d:%d) failed: %s\n", + i, j, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + } + } + + if (verbose) + fprintf (stderr, " Completed CCM checks.\n"); +} + + +static void check_stream_cipher (void) { struct tv @@ -2455,6 +3010,7 @@ check_cipher_modes(void) check_ctr_cipher (); check_cfb_cipher (); check_ofb_cipher (); + check_ccm_cipher (); check_stream_cipher (); check_stream_cipher_large_block (); diff --git a/tests/benchmark.c b/tests/benchmark.c index ecda0d3..5d7e7fd 100644 --- a/tests/benchmark.c +++ b/tests/benchmark.c @@ -435,6 +435,36 @@ md_bench ( const char *algoname ) fflush (stdout); } + +static void ccm_aead_init(gcry_cipher_hd_t hd, size_t buflen, int authlen) +{ + const int _L = 4; + const int noncelen = 15 - _L; + char nonce[noncelen]; + gcry_error_t err = GPG_ERR_NO_ERROR; + + memset (nonce, 0x33, noncelen); + + err = gcry_cipher_setiv (hd, nonce, noncelen); + if (err) + { + fprintf (stderr, "gcry_cipher_setiv failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + err = gcry_cipher_setaad (hd, NULL, 0, buflen, authlen); + if (err) + { + fprintf (stderr, "gcry_cipher_setaad failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } +} + + static void cipher_bench ( const char *algoname ) { @@ -448,12 +478,21 @@ cipher_bench ( const char *algoname ) char *raw_outbuf, *raw_buf; size_t allocated_buflen, buflen; int repetitions; - static struct { int mode; const char *name; int blocked; } modes[] = { + static const struct { + int mode; + const char *name; + int blocked; + void (* const aead_init)(gcry_cipher_hd_t hd, size_t buflen, int authlen); + int req_blocksize; + int authlen; + } modes[] = { { GCRY_CIPHER_MODE_ECB, " ECB/Stream", 1 }, { GCRY_CIPHER_MODE_CBC, " CBC", 1 }, { GCRY_CIPHER_MODE_CFB, " CFB", 0 }, { GCRY_CIPHER_MODE_OFB, " OFB", 0 }, { GCRY_CIPHER_MODE_CTR, " CTR", 0 }, + { GCRY_CIPHER_MODE_CCM, " CCM", 0, + ccm_aead_init, GCRY_CCM_BLOCK_LEN, 8 }, { GCRY_CIPHER_MODE_STREAM, "", 0 }, {0} }; @@ -542,9 +581,16 @@ cipher_bench ( const char *algoname ) for (modeidx=0; modes[modeidx].mode; modeidx++) { if ((blklen > 1 && modes[modeidx].mode == GCRY_CIPHER_MODE_STREAM) - | (blklen == 1 && modes[modeidx].mode != GCRY_CIPHER_MODE_STREAM)) + || (blklen == 1 && modes[modeidx].mode != GCRY_CIPHER_MODE_STREAM)) continue; + if (modes[modeidx].req_blocksize > 0 + && blklen != modes[modeidx].req_blocksize) + { + printf (" %7s %7s", "-", "-" ); + continue; + } + for (i=0; i < sizeof buf; i++) buf[i] = i; @@ -585,7 +631,18 @@ cipher_bench ( const char *algoname ) exit (1); } } - err = gcry_cipher_encrypt ( hd, outbuf, buflen, buf, buflen); + if (modes[modeidx].aead_init) + { + (*modes[modeidx].aead_init) (hd, buflen, modes[modeidx].authlen); + err = gcry_cipher_encrypt (hd, outbuf, buflen, buf, buflen); + if (err) + break; + err = gcry_cipher_gettag (hd, outbuf, modes[modeidx].authlen); + } + else + { + err = gcry_cipher_encrypt (hd, outbuf, buflen, buf, buflen); + } } stop_timer (); @@ -632,7 +689,18 @@ cipher_bench ( const char *algoname ) exit (1); } } - err = gcry_cipher_decrypt ( hd, outbuf, buflen, buf, buflen); + if (modes[modeidx].aead_init) + { + (*modes[modeidx].aead_init) (hd, buflen, modes[modeidx].authlen); + err = gcry_cipher_decrypt (hd, outbuf, buflen, buf, buflen); + if (err) + break; + err = gcry_cipher_checktag (hd, outbuf, modes[modeidx].authlen); + if (gpg_err_code (err) == GPG_ERR_CHECKSUM) + err = gpg_error (GPG_ERR_NO_ERROR); + } + else + err = gcry_cipher_decrypt (hd, outbuf, buflen, buf, buflen); } stop_timer (); printf (" %s", elapsed_time ()); From jussi.kivilinna at iki.fi Sun Oct 20 20:01:42 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Sun, 20 Oct 2013 21:01:42 +0300 Subject: [PATCH] serpent-amd64: do not use GAS macros Message-ID: <20131020180142.19195.21145.stgit@localhost6.localdomain6> * cipher/serpent-avx2-amd64.S: Remove use of GAS macros. * cipher/serpent-sse2-amd64.S: Ditto. * configure.ac [HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS]: Do not check for GAS macros. -- This way we have better portability; for example, when compiling with clang on x86-64, the assembly implementations are now enabled and working. Signed-off-by: Jussi Kivilinna --- cipher/serpent-avx2-amd64.S | 519 ++++++++++++++++++------------------------- cipher/serpent-sse2-amd64.S | 507 ++++++++++++++++++------------------------ configure.ac | 6 3 files changed, 439 insertions(+), 593 deletions(-) diff --git a/cipher/serpent-avx2-amd64.S b/cipher/serpent-avx2-amd64.S index c726e7b..8a76ab1 100644 --- a/cipher/serpent-avx2-amd64.S +++ b/cipher/serpent-avx2-amd64.S @@ -36,51 +36,36 @@ #define CTX %rdi /* vector registers */ -.set RA0, %ymm0 -.set RA1, %ymm1 -.set RA2, %ymm2 -.set RA3, %ymm3 -.set RA4, %ymm4 - -.set RB0, %ymm5 -.set RB1, %ymm6 -.set RB2, %ymm7 -.set RB3, %ymm8 -.set RB4, %ymm9 - -.set RNOT, %ymm10 -.set RTMP0, %ymm11 -.set RTMP1, %ymm12 -.set RTMP2, %ymm13 -.set RTMP3, %ymm14 -.set RTMP4, %ymm15 - -.set RNOTx, %xmm10 -.set RTMP0x, %xmm11 -.set RTMP1x, %xmm12 -.set RTMP2x, %xmm13 -.set RTMP3x, %xmm14 -.set RTMP4x, %xmm15 +#define RA0 %ymm0 +#define RA1 %ymm1 +#define RA2 %ymm2 +#define RA3 %ymm3 +#define RA4 %ymm4 + +#define RB0 %ymm5 +#define RB1 %ymm6 +#define RB2 %ymm7 +#define RB3 %ymm8 +#define RB4 %ymm9 + +#define RNOT %ymm10 +#define RTMP0 %ymm11 +#define RTMP1 %ymm12 +#define RTMP2 %ymm13 +#define RTMP3 %ymm14 +#define RTMP4 %ymm15 + +#define RNOTx %xmm10 +#define RTMP0x %xmm11 +#define RTMP1x %xmm12 +#define RTMP2x %xmm13 +#define RTMP3x %xmm14 +#define RTMP4x %xmm15 /********************************************************************** helper macros **********************************************************************/ -/* preprocessor macro for renaming vector registers using GAS macros */ -#define sbox_reg_rename(r0, r1, r2, r3, r4, \ - new_r0, new_r1, new_r2, new_r3, new_r4) \ - .set rename_reg0, new_r0; \ - .set rename_reg1, new_r1; \ - .set rename_reg2, new_r2; \ - .set rename_reg3, new_r3; \ - .set rename_reg4, new_r4; \ - \ - .set r0, rename_reg0; \ - .set r1, rename_reg1; \ - .set r2, rename_reg2; \ - .set r3, rename_reg3; \ - .set r4, rename_reg4; - /* vector 32-bit rotation to left */ #define vec_rol(reg, nleft, tmp) \ vpslld $(nleft), reg, tmp; \ @@ -128,9 +113,7 @@ vpxor r4, r2, r2; vpxor RNOT, r4, r4; \ vpor r1, r4, r4; vpxor r3, r1, r1; \ vpxor r4, r1, r1; vpor r0, r3, r3; \ - vpxor r3, r1, r1; vpxor r3, r4, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r4,r2,r0,r3); + vpxor r3, r1, r1; vpxor r3, r4, r4; #define SBOX0_INVERSE(r0, r1, r2, r3, r4) \ vpxor RNOT, r2, r2; vmovdqa r1, r4; \ @@ -143,9 +126,7 @@ vpxor r1, r2, r2; vpxor r0, r3, r3; \ vpxor r1, r3, r3; \ vpand r3, r2, r2; \ - vpxor r2, r4, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r0,r4,r1,r3,r2); + vpxor r2, r4, r4; #define SBOX1(r0, r1, r2, r3, r4) \ vpxor RNOT, r0, r0; vpxor RNOT, r2, r2; \ @@ -157,9 +138,7 @@ vpand r4, r2, r2; vpxor r1, r0, r0; \ vpand r2, r1, r1; \ vpxor r0, r1, r1; vpand r2, r0, r0; \ - vpxor r4, r0, r0; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r2,r0,r3,r1,r4); + vpxor r4, r0, r0; #define SBOX1_INVERSE(r0, r1, r2, r3, r4) \ vmovdqa r1, r4; vpxor r3, r1, r1; \ @@ -172,9 +151,7 @@ vpxor r1, r4, r4; vpor r0, r1, r1; \ vpxor r0, r1, r1; \ vpor r4, r1, r1; \ - vpxor r1, r3, r3; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r4,r0,r3,r2,r1); + vpxor r1, r3, r3; #define SBOX2(r0, r1, r2, r3, r4) \ vmovdqa r0, r4; vpand r2, r0, r0; \ @@ -184,9 +161,7 @@ vmovdqa r3, r1; vpor r4, r3, r3; \ vpxor r0, r3, r3; vpand r1, r0, r0; \ vpxor r0, r4, r4; vpxor r3, r1, r1; \ - vpxor r4, r1, r1; vpxor RNOT, r4, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r2,r3,r1,r4,r0); + vpxor r4, r1, r1; vpxor RNOT, r4, r4; #define SBOX2_INVERSE(r0, r1, r2, r3, r4) \ vpxor r3, r2, r2; vpxor r0, r3, r3; \ @@ -198,9 +173,7 @@ vpor r0, r2, r2; vpxor RNOT, r3, r3; \ vpxor r3, r2, r2; vpxor r3, r0, r0; \ vpand r1, r0, r0; vpxor r4, r3, r3; \ - vpxor r0, r3, r3; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r4,r2,r3,r0); + vpxor r0, r3, r3; #define SBOX3(r0, r1, r2, r3, r4) \ vmovdqa r0, r4; vpor r3, r0, r0; \ @@ -212,9 +185,7 @@ vpxor r2, r4, r4; vpor r0, r1, r1; \ vpxor r2, r1, r1; vpxor r3, r0, r0; \ vmovdqa r1, r2; vpor r3, r1, r1; \ - vpxor r0, r1, r1; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r2,r3,r4,r0); + vpxor r0, r1, r1; #define SBOX3_INVERSE(r0, r1, r2, r3, r4) \ vmovdqa r2, r4; vpxor r1, r2, r2; \ @@ -226,9 +197,7 @@ vpxor r1, r3, r3; vpxor r0, r1, r1; \ vpor r2, r1, r1; vpxor r3, r0, r0; \ vpxor r4, r1, r1; \ - vpxor r1, r0, r0; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r2,r1,r3,r0,r4); + vpxor r1, r0, r0; #define SBOX4(r0, r1, r2, r3, r4) \ vpxor r3, r1, r1; vpxor RNOT, r3, r3; \ @@ -240,9 +209,7 @@ vpxor r0, r3, r3; vpor r1, r4, r4; \ vpxor r0, r4, r4; vpor r3, r0, r0; \ vpxor r2, r0, r0; vpand r3, r2, r2; \ - vpxor RNOT, r0, r0; vpxor r2, r4, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r4,r0,r3,r2); + vpxor RNOT, r0, r0; vpxor r2, r4, r4; #define SBOX4_INVERSE(r0, r1, r2, r3, r4) \ vmovdqa r2, r4; vpand r3, r2, r2; \ @@ -255,9 +222,7 @@ vpand r0, r2, r2; vpxor r0, r3, r3; \ vpxor r4, r2, r2; \ vpor r3, r2, r2; vpxor r0, r3, r3; \ - vpxor r1, r2, r2; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r0,r3,r2,r4,r1); + vpxor r1, r2, r2; #define SBOX5(r0, r1, r2, r3, r4) \ vpxor r1, r0, r0; vpxor r3, r1, r1; \ @@ -269,9 +234,7 @@ vpxor r2, r4, r4; vpxor r0, r2, r2; \ vpand r3, r0, r0; vpxor RNOT, r2, r2; \ vpxor r4, r0, r0; vpor r3, r4, r4; \ - vpxor r4, r2, r2; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r3,r0,r2,r4); + vpxor r4, r2, r2; #define SBOX5_INVERSE(r0, r1, r2, r3, r4) \ vpxor RNOT, r1, r1; vmovdqa r3, r4; \ @@ -283,9 +246,7 @@ vpxor r3, r1, r1; vpxor r2, r4, r4; \ vpand r4, r3, r3; vpxor r1, r4, r4; \ vpxor r4, r3, r3; vpxor RNOT, r4, r4; \ - vpxor r0, r3, r3; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r4,r3,r2,r0); + vpxor r0, r3, r3; #define SBOX6(r0, r1, r2, r3, r4) \ vpxor RNOT, r2, r2; vmovdqa r3, r4; \ @@ -297,9 +258,7 @@ vpxor r2, r0, r0; vpxor r3, r4, r4; \ vpxor r0, r4, r4; vpxor RNOT, r3, r3; \ vpand r4, r2, r2; \ - vpxor r3, r2, r2; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r0,r1,r4,r2,r3); + vpxor r3, r2, r2; #define SBOX6_INVERSE(r0, r1, r2, r3, r4) \ vpxor r2, r0, r0; vmovdqa r2, r4; \ @@ -310,9 +269,7 @@ vpxor r1, r4, r4; vpand r3, r1, r1; \ vpxor r0, r1, r1; vpxor r3, r0, r0; \ vpor r2, r0, r0; vpxor r1, r3, r3; \ - vpxor r0, r4, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r2,r4,r3,r0); + vpxor r0, r4, r4; #define SBOX7(r0, r1, r2, r3, r4) \ vmovdqa r1, r4; vpor r2, r1, r1; \ @@ -325,9 +282,7 @@ vpxor r1, r2, r2; vpand r0, r1, r1; \ vpxor r4, r1, r1; vpxor RNOT, r2, r2; \ vpor r0, r2, r2; \ - vpxor r2, r4, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r4,r3,r1,r0,r2); + vpxor r2, r4, r4; #define SBOX7_INVERSE(r0, r1, r2, r3, r4) \ vmovdqa r2, r4; vpxor r0, r2, r2; \ @@ -339,9 +294,7 @@ vpor r2, r0, r0; vpxor r1, r4, r4; \ vpxor r3, r0, r0; vpxor r4, r3, r3; \ vpor r0, r4, r4; vpxor r2, r3, r3; \ - vpxor r2, r4, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r3,r0,r1,r4,r2); + vpxor r2, r4, r4; /* Apply SBOX number WHICH to to the block. */ #define SBOX(which, r0, r1, r2, r3, r4) \ @@ -402,49 +355,51 @@ /* Apply a Serpent round to sixteen parallel blocks. This macro increments `round'. */ -#define ROUND(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ - SBOX (which, a0, a1, a2, a3, a4); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - SBOX (which, b0, b1, b2, b3, b4); \ - LINEAR_TRANSFORMATION (a0, a1, a2, a3, a4); \ - LINEAR_TRANSFORMATION (b0, b1, b2, b3, b4); \ - .set round, (round + 1); +#define ROUND(round, which, a0, a1, a2, a3, a4, na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, nb0, nb1, nb2, nb3, nb4) \ + BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ + SBOX (which, a0, a1, a2, a3, a4); \ + BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ + SBOX (which, b0, b1, b2, b3, b4); \ + LINEAR_TRANSFORMATION (na0, na1, na2, na3, na4); \ + LINEAR_TRANSFORMATION (nb0, nb1, nb2, nb3, nb4); /* Apply the last Serpent round to sixteen parallel blocks. This macro increments `round'. */ -#define ROUND_LAST(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ - SBOX (which, a0, a1, a2, a3, a4); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - SBOX (which, b0, b1, b2, b3, b4); \ - .set round, (round + 1); \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - .set round, (round + 1); +#define ROUND_LAST(round, which, a0, a1, a2, a3, a4, na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, nb0, nb1, nb2, nb3, nb4) \ + BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ + SBOX (which, a0, a1, a2, a3, a4); \ + BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ + SBOX (which, b0, b1, b2, b3, b4); \ + BLOCK_XOR_KEY (na0, na1, na2, na3, na4, ((round) + 1)); \ + BLOCK_XOR_KEY (nb0, nb1, nb2, nb3, nb4, ((round) + 1)); /* Apply an inverse Serpent round to sixteen parallel blocks. This macro increments `round'. */ -#define ROUND_INVERSE(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ +#define ROUND_INVERSE(round, which, a0, a1, a2, a3, a4, \ + na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, \ + nb0, nb1, nb2, nb3, nb4) \ LINEAR_TRANSFORMATION_INVERSE (a0, a1, a2, a3, a4); \ LINEAR_TRANSFORMATION_INVERSE (b0, b1, b2, b3, b4); \ SBOX_INVERSE (which, a0, a1, a2, a3, a4); \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ + BLOCK_XOR_KEY (na0, na1, na2, na3, na4, round); \ SBOX_INVERSE (which, b0, b1, b2, b3, b4); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - .set round, (round - 1); + BLOCK_XOR_KEY (nb0, nb1, nb2, nb3, nb4, round); /* Apply the first inverse Serpent round to sixteen parallel blocks. This macro increments `round'. */ -#define ROUND_FIRST_INVERSE(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - .set round, (round - 1); \ +#define ROUND_FIRST_INVERSE(round, which, a0, a1, a2, a3, a4, \ + na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, \ + nb0, nb1, nb2, nb3, nb4) \ + BLOCK_XOR_KEY (a0, a1, a2, a3, a4, ((round) + 1)); \ + BLOCK_XOR_KEY (b0, b1, b2, b3, b4, ((round) + 1)); \ SBOX_INVERSE (which, a0, a1, a2, a3, a4); \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ + BLOCK_XOR_KEY (na0, na1, na2, na3, na4, round); \ SBOX_INVERSE (which, b0, b1, b2, b3, b4); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - .set round, (round - 1); + BLOCK_XOR_KEY (nb0, nb1, nb2, nb3, nb4, round); .text @@ -456,72 +411,82 @@ __serpent_enc_blk16: * RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3: sixteen parallel * plaintext blocks * output: - * RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3: sixteen parallel + * RA4, RA1, RA2, RA0, RB4, RB1, RB2, RB0: sixteen parallel * ciphertext blocks */ - /* record input vector names for __serpent_enc_blk16 */ - .set enc_in_a0, RA0 - .set enc_in_a1, RA1 - .set enc_in_a2, RA2 - .set enc_in_a3, RA3 - .set enc_in_b0, RB0 - .set enc_in_b1, RB1 - .set enc_in_b2, RB2 - .set enc_in_b3, RB3 - vpcmpeqd RNOT, RNOT, RNOT; transpose_4x4(RA0, RA1, RA2, RA3, RA4, RTMP0, RTMP1); transpose_4x4(RB0, RB1, RB2, RB3, RB4, RTMP0, RTMP1); - .set round, 0 - ROUND (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - - ROUND_LAST (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - - transpose_4x4(RA0, RA1, RA2, RA3, RA4, RTMP0, RTMP1); - transpose_4x4(RB0, RB1, RB2, RB3, RB4, RTMP0, RTMP1); - - /* record output vector names for __serpent_enc_blk16 */ - .set enc_out_a0, RA0 - .set enc_out_a1, RA1 - .set enc_out_a2, RA2 - .set enc_out_a3, RA3 - .set enc_out_b0, RB0 - .set enc_out_b1, RB1 - .set enc_out_b2, RB2 - .set enc_out_b3, RB3 + ROUND (0, 0, RA0, RA1, RA2, RA3, RA4, RA1, RA4, RA2, RA0, RA3, + RB0, RB1, RB2, RB3, RB4, RB1, RB4, RB2, RB0, RB3); + ROUND (1, 1, RA1, RA4, RA2, RA0, RA3, RA2, RA1, RA0, RA4, RA3, + RB1, RB4, RB2, RB0, RB3, RB2, RB1, RB0, RB4, RB3); + ROUND (2, 2, RA2, RA1, RA0, RA4, RA3, RA0, RA4, RA1, RA3, RA2, + RB2, RB1, RB0, RB4, RB3, RB0, RB4, RB1, RB3, RB2); + ROUND (3, 3, RA0, RA4, RA1, RA3, RA2, RA4, RA1, RA3, RA2, RA0, + RB0, RB4, RB1, RB3, RB2, RB4, RB1, RB3, RB2, RB0); + ROUND (4, 4, RA4, RA1, RA3, RA2, RA0, RA1, RA0, RA4, RA2, RA3, + RB4, RB1, RB3, RB2, RB0, RB1, RB0, RB4, RB2, RB3); + ROUND (5, 5, RA1, RA0, RA4, RA2, RA3, RA0, RA2, RA1, RA4, RA3, + RB1, RB0, RB4, RB2, RB3, RB0, RB2, RB1, RB4, RB3); + ROUND (6, 6, RA0, RA2, RA1, RA4, RA3, RA0, RA2, RA3, RA1, RA4, + RB0, RB2, RB1, RB4, RB3, RB0, RB2, RB3, RB1, RB4); + ROUND (7, 7, RA0, RA2, RA3, RA1, RA4, RA4, RA1, RA2, RA0, RA3, + RB0, RB2, RB3, RB1, RB4, RB4, RB1, RB2, RB0, RB3); + ROUND (8, 0, RA4, RA1, RA2, RA0, RA3, RA1, RA3, RA2, RA4, RA0, + RB4, RB1, RB2, RB0, RB3, RB1, RB3, RB2, RB4, RB0); + ROUND (9, 1, RA1, RA3, RA2, RA4, RA0, RA2, RA1, RA4, RA3, RA0, + RB1, RB3, RB2, RB4, RB0, RB2, RB1, RB4, RB3, RB0); + ROUND (10, 2, RA2, RA1, RA4, RA3, RA0, RA4, RA3, RA1, RA0, RA2, + RB2, RB1, RB4, RB3, RB0, RB4, RB3, RB1, RB0, RB2); + ROUND (11, 3, RA4, RA3, RA1, RA0, RA2, RA3, RA1, RA0, RA2, RA4, + RB4, RB3, RB1, RB0, RB2, RB3, RB1, RB0, RB2, RB4); + ROUND (12, 4, RA3, RA1, RA0, RA2, RA4, RA1, RA4, RA3, RA2, RA0, + RB3, RB1, RB0, RB2, RB4, RB1, RB4, RB3, RB2, RB0); + ROUND (13, 5, RA1, RA4, RA3, RA2, RA0, RA4, RA2, RA1, RA3, RA0, + RB1, RB4, RB3, RB2, RB0, RB4, RB2, RB1, RB3, RB0); + ROUND (14, 6, RA4, RA2, RA1, RA3, RA0, RA4, RA2, RA0, RA1, RA3, + RB4, RB2, RB1, RB3, RB0, RB4, RB2, RB0, RB1, RB3); + ROUND (15, 7, RA4, RA2, RA0, RA1, RA3, RA3, RA1, RA2, RA4, RA0, + RB4, RB2, RB0, RB1, RB3, RB3, RB1, RB2, RB4, RB0); + ROUND (16, 0, RA3, RA1, RA2, RA4, RA0, RA1, RA0, RA2, RA3, RA4, + RB3, RB1, RB2, RB4, RB0, RB1, RB0, RB2, RB3, RB4); + ROUND (17, 1, RA1, RA0, RA2, RA3, RA4, RA2, RA1, RA3, RA0, RA4, + RB1, RB0, RB2, RB3, RB4, RB2, RB1, RB3, RB0, RB4); + ROUND (18, 2, RA2, RA1, RA3, RA0, RA4, RA3, RA0, RA1, RA4, RA2, + RB2, RB1, RB3, RB0, RB4, RB3, RB0, RB1, RB4, RB2); + ROUND (19, 3, RA3, RA0, RA1, RA4, RA2, RA0, RA1, RA4, RA2, RA3, + RB3, RB0, RB1, RB4, RB2, RB0, RB1, RB4, RB2, RB3); + ROUND (20, 4, RA0, RA1, RA4, RA2, RA3, RA1, RA3, RA0, RA2, RA4, + RB0, RB1, RB4, RB2, RB3, RB1, RB3, RB0, RB2, RB4); + ROUND (21, 5, RA1, RA3, RA0, RA2, RA4, RA3, RA2, RA1, RA0, RA4, + RB1, RB3, RB0, RB2, RB4, RB3, RB2, RB1, RB0, RB4); + ROUND (22, 6, RA3, RA2, RA1, RA0, RA4, RA3, RA2, RA4, RA1, RA0, + RB3, RB2, RB1, RB0, RB4, RB3, RB2, RB4, RB1, RB0); + ROUND (23, 7, RA3, RA2, RA4, RA1, RA0, RA0, RA1, RA2, RA3, RA4, + RB3, RB2, RB4, RB1, RB0, RB0, RB1, RB2, RB3, RB4); + ROUND (24, 0, RA0, RA1, RA2, RA3, RA4, RA1, RA4, RA2, RA0, RA3, + RB0, RB1, RB2, RB3, RB4, RB1, RB4, RB2, RB0, RB3); + ROUND (25, 1, RA1, RA4, RA2, RA0, RA3, RA2, RA1, RA0, RA4, RA3, + RB1, RB4, RB2, RB0, RB3, RB2, RB1, RB0, RB4, RB3); + ROUND (26, 2, RA2, RA1, RA0, RA4, RA3, RA0, RA4, RA1, RA3, RA2, + RB2, RB1, RB0, RB4, RB3, RB0, RB4, RB1, RB3, RB2); + ROUND (27, 3, RA0, RA4, RA1, RA3, RA2, RA4, RA1, RA3, RA2, RA0, + RB0, RB4, RB1, RB3, RB2, RB4, RB1, RB3, RB2, RB0); + ROUND (28, 4, RA4, RA1, RA3, RA2, RA0, RA1, RA0, RA4, RA2, RA3, + RB4, RB1, RB3, RB2, RB0, RB1, RB0, RB4, RB2, RB3); + ROUND (29, 5, RA1, RA0, RA4, RA2, RA3, RA0, RA2, RA1, RA4, RA3, + RB1, RB0, RB4, RB2, RB3, RB0, RB2, RB1, RB4, RB3); + ROUND (30, 6, RA0, RA2, RA1, RA4, RA3, RA0, RA2, RA3, RA1, RA4, + RB0, RB2, RB1, RB4, RB3, RB0, RB2, RB3, RB1, RB4); + ROUND_LAST (31, 7, RA0, RA2, RA3, RA1, RA4, RA4, RA1, RA2, RA0, RA3, + RB0, RB2, RB3, RB1, RB4, RB4, RB1, RB2, RB0, RB3); + + transpose_4x4(RA4, RA1, RA2, RA0, RA3, RTMP0, RTMP1); + transpose_4x4(RB4, RB1, RB2, RB0, RB3, RTMP0, RTMP1); ret; .size __serpent_enc_blk16,.-__serpent_enc_blk16; @@ -538,69 +503,81 @@ __serpent_dec_blk16: * plaintext blocks */ - /* record input vector names for __serpent_dec_blk16 */ - .set dec_in_a0, RA0 - .set dec_in_a1, RA1 - .set dec_in_a2, RA2 - .set dec_in_a3, RA3 - .set dec_in_b0, RB0 - .set dec_in_b1, RB1 - .set dec_in_b2, RB2 - .set dec_in_b3, RB3 - vpcmpeqd RNOT, RNOT, RNOT; transpose_4x4(RA0, RA1, RA2, RA3, RA4, RTMP0, RTMP1); transpose_4x4(RB0, RB1, RB2, RB3, RB4, RTMP0, RTMP1); - .set round, 32 - ROUND_FIRST_INVERSE (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - - ROUND_INVERSE (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); + ROUND_FIRST_INVERSE (31, 7, RA0, RA1, RA2, RA3, RA4, + RA3, RA0, RA1, RA4, RA2, + RB0, RB1, RB2, RB3, RB4, + RB3, RB0, RB1, RB4, RB2); + ROUND_INVERSE (30, 6, RA3, RA0, RA1, RA4, RA2, RA0, RA1, RA2, RA4, RA3, + RB3, RB0, RB1, RB4, RB2, RB0, RB1, RB2, RB4, RB3); + ROUND_INVERSE (29, 5, RA0, RA1, RA2, RA4, RA3, RA1, RA3, RA4, RA2, RA0, + RB0, RB1, RB2, RB4, RB3, RB1, RB3, RB4, RB2, RB0); + ROUND_INVERSE (28, 4, RA1, RA3, RA4, RA2, RA0, RA1, RA2, RA4, RA0, RA3, + RB1, RB3, RB4, RB2, RB0, RB1, RB2, RB4, RB0, RB3); + ROUND_INVERSE (27, 3, RA1, RA2, RA4, RA0, RA3, RA4, RA2, RA0, RA1, RA3, + RB1, RB2, RB4, RB0, RB3, RB4, RB2, RB0, RB1, RB3); + ROUND_INVERSE (26, 2, RA4, RA2, RA0, RA1, RA3, RA2, RA3, RA0, RA1, RA4, + RB4, RB2, RB0, RB1, RB3, RB2, RB3, RB0, RB1, RB4); + ROUND_INVERSE (25, 1, RA2, RA3, RA0, RA1, RA4, RA4, RA2, RA1, RA0, RA3, + RB2, RB3, RB0, RB1, RB4, RB4, RB2, RB1, RB0, RB3); + ROUND_INVERSE (24, 0, RA4, RA2, RA1, RA0, RA3, RA4, RA3, RA2, RA0, RA1, + RB4, RB2, RB1, RB0, RB3, RB4, RB3, RB2, RB0, RB1); + ROUND_INVERSE (23, 7, RA4, RA3, RA2, RA0, RA1, RA0, RA4, RA3, RA1, RA2, + RB4, RB3, RB2, RB0, RB1, RB0, RB4, RB3, RB1, RB2); + ROUND_INVERSE (22, 6, RA0, RA4, RA3, RA1, RA2, RA4, RA3, RA2, RA1, RA0, + RB0, RB4, RB3, RB1, RB2, RB4, RB3, RB2, RB1, RB0); + ROUND_INVERSE (21, 5, RA4, RA3, RA2, RA1, RA0, RA3, RA0, RA1, RA2, RA4, + RB4, RB3, RB2, RB1, RB0, RB3, RB0, RB1, RB2, RB4); + ROUND_INVERSE (20, 4, RA3, RA0, RA1, RA2, RA4, RA3, RA2, RA1, RA4, RA0, + RB3, RB0, RB1, RB2, RB4, RB3, RB2, RB1, RB4, RB0); + ROUND_INVERSE (19, 3, RA3, RA2, RA1, RA4, RA0, RA1, RA2, RA4, RA3, RA0, + RB3, RB2, RB1, RB4, RB0, RB1, RB2, RB4, RB3, RB0); + ROUND_INVERSE (18, 2, RA1, RA2, RA4, RA3, RA0, RA2, RA0, RA4, RA3, RA1, + RB1, RB2, RB4, RB3, RB0, RB2, RB0, RB4, RB3, RB1); + ROUND_INVERSE (17, 1, RA2, RA0, RA4, RA3, RA1, RA1, RA2, RA3, RA4, RA0, + RB2, RB0, RB4, RB3, RB1, RB1, RB2, RB3, RB4, RB0); + ROUND_INVERSE (16, 0, RA1, RA2, RA3, RA4, RA0, RA1, RA0, RA2, RA4, RA3, + RB1, RB2, RB3, RB4, RB0, RB1, RB0, RB2, RB4, RB3); + ROUND_INVERSE (15, 7, RA1, RA0, RA2, RA4, RA3, RA4, RA1, RA0, RA3, RA2, + RB1, RB0, RB2, RB4, RB3, RB4, RB1, RB0, RB3, RB2); + ROUND_INVERSE (14, 6, RA4, RA1, RA0, RA3, RA2, RA1, RA0, RA2, RA3, RA4, + RB4, RB1, RB0, RB3, RB2, RB1, RB0, RB2, RB3, RB4); + ROUND_INVERSE (13, 5, RA1, RA0, RA2, RA3, RA4, RA0, RA4, RA3, RA2, RA1, + RB1, RB0, RB2, RB3, RB4, RB0, RB4, RB3, RB2, RB1); + ROUND_INVERSE (12, 4, RA0, RA4, RA3, RA2, RA1, RA0, RA2, RA3, RA1, RA4, + RB0, RB4, RB3, RB2, RB1, RB0, RB2, RB3, RB1, RB4); + ROUND_INVERSE (11, 3, RA0, RA2, RA3, RA1, RA4, RA3, RA2, RA1, RA0, RA4, + RB0, RB2, RB3, RB1, RB4, RB3, RB2, RB1, RB0, RB4); + ROUND_INVERSE (10, 2, RA3, RA2, RA1, RA0, RA4, RA2, RA4, RA1, RA0, RA3, + RB3, RB2, RB1, RB0, RB4, RB2, RB4, RB1, RB0, RB3); + ROUND_INVERSE (9, 1, RA2, RA4, RA1, RA0, RA3, RA3, RA2, RA0, RA1, RA4, + RB2, RB4, RB1, RB0, RB3, RB3, RB2, RB0, RB1, RB4); + ROUND_INVERSE (8, 0, RA3, RA2, RA0, RA1, RA4, RA3, RA4, RA2, RA1, RA0, + RB3, RB2, RB0, RB1, RB4, RB3, RB4, RB2, RB1, RB0); + ROUND_INVERSE (7, 7, RA3, RA4, RA2, RA1, RA0, RA1, RA3, RA4, RA0, RA2, + RB3, RB4, RB2, RB1, RB0, RB1, RB3, RB4, RB0, RB2); + ROUND_INVERSE (6, 6, RA1, RA3, RA4, RA0, RA2, RA3, RA4, RA2, RA0, RA1, + RB1, RB3, RB4, RB0, RB2, RB3, RB4, RB2, RB0, RB1); + ROUND_INVERSE (5, 5, RA3, RA4, RA2, RA0, RA1, RA4, RA1, RA0, RA2, RA3, + RB3, RB4, RB2, RB0, RB1, RB4, RB1, RB0, RB2, RB3); + ROUND_INVERSE (4, 4, RA4, RA1, RA0, RA2, RA3, RA4, RA2, RA0, RA3, RA1, + RB4, RB1, RB0, RB2, RB3, RB4, RB2, RB0, RB3, RB1); + ROUND_INVERSE (3, 3, RA4, RA2, RA0, RA3, RA1, RA0, RA2, RA3, RA4, RA1, + RB4, RB2, RB0, RB3, RB1, RB0, RB2, RB3, RB4, RB1); + ROUND_INVERSE (2, 2, RA0, RA2, RA3, RA4, RA1, RA2, RA1, RA3, RA4, RA0, + RB0, RB2, RB3, RB4, RB1, RB2, RB1, RB3, RB4, RB0); + ROUND_INVERSE (1, 1, RA2, RA1, RA3, RA4, RA0, RA0, RA2, RA4, RA3, RA1, + RB2, RB1, RB3, RB4, RB0, RB0, RB2, RB4, RB3, RB1); + ROUND_INVERSE (0, 0, RA0, RA2, RA4, RA3, RA1, RA0, RA1, RA2, RA3, RA4, + RB0, RB2, RB4, RB3, RB1, RB0, RB1, RB2, RB3, RB4); transpose_4x4(RA0, RA1, RA2, RA3, RA4, RTMP0, RTMP1); transpose_4x4(RB0, RB1, RB2, RB3, RB4, RTMP0, RTMP1); - /* record output vector names for __serpent_dec_blk16 */ - .set dec_out_a0, RA0 - .set dec_out_a1, RA1 - .set dec_out_a2, RA2 - .set dec_out_a3, RA3 - .set dec_out_b0, RB0 - .set dec_out_b1, RB1 - .set dec_out_b2, RB2 - .set dec_out_b3, RB3 - ret; .size __serpent_dec_blk16,.-__serpent_dec_blk16; @@ -623,15 +600,6 @@ _gcry_serpent_avx2_ctr_enc: vzeroupper; - .set RA0, enc_in_a0 - .set RA1, enc_in_a1 - .set RA2, enc_in_a2 - .set RA3, enc_in_a3 - .set RB0, enc_in_b0 - .set RB1, enc_in_b1 - .set RB2, enc_in_b2 - .set RB3, enc_in_b3 - vbroadcasti128 .Lbswap128_mask RIP, RTMP3; vpcmpeqd RNOT, RNOT, RNOT; vpsrldq $8, RNOT, RNOT; /* ab: -1:0 ; cd: -1:0 */ @@ -703,32 +671,23 @@ _gcry_serpent_avx2_ctr_enc: call __serpent_enc_blk16; - .set RA0, enc_out_a0 - .set RA1, enc_out_a1 - .set RA2, enc_out_a2 - .set RA3, enc_out_a3 - .set RB0, enc_out_b0 - .set RB1, enc_out_b1 - .set RB2, enc_out_b2 - .set RB3, enc_out_b3 - - vpxor (0 * 32)(%rdx), RA0, RA0; + vpxor (0 * 32)(%rdx), RA4, RA4; vpxor (1 * 32)(%rdx), RA1, RA1; vpxor (2 * 32)(%rdx), RA2, RA2; - vpxor (3 * 32)(%rdx), RA3, RA3; - vpxor (4 * 32)(%rdx), RB0, RB0; + vpxor (3 * 32)(%rdx), RA0, RA0; + vpxor (4 * 32)(%rdx), RB4, RB4; vpxor (5 * 32)(%rdx), RB1, RB1; vpxor (6 * 32)(%rdx), RB2, RB2; - vpxor (7 * 32)(%rdx), RB3, RB3; + vpxor (7 * 32)(%rdx), RB0, RB0; - vmovdqu RA0, (0 * 32)(%rsi); + vmovdqu RA4, (0 * 32)(%rsi); vmovdqu RA1, (1 * 32)(%rsi); vmovdqu RA2, (2 * 32)(%rsi); - vmovdqu RA3, (3 * 32)(%rsi); - vmovdqu RB0, (4 * 32)(%rsi); + vmovdqu RA0, (3 * 32)(%rsi); + vmovdqu RB4, (4 * 32)(%rsi); vmovdqu RB1, (5 * 32)(%rsi); vmovdqu RB2, (6 * 32)(%rsi); - vmovdqu RB3, (7 * 32)(%rsi); + vmovdqu RB0, (7 * 32)(%rsi); vzeroall; @@ -748,15 +707,6 @@ _gcry_serpent_avx2_cbc_dec: vzeroupper; - .set RA0, dec_in_a0 - .set RA1, dec_in_a1 - .set RA2, dec_in_a2 - .set RA3, dec_in_a3 - .set RB0, dec_in_b0 - .set RB1, dec_in_b1 - .set RB2, dec_in_b2 - .set RB3, dec_in_b3 - vmovdqu (0 * 32)(%rdx), RA0; vmovdqu (1 * 32)(%rdx), RA1; vmovdqu (2 * 32)(%rdx), RA2; @@ -768,15 +718,6 @@ _gcry_serpent_avx2_cbc_dec: call __serpent_dec_blk16; - .set RA0, dec_out_a0 - .set RA1, dec_out_a1 - .set RA2, dec_out_a2 - .set RA3, dec_out_a3 - .set RB0, dec_out_b0 - .set RB1, dec_out_b1 - .set RB2, dec_out_b2 - .set RB3, dec_out_b3 - vmovdqu (%rcx), RNOTx; vinserti128 $1, (%rdx), RNOT, RNOT; vpxor RNOT, RA0, RA0; @@ -817,15 +758,6 @@ _gcry_serpent_avx2_cfb_dec: vzeroupper; - .set RA0, enc_in_a0 - .set RA1, enc_in_a1 - .set RA2, enc_in_a2 - .set RA3, enc_in_a3 - .set RB0, enc_in_b0 - .set RB1, enc_in_b1 - .set RB2, enc_in_b2 - .set RB3, enc_in_b3 - /* Load input */ vmovdqu (%rcx), RNOTx; vinserti128 $1, (%rdx), RNOT, RA0; @@ -843,32 +775,23 @@ _gcry_serpent_avx2_cfb_dec: call __serpent_enc_blk16; - .set RA0, enc_out_a0 - .set RA1, enc_out_a1 - .set RA2, enc_out_a2 - .set RA3, enc_out_a3 - .set RB0, enc_out_b0 - .set RB1, enc_out_b1 - .set RB2, enc_out_b2 - .set RB3, enc_out_b3 - - vpxor (0 * 32)(%rdx), RA0, RA0; + vpxor (0 * 32)(%rdx), RA4, RA4; vpxor (1 * 32)(%rdx), RA1, RA1; vpxor (2 * 32)(%rdx), RA2, RA2; - vpxor (3 * 32)(%rdx), RA3, RA3; - vpxor (4 * 32)(%rdx), RB0, RB0; + vpxor (3 * 32)(%rdx), RA0, RA0; + vpxor (4 * 32)(%rdx), RB4, RB4; vpxor (5 * 32)(%rdx), RB1, RB1; vpxor (6 * 32)(%rdx), RB2, RB2; - vpxor (7 * 32)(%rdx), RB3, RB3; + vpxor (7 * 32)(%rdx), RB0, RB0; - vmovdqu RA0, (0 * 32)(%rsi); + vmovdqu RA4, (0 * 32)(%rsi); vmovdqu RA1, (1 * 32)(%rsi); vmovdqu RA2, (2 * 32)(%rsi); - vmovdqu RA3, (3 * 32)(%rsi); - vmovdqu RB0, (4 * 32)(%rsi); + vmovdqu RA0, (3 * 32)(%rsi); + vmovdqu RB4, (4 * 32)(%rsi); vmovdqu RB1, (5 * 32)(%rsi); vmovdqu RB2, (6 * 32)(%rsi); - vmovdqu RB3, (7 * 32)(%rsi); + vmovdqu RB0, (7 * 32)(%rsi); vzeroall; diff --git a/cipher/serpent-sse2-amd64.S b/cipher/serpent-sse2-amd64.S index a5cf353..516126b 100644 --- a/cipher/serpent-sse2-amd64.S +++ b/cipher/serpent-sse2-amd64.S @@ -35,42 +35,27 @@ #define CTX %rdi /* vector registers */ -.set RA0, %xmm0 -.set RA1, %xmm1 -.set RA2, %xmm2 -.set RA3, %xmm3 -.set RA4, %xmm4 - -.set RB0, %xmm5 -.set RB1, %xmm6 -.set RB2, %xmm7 -.set RB3, %xmm8 -.set RB4, %xmm9 - -.set RNOT, %xmm10 -.set RTMP0, %xmm11 -.set RTMP1, %xmm12 -.set RTMP2, %xmm13 +#define RA0 %xmm0 +#define RA1 %xmm1 +#define RA2 %xmm2 +#define RA3 %xmm3 +#define RA4 %xmm4 + +#define RB0 %xmm5 +#define RB1 %xmm6 +#define RB2 %xmm7 +#define RB3 %xmm8 +#define RB4 %xmm9 + +#define RNOT %xmm10 +#define RTMP0 %xmm11 +#define RTMP1 %xmm12 +#define RTMP2 %xmm13 /********************************************************************** helper macros **********************************************************************/ -/* preprocessor macro for renaming vector registers using GAS macros */ -#define sbox_reg_rename(r0, r1, r2, r3, r4, \ - new_r0, new_r1, new_r2, new_r3, new_r4) \ - .set rename_reg0, new_r0; \ - .set rename_reg1, new_r1; \ - .set rename_reg2, new_r2; \ - .set rename_reg3, new_r3; \ - .set rename_reg4, new_r4; \ - \ - .set r0, rename_reg0; \ - .set r1, rename_reg1; \ - .set r2, rename_reg2; \ - .set r3, rename_reg3; \ - .set r4, rename_reg4; - /* vector 32-bit rotation to left */ #define vec_rol(reg, nleft, tmp) \ movdqa reg, tmp; \ @@ -147,9 +132,7 @@ pxor r4, r2; pxor RNOT, r4; \ por r1, r4; pxor r3, r1; \ pxor r4, r1; por r0, r3; \ - pxor r3, r1; pxor r3, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r4,r2,r0,r3); + pxor r3, r1; pxor r3, r4; #define SBOX0_INVERSE(r0, r1, r2, r3, r4) \ pxor RNOT, r2; movdqa r1, r4; \ @@ -162,9 +145,7 @@ pxor r1, r2; pxor r0, r3; \ pxor r1, r3; \ pand r3, r2; \ - pxor r2, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r0,r4,r1,r3,r2); + pxor r2, r4; #define SBOX1(r0, r1, r2, r3, r4) \ pxor RNOT, r0; pxor RNOT, r2; \ @@ -176,9 +157,7 @@ pand r4, r2; pxor r1, r0; \ pand r2, r1; \ pxor r0, r1; pand r2, r0; \ - pxor r4, r0; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r2,r0,r3,r1,r4); + pxor r4, r0; #define SBOX1_INVERSE(r0, r1, r2, r3, r4) \ movdqa r1, r4; pxor r3, r1; \ @@ -191,9 +170,7 @@ pxor r1, r4; por r0, r1; \ pxor r0, r1; \ por r4, r1; \ - pxor r1, r3; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r4,r0,r3,r2,r1); + pxor r1, r3; #define SBOX2(r0, r1, r2, r3, r4) \ movdqa r0, r4; pand r2, r0; \ @@ -203,9 +180,7 @@ movdqa r3, r1; por r4, r3; \ pxor r0, r3; pand r1, r0; \ pxor r0, r4; pxor r3, r1; \ - pxor r4, r1; pxor RNOT, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r2,r3,r1,r4,r0); + pxor r4, r1; pxor RNOT, r4; #define SBOX2_INVERSE(r0, r1, r2, r3, r4) \ pxor r3, r2; pxor r0, r3; \ @@ -217,9 +192,7 @@ por r0, r2; pxor RNOT, r3; \ pxor r3, r2; pxor r3, r0; \ pand r1, r0; pxor r4, r3; \ - pxor r0, r3; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r4,r2,r3,r0); + pxor r0, r3; #define SBOX3(r0, r1, r2, r3, r4) \ movdqa r0, r4; por r3, r0; \ @@ -231,9 +204,7 @@ pxor r2, r4; por r0, r1; \ pxor r2, r1; pxor r3, r0; \ movdqa r1, r2; por r3, r1; \ - pxor r0, r1; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r2,r3,r4,r0); + pxor r0, r1; #define SBOX3_INVERSE(r0, r1, r2, r3, r4) \ movdqa r2, r4; pxor r1, r2; \ @@ -245,9 +216,7 @@ pxor r1, r3; pxor r0, r1; \ por r2, r1; pxor r3, r0; \ pxor r4, r1; \ - pxor r1, r0; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r2,r1,r3,r0,r4); + pxor r1, r0; #define SBOX4(r0, r1, r2, r3, r4) \ pxor r3, r1; pxor RNOT, r3; \ @@ -259,9 +228,7 @@ pxor r0, r3; por r1, r4; \ pxor r0, r4; por r3, r0; \ pxor r2, r0; pand r3, r2; \ - pxor RNOT, r0; pxor r2, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r4,r0,r3,r2); + pxor RNOT, r0; pxor r2, r4; #define SBOX4_INVERSE(r0, r1, r2, r3, r4) \ movdqa r2, r4; pand r3, r2; \ @@ -274,9 +241,7 @@ pand r0, r2; pxor r0, r3; \ pxor r4, r2; \ por r3, r2; pxor r0, r3; \ - pxor r1, r2; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r0,r3,r2,r4,r1); + pxor r1, r2; #define SBOX5(r0, r1, r2, r3, r4) \ pxor r1, r0; pxor r3, r1; \ @@ -288,9 +253,7 @@ pxor r2, r4; pxor r0, r2; \ pand r3, r0; pxor RNOT, r2; \ pxor r4, r0; por r3, r4; \ - pxor r4, r2; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r3,r0,r2,r4); + pxor r4, r2; #define SBOX5_INVERSE(r0, r1, r2, r3, r4) \ pxor RNOT, r1; movdqa r3, r4; \ @@ -302,9 +265,7 @@ pxor r3, r1; pxor r2, r4; \ pand r4, r3; pxor r1, r4; \ pxor r4, r3; pxor RNOT, r4; \ - pxor r0, r3; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r4,r3,r2,r0); + pxor r0, r3; #define SBOX6(r0, r1, r2, r3, r4) \ pxor RNOT, r2; movdqa r3, r4; \ @@ -316,9 +277,7 @@ pxor r2, r0; pxor r3, r4; \ pxor r0, r4; pxor RNOT, r3; \ pand r4, r2; \ - pxor r3, r2; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r0,r1,r4,r2,r3); + pxor r3, r2; #define SBOX6_INVERSE(r0, r1, r2, r3, r4) \ pxor r2, r0; movdqa r2, r4; \ @@ -329,9 +288,7 @@ pxor r1, r4; pand r3, r1; \ pxor r0, r1; pxor r3, r0; \ por r2, r0; pxor r1, r3; \ - pxor r0, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r2,r4,r3,r0); + pxor r0, r4; #define SBOX7(r0, r1, r2, r3, r4) \ movdqa r1, r4; por r2, r1; \ @@ -344,9 +301,7 @@ pxor r1, r2; pand r0, r1; \ pxor r4, r1; pxor RNOT, r2; \ por r0, r2; \ - pxor r2, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r4,r3,r1,r0,r2); + pxor r2, r4; #define SBOX7_INVERSE(r0, r1, r2, r3, r4) \ movdqa r2, r4; pxor r0, r2; \ @@ -358,9 +313,7 @@ por r2, r0; pxor r1, r4; \ pxor r3, r0; pxor r4, r3; \ por r0, r4; pxor r2, r3; \ - pxor r2, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r3,r0,r1,r4,r2); + pxor r2, r4; /* Apply SBOX number WHICH to to the block. */ #define SBOX(which, r0, r1, r2, r3, r4) \ @@ -425,49 +378,51 @@ /* Apply a Serpent round to eight parallel blocks. This macro increments `round'. */ -#define ROUND(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ - SBOX (which, a0, a1, a2, a3, a4); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - SBOX (which, b0, b1, b2, b3, b4); \ - LINEAR_TRANSFORMATION (a0, a1, a2, a3, a4); \ - LINEAR_TRANSFORMATION (b0, b1, b2, b3, b4); \ - .set round, (round + 1); +#define ROUND(round, which, a0, a1, a2, a3, a4, na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, nb0, nb1, nb2, nb3, nb4) \ + BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ + SBOX (which, a0, a1, a2, a3, a4); \ + BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ + SBOX (which, b0, b1, b2, b3, b4); \ + LINEAR_TRANSFORMATION (na0, na1, na2, na3, na4); \ + LINEAR_TRANSFORMATION (nb0, nb1, nb2, nb3, nb4); /* Apply the last Serpent round to eight parallel blocks. This macro increments `round'. */ -#define ROUND_LAST(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ - SBOX (which, a0, a1, a2, a3, a4); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - SBOX (which, b0, b1, b2, b3, b4); \ - .set round, (round + 1); \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - .set round, (round + 1); +#define ROUND_LAST(round, which, a0, a1, a2, a3, a4, na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, nb0, nb1, nb2, nb3, nb4) \ + BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ + SBOX (which, a0, a1, a2, a3, a4); \ + BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ + SBOX (which, b0, b1, b2, b3, b4); \ + BLOCK_XOR_KEY (na0, na1, na2, na3, na4, ((round) + 1)); \ + BLOCK_XOR_KEY (nb0, nb1, nb2, nb3, nb4, ((round) + 1)); /* Apply an inverse Serpent round to eight parallel blocks. This macro increments `round'. */ -#define ROUND_INVERSE(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ +#define ROUND_INVERSE(round, which, a0, a1, a2, a3, a4, \ + na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, \ + nb0, nb1, nb2, nb3, nb4) \ LINEAR_TRANSFORMATION_INVERSE (a0, a1, a2, a3, a4); \ LINEAR_TRANSFORMATION_INVERSE (b0, b1, b2, b3, b4); \ SBOX_INVERSE (which, a0, a1, a2, a3, a4); \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ + BLOCK_XOR_KEY (na0, na1, na2, na3, na4, round); \ SBOX_INVERSE (which, b0, b1, b2, b3, b4); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - .set round, (round - 1); + BLOCK_XOR_KEY (nb0, nb1, nb2, nb3, nb4, round); /* Apply the first inverse Serpent round to eight parallel blocks. This macro increments `round'. */ -#define ROUND_FIRST_INVERSE(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - .set round, (round - 1); \ +#define ROUND_FIRST_INVERSE(round, which, a0, a1, a2, a3, a4, \ + na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, \ + nb0, nb1, nb2, nb3, nb4) \ + BLOCK_XOR_KEY (a0, a1, a2, a3, a4, ((round) + 1)); \ + BLOCK_XOR_KEY (b0, b1, b2, b3, b4, ((round) + 1)); \ SBOX_INVERSE (which, a0, a1, a2, a3, a4); \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ + BLOCK_XOR_KEY (na0, na1, na2, na3, na4, round); \ SBOX_INVERSE (which, b0, b1, b2, b3, b4); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - .set round, (round - 1); + BLOCK_XOR_KEY (nb0, nb1, nb2, nb3, nb4, round); .text @@ -479,72 +434,82 @@ __serpent_enc_blk8: * RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3: eight parallel plaintext * blocks * output: - * RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3: eight parallel + * RA4, RA1, RA2, RA0, RB4, RB1, RB2, RB0: eight parallel * ciphertext blocks */ - /* record input vector names for __serpent_enc_blk8 */ - .set enc_in_a0, RA0 - .set enc_in_a1, RA1 - .set enc_in_a2, RA2 - .set enc_in_a3, RA3 - .set enc_in_b0, RB0 - .set enc_in_b1, RB1 - .set enc_in_b2, RB2 - .set enc_in_b3, RB3 - pcmpeqd RNOT, RNOT; transpose_4x4(RA0, RA1, RA2, RA3, RA4, RTMP0, RTMP1); transpose_4x4(RB0, RB1, RB2, RB3, RB4, RTMP0, RTMP1); - .set round, 0 - ROUND (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - - ROUND_LAST (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - - transpose_4x4(RA0, RA1, RA2, RA3, RA4, RTMP0, RTMP1); - transpose_4x4(RB0, RB1, RB2, RB3, RB4, RTMP0, RTMP1); - - /* record output vector names for __serpent_enc_blk8 */ - .set enc_out_a0, RA0 - .set enc_out_a1, RA1 - .set enc_out_a2, RA2 - .set enc_out_a3, RA3 - .set enc_out_b0, RB0 - .set enc_out_b1, RB1 - .set enc_out_b2, RB2 - .set enc_out_b3, RB3 + ROUND (0, 0, RA0, RA1, RA2, RA3, RA4, RA1, RA4, RA2, RA0, RA3, + RB0, RB1, RB2, RB3, RB4, RB1, RB4, RB2, RB0, RB3); + ROUND (1, 1, RA1, RA4, RA2, RA0, RA3, RA2, RA1, RA0, RA4, RA3, + RB1, RB4, RB2, RB0, RB3, RB2, RB1, RB0, RB4, RB3); + ROUND (2, 2, RA2, RA1, RA0, RA4, RA3, RA0, RA4, RA1, RA3, RA2, + RB2, RB1, RB0, RB4, RB3, RB0, RB4, RB1, RB3, RB2); + ROUND (3, 3, RA0, RA4, RA1, RA3, RA2, RA4, RA1, RA3, RA2, RA0, + RB0, RB4, RB1, RB3, RB2, RB4, RB1, RB3, RB2, RB0); + ROUND (4, 4, RA4, RA1, RA3, RA2, RA0, RA1, RA0, RA4, RA2, RA3, + RB4, RB1, RB3, RB2, RB0, RB1, RB0, RB4, RB2, RB3); + ROUND (5, 5, RA1, RA0, RA4, RA2, RA3, RA0, RA2, RA1, RA4, RA3, + RB1, RB0, RB4, RB2, RB3, RB0, RB2, RB1, RB4, RB3); + ROUND (6, 6, RA0, RA2, RA1, RA4, RA3, RA0, RA2, RA3, RA1, RA4, + RB0, RB2, RB1, RB4, RB3, RB0, RB2, RB3, RB1, RB4); + ROUND (7, 7, RA0, RA2, RA3, RA1, RA4, RA4, RA1, RA2, RA0, RA3, + RB0, RB2, RB3, RB1, RB4, RB4, RB1, RB2, RB0, RB3); + ROUND (8, 0, RA4, RA1, RA2, RA0, RA3, RA1, RA3, RA2, RA4, RA0, + RB4, RB1, RB2, RB0, RB3, RB1, RB3, RB2, RB4, RB0); + ROUND (9, 1, RA1, RA3, RA2, RA4, RA0, RA2, RA1, RA4, RA3, RA0, + RB1, RB3, RB2, RB4, RB0, RB2, RB1, RB4, RB3, RB0); + ROUND (10, 2, RA2, RA1, RA4, RA3, RA0, RA4, RA3, RA1, RA0, RA2, + RB2, RB1, RB4, RB3, RB0, RB4, RB3, RB1, RB0, RB2); + ROUND (11, 3, RA4, RA3, RA1, RA0, RA2, RA3, RA1, RA0, RA2, RA4, + RB4, RB3, RB1, RB0, RB2, RB3, RB1, RB0, RB2, RB4); + ROUND (12, 4, RA3, RA1, RA0, RA2, RA4, RA1, RA4, RA3, RA2, RA0, + RB3, RB1, RB0, RB2, RB4, RB1, RB4, RB3, RB2, RB0); + ROUND (13, 5, RA1, RA4, RA3, RA2, RA0, RA4, RA2, RA1, RA3, RA0, + RB1, RB4, RB3, RB2, RB0, RB4, RB2, RB1, RB3, RB0); + ROUND (14, 6, RA4, RA2, RA1, RA3, RA0, RA4, RA2, RA0, RA1, RA3, + RB4, RB2, RB1, RB3, RB0, RB4, RB2, RB0, RB1, RB3); + ROUND (15, 7, RA4, RA2, RA0, RA1, RA3, RA3, RA1, RA2, RA4, RA0, + RB4, RB2, RB0, RB1, RB3, RB3, RB1, RB2, RB4, RB0); + ROUND (16, 0, RA3, RA1, RA2, RA4, RA0, RA1, RA0, RA2, RA3, RA4, + RB3, RB1, RB2, RB4, RB0, RB1, RB0, RB2, RB3, RB4); + ROUND (17, 1, RA1, RA0, RA2, RA3, RA4, RA2, RA1, RA3, RA0, RA4, + RB1, RB0, RB2, RB3, RB4, RB2, RB1, RB3, RB0, RB4); + ROUND (18, 2, RA2, RA1, RA3, RA0, RA4, RA3, RA0, RA1, RA4, RA2, + RB2, RB1, RB3, RB0, RB4, RB3, RB0, RB1, RB4, RB2); + ROUND (19, 3, RA3, RA0, RA1, RA4, RA2, RA0, RA1, RA4, RA2, RA3, + RB3, RB0, RB1, RB4, RB2, RB0, RB1, RB4, RB2, RB3); + ROUND (20, 4, RA0, RA1, RA4, RA2, RA3, RA1, RA3, RA0, RA2, RA4, + RB0, RB1, RB4, RB2, RB3, RB1, RB3, RB0, RB2, RB4); + ROUND (21, 5, RA1, RA3, RA0, RA2, RA4, RA3, RA2, RA1, RA0, RA4, + RB1, RB3, RB0, RB2, RB4, RB3, RB2, RB1, RB0, RB4); + ROUND (22, 6, RA3, RA2, RA1, RA0, RA4, RA3, RA2, RA4, RA1, RA0, + RB3, RB2, RB1, RB0, RB4, RB3, RB2, RB4, RB1, RB0); + ROUND (23, 7, RA3, RA2, RA4, RA1, RA0, RA0, RA1, RA2, RA3, RA4, + RB3, RB2, RB4, RB1, RB0, RB0, RB1, RB2, RB3, RB4); + ROUND (24, 0, RA0, RA1, RA2, RA3, RA4, RA1, RA4, RA2, RA0, RA3, + RB0, RB1, RB2, RB3, RB4, RB1, RB4, RB2, RB0, RB3); + ROUND (25, 1, RA1, RA4, RA2, RA0, RA3, RA2, RA1, RA0, RA4, RA3, + RB1, RB4, RB2, RB0, RB3, RB2, RB1, RB0, RB4, RB3); + ROUND (26, 2, RA2, RA1, RA0, RA4, RA3, RA0, RA4, RA1, RA3, RA2, + RB2, RB1, RB0, RB4, RB3, RB0, RB4, RB1, RB3, RB2); + ROUND (27, 3, RA0, RA4, RA1, RA3, RA2, RA4, RA1, RA3, RA2, RA0, + RB0, RB4, RB1, RB3, RB2, RB4, RB1, RB3, RB2, RB0); + ROUND (28, 4, RA4, RA1, RA3, RA2, RA0, RA1, RA0, RA4, RA2, RA3, + RB4, RB1, RB3, RB2, RB0, RB1, RB0, RB4, RB2, RB3); + ROUND (29, 5, RA1, RA0, RA4, RA2, RA3, RA0, RA2, RA1, RA4, RA3, + RB1, RB0, RB4, RB2, RB3, RB0, RB2, RB1, RB4, RB3); + ROUND (30, 6, RA0, RA2, RA1, RA4, RA3, RA0, RA2, RA3, RA1, RA4, + RB0, RB2, RB1, RB4, RB3, RB0, RB2, RB3, RB1, RB4); + ROUND_LAST (31, 7, RA0, RA2, RA3, RA1, RA4, RA4, RA1, RA2, RA0, RA3, + RB0, RB2, RB3, RB1, RB4, RB4, RB1, RB2, RB0, RB3); + + transpose_4x4(RA4, RA1, RA2, RA0, RA3, RTMP0, RTMP1); + transpose_4x4(RB4, RB1, RB2, RB0, RB3, RTMP0, RTMP1); ret; .size __serpent_enc_blk8,.-__serpent_enc_blk8; @@ -561,69 +526,81 @@ __serpent_dec_blk8: * blocks */ - /* record input vector names for __serpent_dec_blk8 */ - .set dec_in_a0, RA0 - .set dec_in_a1, RA1 - .set dec_in_a2, RA2 - .set dec_in_a3, RA3 - .set dec_in_b0, RB0 - .set dec_in_b1, RB1 - .set dec_in_b2, RB2 - .set dec_in_b3, RB3 - pcmpeqd RNOT, RNOT; transpose_4x4(RA0, RA1, RA2, RA3, RA4, RTMP0, RTMP1); transpose_4x4(RB0, RB1, RB2, RB3, RB4, RTMP0, RTMP1); - .set round, 32 - ROUND_FIRST_INVERSE (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - - ROUND_INVERSE (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); + ROUND_FIRST_INVERSE (31, 7, RA0, RA1, RA2, RA3, RA4, + RA3, RA0, RA1, RA4, RA2, + RB0, RB1, RB2, RB3, RB4, + RB3, RB0, RB1, RB4, RB2); + ROUND_INVERSE (30, 6, RA3, RA0, RA1, RA4, RA2, RA0, RA1, RA2, RA4, RA3, + RB3, RB0, RB1, RB4, RB2, RB0, RB1, RB2, RB4, RB3); + ROUND_INVERSE (29, 5, RA0, RA1, RA2, RA4, RA3, RA1, RA3, RA4, RA2, RA0, + RB0, RB1, RB2, RB4, RB3, RB1, RB3, RB4, RB2, RB0); + ROUND_INVERSE (28, 4, RA1, RA3, RA4, RA2, RA0, RA1, RA2, RA4, RA0, RA3, + RB1, RB3, RB4, RB2, RB0, RB1, RB2, RB4, RB0, RB3); + ROUND_INVERSE (27, 3, RA1, RA2, RA4, RA0, RA3, RA4, RA2, RA0, RA1, RA3, + RB1, RB2, RB4, RB0, RB3, RB4, RB2, RB0, RB1, RB3); + ROUND_INVERSE (26, 2, RA4, RA2, RA0, RA1, RA3, RA2, RA3, RA0, RA1, RA4, + RB4, RB2, RB0, RB1, RB3, RB2, RB3, RB0, RB1, RB4); + ROUND_INVERSE (25, 1, RA2, RA3, RA0, RA1, RA4, RA4, RA2, RA1, RA0, RA3, + RB2, RB3, RB0, RB1, RB4, RB4, RB2, RB1, RB0, RB3); + ROUND_INVERSE (24, 0, RA4, RA2, RA1, RA0, RA3, RA4, RA3, RA2, RA0, RA1, + RB4, RB2, RB1, RB0, RB3, RB4, RB3, RB2, RB0, RB1); + ROUND_INVERSE (23, 7, RA4, RA3, RA2, RA0, RA1, RA0, RA4, RA3, RA1, RA2, + RB4, RB3, RB2, RB0, RB1, RB0, RB4, RB3, RB1, RB2); + ROUND_INVERSE (22, 6, RA0, RA4, RA3, RA1, RA2, RA4, RA3, RA2, RA1, RA0, + RB0, RB4, RB3, RB1, RB2, RB4, RB3, RB2, RB1, RB0); + ROUND_INVERSE (21, 5, RA4, RA3, RA2, RA1, RA0, RA3, RA0, RA1, RA2, RA4, + RB4, RB3, RB2, RB1, RB0, RB3, RB0, RB1, RB2, RB4); + ROUND_INVERSE (20, 4, RA3, RA0, RA1, RA2, RA4, RA3, RA2, RA1, RA4, RA0, + RB3, RB0, RB1, RB2, RB4, RB3, RB2, RB1, RB4, RB0); + ROUND_INVERSE (19, 3, RA3, RA2, RA1, RA4, RA0, RA1, RA2, RA4, RA3, RA0, + RB3, RB2, RB1, RB4, RB0, RB1, RB2, RB4, RB3, RB0); + ROUND_INVERSE (18, 2, RA1, RA2, RA4, RA3, RA0, RA2, RA0, RA4, RA3, RA1, + RB1, RB2, RB4, RB3, RB0, RB2, RB0, RB4, RB3, RB1); + ROUND_INVERSE (17, 1, RA2, RA0, RA4, RA3, RA1, RA1, RA2, RA3, RA4, RA0, + RB2, RB0, RB4, RB3, RB1, RB1, RB2, RB3, RB4, RB0); + ROUND_INVERSE (16, 0, RA1, RA2, RA3, RA4, RA0, RA1, RA0, RA2, RA4, RA3, + RB1, RB2, RB3, RB4, RB0, RB1, RB0, RB2, RB4, RB3); + ROUND_INVERSE (15, 7, RA1, RA0, RA2, RA4, RA3, RA4, RA1, RA0, RA3, RA2, + RB1, RB0, RB2, RB4, RB3, RB4, RB1, RB0, RB3, RB2); + ROUND_INVERSE (14, 6, RA4, RA1, RA0, RA3, RA2, RA1, RA0, RA2, RA3, RA4, + RB4, RB1, RB0, RB3, RB2, RB1, RB0, RB2, RB3, RB4); + ROUND_INVERSE (13, 5, RA1, RA0, RA2, RA3, RA4, RA0, RA4, RA3, RA2, RA1, + RB1, RB0, RB2, RB3, RB4, RB0, RB4, RB3, RB2, RB1); + ROUND_INVERSE (12, 4, RA0, RA4, RA3, RA2, RA1, RA0, RA2, RA3, RA1, RA4, + RB0, RB4, RB3, RB2, RB1, RB0, RB2, RB3, RB1, RB4); + ROUND_INVERSE (11, 3, RA0, RA2, RA3, RA1, RA4, RA3, RA2, RA1, RA0, RA4, + RB0, RB2, RB3, RB1, RB4, RB3, RB2, RB1, RB0, RB4); + ROUND_INVERSE (10, 2, RA3, RA2, RA1, RA0, RA4, RA2, RA4, RA1, RA0, RA3, + RB3, RB2, RB1, RB0, RB4, RB2, RB4, RB1, RB0, RB3); + ROUND_INVERSE (9, 1, RA2, RA4, RA1, RA0, RA3, RA3, RA2, RA0, RA1, RA4, + RB2, RB4, RB1, RB0, RB3, RB3, RB2, RB0, RB1, RB4); + ROUND_INVERSE (8, 0, RA3, RA2, RA0, RA1, RA4, RA3, RA4, RA2, RA1, RA0, + RB3, RB2, RB0, RB1, RB4, RB3, RB4, RB2, RB1, RB0); + ROUND_INVERSE (7, 7, RA3, RA4, RA2, RA1, RA0, RA1, RA3, RA4, RA0, RA2, + RB3, RB4, RB2, RB1, RB0, RB1, RB3, RB4, RB0, RB2); + ROUND_INVERSE (6, 6, RA1, RA3, RA4, RA0, RA2, RA3, RA4, RA2, RA0, RA1, + RB1, RB3, RB4, RB0, RB2, RB3, RB4, RB2, RB0, RB1); + ROUND_INVERSE (5, 5, RA3, RA4, RA2, RA0, RA1, RA4, RA1, RA0, RA2, RA3, + RB3, RB4, RB2, RB0, RB1, RB4, RB1, RB0, RB2, RB3); + ROUND_INVERSE (4, 4, RA4, RA1, RA0, RA2, RA3, RA4, RA2, RA0, RA3, RA1, + RB4, RB1, RB0, RB2, RB3, RB4, RB2, RB0, RB3, RB1); + ROUND_INVERSE (3, 3, RA4, RA2, RA0, RA3, RA1, RA0, RA2, RA3, RA4, RA1, + RB4, RB2, RB0, RB3, RB1, RB0, RB2, RB3, RB4, RB1); + ROUND_INVERSE (2, 2, RA0, RA2, RA3, RA4, RA1, RA2, RA1, RA3, RA4, RA0, + RB0, RB2, RB3, RB4, RB1, RB2, RB1, RB3, RB4, RB0); + ROUND_INVERSE (1, 1, RA2, RA1, RA3, RA4, RA0, RA0, RA2, RA4, RA3, RA1, + RB2, RB1, RB3, RB4, RB0, RB0, RB2, RB4, RB3, RB1); + ROUND_INVERSE (0, 0, RA0, RA2, RA4, RA3, RA1, RA0, RA1, RA2, RA3, RA4, + RB0, RB2, RB4, RB3, RB1, RB0, RB1, RB2, RB3, RB4); transpose_4x4(RA0, RA1, RA2, RA3, RA4, RTMP0, RTMP1); transpose_4x4(RB0, RB1, RB2, RB3, RB4, RTMP0, RTMP1); - /* record output vector names for __serpent_dec_blk8 */ - .set dec_out_a0, RA0 - .set dec_out_a1, RA1 - .set dec_out_a2, RA2 - .set dec_out_a3, RA3 - .set dec_out_b0, RB0 - .set dec_out_b1, RB1 - .set dec_out_b2, RB2 - .set dec_out_b3, RB3 - ret; .size __serpent_dec_blk8,.-__serpent_dec_blk8; @@ -638,15 +615,6 @@ _gcry_serpent_sse2_ctr_enc: * %rcx: iv (big endian, 128bit) */ - .set RA0, enc_in_a0 - .set RA1, enc_in_a1 - .set RA2, enc_in_a2 - .set RA3, enc_in_a3 - .set RB0, enc_in_b0 - .set RB1, enc_in_b1 - .set RB2, enc_in_b2 - .set RB3, enc_in_b3 - /* load IV and byteswap */ movdqu (%rcx), RA0; movdqa RA0, RTMP0; @@ -729,42 +697,35 @@ _gcry_serpent_sse2_ctr_enc: call __serpent_enc_blk8; - .set RA0, enc_out_a0 - .set RA1, enc_out_a1 - .set RA2, enc_out_a2 - .set RA3, enc_out_a3 - .set RB0, enc_out_b0 - .set RB1, enc_out_b1 - .set RB2, enc_out_b2 - .set RB3, enc_out_b3 - - pxor_u((0 * 16)(%rdx), RA0, RTMP0); + pxor_u((0 * 16)(%rdx), RA4, RTMP0); pxor_u((1 * 16)(%rdx), RA1, RTMP0); pxor_u((2 * 16)(%rdx), RA2, RTMP0); - pxor_u((3 * 16)(%rdx), RA3, RTMP0); - pxor_u((4 * 16)(%rdx), RB0, RTMP0); + pxor_u((3 * 16)(%rdx), RA0, RTMP0); + pxor_u((4 * 16)(%rdx), RB4, RTMP0); pxor_u((5 * 16)(%rdx), RB1, RTMP0); pxor_u((6 * 16)(%rdx), RB2, RTMP0); - pxor_u((7 * 16)(%rdx), RB3, RTMP0); + pxor_u((7 * 16)(%rdx), RB0, RTMP0); - movdqu RA0, (0 * 16)(%rsi); + movdqu RA4, (0 * 16)(%rsi); movdqu RA1, (1 * 16)(%rsi); movdqu RA2, (2 * 16)(%rsi); - movdqu RA3, (3 * 16)(%rsi); - movdqu RB0, (4 * 16)(%rsi); + movdqu RA0, (3 * 16)(%rsi); + movdqu RB4, (4 * 16)(%rsi); movdqu RB1, (5 * 16)(%rsi); movdqu RB2, (6 * 16)(%rsi); - movdqu RB3, (7 * 16)(%rsi); + movdqu RB0, (7 * 16)(%rsi); /* clear the used registers */ pxor RA0, RA0; pxor RA1, RA1; pxor RA2, RA2; pxor RA3, RA3; + pxor RA4, RA4; pxor RB0, RB0; pxor RB1, RB1; pxor RB2, RB2; pxor RB3, RB3; + pxor RB4, RB4; pxor RTMP0, RTMP0; pxor RTMP1, RTMP1; pxor RTMP2, RTMP2; @@ -784,15 +745,6 @@ _gcry_serpent_sse2_cbc_dec: * %rcx: iv */ - .set RA0, dec_in_a0 - .set RA1, dec_in_a1 - .set RA2, dec_in_a2 - .set RA3, dec_in_a3 - .set RB0, dec_in_b0 - .set RB1, dec_in_b1 - .set RB2, dec_in_b2 - .set RB3, dec_in_b3 - movdqu (0 * 16)(%rdx), RA0; movdqu (1 * 16)(%rdx), RA1; movdqu (2 * 16)(%rdx), RA2; @@ -804,15 +756,6 @@ _gcry_serpent_sse2_cbc_dec: call __serpent_dec_blk8; - .set RA0, dec_out_a0 - .set RA1, dec_out_a1 - .set RA2, dec_out_a2 - .set RA3, dec_out_a3 - .set RB0, dec_out_b0 - .set RB1, dec_out_b1 - .set RB2, dec_out_b2 - .set RB3, dec_out_b3 - movdqu (7 * 16)(%rdx), RNOT; pxor_u((%rcx), RA0, RTMP0); pxor_u((0 * 16)(%rdx), RA1, RTMP0); @@ -838,10 +781,12 @@ _gcry_serpent_sse2_cbc_dec: pxor RA1, RA1; pxor RA2, RA2; pxor RA3, RA3; + pxor RA4, RA4; pxor RB0, RB0; pxor RB1, RB1; pxor RB2, RB2; pxor RB3, RB3; + pxor RB4, RB4; pxor RTMP0, RTMP0; pxor RTMP1, RTMP1; pxor RTMP2, RTMP2; @@ -861,15 +806,6 @@ _gcry_serpent_sse2_cfb_dec: * %rcx: iv */ - .set RA0, enc_in_a0 - .set RA1, enc_in_a1 - .set RA2, enc_in_a2 - .set RA3, enc_in_a3 - .set RB0, enc_in_b0 - .set RB1, enc_in_b1 - .set RB2, enc_in_b2 - .set RB3, enc_in_b3 - /* Load input */ movdqu (%rcx), RA0; movdqu 0 * 16(%rdx), RA1; @@ -886,42 +822,35 @@ _gcry_serpent_sse2_cfb_dec: call __serpent_enc_blk8; - .set RA0, enc_out_a0 - .set RA1, enc_out_a1 - .set RA2, enc_out_a2 - .set RA3, enc_out_a3 - .set RB0, enc_out_b0 - .set RB1, enc_out_b1 - .set RB2, enc_out_b2 - .set RB3, enc_out_b3 - - pxor_u((0 * 16)(%rdx), RA0, RTMP0); + pxor_u((0 * 16)(%rdx), RA4, RTMP0); pxor_u((1 * 16)(%rdx), RA1, RTMP0); pxor_u((2 * 16)(%rdx), RA2, RTMP0); - pxor_u((3 * 16)(%rdx), RA3, RTMP0); - pxor_u((4 * 16)(%rdx), RB0, RTMP0); + pxor_u((3 * 16)(%rdx), RA0, RTMP0); + pxor_u((4 * 16)(%rdx), RB4, RTMP0); pxor_u((5 * 16)(%rdx), RB1, RTMP0); pxor_u((6 * 16)(%rdx), RB2, RTMP0); - pxor_u((7 * 16)(%rdx), RB3, RTMP0); + pxor_u((7 * 16)(%rdx), RB0, RTMP0); - movdqu RA0, (0 * 16)(%rsi); + movdqu RA4, (0 * 16)(%rsi); movdqu RA1, (1 * 16)(%rsi); movdqu RA2, (2 * 16)(%rsi); - movdqu RA3, (3 * 16)(%rsi); - movdqu RB0, (4 * 16)(%rsi); + movdqu RA0, (3 * 16)(%rsi); + movdqu RB4, (4 * 16)(%rsi); movdqu RB1, (5 * 16)(%rsi); movdqu RB2, (6 * 16)(%rsi); - movdqu RB3, (7 * 16)(%rsi); + movdqu RB0, (7 * 16)(%rsi); /* clear the used registers */ pxor RA0, RA0; pxor RA1, RA1; pxor RA2, RA2; pxor RA3, RA3; + pxor RA4, RA4; pxor RB0, RB0; pxor RB1, RB1; pxor RB2, RB2; pxor RB3, RB3; + pxor RB4, RB4; pxor RTMP0, RTMP0; pxor RTMP1, RTMP1; pxor RTMP2, RTMP2; diff --git a/configure.ac b/configure.ac index 1460dfd..a803b5f 100644 --- a/configure.ac +++ b/configure.ac @@ -1034,12 +1034,6 @@ if test $amd64_as_feature_detection = yes; then [gcry_cv_gcc_amd64_platform_as_ok=no AC_COMPILE_IFELSE([AC_LANG_SOURCE( [[__asm__( - /* Test if '.set' is supported by underlying assembler. */ - ".set a0, %rax\n\t" - ".set b0, %rdx\n\t" - "asmfunc:\n\t" - "movq a0, b0;\n\t" /* Fails here if .set ignored by as. */ - /* Test if '.type' and '.size' are supported. */ /* These work only on ELF targets. */ /* TODO: add COFF (mingw64, cygwin64) support to assembly From dbaryshkov at gmail.com Sun Oct 20 21:49:20 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Sun, 20 Oct 2013 23:49:20 +0400 Subject: [PATCH 1/2] [v2] Add API to support AEAD cipher modes In-Reply-To: <20131020120313.21970.15918.stgit@localhost6.localdomain6> References: <20131020120313.21970.15918.stgit@localhost6.localdomain6> Message-ID: Hello, On Sun, Oct 20, 2013 at 4:03 PM, Jussi Kivilinna wrote: > - Change gcry_cipher_tag to gcry_cipher_checktag and gcry_cipher_gettag > for giving tag (checktag) for decryption and reading tag (gettag) after > encryption. > - Change gcry_cipher_authenticate to gcry_cipher_setaad, since > additional parameters needed for some AEAD modes (in this case CCM, > which needs the length of encrypted data and tag for MAC > initialization). I'm quite unsure that we should make this API call _that_ specific. I would propose to separate _authenticate()/_aad() method for passing cleartext data and a set of ioctl's (GCRY_CTL_*) that pass additional information depending on selected AEAD mode. For example, for GCM you don't need to know aadlen/enclen/taglen in advance. -- With best wishes Dmitry From gniibe at fsij.org Mon Oct 21 09:46:13 2013 From: gniibe at fsij.org (NIIBE Yutaka) Date: Mon, 21 Oct 2013 16:46:13 +0900 Subject: ECDSA for Edwards curve (was: [PATCH v2 2/2] Add support for GOST R 34.10-2001/-2012 signatures) In-Reply-To: <87iowwiefw.fsf@vigenere.g10code.de> References: <1381867004-21231-1-git-send-email-dbaryshkov@gmail.com> <1381867004-21231-3-git-send-email-dbaryshkov@gmail.com> <87ppr5jnbm.fsf@vigenere.g10code.de> <87iowwiefw.fsf@vigenere.g10code.de> Message-ID: <1382341573.3497.1.camel@cfw2.gniibe.org> On 2013-10-17 at 08:44 +0200, Werner Koch wrote: > On Wed, 16 Oct 2013 18:13, dbaryshkov at gmail.com said: > > And strangely enough it aborts in 50% of runs. Sometimes it does, sometimes > > it just outputs a note regarding testkey and and exits normally. > > I failed to capture a problem either via gdb or via valgrind. > > It is an algorithmic problem. I think that I figure out the issue of failure. In the function nist_generate_key, when we change the private key "d" into -d, it assumes Weierstrass curve, where negative point of (x, y) is (x, -y). However, for Twisted Edwards curve, negative point of (u, v) is (-u, v). Perhaps, compact form would be v only for Twisted Edwards curve. Or, we could change the code so that we can have interfaces of getting/setting affine point in the representation of corresponding Weierstrass curve (x, y) for Twisted Edwards curve. And public key is specified by Weierstrass curve representation. -- From jussi.kivilinna at iki.fi Mon Oct 21 12:30:10 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 21 Oct 2013 13:30:10 +0300 Subject: [PATCH 1/2] [v2] Add API to support AEAD cipher modes In-Reply-To: References: <20131020120313.21970.15918.stgit@localhost6.localdomain6> Message-ID: <52650232.7080009@iki.fi> On 20.10.2013 22:49, Dmitry Eremin-Solenikov wrote: > Hello, > > On Sun, Oct 20, 2013 at 4:03 PM, Jussi Kivilinna wrote: >> - Change gcry_cipher_tag to gcry_cipher_checktag and gcry_cipher_gettag >> for giving tag (checktag) for decryption and reading tag (gettag) after >> encryption. >> - Change gcry_cipher_authenticate to gcry_cipher_setaad, since >> additional parameters needed for some AEAD modes (in this case CCM, >> which needs the length of encrypted data and tag for MAC >> initialization). > > I'm quite unsure that we should make this API call _that_ specific. > I would propose to separate _authenticate()/_aad() method for passing > cleartext data and a set of ioctl's (GCRY_CTL_*) that pass additional > information > depending on selected AEAD mode. Ok, I changed API back to _authenticate() and added GCRYCTL_SET_CCM_PARAMS CCM patch. Looks better now and another benefit is that now AAD can be passed in multiple as calls to _authenticate. -Jussi > > For example, for GCM you don't need to know aadlen/enclen/taglen in advance. > From jussi.kivilinna at iki.fi Mon Oct 21 12:34:06 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 21 Oct 2013 13:34:06 +0300 Subject: [PATCH 1/2] [v3] Add API to support AEAD cipher modes Message-ID: <20131021103406.22420.58657.stgit@localhost6.localdomain6> * cipher/cipher.c (_gcry_cipher_authenticate, _gcry_cipher_checktag) (_gcry_cipher_gettag): New. * doc/gcrypt.texi: Add documentation for new API functions. * src/visibility.c (gcry_cipher_authenticate, gcry_cipher_checktag) (gcry_cipher_gettag): New. * src/gcrypt.h.in, src/visibility.h: add declarations of these functions. * src/libgcrypt.defs, src/libgcrypt.vers: export functions. -- Authenticated Encryption with Associated Data (AEAD) cipher modes provide authentication tag that can be used to authenticate message. At the same time it allows one to specify additional (unencrypted data) that will be authenticated together with the message. This class of cipher modes requires additional API present in this commit. This patch is based on original patch by Dmitry Eremin-Solenikov. Changes in v2: - Change gcry_cipher_tag to gcry_cipher_checktag and gcry_cipher_gettag for giving tag (checktag) for decryption and reading tag (gettag) after encryption. - Change gcry_cipher_authenticate to gcry_cipher_setaad, since additional parameters needed for some AEAD modes (in this case CCM, which needs the length of encrypted data and tag for MAC initialization). - Add some documentation. Changes in v3: - Change gcry_cipher_setaad back to gcry_cipher_authenticate. Additional parameters (encrypt_len, tag_len, aad_len) for CCM will be given through GCRY_CTL_SET_CCM_PARAMS. Signed-off-by: Jussi Kivilinna --- cipher/cipher.c | 34 ++++++++++++++++++++++++++++++++++ doc/gcrypt.texi | 35 +++++++++++++++++++++++++++++++++++ src/gcrypt.h.in | 11 +++++++++++ src/libgcrypt.def | 3 +++ src/libgcrypt.vers | 1 + src/visibility.c | 27 +++++++++++++++++++++++++++ src/visibility.h | 9 +++++++++ 7 files changed, 120 insertions(+) diff --git a/cipher/cipher.c b/cipher/cipher.c index 75d42d1..36c79db 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -910,6 +910,40 @@ _gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen) return 0; } +gcry_error_t +_gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, + size_t abuflen) +{ + log_fatal ("gcry_cipher_authenticate: invalid mode %d\n", hd->mode); + + (void)abuf; + (void)abuflen; + + return gpg_error (GPG_ERR_INV_CIPHER_MODE); +} + +gcry_error_t +_gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, size_t taglen) +{ + log_fatal ("gcry_cipher_gettag: invalid mode %d\n", hd->mode); + + (void)outtag; + (void)taglen; + + return gpg_error (GPG_ERR_INV_CIPHER_MODE); +} + +gcry_error_t +_gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen) +{ + log_fatal ("gcry_cipher_checktag: invalid mode %d\n", hd->mode); + + (void)intag; + (void)taglen; + + return gpg_error (GPG_ERR_INV_CIPHER_MODE); +} + gcry_error_t gcry_cipher_ctl( gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) diff --git a/doc/gcrypt.texi b/doc/gcrypt.texi index 473c484..0049fa0 100644 --- a/doc/gcrypt.texi +++ b/doc/gcrypt.texi @@ -1731,6 +1731,10 @@ matches the requirement of the selected algorithm and mode. This function is also used with the Salsa20 stream cipher to set or update the required nonce. In this case it needs to be called after setting the key. + +This function is also used with the AEAD cipher modes to set or +update the required nonce. + @end deftypefun @deftypefun gcry_error_t gcry_cipher_setctr (gcry_cipher_hd_t @var{h}, const void *@var{c}, size_t @var{l}) @@ -1750,6 +1754,37 @@ call to gcry_cipher_setkey and clear the initialization vector. Note that gcry_cipher_reset is implemented as a macro. @end deftypefun +Authenticated Encryption with Associated Data (AEAD) block cipher +modes require the handling of the authentication tag and the additional +authenticated data, which can be done by using the following +functions: + + at deftypefun gcry_error_t gcry_cipher_authenticate (gcry_cipher_hd_t @var{h}, const void *@var{abuf}, size_t @var{abuflen}) + +Process the buffer @var{abuf} of length @var{abuflen} as the additional +authenticated data (AAD) for AEAD cipher modes. + + at end deftypefun + + at deftypefun gcry_error_t gcry_cipher_gettag (gcry_cipher_hd_t @var{h}, void *@var{tag}, size_t @var{taglen}) + +This function is used to read the authentication tag after encryption. +The function finalizes and outputs the authentication tag to the buffer + at var{tag} of length @var{taglen} bytes. + + at end deftypefun + + at deftypefun gcry_error_t gcry_cipher_checktag (gcry_cipher_hd_t @var{h}, const void *@var{tag}, size_t @var{taglen}) + +Check the authentication tag after decryption. The authentication +tag is passed as the buffer @var{tag} of length @var{taglen} bytes +and compared to internal authentication tag computed during +decryption. Error code @code{GPG_ERR_CHECKSUM} is returned if +the authentication tag in the buffer @var{tag} does not match +the authentication tag calculated during decryption. + + at end deftypefun + The actual encryption and decryption is done by using one of the following functions. They may be used as often as required to process all the data. diff --git a/src/gcrypt.h.in b/src/gcrypt.h.in index 64cc0e4..f0ae927 100644 --- a/src/gcrypt.h.in +++ b/src/gcrypt.h.in @@ -953,6 +953,17 @@ gcry_error_t gcry_cipher_setkey (gcry_cipher_hd_t hd, gcry_error_t gcry_cipher_setiv (gcry_cipher_hd_t hd, const void *iv, size_t ivlen); +/* Provide additional authentication data for AEAD modes/ciphers. */ +gcry_error_t gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, + size_t abuflen); + +/* Get authentication tag for AEAD modes/ciphers. */ +gcry_error_t gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, + size_t taglen); + +/* Check authentication tag for AEAD modes/ciphers. */ +gcry_error_t gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, + size_t taglen); /* Reset the handle to the state after open. */ #define gcry_cipher_reset(h) gcry_cipher_ctl ((h), GCRYCTL_RESET, NULL, 0) diff --git a/src/libgcrypt.def b/src/libgcrypt.def index ec0c1e3..64ba370 100644 --- a/src/libgcrypt.def +++ b/src/libgcrypt.def @@ -255,6 +255,9 @@ EXPORTS gcry_sexp_extract_param @225 + gcry_cipher_authenticate @226 + gcry_cipher_gettag @227 + gcry_cipher_checktag @228 ;; end of file with public symbols for Windows. diff --git a/src/libgcrypt.vers b/src/libgcrypt.vers index be72aad..93eaa93 100644 --- a/src/libgcrypt.vers +++ b/src/libgcrypt.vers @@ -51,6 +51,7 @@ GCRYPT_1.6 { gcry_cipher_info; gcry_cipher_map_name; gcry_cipher_mode_from_oid; gcry_cipher_open; gcry_cipher_setkey; gcry_cipher_setiv; gcry_cipher_setctr; + gcry_cipher_authenticate; gcry_cipher_gettag; gcry_cipher_checktag; gcry_pk_algo_info; gcry_pk_algo_name; gcry_pk_ctl; gcry_pk_decrypt; gcry_pk_encrypt; gcry_pk_genkey; diff --git a/src/visibility.c b/src/visibility.c index 848925e..1f7bb3a 100644 --- a/src/visibility.c +++ b/src/visibility.c @@ -713,6 +713,33 @@ gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen) return _gcry_cipher_setctr (hd, ctr, ctrlen); } +gcry_error_t +gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, size_t abuflen) +{ + if (!fips_is_operational ()) + return gpg_error (fips_not_operational ()); + + return _gcry_cipher_authenticate (hd, abuf, abuflen); +} + +gcry_error_t +gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, size_t taglen) +{ + if (!fips_is_operational ()) + return gpg_error (fips_not_operational ()); + + return _gcry_cipher_gettag (hd, outtag, taglen); +} + +gcry_error_t +gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen) +{ + if (!fips_is_operational ()) + return gpg_error (fips_not_operational ()); + + return _gcry_cipher_checktag (hd, intag, taglen); +} + gcry_error_t gcry_cipher_ctl (gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) diff --git a/src/visibility.h b/src/visibility.h index 1c8f047..b2fa4c0 100644 --- a/src/visibility.h +++ b/src/visibility.h @@ -81,6 +81,9 @@ #define gcry_cipher_setkey _gcry_cipher_setkey #define gcry_cipher_setiv _gcry_cipher_setiv #define gcry_cipher_setctr _gcry_cipher_setctr +#define gcry_cipher_authenticate _gcry_cipher_authenticate +#define gcry_cipher_checktag _gcry_cipher_checktag +#define gcry_cipher_gettag _gcry_cipher_gettag #define gcry_cipher_ctl _gcry_cipher_ctl #define gcry_cipher_decrypt _gcry_cipher_decrypt #define gcry_cipher_encrypt _gcry_cipher_encrypt @@ -297,6 +300,9 @@ gcry_err_code_t gcry_md_get (gcry_md_hd_t hd, int algo, #undef gcry_cipher_setkey #undef gcry_cipher_setiv #undef gcry_cipher_setctr +#undef gcry_cipher_authenticate +#undef gcry_cipher_checktag +#undef gcry_cipher_gettag #undef gcry_cipher_ctl #undef gcry_cipher_decrypt #undef gcry_cipher_encrypt @@ -474,6 +480,9 @@ MARK_VISIBLE (gcry_cipher_close) MARK_VISIBLE (gcry_cipher_setkey) MARK_VISIBLE (gcry_cipher_setiv) MARK_VISIBLE (gcry_cipher_setctr) +MARK_VISIBLE (gcry_cipher_authenticate) +MARK_VISIBLE (gcry_cipher_checktag) +MARK_VISIBLE (gcry_cipher_gettag) MARK_VISIBLE (gcry_cipher_ctl) MARK_VISIBLE (gcry_cipher_decrypt) MARK_VISIBLE (gcry_cipher_encrypt) From jussi.kivilinna at iki.fi Mon Oct 21 12:34:11 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 21 Oct 2013 13:34:11 +0300 Subject: [PATCH 2/2] [v3] Add Counter with CBC-MAC mode (CCM) In-Reply-To: <20131021103406.22420.58657.stgit@localhost6.localdomain6> References: <20131021103406.22420.58657.stgit@localhost6.localdomain6> Message-ID: <20131021103411.22420.21185.stgit@localhost6.localdomain6> * cipher/Makefile.am: Add 'cipher-ccm.c'. * cipher/cipher-ccm.c: New. * cipher/cipher-internal.h (gcry_cipher_handle): Add 'u_mode'. (_gcry_cipher_ccm_encrypt, _gcry_cipher_ccm_decrypt) (_gcry_cipher_ccm_set_nonce, _gcry_cipher_ccm_authenticate) (_gcry_cipher_ccm_get_tag, _gcry_cipher_ccm_check_tag) (_gcry_cipher_ccm_set_params): New prototypes. * cipher/cipher.c (gcry_cipher_open, cipher_encrypt, cipher_decrypt) (_gcry_cipher_setiv, _gcry_cipher_authenticate, _gcry_cipher_gettag) (_gcry_cipher_checktag, gry_cipher_ctl): Add handling for CCM mode. * doc/gcrypt.texi: Add documentation for GCRY_CIPHER_MODE_CCM. * src/gcrypt.h.in (gcry_cipher_modes): Add 'GCRY_CIPHER_MODE_CCM' and 'GCRYCTL_SET_CCM_PARAMS'. (GCRY_CCM_BLOCK_LEN): New. * tests/basic.c (check_ccm_cipher): New. (check_cipher_modes): Call 'check_ccm_cipher'. * tests/benchmark.c (ccm_aead_init): New. (cipher_bench): Add handling for AEAD modes and add CCM benchmarking. -- Patch adds CCM (Counter with CBC-MAC) mode as defined in RFC 3610 and NIST Special Publication 800-38C. Example for encrypting message (split in two buffers; buf1, buf2) and authenticating additional non-encrypted data (split in two buffers; aadbuf1, aadbuf2) with authentication tag length of eigth bytes: size_t params[3]; taglen = 8; gcry_cipher_setkey(h, key, len(key)); gcry_cipher_setiv(h, nonce, len(nonce)); params[0] = len(buf1) + len(buf2); /* 0: enclen */ params[1] = len(aadbuf1) + len(aadbuf2); /* 1: aadlen */ params[2] = taglen; /* 2: authtaglen */ gcry_cipher_ctl(h, GCRYCTL_SET_CCM_PARAMS, params, sizeof(size_t) * 3); gcry_cipher_authenticate(h, aadbuf1, len(aadbuf1)); gcry_cipher_authenticate(h, aadbuf2, len(aadbuf2)); gcry_cipher_encrypt(h, buf1, len(buf1), buf1, len(buf1)); gcry_cipher_encrypt(h, buf2, len(buf2), buf2, len(buf2)); gcry_cipher_gettag(h, tag, taglen); Example for decrypting above message and checking authentication tag: size_t params[3]; taglen = 8; gcry_cipher_setkey(h, key, len(key)); gcry_cipher_setiv(h, nonce, len(nonce)); params[0] = len(buf1) + len(buf2); /* 0: enclen */ params[1] = len(aadbuf1) + len(aadbuf2); /* 1: aadlen */ params[2] = taglen; /* 2: authtaglen */ gcry_cipher_ctl(h, GCRYCTL_SET_CCM_PARAMS, params, sizeof(size_t) * 3); gcry_cipher_authenticate(h, aadbuf1, len(aadbuf1)); gcry_cipher_authenticate(h, aadbuf2, len(aadbuf2)); gcry_cipher_decrypt(h, buf1, len(buf1), buf1, len(buf1)); gcry_cipher_decrypt(h, buf2, len(buf2), buf2, len(buf2)); err = gcry_cipher_checktag(h, tag, taglen); if (gpg_err_code (err) == GPG_ERR_CHECKSUM) { /* Authentication failed. */ } else if (err == 0) { /* Authentication ok. */ } Example for encrypting message without additional authenticated data: size_t params[3]; taglen = 10; gcry_cipher_setkey(h, key, len(key)); gcry_cipher_setiv(h, nonce, len(nonce)); params[0] = len(buf1); /* 0: enclen */ params[1] = 0; /* 1: aadlen */ params[2] = taglen; /* 2: authtaglen */ gcry_cipher_ctl(h, GCRYCTL_SET_CCM_PARAMS, params, sizeof(size_t) * 3); gcry_cipher_encrypt(h, buf1, len(buf1), buf1, len(buf1)); gcry_cipher_gettag(h, tag, taglen); To reset CCM state for cipher handle, one can either set new nonce or use 'gcry_cipher_reset'. This implementation reuses existing CTR mode code for encryption/decryption and is there for able to process multiple buffers that are not multiple of blocksize. AAD data maybe also be passed into gcry_cipher_authenticate in non-blocksize chunks. Signed-off-by: Jussi Kivilinna --- cipher/Makefile.am | 1 cipher/cipher-ccm.c | 371 ++++++++++++++++++++++ cipher/cipher-internal.h | 48 +++ cipher/cipher.c | 107 ++++++ doc/gcrypt.texi | 16 + src/gcrypt.h.in | 8 tests/basic.c | 771 ++++++++++++++++++++++++++++++++++++++++++++++ tests/benchmark.c | 80 +++++ 8 files changed, 1377 insertions(+), 25 deletions(-) create mode 100644 cipher/cipher-ccm.c diff --git a/cipher/Makefile.am b/cipher/Makefile.am index a2b2c8a..b0efd89 100644 --- a/cipher/Makefile.am +++ b/cipher/Makefile.am @@ -40,6 +40,7 @@ libcipher_la_LIBADD = $(GCRYPT_MODULES) libcipher_la_SOURCES = \ cipher.c cipher-internal.h \ cipher-cbc.c cipher-cfb.c cipher-ofb.c cipher-ctr.c cipher-aeswrap.c \ +cipher-ccm.c \ cipher-selftest.c cipher-selftest.h \ pubkey.c pubkey-internal.h pubkey-util.c \ md.c \ diff --git a/cipher/cipher-ccm.c b/cipher/cipher-ccm.c new file mode 100644 index 0000000..e51fa32 --- /dev/null +++ b/cipher/cipher-ccm.c @@ -0,0 +1,371 @@ +/* cipher-ccm.c - CTR mode with CBC-MAC mode implementation + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser general Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include +#include +#include +#include +#include + +#include "g10lib.h" +#include "cipher.h" +#include "ath.h" +#include "bufhelp.h" +#include "./cipher-internal.h" + + +#define set_burn(burn, nburn) do { \ + unsigned int __nburn = (nburn); \ + (burn) = (burn) > __nburn ? (burn) : __nburn; } while (0) + + +static unsigned int +do_cbc_mac (gcry_cipher_hd_t c, const unsigned char *inbuf, size_t inlen, + int do_padding) +{ + const unsigned int blocksize = 16; + unsigned char tmp[blocksize]; + unsigned int burn = 0; + unsigned int unused = c->u_mode.ccm.mac_unused; + size_t nblocks; + + if (inlen == 0 && (unused == 0 || !do_padding)) + return 0; + + do + { + if (inlen + unused < blocksize || unused > 0) + { + for (; inlen && unused < blocksize; inlen--) + c->u_mode.ccm.macbuf[unused++] = *inbuf++; + } + if (!inlen) + { + if (!do_padding) + break; + + while (unused < blocksize) + c->u_mode.ccm.macbuf[unused++] = 0; + } + + if (unused > 0) + { + /* Process one block from macbuf. */ + buf_xor(c->u_iv.iv, c->u_iv.iv, c->u_mode.ccm.macbuf, blocksize); + set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_iv.iv, + c->u_iv.iv )); + + unused = 0; + } + + if (c->bulk.cbc_enc) + { + nblocks = inlen / blocksize; + c->bulk.cbc_enc (&c->context.c, c->u_iv.iv, tmp, inbuf, nblocks, 1); + inbuf += nblocks * blocksize; + inlen -= nblocks * blocksize; + + wipememory (tmp, sizeof(tmp)); + } + else + { + while (inlen >= blocksize) + { + buf_xor(c->u_iv.iv, c->u_iv.iv, inbuf, blocksize); + + set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_iv.iv, + c->u_iv.iv )); + + inlen -= blocksize; + inbuf += blocksize; + } + } + } + while (inlen > 0); + + c->u_mode.ccm.mac_unused = unused; + + if (burn) + burn += 4 * sizeof(void *); + + return burn; +} + + +gcry_err_code_t +_gcry_cipher_ccm_set_nonce (gcry_cipher_hd_t c, const unsigned char *nonce, + size_t noncelen) +{ + size_t L = 15 - noncelen; + size_t L_; + + L_ = L - 1; + + if (!nonce) + return GPG_ERR_INV_ARG; + /* Length field must be 2, 3, ..., or 8. */ + if (L < 2 || L > 8) + return GPG_ERR_INV_LENGTH; + + /* Reset state */ + memset (&c->u_mode, 0, sizeof(c->u_mode)); + memset (&c->marks, 0, sizeof(c->marks)); + memset (&c->u_iv, 0, sizeof(c->u_iv)); + memset (&c->u_ctr, 0, sizeof(c->u_ctr)); + memset (c->lastiv, 0, sizeof(c->lastiv)); + c->unused = 0; + + /* Setup CTR */ + c->u_ctr.ctr[0] = L_; + memcpy (&c->u_ctr.ctr[1], nonce, noncelen); + memset (&c->u_ctr.ctr[1 + noncelen], 0, L); + + /* Setup IV */ + c->u_iv.iv[0] = L_; + memcpy (&c->u_iv.iv[1], nonce, noncelen); + /* Add (8 * M_ + 64 * flags) to iv[0] and set iv[noncelen + 1 ... 15] later + in set_aad. */ + memset (&c->u_iv.iv[1 + noncelen], 0, L); + + c->u_mode.ccm.nonce = 1; + + return GPG_ERR_NO_ERROR; +} + + +gcry_err_code_t +_gcry_cipher_ccm_set_params (gcry_cipher_hd_t c, size_t encryptlen, + size_t aadlen, size_t taglen) +{ + unsigned int burn = 0; + unsigned char b0[16]; + size_t noncelen = 15 - (c->u_iv.iv[0] + 1); + size_t M = taglen; + size_t M_; + int i; + + M_ = (M - 2) / 2; + + /* Authentication field must be 4, 6, 8, 10, 12, 14 or 16. */ + if ((M_ * 2 + 2) != M || M < 4 || M > 16) + return GPG_ERR_INV_LENGTH; + if (!c->u_mode.ccm.nonce || c->u_mode.ccm.tag) + return GPG_ERR_INV_STATE; + if (c->u_mode.ccm.params) + return GPG_ERR_INV_STATE; + + c->u_mode.ccm.authlen = taglen; + c->u_mode.ccm.encryptlen = encryptlen; + c->u_mode.ccm.aadlen = aadlen; + + /* Complete IV setup. */ + c->u_iv.iv[0] += (aadlen > 0) * 64 + M_ * 8; + for (i = 16 - 1; i >= 1 + noncelen; i--) + { + c->u_iv.iv[i] = encryptlen & 0xff; + encryptlen >>= 8; + } + + memcpy (b0, c->u_iv.iv, 16); + memset (c->u_iv.iv, 0, 16); + + set_burn (burn, do_cbc_mac (c, b0, 16, 0)); + + if (aadlen == 0) + { + /* Do nothing. */ + } + else if (aadlen > 0 && aadlen <= (unsigned int)0xfeff) + { + b0[0] = (aadlen >> 8) & 0xff; + b0[1] = aadlen & 0xff; + set_burn (burn, do_cbc_mac (c, b0, 2, 0)); + } + else if (aadlen > 0xfeff && aadlen <= (unsigned int)0xffffffff) + { + b0[0] = 0xff; + b0[1] = 0xfe; + buf_put_be32(&b0[2], aadlen); + set_burn (burn, do_cbc_mac (c, b0, 6, 0)); + } +#ifdef HAVE_U64_TYPEDEF + else if (aadlen > (unsigned int)0xffffffff) + { + b0[0] = 0xff; + b0[1] = 0xff; + buf_put_be64(&b0[2], aadlen); + set_burn (burn, do_cbc_mac (c, b0, 10, 0)); + } +#endif + + /* Generate S_0 and increase counter. */ + set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_mode.ccm.s0, + c->u_ctr.ctr )); + c->u_ctr.ctr[15]++; + + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + + c->u_mode.ccm.params = 1; + + return GPG_ERR_NO_ERROR; +} + + +gcry_err_code_t +_gcry_cipher_ccm_authenticate (gcry_cipher_hd_t c, const unsigned char *abuf, + size_t abuflen) +{ + unsigned int burn; + + if (abuflen > 0 && !abuf) + return GPG_ERR_INV_ARG; + if (!c->u_mode.ccm.nonce || !c->u_mode.ccm.params || c->u_mode.ccm.tag) + return GPG_ERR_INV_STATE; + if (abuflen > c->u_mode.ccm.aadlen) + return GPG_ERR_INV_LENGTH; + + c->u_mode.ccm.aadlen -= abuflen; + burn = do_cbc_mac (c, abuf, abuflen, c->u_mode.ccm.aadlen == 0); + + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + + return GPG_ERR_NO_ERROR; +} + + +gcry_err_code_t +_gcry_cipher_ccm_tag (gcry_cipher_hd_t c, unsigned char *outbuf, + size_t outbuflen, int check) +{ + unsigned int burn; + + if (!outbuf || outbuflen == 0) + return GPG_ERR_INV_ARG; + /* Tag length must be same as initial authlen. */ + if (c->u_mode.ccm.authlen != outbuflen) + return GPG_ERR_INV_LENGTH; + if (!c->u_mode.ccm.nonce || !c->u_mode.ccm.params || c->u_mode.ccm.aadlen > 0) + return GPG_ERR_INV_STATE; + /* Initial encrypt length must match with length of actual data processed. */ + if (c->u_mode.ccm.encryptlen > 0) + return GPG_ERR_UNFINISHED; + + if (!c->u_mode.ccm.tag) + { + burn = do_cbc_mac (c, NULL, 0, 1); /* Perform final padding. */ + + /* Add S_0 */ + buf_xor (c->u_iv.iv, c->u_iv.iv, c->u_mode.ccm.s0, 16); + + wipememory (c->u_ctr.ctr, 16); + wipememory (c->u_mode.ccm.s0, 16); + wipememory (c->u_mode.ccm.macbuf, 16); + + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + } + + if (!check) + { + memcpy (outbuf, c->u_iv.iv, outbuflen); + return GPG_ERR_NO_ERROR; + } + else + { + int diff, i; + + /* Constant-time compare. */ + for (i = 0, diff = 0; i < outbuflen; i++) + diff -= !!(outbuf[i] - c->u_iv.iv[i]); + + return !diff ? GPG_ERR_NO_ERROR : GPG_ERR_CHECKSUM; + } +} + + +gcry_err_code_t +_gcry_cipher_ccm_get_tag (gcry_cipher_hd_t c, unsigned char *outtag, + size_t taglen) +{ + return _gcry_cipher_ccm_tag (c, outtag, taglen, 0); +} + + +gcry_err_code_t +_gcry_cipher_ccm_check_tag (gcry_cipher_hd_t c, const unsigned char *intag, + size_t taglen) +{ + return _gcry_cipher_ccm_tag (c, (unsigned char *)intag, taglen, 1); +} + + +gcry_err_code_t +_gcry_cipher_ccm_encrypt (gcry_cipher_hd_t c, unsigned char *outbuf, + unsigned int outbuflen, const unsigned char *inbuf, + unsigned int inbuflen) +{ + unsigned int burn; + + if (outbuflen < inbuflen) + return GPG_ERR_BUFFER_TOO_SHORT; + if (!c->u_mode.ccm.nonce || c->u_mode.ccm.tag || !c->u_mode.ccm.params || + c->u_mode.ccm.aadlen > 0) + return GPG_ERR_INV_STATE; + if (inbuflen > c->u_mode.ccm.encryptlen) + return GPG_ERR_INV_LENGTH; + + c->u_mode.ccm.encryptlen -= inbuflen; + burn = do_cbc_mac (c, inbuf, inbuflen, 0); + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + + return _gcry_cipher_ctr_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); +} + + +gcry_err_code_t +_gcry_cipher_ccm_decrypt (gcry_cipher_hd_t c, unsigned char *outbuf, + unsigned int outbuflen, const unsigned char *inbuf, + unsigned int inbuflen) +{ + gcry_err_code_t err; + unsigned int burn; + + if (outbuflen < inbuflen) + return GPG_ERR_BUFFER_TOO_SHORT; + if (!c->u_mode.ccm.nonce || c->u_mode.ccm.tag || !c->u_mode.ccm.params || + c->u_mode.ccm.aadlen > 0) + return GPG_ERR_INV_STATE; + if (inbuflen > c->u_mode.ccm.encryptlen) + return GPG_ERR_INV_LENGTH; + + err = _gcry_cipher_ctr_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); + if (err) + return err; + + c->u_mode.ccm.encryptlen -= inbuflen; + burn = do_cbc_mac (c, outbuf, inbuflen, 0); + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + + return err; +} + diff --git a/cipher/cipher-internal.h b/cipher/cipher-internal.h index b60ef38..fbaef4f 100644 --- a/cipher/cipher-internal.h +++ b/cipher/cipher-internal.h @@ -100,7 +100,8 @@ struct gcry_cipher_handle /* The initialization vector. For best performance we make sure that it is properly aligned. In particular some implementations - of bulk operations expect an 16 byte aligned IV. */ + of bulk operations expect an 16 byte aligned IV. IV is also used + to store CBC-MAC in CCM mode; counter IV is stored in U_CTR. */ union { cipher_context_alignment_t iv_align; unsigned char iv[MAX_BLOCKSIZE]; @@ -117,6 +118,26 @@ struct gcry_cipher_handle unsigned char lastiv[MAX_BLOCKSIZE]; int unused; /* Number of unused bytes in LASTIV. */ + union { + /* Mode specific storage for CCM mode. */ + struct { + size_t encryptlen; + size_t aadlen; + unsigned int authlen; + + /* Space to save partial input lengths for MAC. */ + unsigned char macbuf[GCRY_CCM_BLOCK_LEN]; + int mac_unused; /* Number of unprocessed bytes in MACBUF. */ + + unsigned char s0[GCRY_CCM_BLOCK_LEN]; + + unsigned int nonce:1;/* Set to 1 if nonce has been set. */ + unsigned int params:1; /* Set to 1 if CCM parameters has been + processed. */ + unsigned int tag:1; /* Set to 1 if tag has been finalized. */ + } ccm; + } u_mode; + /* What follows are two contexts of the cipher in use. The first one needs to be aligned well enough for the cipher operation whereas the second one is a copy created by cipher_setkey and @@ -175,5 +196,30 @@ gcry_err_code_t _gcry_cipher_aeswrap_decrypt const byte *inbuf, unsigned int inbuflen); +/*-- cipher-ccm.c --*/ +gcry_err_code_t _gcry_cipher_ccm_encrypt +/* */ (gcry_cipher_hd_t c, + unsigned char *outbuf, unsigned int outbuflen, + const unsigned char *inbuf, unsigned int inbuflen); +gcry_err_code_t _gcry_cipher_ccm_decrypt +/* */ (gcry_cipher_hd_t c, + unsigned char *outbuf, unsigned int outbuflen, + const unsigned char *inbuf, unsigned int inbuflen); +gcry_err_code_t _gcry_cipher_ccm_set_nonce +/* */ (gcry_cipher_hd_t c, const unsigned char *nonce, + size_t noncelen); +gcry_err_code_t _gcry_cipher_ccm_authenticate +/* */ (gcry_cipher_hd_t c, const unsigned char *abuf, size_t abuflen); +gcry_err_code_t _gcry_cipher_ccm_set_params +/* */ (gcry_cipher_hd_t c, size_t encryptedlen, size_t aadlen, + size_t taglen); +gcry_err_code_t _gcry_cipher_ccm_get_tag +/* */ (gcry_cipher_hd_t c, + unsigned char *outtag, size_t taglen); +gcry_err_code_t _gcry_cipher_ccm_check_tag +/* */ (gcry_cipher_hd_t c, + const unsigned char *intag, size_t taglen); + + #endif /*G10_CIPHER_INTERNAL_H*/ diff --git a/cipher/cipher.c b/cipher/cipher.c index 36c79db..db7f505 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -375,6 +375,13 @@ gcry_cipher_open (gcry_cipher_hd_t *handle, if (! err) switch (mode) { + case GCRY_CIPHER_MODE_CCM: + if (spec->blocksize != GCRY_CCM_BLOCK_LEN) + err = GPG_ERR_INV_CIPHER_MODE; + if (!spec->encrypt || !spec->decrypt) + err = GPG_ERR_INV_CIPHER_MODE; + break; + case GCRY_CIPHER_MODE_ECB: case GCRY_CIPHER_MODE_CBC: case GCRY_CIPHER_MODE_CFB: @@ -613,6 +620,8 @@ cipher_reset (gcry_cipher_hd_t c) memset (c->u_iv.iv, 0, c->spec->blocksize); memset (c->lastiv, 0, c->spec->blocksize); memset (c->u_ctr.ctr, 0, c->spec->blocksize); + memset (&c->u_mode, 0, sizeof c->u_mode); + c->unused = 0; } @@ -718,6 +727,10 @@ cipher_encrypt (gcry_cipher_hd_t c, byte *outbuf, unsigned int outbuflen, inbuf, inbuflen); break; + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); + break; + case GCRY_CIPHER_MODE_STREAM: c->spec->stencrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf, inbuflen); @@ -811,6 +824,10 @@ cipher_decrypt (gcry_cipher_hd_t c, byte *outbuf, unsigned int outbuflen, inbuf, inbuflen); break; + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); + break; + case GCRY_CIPHER_MODE_STREAM: c->spec->stdecrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf, inbuflen); @@ -885,8 +902,19 @@ _gcry_cipher_setkey (gcry_cipher_hd_t hd, const void *key, size_t keylen) gcry_error_t _gcry_cipher_setiv (gcry_cipher_hd_t hd, const void *iv, size_t ivlen) { - cipher_setiv (hd, iv, ivlen); - return 0; + gcry_err_code_t rc = GPG_ERR_NO_ERROR; + + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_set_nonce (hd, iv, ivlen); + break; + + default: + cipher_setiv (hd, iv, ivlen); + break; + } + return gpg_error (rc); } /* Set counter for CTR mode. (CTR,CTRLEN) must denote a buffer of @@ -914,34 +942,61 @@ gcry_error_t _gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, size_t abuflen) { - log_fatal ("gcry_cipher_authenticate: invalid mode %d\n", hd->mode); + gcry_err_code_t rc; + + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_authenticate (hd, abuf, abuflen); + break; - (void)abuf; - (void)abuflen; + default: + log_fatal ("gcry_cipher_authenticate: invalid mode %d\n", hd->mode); + rc = GPG_ERR_INV_CIPHER_MODE; + break; + } - return gpg_error (GPG_ERR_INV_CIPHER_MODE); + return gpg_error (rc); } gcry_error_t _gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, size_t taglen) { - log_fatal ("gcry_cipher_gettag: invalid mode %d\n", hd->mode); + gcry_err_code_t rc; + + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_get_tag (hd, outtag, taglen); + break; - (void)outtag; - (void)taglen; + default: + log_fatal ("gcry_cipher_gettag: invalid mode %d\n", hd->mode); + rc = GPG_ERR_INV_CIPHER_MODE; + break; + } - return gpg_error (GPG_ERR_INV_CIPHER_MODE); + return gpg_error (rc); } gcry_error_t _gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen) { - log_fatal ("gcry_cipher_checktag: invalid mode %d\n", hd->mode); + gcry_err_code_t rc; + + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_check_tag (hd, intag, taglen); + break; - (void)intag; - (void)taglen; + default: + log_fatal ("gcry_cipher_checktag: invalid mode %d\n", hd->mode); + rc = GPG_ERR_INV_CIPHER_MODE; + break; + } - return gpg_error (GPG_ERR_INV_CIPHER_MODE); + return gpg_error (rc); } @@ -980,6 +1035,30 @@ gcry_cipher_ctl( gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) h->flags &= ~GCRY_CIPHER_CBC_MAC; break; + case GCRYCTL_SET_CCM_PARAMS: + { + size_t params[3]; + size_t encryptedlen; + size_t aadlen; + size_t authtaglen; + + if (h->mode != GCRY_CIPHER_MODE_CCM) + return gcry_error (GPG_ERR_INV_CIPHER_MODE); + + if (!buffer || buflen != 3 * sizeof(size_t)) + return gcry_error (GPG_ERR_INV_ARG); + + /* This command is used to pass additional length parameters needed + by CCM mode to initialize CBC-MAC. */ + memcpy (params, buffer, sizeof(params)); + encryptedlen = params[0]; + aadlen = params[1]; + authtaglen = params[2]; + + rc = _gcry_cipher_ccm_set_params (h, encryptedlen, aadlen, authtaglen); + } + break; + case GCRYCTL_DISABLE_ALGO: /* This command expects NULL for H and BUFFER to point to an integer with the algo number. */ diff --git a/doc/gcrypt.texi b/doc/gcrypt.texi index 0049fa0..91fe399 100644 --- a/doc/gcrypt.texi +++ b/doc/gcrypt.texi @@ -1635,6 +1635,12 @@ may be specified 64 bit (8 byte) shorter than then input buffer. As per specs the input length must be at least 128 bits and the length must be a multiple of 64 bits. + at item GCRY_CIPHER_MODE_CCM + at cindex CCM, Counter with CBC-MAC mode +Counter with CBC-MAC mode is an Authenticated Encryption with +Associated Data (AEAD) block cipher mode, which is specified in +'NIST Special Publication 800-38C' and RFC 3610. + @end table @node Working with cipher handles @@ -1661,11 +1667,13 @@ The cipher mode to use must be specified via @var{mode}. See @xref{Available cipher modes}, for a list of supported cipher modes and the according constants. Note that some modes are incompatible with some algorithms - in particular, stream mode -(@code{GCRY_CIPHER_MODE_STREAM}) only works with stream ciphers. Any -block cipher mode (@code{GCRY_CIPHER_MODE_ECB}, +(@code{GCRY_CIPHER_MODE_STREAM}) only works with stream ciphers. The +block cipher modes (@code{GCRY_CIPHER_MODE_ECB}, @code{GCRY_CIPHER_MODE_CBC}, @code{GCRY_CIPHER_MODE_CFB}, - at code{GCRY_CIPHER_MODE_OFB} or @code{GCRY_CIPHER_MODE_CTR}) will work -with any block cipher algorithm. + at code{GCRY_CIPHER_MODE_OFB} and @code{GCRY_CIPHER_MODE_CTR}) will work +with any block cipher algorithm. The @code{GCRY_CIPHER_MODE_CCM} will +only work with block cipher algorithms which have the block size of +16 bytes. The third argument @var{flags} can either be passed as @code{0} or as the bit-wise OR of the following constants. diff --git a/src/gcrypt.h.in b/src/gcrypt.h.in index f0ae927..63bfd24 100644 --- a/src/gcrypt.h.in +++ b/src/gcrypt.h.in @@ -325,7 +325,8 @@ enum gcry_ctl_cmds GCRYCTL_SET_PREFERRED_RNG_TYPE = 65, GCRYCTL_GET_CURRENT_RNG_TYPE = 66, GCRYCTL_DISABLE_LOCKED_SECMEM = 67, - GCRYCTL_DISABLE_PRIV_DROP = 68 + GCRYCTL_DISABLE_PRIV_DROP = 68, + GCRYCTL_SET_CCM_PARAMS = 69 }; /* Perform various operations defined by CMD. */ @@ -884,7 +885,8 @@ enum gcry_cipher_modes GCRY_CIPHER_MODE_STREAM = 4, /* Used with stream ciphers. */ GCRY_CIPHER_MODE_OFB = 5, /* Outer feedback. */ GCRY_CIPHER_MODE_CTR = 6, /* Counter. */ - GCRY_CIPHER_MODE_AESWRAP= 7 /* AES-WRAP algorithm. */ + GCRY_CIPHER_MODE_AESWRAP= 7, /* AES-WRAP algorithm. */ + GCRY_CIPHER_MODE_CCM = 8 /* Counter with CBC-MAC. */ }; /* Flags used with the open function. */ @@ -896,6 +898,8 @@ enum gcry_cipher_flags GCRY_CIPHER_CBC_MAC = 8 /* Enable CBC message auth. code (MAC). */ }; +/* CCM works only with blocks of 128 bits. */ +#define GCRY_CCM_BLOCK_LEN (128 / 8) /* Create a handle for algorithm ALGO to be used in MODE. FLAGS may be given as an bitwise OR of the gcry_cipher_flags values. */ diff --git a/tests/basic.c b/tests/basic.c index 1d6e637..f7771e1 100644 --- a/tests/basic.c +++ b/tests/basic.c @@ -1139,6 +1139,776 @@ check_ofb_cipher (void) static void +check_ccm_cipher (void) +{ + static const struct tv + { + int algo; + int keylen; + const char *key; + int noncelen; + const char *nonce; + int aadlen; + const char *aad; + int plainlen; + const char *plaintext; + int cipherlen; + const char *ciphertext; + } tv[] = + { + /* RFC 3610 */ + { GCRY_CIPHER_AES, /* Packet Vector #1 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x03\x02\x01\x00\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 31, + "\x58\x8C\x97\x9A\x61\xC6\x63\xD2\xF0\x66\xD0\xC2\xC0\xF9\x89\x80\x6D\x5F\x6B\x61\xDA\xC3\x84\x17\xE8\xD1\x2C\xFD\xF9\x26\xE0"}, + { GCRY_CIPHER_AES, /* Packet Vector #2 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x04\x03\x02\x01\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 32, + "\x72\xC9\x1A\x36\xE1\x35\xF8\xCF\x29\x1C\xA8\x94\x08\x5C\x87\xE3\xCC\x15\xC4\x39\xC9\xE4\x3A\x3B\xA0\x91\xD5\x6E\x10\x40\x09\x16"}, + { GCRY_CIPHER_AES, /* Packet Vector #3 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x05\x04\x03\x02\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 33, + "\x51\xB1\xE5\xF4\x4A\x19\x7D\x1D\xA4\x6B\x0F\x8E\x2D\x28\x2A\xE8\x71\xE8\x38\xBB\x64\xDA\x85\x96\x57\x4A\xDA\xA7\x6F\xBD\x9F\xB0\xC5"}, + { GCRY_CIPHER_AES, /* Packet Vector #4 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x06\x05\x04\x03\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 27, + "\xA2\x8C\x68\x65\x93\x9A\x9A\x79\xFA\xAA\x5C\x4C\x2A\x9D\x4A\x91\xCD\xAC\x8C\x96\xC8\x61\xB9\xC9\xE6\x1E\xF1"}, + { GCRY_CIPHER_AES, /* Packet Vector #5 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x07\x06\x05\x04\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 28, + "\xDC\xF1\xFB\x7B\x5D\x9E\x23\xFB\x9D\x4E\x13\x12\x53\x65\x8A\xD8\x6E\xBD\xCA\x3E\x51\xE8\x3F\x07\x7D\x9C\x2D\x93"}, + { GCRY_CIPHER_AES, /* Packet Vector #6 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x08\x07\x06\x05\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 29, + "\x6F\xC1\xB0\x11\xF0\x06\x56\x8B\x51\x71\xA4\x2D\x95\x3D\x46\x9B\x25\x70\xA4\xBD\x87\x40\x5A\x04\x43\xAC\x91\xCB\x94"}, + { GCRY_CIPHER_AES, /* Packet Vector #7 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x09\x08\x07\x06\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 33, + "\x01\x35\xD1\xB2\xC9\x5F\x41\xD5\xD1\xD4\xFE\xC1\x85\xD1\x66\xB8\x09\x4E\x99\x9D\xFE\xD9\x6C\x04\x8C\x56\x60\x2C\x97\xAC\xBB\x74\x90"}, + { GCRY_CIPHER_AES, /* Packet Vector #8 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0A\x09\x08\x07\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 34, + "\x7B\x75\x39\x9A\xC0\x83\x1D\xD2\xF0\xBB\xD7\x58\x79\xA2\xFD\x8F\x6C\xAE\x6B\x6C\xD9\xB7\xDB\x24\xC1\x7B\x44\x33\xF4\x34\x96\x3F\x34\xB4"}, + { GCRY_CIPHER_AES, /* Packet Vector #9 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0B\x0A\x09\x08\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 35, + "\x82\x53\x1A\x60\xCC\x24\x94\x5A\x4B\x82\x79\x18\x1A\xB5\xC8\x4D\xF2\x1C\xE7\xF9\xB7\x3F\x42\xE1\x97\xEA\x9C\x07\xE5\x6B\x5E\xB1\x7E\x5F\x4E"}, + { GCRY_CIPHER_AES, /* Packet Vector #10 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0C\x0B\x0A\x09\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 29, + "\x07\x34\x25\x94\x15\x77\x85\x15\x2B\x07\x40\x98\x33\x0A\xBB\x14\x1B\x94\x7B\x56\x6A\xA9\x40\x6B\x4D\x99\x99\x88\xDD"}, + { GCRY_CIPHER_AES, /* Packet Vector #11 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0D\x0C\x0B\x0A\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 30, + "\x67\x6B\xB2\x03\x80\xB0\xE3\x01\xE8\xAB\x79\x59\x0A\x39\x6D\xA7\x8B\x83\x49\x34\xF5\x3A\xA2\xE9\x10\x7A\x8B\x6C\x02\x2C"}, + { GCRY_CIPHER_AES, /* Packet Vector #12 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0E\x0D\x0C\x0B\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 31, + "\xC0\xFF\xA0\xD6\xF0\x5B\xDB\x67\xF2\x4D\x43\xA4\x33\x8D\x2A\xA4\xBE\xD7\xB2\x0E\x43\xCD\x1A\xA3\x16\x62\xE7\xAD\x65\xD6\xDB"}, + { GCRY_CIPHER_AES, /* Packet Vector #13 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x41\x2B\x4E\xA9\xCD\xBE\x3C\x96\x96\x76\x6C\xFA", + 8, "\x0B\xE1\xA8\x8B\xAC\xE0\x18\xB1", + 23, + "\x08\xE8\xCF\x97\xD8\x20\xEA\x25\x84\x60\xE9\x6A\xD9\xCF\x52\x89\x05\x4D\x89\x5C\xEA\xC4\x7C", + 31, + "\x4C\xB9\x7F\x86\xA2\xA4\x68\x9A\x87\x79\x47\xAB\x80\x91\xEF\x53\x86\xA6\xFF\xBD\xD0\x80\xF8\xE7\x8C\xF7\xCB\x0C\xDD\xD7\xB3"}, + { GCRY_CIPHER_AES, /* Packet Vector #14 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x33\x56\x8E\xF7\xB2\x63\x3C\x96\x96\x76\x6C\xFA", + 8, "\x63\x01\x8F\x76\xDC\x8A\x1B\xCB", + 24, + "\x90\x20\xEA\x6F\x91\xBD\xD8\x5A\xFA\x00\x39\xBA\x4B\xAF\xF9\xBF\xB7\x9C\x70\x28\x94\x9C\xD0\xEC", + 32, + "\x4C\xCB\x1E\x7C\xA9\x81\xBE\xFA\xA0\x72\x6C\x55\xD3\x78\x06\x12\x98\xC8\x5C\x92\x81\x4A\xBC\x33\xC5\x2E\xE8\x1D\x7D\x77\xC0\x8A"}, + { GCRY_CIPHER_AES, /* Packet Vector #15 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x10\x3F\xE4\x13\x36\x71\x3C\x96\x96\x76\x6C\xFA", + 8, "\xAA\x6C\xFA\x36\xCA\xE8\x6B\x40", + 25, + "\xB9\x16\xE0\xEA\xCC\x1C\x00\xD7\xDC\xEC\x68\xEC\x0B\x3B\xBB\x1A\x02\xDE\x8A\x2D\x1A\xA3\x46\x13\x2E", + 33, + "\xB1\xD2\x3A\x22\x20\xDD\xC0\xAC\x90\x0D\x9A\xA0\x3C\x61\xFC\xF4\xA5\x59\xA4\x41\x77\x67\x08\x97\x08\xA7\x76\x79\x6E\xDB\x72\x35\x06"}, + { GCRY_CIPHER_AES, /* Packet Vector #16 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x76\x4C\x63\xB8\x05\x8E\x3C\x96\x96\x76\x6C\xFA", + 12, "\xD0\xD0\x73\x5C\x53\x1E\x1B\xEC\xF0\x49\xC2\x44", + 19, + "\x12\xDA\xAC\x56\x30\xEF\xA5\x39\x6F\x77\x0C\xE1\xA6\x6B\x21\xF7\xB2\x10\x1C", + 27, + "\x14\xD2\x53\xC3\x96\x7B\x70\x60\x9B\x7C\xBB\x7C\x49\x91\x60\x28\x32\x45\x26\x9A\x6F\x49\x97\x5B\xCA\xDE\xAF"}, + { GCRY_CIPHER_AES, /* Packet Vector #17 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\xF8\xB6\x78\x09\x4E\x3B\x3C\x96\x96\x76\x6C\xFA", + 12, "\x77\xB6\x0F\x01\x1C\x03\xE1\x52\x58\x99\xBC\xAE", + 20, + "\xE8\x8B\x6A\x46\xC7\x8D\x63\xE5\x2E\xB8\xC5\x46\xEF\xB5\xDE\x6F\x75\xE9\xCC\x0D", + 28, + "\x55\x45\xFF\x1A\x08\x5E\xE2\xEF\xBF\x52\xB2\xE0\x4B\xEE\x1E\x23\x36\xC7\x3E\x3F\x76\x2C\x0C\x77\x44\xFE\x7E\x3C"}, + { GCRY_CIPHER_AES, /* Packet Vector #18 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\xD5\x60\x91\x2D\x3F\x70\x3C\x96\x96\x76\x6C\xFA", + 12, "\xCD\x90\x44\xD2\xB7\x1F\xDB\x81\x20\xEA\x60\xC0", + 21, + "\x64\x35\xAC\xBA\xFB\x11\xA8\x2E\x2F\x07\x1D\x7C\xA4\xA5\xEB\xD9\x3A\x80\x3B\xA8\x7F", + 29, + "\x00\x97\x69\xEC\xAB\xDF\x48\x62\x55\x94\xC5\x92\x51\xE6\x03\x57\x22\x67\x5E\x04\xC8\x47\x09\x9E\x5A\xE0\x70\x45\x51"}, + { GCRY_CIPHER_AES, /* Packet Vector #19 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x42\xFF\xF8\xF1\x95\x1C\x3C\x96\x96\x76\x6C\xFA", + 8, "\xD8\x5B\xC7\xE6\x9F\x94\x4F\xB8", + 23, + "\x8A\x19\xB9\x50\xBC\xF7\x1A\x01\x8E\x5E\x67\x01\xC9\x17\x87\x65\x98\x09\xD6\x7D\xBE\xDD\x18", + 33, + "\xBC\x21\x8D\xAA\x94\x74\x27\xB6\xDB\x38\x6A\x99\xAC\x1A\xEF\x23\xAD\xE0\xB5\x29\x39\xCB\x6A\x63\x7C\xF9\xBE\xC2\x40\x88\x97\xC6\xBA"}, + { GCRY_CIPHER_AES, /* Packet Vector #20 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x92\x0F\x40\xE5\x6C\xDC\x3C\x96\x96\x76\x6C\xFA", + 8, "\x74\xA0\xEB\xC9\x06\x9F\x5B\x37", + 24, + "\x17\x61\x43\x3C\x37\xC5\xA3\x5F\xC1\xF3\x9F\x40\x63\x02\xEB\x90\x7C\x61\x63\xBE\x38\xC9\x84\x37", + 34, + "\x58\x10\xE6\xFD\x25\x87\x40\x22\xE8\x03\x61\xA4\x78\xE3\xE9\xCF\x48\x4A\xB0\x4F\x44\x7E\xFF\xF6\xF0\xA4\x77\xCC\x2F\xC9\xBF\x54\x89\x44"}, + { GCRY_CIPHER_AES, /* Packet Vector #21 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x27\xCA\x0C\x71\x20\xBC\x3C\x96\x96\x76\x6C\xFA", + 8, "\x44\xA3\xAA\x3A\xAE\x64\x75\xCA", + 25, + "\xA4\x34\xA8\xE5\x85\x00\xC6\xE4\x15\x30\x53\x88\x62\xD6\x86\xEA\x9E\x81\x30\x1B\x5A\xE4\x22\x6B\xFA", + 35, + "\xF2\xBE\xED\x7B\xC5\x09\x8E\x83\xFE\xB5\xB3\x16\x08\xF8\xE2\x9C\x38\x81\x9A\x89\xC8\xE7\x76\xF1\x54\x4D\x41\x51\xA4\xED\x3A\x8B\x87\xB9\xCE"}, + { GCRY_CIPHER_AES, /* Packet Vector #22 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x5B\x8C\xCB\xCD\x9A\xF8\x3C\x96\x96\x76\x6C\xFA", + 12, "\xEC\x46\xBB\x63\xB0\x25\x20\xC3\x3C\x49\xFD\x70", + 19, + "\xB9\x6B\x49\xE2\x1D\x62\x17\x41\x63\x28\x75\xDB\x7F\x6C\x92\x43\xD2\xD7\xC2", + 29, + "\x31\xD7\x50\xA0\x9D\xA3\xED\x7F\xDD\xD4\x9A\x20\x32\xAA\xBF\x17\xEC\x8E\xBF\x7D\x22\xC8\x08\x8C\x66\x6B\xE5\xC1\x97"}, + { GCRY_CIPHER_AES, /* Packet Vector #23 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x3E\xBE\x94\x04\x4B\x9A\x3C\x96\x96\x76\x6C\xFA", + 12, "\x47\xA6\x5A\xC7\x8B\x3D\x59\x42\x27\xE8\x5E\x71", + 20, + "\xE2\xFC\xFB\xB8\x80\x44\x2C\x73\x1B\xF9\x51\x67\xC8\xFF\xD7\x89\x5E\x33\x70\x76", + 30, + "\xE8\x82\xF1\xDB\xD3\x8C\xE3\xED\xA7\xC2\x3F\x04\xDD\x65\x07\x1E\xB4\x13\x42\xAC\xDF\x7E\x00\xDC\xCE\xC7\xAE\x52\x98\x7D"}, + { GCRY_CIPHER_AES, /* Packet Vector #24 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x8D\x49\x3B\x30\xAE\x8B\x3C\x96\x96\x76\x6C\xFA", + 12, "\x6E\x37\xA6\xEF\x54\x6D\x95\x5D\x34\xAB\x60\x59", + 21, + "\xAB\xF2\x1C\x0B\x02\xFE\xB8\x8F\x85\x6D\xF4\xA3\x73\x81\xBC\xE3\xCC\x12\x85\x17\xD4", + 31, + "\xF3\x29\x05\xB8\x8A\x64\x1B\x04\xB9\xC9\xFF\xB5\x8C\xC3\x90\x90\x0F\x3D\xA1\x2A\xB1\x6D\xCE\x9E\x82\xEF\xA1\x6D\xA6\x20\x59"}, + /* RFC 5528 */ + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #1 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x03\x02\x01\x00\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 31, + "\xBA\x73\x71\x85\xE7\x19\x31\x04\x92\xF3\x8A\x5F\x12\x51\xDA\x55\xFA\xFB\xC9\x49\x84\x8A\x0D\xFC\xAE\xCE\x74\x6B\x3D\xB9\xAD"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #2 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x04\x03\x02\x01\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 32, + "\x5D\x25\x64\xBF\x8E\xAF\xE1\xD9\x95\x26\xEC\x01\x6D\x1B\xF0\x42\x4C\xFB\xD2\xCD\x62\x84\x8F\x33\x60\xB2\x29\x5D\xF2\x42\x83\xE8"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #3 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x05\x04\x03\x02\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 33, + "\x81\xF6\x63\xD6\xC7\x78\x78\x17\xF9\x20\x36\x08\xB9\x82\xAD\x15\xDC\x2B\xBD\x87\xD7\x56\xF7\x92\x04\xF5\x51\xD6\x68\x2F\x23\xAA\x46"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #4 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x06\x05\x04\x03\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 27, + "\xCA\xEF\x1E\x82\x72\x11\xB0\x8F\x7B\xD9\x0F\x08\xC7\x72\x88\xC0\x70\xA4\xA0\x8B\x3A\x93\x3A\x63\xE4\x97\xA0"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #5 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x07\x06\x05\x04\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 28, + "\x2A\xD3\xBA\xD9\x4F\xC5\x2E\x92\xBE\x43\x8E\x82\x7C\x10\x23\xB9\x6A\x8A\x77\x25\x8F\xA1\x7B\xA7\xF3\x31\xDB\x09"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #6 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x08\x07\x06\x05\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 29, + "\xFE\xA5\x48\x0B\xA5\x3F\xA8\xD3\xC3\x44\x22\xAA\xCE\x4D\xE6\x7F\xFA\x3B\xB7\x3B\xAB\xAB\x36\xA1\xEE\x4F\xE0\xFE\x28"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #7 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x09\x08\x07\x06\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 33, + "\x54\x53\x20\x26\xE5\x4C\x11\x9A\x8D\x36\xD9\xEC\x6E\x1E\xD9\x74\x16\xC8\x70\x8C\x4B\x5C\x2C\xAC\xAF\xA3\xBC\xCF\x7A\x4E\xBF\x95\x73"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #8 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0A\x09\x08\x07\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 34, + "\x8A\xD1\x9B\x00\x1A\x87\xD1\x48\xF4\xD9\x2B\xEF\x34\x52\x5C\xCC\xE3\xA6\x3C\x65\x12\xA6\xF5\x75\x73\x88\xE4\x91\x3E\xF1\x47\x01\xF4\x41"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #9 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0B\x0A\x09\x08\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 35, + "\x5D\xB0\x8D\x62\x40\x7E\x6E\x31\xD6\x0F\x9C\xA2\xC6\x04\x74\x21\x9A\xC0\xBE\x50\xC0\xD4\xA5\x77\x87\x94\xD6\xE2\x30\xCD\x25\xC9\xFE\xBF\x87"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #10 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0C\x0B\x0A\x09\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 29, + "\xDB\x11\x8C\xCE\xC1\xB8\x76\x1C\x87\x7C\xD8\x96\x3A\x67\xD6\xF3\xBB\xBC\x5C\xD0\x92\x99\xEB\x11\xF3\x12\xF2\x32\x37"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #11 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0D\x0C\x0B\x0A\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 30, + "\x7C\xC8\x3D\x8D\xC4\x91\x03\x52\x5B\x48\x3D\xC5\xCA\x7E\xA9\xAB\x81\x2B\x70\x56\x07\x9D\xAF\xFA\xDA\x16\xCC\xCF\x2C\x4E"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #12 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0E\x0D\x0C\x0B\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 31, + "\x2C\xD3\x5B\x88\x20\xD2\x3E\x7A\xA3\x51\xB0\xE9\x2F\xC7\x93\x67\x23\x8B\x2C\xC7\x48\xCB\xB9\x4C\x29\x47\x79\x3D\x64\xAF\x75"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #13 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xA9\x70\x11\x0E\x19\x27\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x6B\x7F\x46\x45\x07\xFA\xE4\x96", + 23, + "\xC6\xB5\xF3\xE6\xCA\x23\x11\xAE\xF7\x47\x2B\x20\x3E\x73\x5E\xA5\x61\xAD\xB1\x7D\x56\xC5\xA3", + 31, + "\xA4\x35\xD7\x27\x34\x8D\xDD\x22\x90\x7F\x7E\xB8\xF5\xFD\xBB\x4D\x93\x9D\xA6\x52\x4D\xB4\xF6\x45\x58\xC0\x2D\x25\xB1\x27\xEE"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #14 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x83\xCD\x8C\xE0\xCB\x42\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x98\x66\x05\xB4\x3D\xF1\x5D\xE7", + 24, + "\x01\xF6\xCE\x67\x64\xC5\x74\x48\x3B\xB0\x2E\x6B\xBF\x1E\x0A\xBD\x26\xA2\x25\x72\xB4\xD8\x0E\xE7", + 32, + "\x8A\xE0\x52\x50\x8F\xBE\xCA\x93\x2E\x34\x6F\x05\xE0\xDC\x0D\xFB\xCF\x93\x9E\xAF\xFA\x3E\x58\x7C\x86\x7D\x6E\x1C\x48\x70\x38\x06"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #15 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x5F\x54\x95\x0B\x18\xF2\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x48\xF2\xE7\xE1\xA7\x67\x1A\x51", + 25, + "\xCD\xF1\xD8\x40\x6F\xC2\xE9\x01\x49\x53\x89\x70\x05\xFB\xFB\x8B\xA5\x72\x76\xF9\x24\x04\x60\x8E\x08", + 33, + "\x08\xB6\x7E\xE2\x1C\x8B\xF2\x6E\x47\x3E\x40\x85\x99\xE9\xC0\x83\x6D\x6A\xF0\xBB\x18\xDF\x55\x46\x6C\xA8\x08\x78\xA7\x90\x47\x6D\xE5"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #16 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xEC\x60\x08\x63\x31\x9A\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\xDE\x97\xDF\x3B\x8C\xBD\x6D\x8E\x50\x30\xDA\x4C", + 19, + "\xB0\x05\xDC\xFA\x0B\x59\x18\x14\x26\xA9\x61\x68\x5A\x99\x3D\x8C\x43\x18\x5B", + 27, + "\x63\xB7\x8B\x49\x67\xB1\x9E\xDB\xB7\x33\xCD\x11\x14\xF6\x4E\xB2\x26\x08\x93\x68\xC3\x54\x82\x8D\x95\x0C\xC5"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #17 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x60\xCF\xF1\xA3\x1E\xA1\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\xA5\xEE\x93\xE4\x57\xDF\x05\x46\x6E\x78\x2D\xCF", + 20, + "\x2E\x20\x21\x12\x98\x10\x5F\x12\x9D\x5E\xD9\x5B\x93\xF7\x2D\x30\xB2\xFA\xCC\xD7", + 28, + "\x0B\xC6\xBB\xE2\xA8\xB9\x09\xF4\x62\x9E\xE6\xDC\x14\x8D\xA4\x44\x10\xE1\x8A\xF4\x31\x47\x38\x32\x76\xF6\x6A\x9F"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #18 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x0F\x85\xCD\x99\x5C\x97\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\x24\xAA\x1B\xF9\xA5\xCD\x87\x61\x82\xA2\x50\x74", + 21, + "\x26\x45\x94\x1E\x75\x63\x2D\x34\x91\xAF\x0F\xC0\xC9\x87\x6C\x3B\xE4\xAA\x74\x68\xC9", + 29, + "\x22\x2A\xD6\x32\xFA\x31\xD6\xAF\x97\x0C\x34\x5F\x7E\x77\xCA\x3B\xD0\xDC\x25\xB3\x40\xA1\xA3\xD3\x1F\x8D\x4B\x44\xB7"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #19 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xC2\x9B\x2C\xAA\xC4\xCD\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x69\x19\x46\xB9\xCA\x07\xBE\x87", + 23, + "\x07\x01\x35\xA6\x43\x7C\x9D\xB1\x20\xCD\x61\xD8\xF6\xC3\x9C\x3E\xA1\x25\xFD\x95\xA0\xD2\x3D", + 33, + "\x05\xB8\xE1\xB9\xC4\x9C\xFD\x56\xCF\x13\x0A\xA6\x25\x1D\xC2\xEC\xC0\x6C\xCC\x50\x8F\xE6\x97\xA0\x06\x6D\x57\xC8\x4B\xEC\x18\x27\x68"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #20 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x2C\x6B\x75\x95\xEE\x62\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\xD0\xC5\x4E\xCB\x84\x62\x7D\xC4", + 24, + "\xC8\xC0\x88\x0E\x6C\x63\x6E\x20\x09\x3D\xD6\x59\x42\x17\xD2\xE1\x88\x77\xDB\x26\x4E\x71\xA5\xCC", + 34, + "\x54\xCE\xB9\x68\xDE\xE2\x36\x11\x57\x5E\xC0\x03\xDF\xAA\x1C\xD4\x88\x49\xBD\xF5\xAE\x2E\xDB\x6B\x7F\xA7\x75\xB1\x50\xED\x43\x83\xC5\xA9"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #21 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xC5\x3C\xD4\xC2\xAA\x24\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\xE2\x85\xE0\xE4\x80\x8C\xDA\x3D", + 25, + "\xF7\x5D\xAA\x07\x10\xC4\xE6\x42\x97\x79\x4D\xC2\xB7\xD2\xA2\x07\x57\xB1\xAA\x4E\x44\x80\x02\xFF\xAB", + 35, + "\xB1\x40\x45\x46\xBF\x66\x72\x10\xCA\x28\xE3\x09\xB3\x9B\xD6\xCA\x7E\x9F\xC8\x28\x5F\xE6\x98\xD4\x3C\xD2\x0A\x02\xE0\xBD\xCA\xED\x20\x10\xD3"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #22 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xBE\xE9\x26\x7F\xBA\xDC\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\x6C\xAE\xF9\x94\x11\x41\x57\x0D\x7C\x81\x34\x05", + 19, + "\xC2\x38\x82\x2F\xAC\x5F\x98\xFF\x92\x94\x05\xB0\xAD\x12\x7A\x4E\x41\x85\x4E", + 29, + "\x94\xC8\x95\x9C\x11\x56\x9A\x29\x78\x31\xA7\x21\x00\x58\x57\xAB\x61\xB8\x7A\x2D\xEA\x09\x36\xB6\xEB\x5F\x62\x5F\x5D"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #23 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xDF\xA8\xB1\x24\x50\x07\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\x36\xA5\x2C\xF1\x6B\x19\xA2\x03\x7A\xB7\x01\x1E", + 20, + "\x4D\xBF\x3E\x77\x4A\xD2\x45\xE5\xD5\x89\x1F\x9D\x1C\x32\xA0\xAE\x02\x2C\x85\xD7", + 30, + "\x58\x69\xE3\xAA\xD2\x44\x7C\x74\xE0\xFC\x05\xF9\xA4\xEA\x74\x57\x7F\x4D\xE8\xCA\x89\x24\x76\x42\x96\xAD\x04\x11\x9C\xE7"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #24 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x3B\x8F\xD8\xD3\xA9\x37\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\xA4\xD4\x99\xF7\x84\x19\x72\x8C\x19\x17\x8B\x0C", + 21, + "\x9D\xC9\xED\xAE\x2F\xF5\xDF\x86\x36\xE8\xC6\xDE\x0E\xED\x55\xF7\x86\x7E\x33\x33\x7D", + 31, + "\x4B\x19\x81\x56\x39\x3B\x0F\x77\x96\x08\x6A\xAF\xB4\x54\xF8\xC3\xF0\x34\xCC\xA9\x66\x94\x5F\x1F\xCE\xA7\xE1\x1B\xEE\x6A\x2F"} + }; + static const int cut[] = { 0, 1, 8, 10, 16, 19, -1 }; + gcry_cipher_hd_t hde, hdd; + unsigned char out[MAX_DATA_LEN]; + size_t ctl_params[3]; + int split, aadsplit; + size_t j, i, keylen, blklen, authlen; + gcry_error_t err = 0; + + if (verbose) + fprintf (stderr, " Starting CCM checks.\n"); + + for (i = 0; i < sizeof (tv) / sizeof (tv[0]); i++) + { + if (verbose) + fprintf (stderr, " checking CCM mode for %s [%i]\n", + gcry_cipher_algo_name (tv[i].algo), + tv[i].algo); + + for (j = 0; j < sizeof (cut) / sizeof (cut[0]); j++) + { + split = cut[j] < 0 ? tv[i].plainlen : cut[j]; + if (tv[i].plainlen < split) + continue; + + err = gcry_cipher_open (&hde, tv[i].algo, GCRY_CIPHER_MODE_CCM, 0); + if (!err) + err = gcry_cipher_open (&hdd, tv[i].algo, GCRY_CIPHER_MODE_CCM, 0); + if (err) + { + fail ("cipher-ccm, gcry_cipher_open failed: %s\n", + gpg_strerror (err)); + return; + } + + keylen = gcry_cipher_get_algo_keylen(tv[i].algo); + if (!keylen) + { + fail ("cipher-ccm, gcry_cipher_get_algo_keylen failed\n"); + return; + } + + err = gcry_cipher_setkey (hde, tv[i].key, keylen); + if (!err) + err = gcry_cipher_setkey (hdd, tv[i].key, keylen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_setkey failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + blklen = gcry_cipher_get_algo_blklen(tv[i].algo); + if (!blklen) + { + fail ("cipher-ccm, gcry_cipher_get_algo_blklen failed\n"); + return; + } + + err = gcry_cipher_setiv (hde, tv[i].nonce, tv[i].noncelen); + if (!err) + err = gcry_cipher_setiv (hdd, tv[i].nonce, tv[i].noncelen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_setiv failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + authlen = tv[i].cipherlen - tv[i].plainlen; + ctl_params[0] = tv[i].plainlen; /* encryptedlen */ + ctl_params[1] = tv[i].aadlen; /* aadlen */ + ctl_params[2] = authlen; /* authtaglen */ + err = gcry_cipher_ctl (hde, GCRYCTL_SET_CCM_PARAMS, ctl_params, + sizeof(ctl_params)); + if (!err) + err = gcry_cipher_ctl (hdd, GCRYCTL_SET_CCM_PARAMS, ctl_params, + sizeof(ctl_params)); + if (err) + { + fail ("cipher-ccm, gcry_cipher_ctl GCRYCTL_SET_CCM_PARAMS failed:" + "%s\n", gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + aadsplit = split > tv[i].aadlen ? 0 : split; + + err = gcry_cipher_authenticate (hde, tv[i].aad, + tv[i].aadlen - aadsplit); + if (!err) + err = gcry_cipher_authenticate (hde, + &tv[i].aad[tv[i].aadlen - aadsplit], + aadsplit); + if (!err) + err = gcry_cipher_authenticate (hdd, tv[i].aad, + tv[i].aadlen - aadsplit); + if (!err) + err = gcry_cipher_authenticate (hdd, + &tv[i].aad[tv[i].aadlen - aadsplit], + aadsplit); + if (err) + { + fail ("cipher-ccm, gcry_cipher_authenticate failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + err = gcry_cipher_encrypt (hde, out, MAX_DATA_LEN, tv[i].plaintext, + tv[i].plainlen - split); + if (!err) + err = gcry_cipher_encrypt (hde, &out[tv[i].plainlen - split], + MAX_DATA_LEN - (tv[i].plainlen - split), + &tv[i].plaintext[tv[i].plainlen - split], + split); + if (err) + { + fail ("cipher-ccm, gcry_cipher_encrypt (%d:%d) failed: %s\n", + i, j, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + err = gcry_cipher_gettag (hde, &out[tv[i].plainlen], authlen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_gettag (%d:%d) failed: %s\n", + i, j, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + if (memcmp (tv[i].ciphertext, out, tv[i].cipherlen)) + fail ("cipher-ccm, encrypt mismatch entry %d:%d\n", i, j); + + err = gcry_cipher_decrypt (hdd, out, tv[i].plainlen - split, NULL, 0); + if (!err) + err = gcry_cipher_decrypt (hdd, &out[tv[i].plainlen - split], split, + NULL, 0); + if (err) + { + fail ("cipher-ccm, gcry_cipher_decrypt (%d:%d) failed: %s\n", + i, j, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + if (memcmp (tv[i].plaintext, out, tv[i].plainlen)) + fail ("cipher-ccm, decrypt mismatch entry %d:%d\n", i, j); + + err = gcry_cipher_checktag (hdd, &out[tv[i].plainlen], authlen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_checktag (%d:%d) failed: %s\n", + i, j, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + } + } + + /* Large buffer tests. */ + + /* Test encoding of aadlen > 0xfeff. */ + { + static const char key[]={0x40,0x41,0x42,0x43,0x44,0x45,0x46,0x47, + 0x48,0x49,0x4a,0x4b,0x4c,0x4d,0x4e,0x4f}; + static const char iv[]={0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19}; + static const char tag[]={0x9C,0x76,0xE7,0x33,0xD5,0x15,0xB3,0x6C, + 0xBA,0x76,0x95,0xF7,0xFB,0x91}; + char buf[1024]; + size_t enclen = 0x20000; + size_t aadlen = 0x20000; + size_t taglen = sizeof(tag); + + err = gcry_cipher_open (&hde, GCRY_CIPHER_AES, GCRY_CIPHER_MODE_CCM, 0); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_open failed: %s\n", + gpg_strerror (err)); + return; + } + + err = gcry_cipher_setkey (hde, key, sizeof (key)); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_setkey failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + err = gcry_cipher_setiv (hde, iv, sizeof (iv)); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_setiv failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + ctl_params[0] = enclen; /* encryptedlen */ + ctl_params[1] = aadlen; /* aadlen */ + ctl_params[2] = taglen; /* authtaglen */ + err = gcry_cipher_ctl (hde, GCRYCTL_SET_CCM_PARAMS, ctl_params, + sizeof(ctl_params)); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_ctl GCRYCTL_SET_CCM_PARAMS failed:" + "%s\n", gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + memset (buf, 0xaa, sizeof(buf)); + + for (i = 0; i < aadlen; i += sizeof(buf)) + { + err = gcry_cipher_authenticate (hde, buf, sizeof (buf)); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_authenticate failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + } + + for (i = 0; i < enclen; i += sizeof(buf)) + { + memset (buf, 0xee, sizeof(buf)); + err = gcry_cipher_encrypt (hde, buf, sizeof (buf), NULL, 0); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_encrypt failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + } + + err = gcry_cipher_gettag (hde, buf, taglen); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_gettag failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + if (memcmp (buf, tag, taglen) != 0) + fail ("cipher-ccm-large, encrypt mismatch entry\n"); + } + +#if 0 + /* Test encoding of aadlen > 0xffffffff. */ + { + static const char key[]={0x40,0x41,0x42,0x43,0x44,0x45,0x46,0x47, + 0x48,0x49,0x4a,0x4b,0x4c,0x4d,0x4e,0x4f}; + static const char iv[]={0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19}; + static const char tag[]={0x01,0xB2,0xC3,0x4A,0xA6,0x6A,0x07,0x6D, + 0xBC,0xBD,0xEA,0x17,0xD3,0x73,0xD7,0xD4}; + char buf[1024]; + size_t enclen = (size_t)0xffffffff + 1 + 1024; + size_t aadlen = (size_t)0xffffffff + 1 + 1024; + size_t taglen = sizeof(tag); + + err = gcry_cipher_open (&hde, GCRY_CIPHER_AES, GCRY_CIPHER_MODE_CCM, 0); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_open failed: %s\n", + gpg_strerror (err)); + return; + } + + err = gcry_cipher_setkey (hde, key, sizeof (key)); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_setkey failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + err = gcry_cipher_setiv (hde, iv, sizeof (iv)); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_setiv failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + ctl_params[0] = enclen; /* encryptedlen */ + ctl_params[1] = aadlen; /* aadlen */ + ctl_params[2] = taglen; /* authtaglen */ + err = gcry_cipher_ctl (hde, GCRYCTL_SET_CCM_PARAMS, ctl_params, + sizeof(ctl_params)); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_ctl GCRYCTL_SET_CCM_PARAMS failed:" + "%s\n", gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + memset (buf, 0xaa, sizeof(buf)); + + for (i = 0; i < aadlen; i += sizeof(buf)) + { + err = gcry_cipher_authenticate (hde, buf, sizeof (buf)); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_authenticate failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + } + + for (i = 0; i < enclen; i += sizeof(buf)) + { + memset (buf, 0xee, sizeof(buf)); + err = gcry_cipher_encrypt (hde, buf, sizeof (buf), NULL, 0); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_encrypt failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + } + + err = gcry_cipher_gettag (hde, buf, taglen); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_gettag failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + if (memcmp (buf, tag, taglen) != 0) + fail ("cipher-ccm-huge, encrypt mismatch entry\n"); + } +#endif + + if (verbose) + fprintf (stderr, " Completed CCM checks.\n"); +} + + +static void check_stream_cipher (void) { struct tv @@ -2455,6 +3225,7 @@ check_cipher_modes(void) check_ctr_cipher (); check_cfb_cipher (); check_ofb_cipher (); + check_ccm_cipher (); check_stream_cipher (); check_stream_cipher_large_block (); diff --git a/tests/benchmark.c b/tests/benchmark.c index ecda0d3..a9b31f7 100644 --- a/tests/benchmark.c +++ b/tests/benchmark.c @@ -435,6 +435,40 @@ md_bench ( const char *algoname ) fflush (stdout); } + +static void ccm_aead_init(gcry_cipher_hd_t hd, size_t buflen, int authlen) +{ + const int _L = 4; + const int noncelen = 15 - _L; + char nonce[noncelen]; + size_t params[3]; + gcry_error_t err = GPG_ERR_NO_ERROR; + + memset (nonce, 0x33, noncelen); + + err = gcry_cipher_setiv (hd, nonce, noncelen); + if (err) + { + fprintf (stderr, "gcry_cipher_setiv failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + params[0] = buflen; /* encryptedlen */ + params[1] = 0; /* aadlen */ + params[2] = authlen; /* authtaglen */ + err = gcry_cipher_ctl (hd, GCRYCTL_SET_CCM_PARAMS, params, sizeof(params)); + if (err) + { + fprintf (stderr, "gcry_cipher_setiv failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } +} + + static void cipher_bench ( const char *algoname ) { @@ -448,12 +482,21 @@ cipher_bench ( const char *algoname ) char *raw_outbuf, *raw_buf; size_t allocated_buflen, buflen; int repetitions; - static struct { int mode; const char *name; int blocked; } modes[] = { + static const struct { + int mode; + const char *name; + int blocked; + void (* const aead_init)(gcry_cipher_hd_t hd, size_t buflen, int authlen); + int req_blocksize; + int authlen; + } modes[] = { { GCRY_CIPHER_MODE_ECB, " ECB/Stream", 1 }, { GCRY_CIPHER_MODE_CBC, " CBC", 1 }, { GCRY_CIPHER_MODE_CFB, " CFB", 0 }, { GCRY_CIPHER_MODE_OFB, " OFB", 0 }, { GCRY_CIPHER_MODE_CTR, " CTR", 0 }, + { GCRY_CIPHER_MODE_CCM, " CCM", 0, + ccm_aead_init, GCRY_CCM_BLOCK_LEN, 8 }, { GCRY_CIPHER_MODE_STREAM, "", 0 }, {0} }; @@ -542,9 +585,16 @@ cipher_bench ( const char *algoname ) for (modeidx=0; modes[modeidx].mode; modeidx++) { if ((blklen > 1 && modes[modeidx].mode == GCRY_CIPHER_MODE_STREAM) - | (blklen == 1 && modes[modeidx].mode != GCRY_CIPHER_MODE_STREAM)) + || (blklen == 1 && modes[modeidx].mode != GCRY_CIPHER_MODE_STREAM)) continue; + if (modes[modeidx].req_blocksize > 0 + && blklen != modes[modeidx].req_blocksize) + { + printf (" %7s %7s", "-", "-" ); + continue; + } + for (i=0; i < sizeof buf; i++) buf[i] = i; @@ -585,7 +635,18 @@ cipher_bench ( const char *algoname ) exit (1); } } - err = gcry_cipher_encrypt ( hd, outbuf, buflen, buf, buflen); + if (modes[modeidx].aead_init) + { + (*modes[modeidx].aead_init) (hd, buflen, modes[modeidx].authlen); + err = gcry_cipher_encrypt (hd, outbuf, buflen, buf, buflen); + if (err) + break; + err = gcry_cipher_gettag (hd, outbuf, modes[modeidx].authlen); + } + else + { + err = gcry_cipher_encrypt (hd, outbuf, buflen, buf, buflen); + } } stop_timer (); @@ -632,7 +693,18 @@ cipher_bench ( const char *algoname ) exit (1); } } - err = gcry_cipher_decrypt ( hd, outbuf, buflen, buf, buflen); + if (modes[modeidx].aead_init) + { + (*modes[modeidx].aead_init) (hd, buflen, modes[modeidx].authlen); + err = gcry_cipher_decrypt (hd, outbuf, buflen, buf, buflen); + if (err) + break; + err = gcry_cipher_checktag (hd, outbuf, modes[modeidx].authlen); + if (gpg_err_code (err) == GPG_ERR_CHECKSUM) + err = gpg_error (GPG_ERR_NO_ERROR); + } + else + err = gcry_cipher_decrypt (hd, outbuf, buflen, buf, buflen); } stop_timer (); printf (" %s", elapsed_time ()); From wk at gnupg.org Mon Oct 21 12:43:58 2013 From: wk at gnupg.org (Werner Koch) Date: Mon, 21 Oct 2013 12:43:58 +0200 Subject: ECDSA for Edwards curve In-Reply-To: <1382341573.3497.1.camel@cfw2.gniibe.org> (NIIBE Yutaka's message of "Mon, 21 Oct 2013 16:46:13 +0900") References: <1381867004-21231-1-git-send-email-dbaryshkov@gmail.com> <1381867004-21231-3-git-send-email-dbaryshkov@gmail.com> <87ppr5jnbm.fsf@vigenere.g10code.de> <87iowwiefw.fsf@vigenere.g10code.de> <1382341573.3497.1.camel@cfw2.gniibe.org> Message-ID: <87sivuewep.fsf@vigenere.g10code.de> On Mon, 21 Oct 2013 09:46, gniibe at fsij.org said: > I think that I figure out the issue of failure. Me too. By assuming a bug in the code it took me lots of hours to come to the right conclusion :-(. > In the function nist_generate_key, when we change the private key "d" > into -d, it assumes Weierstrass curve, where negative point of (x, y) > is (x, -y). > > However, for Twisted Edwards curve, negative point of (u, v) is (-u, v). Yeah. The question is why we should use that compression form for the Edwards curve. The very reason for the Ed25519/ECDSA hack is a special need of GNUNET which can't be fulfilled by the standard ED25519 key generation. Jivsov's black box key generation algorithm would fix the problem but be unusable for GNUNET as well. > Or, we could change the code so that we can have interfaces of > getting/setting affine point in the representation of corresponding > Weierstrass curve (x, y) for Twisted Edwards curve. And public key is > specified by Weierstrass curve representation. Or forget about Ed25519 and use P-256 directly? Needs to be discussed with the GNUnet folks. Shalom-Salam, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From jussi.kivilinna at iki.fi Mon Oct 21 14:40:59 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 21 Oct 2013 15:40:59 +0300 Subject: [PATCH] mpi: allow building with clang on ARM Message-ID: <20131021124059.29895.67299.stgit@localhost6.localdomain6> * mpi/longlong.h [__arm__] (add_ssaaaa, sub_ddmmss, umul_ppmm) (count_leading_zeros): Do not cast assembly output arguments. [__arm__] (umul_ppmm): Remove the extra '%' ahead of assembly comment. [_ARM_ARCH >= 4] (umul_ppmm): Use correct inputs and outputs instead of registers. -- Signed-off-by: Jussi Kivilinna --- mpi/longlong.h | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/mpi/longlong.h b/mpi/longlong.h index c2ab9c5..8c0bd45 100644 --- a/mpi/longlong.h +++ b/mpi/longlong.h @@ -213,8 +213,8 @@ extern UDItype __udiv_qrnnd (); #define add_ssaaaa(sh, sl, ah, al, bh, bl) \ __asm__ ("adds %1, %4, %5\n" \ "adc %0, %2, %3" \ - : "=r" ((USItype)(sh)), \ - "=&r" ((USItype)(sl)) \ + : "=r" ((sh)), \ + "=&r" ((sl)) \ : "%r" ((USItype)(ah)), \ "rI" ((USItype)(bh)), \ "%r" ((USItype)(al)), \ @@ -222,15 +222,15 @@ extern UDItype __udiv_qrnnd (); #define sub_ddmmss(sh, sl, ah, al, bh, bl) \ __asm__ ("subs %1, %4, %5\n" \ "sbc %0, %2, %3" \ - : "=r" ((USItype)(sh)), \ - "=&r" ((USItype)(sl)) \ + : "=r" ((sh)), \ + "=&r" ((sl)) \ : "r" ((USItype)(ah)), \ "rI" ((USItype)(bh)), \ "r" ((USItype)(al)), \ "rI" ((USItype)(bl)) __CLOBBER_CC) #if (defined __ARM_ARCH && __ARM_ARCH <= 3) #define umul_ppmm(xh, xl, a, b) \ - __asm__ ("%@ Inlined umul_ppmm\n" \ + __asm__ ("@ Inlined umul_ppmm\n" \ "mov %|r0, %2, lsr #16 @ AAAA\n" \ "mov %|r2, %3, lsr #16 @ BBBB\n" \ "bic %|r1, %2, %|r0, lsl #16 @ aaaa\n" \ @@ -243,17 +243,17 @@ extern UDItype __udiv_qrnnd (); "addcs %|r2, %|r2, #65536\n" \ "adds %1, %|r1, %|r0, lsl #16\n" \ "adc %0, %|r2, %|r0, lsr #16" \ - : "=&r" ((USItype)(xh)), \ - "=r" ((USItype)(xl)) \ + : "=&r" ((xh)), \ + "=r" ((xl)) \ : "r" ((USItype)(a)), \ "r" ((USItype)(b)) \ : "r0", "r1", "r2" __CLOBBER_CC) #else /* __ARM_ARCH >= 4 */ #define umul_ppmm(xh, xl, a, b) \ - __asm__ ("%@ Inlined umul_ppmm\n" \ - "umull %r1, %r0, %r2, %r3" \ - : "=&r" ((USItype)(xh)), \ - "=r" ((USItype)(xl)) \ + __asm__ ("@ Inlined umul_ppmm\n" \ + "umull %1, %0, %2, %3" \ + : "=&r" ((xh)), \ + "=r" ((xl)) \ : "r" ((USItype)(a)), \ "r" ((USItype)(b)) \ : "r0", "r1") @@ -263,7 +263,7 @@ extern UDItype __udiv_qrnnd (); #if (defined __ARM_ARCH && __ARM_ARCH >= 5) #define count_leading_zeros(count, x) \ __asm__ ("clz %0, %1" \ - : "=r" ((USItype)(count)) \ + : "=r" ((count)) \ : "r" ((USItype)(x))) #endif /* __ARM_ARCH >= 5 */ #endif /* __arm__ */ From jussi.kivilinna at iki.fi Mon Oct 21 15:21:21 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 21 Oct 2013 16:21:21 +0300 Subject: [PATCH] twofish: add ARMv6 assembly implementation Message-ID: <20131021132121.23604.99408.stgit@localhost6.localdomain6> * cipher/Makefile.am: Add 'twofish-armv6.S'. * cipher/twofish-armv6.S: New. * cipher/twofish.c (USE_ARMV6_ASM): New macro. [USE_ARMV6_ASM] (_gcry_twofish_armv6_encrypt_block) (_gcry_twofish_armv6_decrypt_block): New prototypes. [USE_AMDV6_ASM] (twofish_encrypt, twofish_decrypt): Add. [USE_AMD64_ASM] (do_twofish_encrypt, do_twofish_decrypt): Remove. (_gcry_twofish_ctr_enc, _gcry_twofish_cfb_dec): Use 'twofish_encrypt' instead of 'do_twofish_encrypt'. (_gcry_twofish_cbc_dec): Use 'twofish_decrypt' instead of 'do_twofish_decrypt'. * configure.ac [arm]: Add 'twofish-armv6.lo'. -- Add optimized ARMv6 assembly implementation for Twofish. Implementation is tuned for Cortex-A8. Unaligned access handling is done in assembly part. For now, only enable this on little-endian systems as big-endian correctness have not been tested yet. Old (gcc-4.8) vs new (twofish-asm), Cortex-A8 (on armhf): ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- TWOFISH 1.23x 1.25x 1.16x 1.26x 1.16x 1.30x 1.18x 1.17x 1.23x 1.23x 1.22x 1.22x Signed-off-by: Jussi Kivilinna --- cipher/Makefile.am | 2 cipher/twofish-armv6.S | 365 ++++++++++++++++++++++++++++++++++++++++++++++++ cipher/twofish.c | 88 ++++++++---- configure.ac | 4 + 4 files changed, 432 insertions(+), 27 deletions(-) create mode 100644 cipher/twofish-armv6.S diff --git a/cipher/Makefile.am b/cipher/Makefile.am index b0efd89..3d8149a 100644 --- a/cipher/Makefile.am +++ b/cipher/Makefile.am @@ -80,7 +80,7 @@ sha512.c sha512-armv7-neon.S \ stribog.c \ tiger.c \ whirlpool.c \ -twofish.c twofish-amd64.S \ +twofish.c twofish-amd64.S twofish-armv6.S \ rfc2268.c \ camellia.c camellia.h camellia-glue.c camellia-aesni-avx-amd64.S \ camellia-aesni-avx2-amd64.S camellia-armv6.S diff --git a/cipher/twofish-armv6.S b/cipher/twofish-armv6.S new file mode 100644 index 0000000..b76ab37 --- /dev/null +++ b/cipher/twofish-armv6.S @@ -0,0 +1,365 @@ +/* twofish-armv6.S - ARM assembly implementation of Twofish cipher + * + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include + +#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +#ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS + +.text + +.syntax unified +.arm + +/* structure of TWOFISH_context: */ +#define s0 0 +#define s1 ((s0) + 4 * 256) +#define s2 ((s1) + 4 * 256) +#define s3 ((s2) + 4 * 256) +#define w ((s3) + 4 * 256) +#define k ((w) + 4 * 8) + +/* register macros */ +#define CTX %r0 +#define CTXs0 %r0 +#define CTXs1 %r1 +#define CTXs3 %r7 + +#define RA %r3 +#define RB %r4 +#define RC %r5 +#define RD %r6 + +#define RX %r2 +#define RY %ip + +#define RMASK %lr + +#define RT0 %r8 +#define RT1 %r9 +#define RT2 %r10 +#define RT3 %r11 + +/* helper macros */ +#define ldr_unaligned_le(rout, rsrc, offs, rtmp) \ + ldrb rout, [rsrc, #((offs) + 0)]; \ + ldrb rtmp, [rsrc, #((offs) + 1)]; \ + orr rout, rout, rtmp, lsl #8; \ + ldrb rtmp, [rsrc, #((offs) + 2)]; \ + orr rout, rout, rtmp, lsl #16; \ + ldrb rtmp, [rsrc, #((offs) + 3)]; \ + orr rout, rout, rtmp, lsl #24; + +#define str_unaligned_le(rin, rdst, offs, rtmp0, rtmp1) \ + mov rtmp0, rin, lsr #8; \ + strb rin, [rdst, #((offs) + 0)]; \ + mov rtmp1, rin, lsr #16; \ + strb rtmp0, [rdst, #((offs) + 1)]; \ + mov rtmp0, rin, lsr #24; \ + strb rtmp1, [rdst, #((offs) + 2)]; \ + strb rtmp0, [rdst, #((offs) + 3)]; + +#ifndef __ARMEL__ + /* bswap on big-endian */ + #define host_to_le(reg) \ + rev reg, reg; + #define le_to_host(reg) \ + rev reg, reg; +#else + /* nop on little-endian */ + #define host_to_le(reg) /*_*/ + #define le_to_host(reg) /*_*/ +#endif + +#define ldr_input_aligned_le(rin, a, b, c, d) \ + ldr a, [rin, #0]; \ + ldr b, [rin, #4]; \ + le_to_host(a); \ + ldr c, [rin, #8]; \ + le_to_host(b); \ + ldr d, [rin, #12]; \ + le_to_host(c); \ + le_to_host(d); + +#define str_output_aligned_le(rout, a, b, c, d) \ + le_to_host(a); \ + le_to_host(b); \ + str a, [rout, #0]; \ + le_to_host(c); \ + str b, [rout, #4]; \ + le_to_host(d); \ + str c, [rout, #8]; \ + str d, [rout, #12]; + +#ifdef __ARM_FEATURE_UNALIGNED + /* unaligned word reads/writes allowed */ + #define ldr_input_le(rin, ra, rb, rc, rd, rtmp) \ + ldr_input_aligned_le(rin, ra, rb, rc, rd) + + #define str_output_le(rout, ra, rb, rc, rd, rtmp0, rtmp1) \ + str_output_aligned_le(rout, ra, rb, rc, rd) +#else + /* need to handle unaligned reads/writes by byte reads */ + #define ldr_input_le(rin, ra, rb, rc, rd, rtmp0) \ + tst rin, #3; \ + beq 1f; \ + ldr_unaligned_le(ra, rin, 0, rtmp0); \ + ldr_unaligned_le(rb, rin, 4, rtmp0); \ + ldr_unaligned_le(rc, rin, 8, rtmp0); \ + ldr_unaligned_le(rd, rin, 12, rtmp0); \ + b 2f; \ + 1:;\ + ldr_input_aligned_le(rin, ra, rb, rc, rd); \ + 2:; + + #define str_output_le(rout, ra, rb, rc, rd, rtmp0, rtmp1) \ + tst rout, #3; \ + beq 1f; \ + str_unaligned_le(ra, rout, 0, rtmp0, rtmp1); \ + str_unaligned_le(rb, rout, 4, rtmp0, rtmp1); \ + str_unaligned_le(rc, rout, 8, rtmp0, rtmp1); \ + str_unaligned_le(rd, rout, 12, rtmp0, rtmp1); \ + b 2f; \ + 1:;\ + str_output_aligned_le(rout, ra, rb, rc, rd); \ + 2:; +#endif + +/********************************************************************** + 1-way twofish + **********************************************************************/ +#define encrypt_round(a, b, rc, rd, n, ror_a, adj_a) \ + and RT0, RMASK, b, lsr#(8 - 2); \ + and RY, RMASK, b, lsr#(16 - 2); \ + add RT0, RT0, #(s2 - s1); \ + and RT1, RMASK, b, lsr#(24 - 2); \ + ldr RY, [CTXs3, RY]; \ + and RT2, RMASK, b, lsl#(2); \ + ldr RT0, [CTXs1, RT0]; \ + and RT3, RMASK, a, lsr#(16 - 2 + (adj_a)); \ + ldr RT1, [CTXs0, RT1]; \ + and RX, RMASK, a, lsr#(8 - 2 + (adj_a)); \ + ldr RT2, [CTXs1, RT2]; \ + add RT3, RT3, #(s2 - s1); \ + ldr RX, [CTXs1, RX]; \ + ror_a(a); \ + \ + eor RY, RY, RT0; \ + ldr RT3, [CTXs1, RT3]; \ + and RT0, RMASK, a, lsl#(2); \ + eor RY, RY, RT1; \ + and RT1, RMASK, a, lsr#(24 - 2); \ + eor RY, RY, RT2; \ + ldr RT0, [CTXs0, RT0]; \ + eor RX, RX, RT3; \ + ldr RT1, [CTXs3, RT1]; \ + eor RX, RX, RT0; \ + \ + ldr RT3, [CTXs3, #(k - s3 + 8 * (n) + 4)]; \ + eor RX, RX, RT1; \ + ldr RT2, [CTXs3, #(k - s3 + 8 * (n))]; \ + \ + add RT0, RX, RY, lsl #1; \ + add RX, RX, RY; \ + add RT0, RT0, RT3; \ + add RX, RX, RT2; \ + eor rd, RT0, rd, ror #31; \ + eor rc, rc, RX; + +#define dummy(x) /*_*/ + +#define ror1(r) \ + ror r, r, #1; + +#define decrypt_round(a, b, rc, rd, n, ror_b, adj_b) \ + and RT3, RMASK, b, lsl#(2 - (adj_b)); \ + and RT1, RMASK, b, lsr#(8 - 2 + (adj_b)); \ + ror_b(b); \ + and RT2, RMASK, a, lsl#(2); \ + and RT0, RMASK, a, lsr#(8 - 2); \ + \ + ldr RY, [CTXs1, RT3]; \ + add RT1, RT1, #(s2 - s1); \ + ldr RX, [CTXs0, RT2]; \ + and RT3, RMASK, b, lsr#(16 - 2); \ + ldr RT1, [CTXs1, RT1]; \ + and RT2, RMASK, a, lsr#(16 - 2); \ + ldr RT0, [CTXs1, RT0]; \ + \ + add RT2, RT2, #(s2 - s1); \ + ldr RT3, [CTXs3, RT3]; \ + eor RY, RY, RT1; \ + \ + and RT1, RMASK, b, lsr#(24 - 2); \ + eor RX, RX, RT0; \ + ldr RT2, [CTXs1, RT2]; \ + and RT0, RMASK, a, lsr#(24 - 2); \ + \ + ldr RT1, [CTXs0, RT1]; \ + \ + eor RY, RY, RT3; \ + ldr RT0, [CTXs3, RT0]; \ + eor RX, RX, RT2; \ + eor RY, RY, RT1; \ + \ + ldr RT1, [CTXs3, #(k - s3 + 8 * (n) + 4)]; \ + eor RX, RX, RT0; \ + ldr RT2, [CTXs3, #(k - s3 + 8 * (n))]; \ + \ + add RT0, RX, RY, lsl #1; \ + add RX, RX, RY; \ + add RT0, RT0, RT1; \ + add RX, RX, RT2; \ + eor rd, rd, RT0; \ + eor rc, RX, rc, ror #31; + +#define first_encrypt_cycle(nc) \ + encrypt_round(RA, RB, RC, RD, (nc) * 2, dummy, 0); \ + encrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); + +#define encrypt_cycle(nc) \ + encrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); \ + encrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); + +#define last_encrypt_cycle(nc) \ + encrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); \ + encrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); \ + ror1(RA); + +#define first_decrypt_cycle(nc) \ + decrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, dummy, 0); \ + decrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); + +#define decrypt_cycle(nc) \ + decrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); \ + decrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); + +#define last_decrypt_cycle(nc) \ + decrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); \ + decrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); \ + ror1(RD); + +.align 3 +.global _gcry_twofish_armv6_encrypt_block +.type _gcry_twofish_armv6_encrypt_block,%function; + +_gcry_twofish_armv6_encrypt_block: + /* input: + * %r0: ctx + * %r1: dst + * %r2: src + */ + push {%r1, %r4-%r11, %ip, %lr}; + + add RY, CTXs0, #w; + + ldr_input_le(%r2, RA, RB, RC, RD, RT0); + + /* Input whitening */ + ldm RY, {RT0, RT1, RT2, RT3}; + add CTXs3, CTXs0, #(s3 - s0); + add CTXs1, CTXs0, #(s1 - s0); + mov RMASK, #(0xff << 2); + eor RA, RA, RT0; + eor RB, RB, RT1; + eor RC, RC, RT2; + eor RD, RD, RT3; + + first_encrypt_cycle(0); + encrypt_cycle(1); + encrypt_cycle(2); + encrypt_cycle(3); + encrypt_cycle(4); + encrypt_cycle(5); + encrypt_cycle(6); + last_encrypt_cycle(7); + + add RY, CTXs3, #(w + 4*4 - s3); + pop {%r1}; /* dst */ + + /* Output whitening */ + ldm RY, {RT0, RT1, RT2, RT3}; + eor RC, RC, RT0; + eor RD, RD, RT1; + eor RA, RA, RT2; + eor RB, RB, RT3; + + str_output_le(%r1, RC, RD, RA, RB, RT0, RT1); + + pop {%r4-%r11, %ip, %lr}; + bx %lr; +.ltorg +.size _gcry_twofish_armv6_encrypt_block,.-_gcry_twofish_armv6_encrypt_block; + +.align 3 +.global _gcry_twofish_armv6_decrypt_block +.type _gcry_twofish_armv6_decrypt_block,%function; + +_gcry_twofish_armv6_decrypt_block: + /* input: + * %r0: ctx + * %r1: dst + * %r2: src + */ + push {%r1, %r4-%r11, %ip, %lr}; + + add CTXs3, CTXs0, #(s3 - s0); + + ldr_input_le(%r2, RC, RD, RA, RB, RT0); + + add RY, CTXs3, #(w + 4*4 - s3); + add CTXs3, CTXs0, #(s3 - s0); + + /* Input whitening */ + ldm RY, {RT0, RT1, RT2, RT3}; + add CTXs1, CTXs0, #(s1 - s0); + mov RMASK, #(0xff << 2); + eor RC, RC, RT0; + eor RD, RD, RT1; + eor RA, RA, RT2; + eor RB, RB, RT3; + + first_decrypt_cycle(7); + decrypt_cycle(6); + decrypt_cycle(5); + decrypt_cycle(4); + decrypt_cycle(3); + decrypt_cycle(2); + decrypt_cycle(1); + last_decrypt_cycle(0); + + add RY, CTXs0, #w; + pop {%r1}; /* dst */ + + /* Output whitening */ + ldm RY, {RT0, RT1, RT2, RT3}; + eor RA, RA, RT0; + eor RB, RB, RT1; + eor RC, RC, RT2; + eor RD, RD, RT3; + + str_output_le(%r1, RA, RB, RC, RD, RT0, RT1); + + pop {%r4-%r11, %ip, %lr}; + bx %lr; +.size _gcry_twofish_armv6_decrypt_block,.-_gcry_twofish_armv6_decrypt_block; + +#endif /*HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS*/ +#endif /*__ARM_ARCH >= 6*/ diff --git a/cipher/twofish.c b/cipher/twofish.c index 993ad0f..d2cabbe 100644 --- a/cipher/twofish.c +++ b/cipher/twofish.c @@ -57,6 +57,14 @@ # define USE_AMD64_ASM 1 #endif +/* USE_ARMV6_ASM indicates whether to use ARMv6 assembly code. */ +#undef USE_ARMV6_ASM +#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +# if defined(HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS) +# define USE_ARMV6_ASM 1 +# endif +#endif + /* Prototype for the self-test function. */ static const char *selftest(void); @@ -746,7 +754,16 @@ extern void _gcry_twofish_amd64_cbc_dec(const TWOFISH_context *c, byte *out, extern void _gcry_twofish_amd64_cfb_dec(const TWOFISH_context *c, byte *out, const byte *in, byte *iv); -#else /*!USE_AMD64_ASM*/ +#elif defined(USE_ARMV6_ASM) + +/* Assembly implementations of Twofish. */ +extern void _gcry_twofish_armv6_encrypt_block(const TWOFISH_context *c, + byte *out, const byte *in); + +extern void _gcry_twofish_armv6_decrypt_block(const TWOFISH_context *c, + byte *out, const byte *in); + +#else /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ /* Macros to compute the g() function in the encryption and decryption * rounds. G1 is the straight g() function; G2 includes the 8-bit @@ -812,21 +829,25 @@ extern void _gcry_twofish_amd64_cfb_dec(const TWOFISH_context *c, byte *out, #ifdef USE_AMD64_ASM -static void -do_twofish_encrypt (const TWOFISH_context *ctx, byte *out, const byte *in) +static unsigned int +twofish_encrypt (void *context, byte *out, const byte *in) { + TWOFISH_context *ctx = context; _gcry_twofish_amd64_encrypt_block(ctx, out, in); + return /*burn_stack*/ (4*sizeof (void*)); } +#elif defined(USE_ARMV6_ASM) + static unsigned int twofish_encrypt (void *context, byte *out, const byte *in) { TWOFISH_context *ctx = context; - _gcry_twofish_amd64_encrypt_block(ctx, out, in); + _gcry_twofish_armv6_encrypt_block(ctx, out, in); return /*burn_stack*/ (4*sizeof (void*)); } -#else /*!USE_AMD64_ASM*/ +#else /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ static void do_twofish_encrypt (const TWOFISH_context *ctx, byte *out, const byte *in) @@ -868,28 +889,32 @@ twofish_encrypt (void *context, byte *out, const byte *in) return /*burn_stack*/ (24+3*sizeof (void*)); } -#endif /*!USE_AMD64_ASM*/ +#endif /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ /* Decrypt one block. in and out may be the same. */ #ifdef USE_AMD64_ASM -static void -do_twofish_decrypt (const TWOFISH_context *ctx, byte *out, const byte *in) +static unsigned int +twofish_decrypt (void *context, byte *out, const byte *in) { + TWOFISH_context *ctx = context; _gcry_twofish_amd64_decrypt_block(ctx, out, in); + return /*burn_stack*/ (4*sizeof (void*)); } +#elif defined(USE_ARMV6_ASM) + static unsigned int twofish_decrypt (void *context, byte *out, const byte *in) { TWOFISH_context *ctx = context; - _gcry_twofish_amd64_decrypt_block(ctx, out, in); + _gcry_twofish_armv6_decrypt_block(ctx, out, in); return /*burn_stack*/ (4*sizeof (void*)); } -#else /*!USE_AMD64_ASM*/ +#else /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ static void do_twofish_decrypt (const TWOFISH_context *ctx, byte *out, const byte *in) @@ -932,7 +957,7 @@ twofish_decrypt (void *context, byte *out, const byte *in) return /*burn_stack*/ (24+3*sizeof (void*)); } -#endif /*!USE_AMD64_ASM*/ +#endif /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ @@ -947,14 +972,11 @@ _gcry_twofish_ctr_enc(void *context, unsigned char *ctr, void *outbuf_arg, unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; unsigned char tmpbuf[TWOFISH_BLOCKSIZE]; - int burn_stack_depth = 24 + 3 * sizeof (void*); + unsigned int burn, burn_stack_depth = 0; int i; #ifdef USE_AMD64_ASM { - if (nblocks >= 3 && burn_stack_depth < 8 * sizeof(void*)) - burn_stack_depth = 8 * sizeof(void*); - /* Process data in 3 block chunks. */ while (nblocks >= 3) { @@ -963,6 +985,10 @@ _gcry_twofish_ctr_enc(void *context, unsigned char *ctr, void *outbuf_arg, nblocks -= 3; outbuf += 3 * TWOFISH_BLOCKSIZE; inbuf += 3 * TWOFISH_BLOCKSIZE; + + burn = 8 * sizeof(void*); + if (burn > burn_stack_depth) + burn_stack_depth = burn; } /* Use generic code to handle smaller chunks... */ @@ -973,7 +999,10 @@ _gcry_twofish_ctr_enc(void *context, unsigned char *ctr, void *outbuf_arg, for ( ;nblocks; nblocks-- ) { /* Encrypt the counter. */ - do_twofish_encrypt(ctx, tmpbuf, ctr); + burn = twofish_encrypt(ctx, tmpbuf, ctr); + if (burn > burn_stack_depth) + burn_stack_depth = burn; + /* XOR the input with the encrypted counter and store in output. */ buf_xor(outbuf, tmpbuf, inbuf, TWOFISH_BLOCKSIZE); outbuf += TWOFISH_BLOCKSIZE; @@ -1002,13 +1031,10 @@ _gcry_twofish_cbc_dec(void *context, unsigned char *iv, void *outbuf_arg, unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; unsigned char savebuf[TWOFISH_BLOCKSIZE]; - int burn_stack_depth = 24 + 3 * sizeof (void*); + unsigned int burn, burn_stack_depth = 0; #ifdef USE_AMD64_ASM { - if (nblocks >= 3 && burn_stack_depth < 9 * sizeof(void*)) - burn_stack_depth = 9 * sizeof(void*); - /* Process data in 3 block chunks. */ while (nblocks >= 3) { @@ -1017,6 +1043,10 @@ _gcry_twofish_cbc_dec(void *context, unsigned char *iv, void *outbuf_arg, nblocks -= 3; outbuf += 3 * TWOFISH_BLOCKSIZE; inbuf += 3 * TWOFISH_BLOCKSIZE; + + burn = 9 * sizeof(void*); + if (burn > burn_stack_depth) + burn_stack_depth = burn; } /* Use generic code to handle smaller chunks... */ @@ -1029,7 +1059,9 @@ _gcry_twofish_cbc_dec(void *context, unsigned char *iv, void *outbuf_arg, OUTBUF. */ memcpy(savebuf, inbuf, TWOFISH_BLOCKSIZE); - do_twofish_decrypt (ctx, outbuf, inbuf); + burn = twofish_decrypt (ctx, outbuf, inbuf); + if (burn > burn_stack_depth) + burn_stack_depth = burn; buf_xor(outbuf, outbuf, iv, TWOFISH_BLOCKSIZE); memcpy(iv, savebuf, TWOFISH_BLOCKSIZE); @@ -1051,13 +1083,10 @@ _gcry_twofish_cfb_dec(void *context, unsigned char *iv, void *outbuf_arg, TWOFISH_context *ctx = context; unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; - int burn_stack_depth = 24 + 3 * sizeof (void*); + unsigned int burn, burn_stack_depth = 0; #ifdef USE_AMD64_ASM { - if (nblocks >= 3 && burn_stack_depth < 8 * sizeof(void*)) - burn_stack_depth = 8 * sizeof(void*); - /* Process data in 3 block chunks. */ while (nblocks >= 3) { @@ -1066,6 +1095,10 @@ _gcry_twofish_cfb_dec(void *context, unsigned char *iv, void *outbuf_arg, nblocks -= 3; outbuf += 3 * TWOFISH_BLOCKSIZE; inbuf += 3 * TWOFISH_BLOCKSIZE; + + burn = 8 * sizeof(void*); + if (burn > burn_stack_depth) + burn_stack_depth = burn; } /* Use generic code to handle smaller chunks... */ @@ -1074,7 +1107,10 @@ _gcry_twofish_cfb_dec(void *context, unsigned char *iv, void *outbuf_arg, for ( ;nblocks; nblocks-- ) { - do_twofish_encrypt(ctx, iv, iv); + burn = twofish_encrypt(ctx, iv, iv); + if (burn > burn_stack_depth) + burn_stack_depth = burn; + buf_xor_n_copy(outbuf, iv, inbuf, TWOFISH_BLOCKSIZE); outbuf += TWOFISH_BLOCKSIZE; inbuf += TWOFISH_BLOCKSIZE; diff --git a/configure.ac b/configure.ac index a803b5f..66fb6b9 100644 --- a/configure.ac +++ b/configure.ac @@ -1479,6 +1479,10 @@ if test "$found" = "1" ; then # Build with the assembly implementation GCRYPT_CIPHERS="$GCRYPT_CIPHERS twofish-amd64.lo" ;; + arm*-*-*) + # Build with the assembly implementation + GCRYPT_CIPHERS="$GCRYPT_CIPHERS twofish-armv6.lo" + ;; esac fi From dbaryshkov at gmail.com Mon Oct 21 17:35:53 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Mon, 21 Oct 2013 19:35:53 +0400 Subject: [PATCH] twofish: add ARMv6 assembly implementation In-Reply-To: <20131021132121.23604.99408.stgit@localhost6.localdomain6> References: <20131021132121.23604.99408.stgit@localhost6.localdomain6> Message-ID: Hello, On Mon, Oct 21, 2013 at 5:21 PM, Jussi Kivilinna wrote: > * cipher/Makefile.am: Add 'twofish-armv6.S'. > * cipher/twofish-armv6.S: New. > * cipher/twofish.c (USE_ARMV6_ASM): New macro. > [USE_ARMV6_ASM] (_gcry_twofish_armv6_encrypt_block) > (_gcry_twofish_armv6_decrypt_block): New prototypes. > [USE_AMDV6_ASM] (twofish_encrypt, twofish_decrypt): Add. > [USE_AMD64_ASM] (do_twofish_encrypt, do_twofish_decrypt): Remove. > (_gcry_twofish_ctr_enc, _gcry_twofish_cfb_dec): Use 'twofish_encrypt' > instead of 'do_twofish_encrypt'. > (_gcry_twofish_cbc_dec): Use 'twofish_decrypt' instead of > 'do_twofish_decrypt'. > * configure.ac [arm]: Add 'twofish-armv6.lo'. Some time ago I have looked into adapting asm optimizations to earlier ARM cores. The main problem was rev instruction, which I conditionally replaced with 4 insns. My current code is present at https://github.com/GostCrypt/libgcrypt/commits/arm-opt . The main remaining issue is HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS test in configure.ac. It is too restrictive. E.g. it fails if gcc is configured to default to armv4te. Also it has .thumb directive, however all assembler files are compiled in arm mode. If I comment out add.w and ,thumb/.code16 lines, I can build code even for armv4 (no thumb), qemu-verified. Code built for armv4t/armv5te successfully verified on armv5te core (XScale). Would you have any suggestions on improving/adapting this configure test? -- With best wishes Dmitry From wk at gnupg.org Mon Oct 21 17:34:40 2013 From: wk at gnupg.org (Werner Koch) Date: Mon, 21 Oct 2013 17:34:40 +0200 Subject: ECDSA for Edwards curve In-Reply-To: <87sivuewep.fsf@vigenere.g10code.de> (Werner Koch's message of "Mon, 21 Oct 2013 12:43:58 +0200") References: <1381867004-21231-1-git-send-email-dbaryshkov@gmail.com> <1381867004-21231-3-git-send-email-dbaryshkov@gmail.com> <87ppr5jnbm.fsf@vigenere.g10code.de> <87iowwiefw.fsf@vigenere.g10code.de> <1382341573.3497.1.camel@cfw2.gniibe.org> <87sivuewep.fsf@vigenere.g10code.de> Message-ID: <878uxmeiy7.fsf@vigenere.g10code.de> On Mon, 21 Oct 2013 12:43, wk at gnupg.org said: > Or forget about Ed25519 and use P-256 directly? Needs to be discussed > with the GNUnet folks. There is an easier way to do that. We only have 255 bit thus the there is one spare bit to represent the sign. This is similar to what EdDSA does, pretty easy and sufficient for the GNUNET case. Shalom-Salam, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From wk at gnupg.org Mon Oct 21 17:41:37 2013 From: wk at gnupg.org (Werner Koch) Date: Mon, 21 Oct 2013 17:41:37 +0200 Subject: [PATCH 1/2] [v2] Add API to support AEAD cipher modes In-Reply-To: <52650232.7080009@iki.fi> (Jussi Kivilinna's message of "Mon, 21 Oct 2013 13:30:10 +0300") References: <20131020120313.21970.15918.stgit@localhost6.localdomain6> <52650232.7080009@iki.fi> Message-ID: <874n8aeimm.fsf@vigenere.g10code.de> On Mon, 21 Oct 2013 12:30, jussi.kivilinna at iki.fi said: > Ok, I changed API back to _authenticate() and added GCRYCTL_SET_CCM_PARAMS I still think that _setaad is a better symbol name than _authenticate. The use of new GCRYCTL codes should be limited because it bypasses type checking. However, for the length paramteer of CCM, I think the use of GCRYCTL_SET_CCM_LENGTH is fine. Salam-Shalom, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From jussi.kivilinna at iki.fi Mon Oct 21 21:34:55 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 21 Oct 2013 22:34:55 +0300 Subject: [PATCH] twofish: add ARMv6 assembly implementation In-Reply-To: References: <20131021132121.23604.99408.stgit@localhost6.localdomain6> Message-ID: <526581DF.6070308@iki.fi> On 21.10.2013 18:35, Dmitry Eremin-Solenikov wrote: > Hello, > > On Mon, Oct 21, 2013 at 5:21 PM, Jussi Kivilinna wrote: >> * cipher/Makefile.am: Add 'twofish-armv6.S'. >> * cipher/twofish-armv6.S: New. >> * cipher/twofish.c (USE_ARMV6_ASM): New macro. >> [USE_ARMV6_ASM] (_gcry_twofish_armv6_encrypt_block) >> (_gcry_twofish_armv6_decrypt_block): New prototypes. >> [USE_AMDV6_ASM] (twofish_encrypt, twofish_decrypt): Add. >> [USE_AMD64_ASM] (do_twofish_encrypt, do_twofish_decrypt): Remove. >> (_gcry_twofish_ctr_enc, _gcry_twofish_cfb_dec): Use 'twofish_encrypt' >> instead of 'do_twofish_encrypt'. >> (_gcry_twofish_cbc_dec): Use 'twofish_decrypt' instead of >> 'do_twofish_decrypt'. >> * configure.ac [arm]: Add 'twofish-armv6.lo'. > > Some time ago I have looked into adapting asm optimizations to earlier > ARM cores. > The main problem was rev instruction, which I conditionally replaced > with 4 insns. > My current code is present at > https://github.com/GostCrypt/libgcrypt/commits/arm-opt . Nice. > > The main remaining issue is HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS test > in configure.ac. It is too restrictive. E.g. it fails if gcc is > configured to default to armv4te. > Also it has .thumb directive, however all assembler files are compiled > in arm mode. > If I comment out add.w and ,thumb/.code16 lines, I can build code even for armv4 > (no thumb), qemu-verified. Code built for armv4t/armv5te successfully > verified on armv5te > core (XScale). Would you have any suggestions on improving/adapting > this configure test? Well, what I try to do here is to check that '.syntax unified' works and is not ignored by assembler. But if asm code works even if '.syntax unified' is ignored, that part of check can be removed. So following should/might be enough: AC_CACHE_CHECK([whether GCC assembler is compatible for ARM assembly implementations], [gcry_cv_gcc_arm_platform_as_ok], [gcry_cv_gcc_arm_platform_as_ok=no AC_COMPILE_IFELSE([AC_LANG_SOURCE( [[__asm__( ".syntax unified\n\t" ".arm\n\t" "asmfunc:\n\t" "add %r0, %r4, %r8, ror #12;\n\t" /* Test if '.type' and '.size' are supported. */ ".size asmfunc,.-asmfunc;\n\t" ".type asmfunc,%function;\n\t" );]])], [gcry_cv_gcc_arm_platform_as_ok=yes])]) -Jussi From jussi.kivilinna at iki.fi Mon Oct 21 21:40:57 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 21 Oct 2013 22:40:57 +0300 Subject: [PATCH 1/2] [v2] Add API to support AEAD cipher modes In-Reply-To: <874n8aeimm.fsf@vigenere.g10code.de> References: <20131020120313.21970.15918.stgit@localhost6.localdomain6> <52650232.7080009@iki.fi> <874n8aeimm.fsf@vigenere.g10code.de> Message-ID: <52658349.4050609@iki.fi> On 21.10.2013 18:41, Werner Koch wrote: > On Mon, 21 Oct 2013 12:30, jussi.kivilinna at iki.fi said: > >> Ok, I changed API back to _authenticate() and added GCRYCTL_SET_CCM_PARAMS > > I still think that _setaad is a better symbol name than _authenticate. > I think _authenticate would be better if block cipher based MACs are to be added. So we'd have _authenticate for buffers to MAC modes and AAD buffers to AEAD modes. _setaad makes it should more AEAD specific. > The use of new GCRYCTL codes should be limited because it bypasses type > checking. However, for the length paramteer of CCM, I think the use of > GCRYCTL_SET_CCM_LENGTH is fine. Ok. -Jussi > > > Salam-Shalom, > > Werner > From gniibe at fsij.org Tue Oct 22 07:24:55 2013 From: gniibe at fsij.org (NIIBE Yutaka) Date: Tue, 22 Oct 2013 14:24:55 +0900 Subject: ECDSA for Edwards curve In-Reply-To: <878uxmeiy7.fsf@vigenere.g10code.de> References: <1381867004-21231-1-git-send-email-dbaryshkov@gmail.com> <1381867004-21231-3-git-send-email-dbaryshkov@gmail.com> <87ppr5jnbm.fsf@vigenere.g10code.de> <87iowwiefw.fsf@vigenere.g10code.de> <1382341573.3497.1.camel@cfw2.gniibe.org> <87sivuewep.fsf@vigenere.g10code.de> <878uxmeiy7.fsf@vigenere.g10code.de> Message-ID: <1382419495.5358.2.camel@cfw2.gniibe.org> On 2013-10-21 at 17:34 +0200, Werner Koch wrote: > There is an easier way to do that. We only have 255 bit thus the there > is one spare bit to represent the sign. This is similar to what EdDSA > does, pretty easy and sufficient for the GNUNET case. Aside from how to encode/decode the point, here is the fix to get compliant key. This fixes the failure of keygen program. diff --git a/cipher/ecc.c b/cipher/ecc.c index 6f3cbbd..2774718 100644 --- a/cipher/ecc.c +++ b/cipher/ecc.c @@ -178,27 +178,33 @@ nist_generate_key (ECC_secret_key *sk, elliptic_curve_t *E, mpi_ec_t ctx, * dropped because we know that it's a minimum of the two * possibilities without any loss of security. */ { - gcry_mpi_t x, y, p_y; + gcry_mpi_t x, y, negative; const unsigned int pbits = mpi_get_nbits (E->p); x = mpi_new (pbits); y = mpi_new (pbits); - p_y = mpi_new (pbits); + negative = mpi_new (pbits); if (_gcry_mpi_ec_get_affine (x, y, &Q, ctx)) log_fatal ("ecgen: Failed to get affine coordinates for %s\n", "Q"); - mpi_sub (p_y, E->p, y); /* p_y = p - y */ + if (E->model == MPI_EC_WEIERSTRASS) + mpi_sub (negative, E->p, y); /* negative = p - y */ + else + mpi_sub (negative, E->p, x); /* negative = p - x */ - if (mpi_cmp (p_y, y) < 0) /* p - y < p */ + if (mpi_cmp (negative, y) < 0) /* p - y < p */ { /* We need to end up with -Q; this assures that new Q's y is the smallest one */ mpi_sub (sk->d, E->n, sk->d); /* d = order - d */ - gcry_mpi_point_snatch_set (&sk->Q, x, p_y, mpi_alloc_set_ui (1)); + if (E->model == MPI_EC_WEIERSTRASS) + gcry_mpi_point_snatch_set (&sk->Q, x, negative, mpi_alloc_set_ui (1)); + else + gcry_mpi_point_snatch_set (&sk->Q, negative, y, mpi_alloc_set_ui (1)); - if (DBG_CIPHER) - log_debug ("ecgen converted Q to a compliant point\n"); + if (DBG_CIPHER) + log_debug ("ecgen converted Q to a compliant point\n"); } else /* p - y >= p */ { @@ -207,10 +213,17 @@ nist_generate_key (ECC_secret_key *sk, elliptic_curve_t *E, mpi_ec_t ctx, if (DBG_CIPHER) log_debug ("ecgen didn't need to convert Q to a compliant point\n"); - mpi_free (p_y); - mpi_free (x); + mpi_free (negative); + if (E->model == MPI_EC_WEIERSTRASS) + mpi_free (x); + else + mpi_free (y); } - mpi_free (y); + + if (E->model == MPI_EC_WEIERSTRASS) + mpi_free (y); + else + mpi_free (x); } /* Now we can test our keys (this should never fail!). */ -- From wk at gnupg.org Tue Oct 22 09:17:46 2013 From: wk at gnupg.org (Werner Koch) Date: Tue, 22 Oct 2013 09:17:46 +0200 Subject: [PATCH 1/2] [v2] Add API to support AEAD cipher modes In-Reply-To: <52658349.4050609@iki.fi> (Jussi Kivilinna's message of "Mon, 21 Oct 2013 22:40:57 +0300") References: <20131020120313.21970.15918.stgit@localhost6.localdomain6> <52650232.7080009@iki.fi> <874n8aeimm.fsf@vigenere.g10code.de> <52658349.4050609@iki.fi> Message-ID: <87mwm1dbad.fsf@vigenere.g10code.de> On Mon, 21 Oct 2013 21:40, jussi.kivilinna at iki.fi said: > I think _authenticate would be better if block cipher based MACs are to be > added. So we'd have _authenticate for buffers to MAC modes and AAD buffers > to AEAD modes. _setaad makes it should more AEAD specific. I am not convinced, but I am okay with _authenticate. Salam-Shalom, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From cvs at cvs.gnupg.org Tue Oct 22 12:48:58 2013 From: cvs at cvs.gnupg.org (by NIIBE Yutaka) Date: Tue, 22 Oct 2013 12:48:58 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-319-ga5a277a Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via a5a277a9016ccb34f1858a65e0ed1791b2fc3db3 (commit) from f7711e6eb5f02d03c74911f6f037ab28075e7c0d (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit a5a277a9016ccb34f1858a65e0ed1791b2fc3db3 Author: NIIBE Yutaka Date: Tue Oct 22 12:47:11 2013 +0200 ecc: Correct compliant key generation for Edwards curves. * cipher/ecc.c: Add case for Edwards curves. Signed-off-by: Werner Koch diff --git a/cipher/ecc.c b/cipher/ecc.c index 6f3cbbd..2774718 100644 --- a/cipher/ecc.c +++ b/cipher/ecc.c @@ -178,27 +178,33 @@ nist_generate_key (ECC_secret_key *sk, elliptic_curve_t *E, mpi_ec_t ctx, * dropped because we know that it's a minimum of the two * possibilities without any loss of security. */ { - gcry_mpi_t x, y, p_y; + gcry_mpi_t x, y, negative; const unsigned int pbits = mpi_get_nbits (E->p); x = mpi_new (pbits); y = mpi_new (pbits); - p_y = mpi_new (pbits); + negative = mpi_new (pbits); if (_gcry_mpi_ec_get_affine (x, y, &Q, ctx)) log_fatal ("ecgen: Failed to get affine coordinates for %s\n", "Q"); - mpi_sub (p_y, E->p, y); /* p_y = p - y */ + if (E->model == MPI_EC_WEIERSTRASS) + mpi_sub (negative, E->p, y); /* negative = p - y */ + else + mpi_sub (negative, E->p, x); /* negative = p - x */ - if (mpi_cmp (p_y, y) < 0) /* p - y < p */ + if (mpi_cmp (negative, y) < 0) /* p - y < p */ { /* We need to end up with -Q; this assures that new Q's y is the smallest one */ mpi_sub (sk->d, E->n, sk->d); /* d = order - d */ - gcry_mpi_point_snatch_set (&sk->Q, x, p_y, mpi_alloc_set_ui (1)); + if (E->model == MPI_EC_WEIERSTRASS) + gcry_mpi_point_snatch_set (&sk->Q, x, negative, mpi_alloc_set_ui (1)); + else + gcry_mpi_point_snatch_set (&sk->Q, negative, y, mpi_alloc_set_ui (1)); - if (DBG_CIPHER) - log_debug ("ecgen converted Q to a compliant point\n"); + if (DBG_CIPHER) + log_debug ("ecgen converted Q to a compliant point\n"); } else /* p - y >= p */ { @@ -207,10 +213,17 @@ nist_generate_key (ECC_secret_key *sk, elliptic_curve_t *E, mpi_ec_t ctx, if (DBG_CIPHER) log_debug ("ecgen didn't need to convert Q to a compliant point\n"); - mpi_free (p_y); - mpi_free (x); + mpi_free (negative); + if (E->model == MPI_EC_WEIERSTRASS) + mpi_free (x); + else + mpi_free (y); } - mpi_free (y); + + if (E->model == MPI_EC_WEIERSTRASS) + mpi_free (y); + else + mpi_free (x); } /* Now we can test our keys (this should never fail!). */ ----------------------------------------------------------------------- Summary of changes: cipher/ecc.c | 33 +++++++++++++++++++++++---------- 1 file changed, 23 insertions(+), 10 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From wk at gnupg.org Tue Oct 22 12:44:14 2013 From: wk at gnupg.org (Werner Koch) Date: Tue, 22 Oct 2013 12:44:14 +0200 Subject: ECDSA for Edwards curve In-Reply-To: <1382419495.5358.2.camel@cfw2.gniibe.org> (NIIBE Yutaka's message of "Tue, 22 Oct 2013 14:24:55 +0900") References: <1381867004-21231-1-git-send-email-dbaryshkov@gmail.com> <1381867004-21231-3-git-send-email-dbaryshkov@gmail.com> <87ppr5jnbm.fsf@vigenere.g10code.de> <87iowwiefw.fsf@vigenere.g10code.de> <1382341573.3497.1.camel@cfw2.gniibe.org> <87sivuewep.fsf@vigenere.g10code.de> <878uxmeiy7.fsf@vigenere.g10code.de> <1382419495.5358.2.camel@cfw2.gniibe.org> Message-ID: <8738ntd1q9.fsf@vigenere.g10code.de> On Tue, 22 Oct 2013 07:24, gniibe at fsij.org said: > Aside from how to encode/decode the point, here is the fix to get > compliant key. This fixes the failure of keygen program. Thanks. Salam-Shalom, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From jussi.kivilinna at iki.fi Tue Oct 22 14:32:02 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 22 Oct 2013 15:32:02 +0300 Subject: [PATCH 1/2] [v3] Add API to support AEAD cipher modes Message-ID: <20131022123202.6563.15226.stgit@localhost6.localdomain6> * cipher/cipher.c (_gcry_cipher_authenticate, _gcry_cipher_checktag) (_gcry_cipher_gettag): New. * doc/gcrypt.texi: Add documentation for new API functions. * src/visibility.c (gcry_cipher_authenticate, gcry_cipher_checktag) (gcry_cipher_gettag): New. * src/gcrypt.h.in, src/visibility.h: add declarations of these functions. * src/libgcrypt.defs, src/libgcrypt.vers: export functions. -- Authenticated Encryption with Associated Data (AEAD) cipher modes provide authentication tag that can be used to authenticate message. At the same time it allows one to specify additional (unencrypted data) that will be authenticated together with the message. This class of cipher modes requires additional API present in this commit. This patch is based on original patch by Dmitry Eremin-Solenikov. Changes in v2: - Change gcry_cipher_tag to gcry_cipher_checktag and gcry_cipher_gettag for giving tag (checktag) for decryption and reading tag (gettag) after encryption. - Change gcry_cipher_authenticate to gcry_cipher_setaad, since additional parameters needed for some AEAD modes (in this case CCM, which needs the length of encrypted data and tag for MAC initialization). - Add some documentation. Changes in v3: - Change gcry_cipher_setaad back to gcry_cipher_authenticate. Additional parameters (encrypt_len, tag_len, aad_len) for CCM will be given through GCRY_CTL_SET_CCM_LENGTHS. Signed-off-by: Jussi Kivilinna --- cipher/cipher.c | 34 ++++++++++++++++++++++++++++++++++ doc/gcrypt.texi | 35 +++++++++++++++++++++++++++++++++++ src/gcrypt.h.in | 11 +++++++++++ src/libgcrypt.def | 3 +++ src/libgcrypt.vers | 1 + src/visibility.c | 27 +++++++++++++++++++++++++++ src/visibility.h | 9 +++++++++ 7 files changed, 120 insertions(+) diff --git a/cipher/cipher.c b/cipher/cipher.c index 75d42d1..36c79db 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -910,6 +910,40 @@ _gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen) return 0; } +gcry_error_t +_gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, + size_t abuflen) +{ + log_fatal ("gcry_cipher_authenticate: invalid mode %d\n", hd->mode); + + (void)abuf; + (void)abuflen; + + return gpg_error (GPG_ERR_INV_CIPHER_MODE); +} + +gcry_error_t +_gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, size_t taglen) +{ + log_fatal ("gcry_cipher_gettag: invalid mode %d\n", hd->mode); + + (void)outtag; + (void)taglen; + + return gpg_error (GPG_ERR_INV_CIPHER_MODE); +} + +gcry_error_t +_gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen) +{ + log_fatal ("gcry_cipher_checktag: invalid mode %d\n", hd->mode); + + (void)intag; + (void)taglen; + + return gpg_error (GPG_ERR_INV_CIPHER_MODE); +} + gcry_error_t gcry_cipher_ctl( gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) diff --git a/doc/gcrypt.texi b/doc/gcrypt.texi index 473c484..0049fa0 100644 --- a/doc/gcrypt.texi +++ b/doc/gcrypt.texi @@ -1731,6 +1731,10 @@ matches the requirement of the selected algorithm and mode. This function is also used with the Salsa20 stream cipher to set or update the required nonce. In this case it needs to be called after setting the key. + +This function is also used with the AEAD cipher modes to set or +update the required nonce. + @end deftypefun @deftypefun gcry_error_t gcry_cipher_setctr (gcry_cipher_hd_t @var{h}, const void *@var{c}, size_t @var{l}) @@ -1750,6 +1754,37 @@ call to gcry_cipher_setkey and clear the initialization vector. Note that gcry_cipher_reset is implemented as a macro. @end deftypefun +Authenticated Encryption with Associated Data (AEAD) block cipher +modes require the handling of the authentication tag and the additional +authenticated data, which can be done by using the following +functions: + + at deftypefun gcry_error_t gcry_cipher_authenticate (gcry_cipher_hd_t @var{h}, const void *@var{abuf}, size_t @var{abuflen}) + +Process the buffer @var{abuf} of length @var{abuflen} as the additional +authenticated data (AAD) for AEAD cipher modes. + + at end deftypefun + + at deftypefun gcry_error_t gcry_cipher_gettag (gcry_cipher_hd_t @var{h}, void *@var{tag}, size_t @var{taglen}) + +This function is used to read the authentication tag after encryption. +The function finalizes and outputs the authentication tag to the buffer + at var{tag} of length @var{taglen} bytes. + + at end deftypefun + + at deftypefun gcry_error_t gcry_cipher_checktag (gcry_cipher_hd_t @var{h}, const void *@var{tag}, size_t @var{taglen}) + +Check the authentication tag after decryption. The authentication +tag is passed as the buffer @var{tag} of length @var{taglen} bytes +and compared to internal authentication tag computed during +decryption. Error code @code{GPG_ERR_CHECKSUM} is returned if +the authentication tag in the buffer @var{tag} does not match +the authentication tag calculated during decryption. + + at end deftypefun + The actual encryption and decryption is done by using one of the following functions. They may be used as often as required to process all the data. diff --git a/src/gcrypt.h.in b/src/gcrypt.h.in index 64cc0e4..f0ae927 100644 --- a/src/gcrypt.h.in +++ b/src/gcrypt.h.in @@ -953,6 +953,17 @@ gcry_error_t gcry_cipher_setkey (gcry_cipher_hd_t hd, gcry_error_t gcry_cipher_setiv (gcry_cipher_hd_t hd, const void *iv, size_t ivlen); +/* Provide additional authentication data for AEAD modes/ciphers. */ +gcry_error_t gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, + size_t abuflen); + +/* Get authentication tag for AEAD modes/ciphers. */ +gcry_error_t gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, + size_t taglen); + +/* Check authentication tag for AEAD modes/ciphers. */ +gcry_error_t gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, + size_t taglen); /* Reset the handle to the state after open. */ #define gcry_cipher_reset(h) gcry_cipher_ctl ((h), GCRYCTL_RESET, NULL, 0) diff --git a/src/libgcrypt.def b/src/libgcrypt.def index ec0c1e3..64ba370 100644 --- a/src/libgcrypt.def +++ b/src/libgcrypt.def @@ -255,6 +255,9 @@ EXPORTS gcry_sexp_extract_param @225 + gcry_cipher_authenticate @226 + gcry_cipher_gettag @227 + gcry_cipher_checktag @228 ;; end of file with public symbols for Windows. diff --git a/src/libgcrypt.vers b/src/libgcrypt.vers index be72aad..93eaa93 100644 --- a/src/libgcrypt.vers +++ b/src/libgcrypt.vers @@ -51,6 +51,7 @@ GCRYPT_1.6 { gcry_cipher_info; gcry_cipher_map_name; gcry_cipher_mode_from_oid; gcry_cipher_open; gcry_cipher_setkey; gcry_cipher_setiv; gcry_cipher_setctr; + gcry_cipher_authenticate; gcry_cipher_gettag; gcry_cipher_checktag; gcry_pk_algo_info; gcry_pk_algo_name; gcry_pk_ctl; gcry_pk_decrypt; gcry_pk_encrypt; gcry_pk_genkey; diff --git a/src/visibility.c b/src/visibility.c index 848925e..1f7bb3a 100644 --- a/src/visibility.c +++ b/src/visibility.c @@ -713,6 +713,33 @@ gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen) return _gcry_cipher_setctr (hd, ctr, ctrlen); } +gcry_error_t +gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, size_t abuflen) +{ + if (!fips_is_operational ()) + return gpg_error (fips_not_operational ()); + + return _gcry_cipher_authenticate (hd, abuf, abuflen); +} + +gcry_error_t +gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, size_t taglen) +{ + if (!fips_is_operational ()) + return gpg_error (fips_not_operational ()); + + return _gcry_cipher_gettag (hd, outtag, taglen); +} + +gcry_error_t +gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen) +{ + if (!fips_is_operational ()) + return gpg_error (fips_not_operational ()); + + return _gcry_cipher_checktag (hd, intag, taglen); +} + gcry_error_t gcry_cipher_ctl (gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) diff --git a/src/visibility.h b/src/visibility.h index 1c8f047..b2fa4c0 100644 --- a/src/visibility.h +++ b/src/visibility.h @@ -81,6 +81,9 @@ #define gcry_cipher_setkey _gcry_cipher_setkey #define gcry_cipher_setiv _gcry_cipher_setiv #define gcry_cipher_setctr _gcry_cipher_setctr +#define gcry_cipher_authenticate _gcry_cipher_authenticate +#define gcry_cipher_checktag _gcry_cipher_checktag +#define gcry_cipher_gettag _gcry_cipher_gettag #define gcry_cipher_ctl _gcry_cipher_ctl #define gcry_cipher_decrypt _gcry_cipher_decrypt #define gcry_cipher_encrypt _gcry_cipher_encrypt @@ -297,6 +300,9 @@ gcry_err_code_t gcry_md_get (gcry_md_hd_t hd, int algo, #undef gcry_cipher_setkey #undef gcry_cipher_setiv #undef gcry_cipher_setctr +#undef gcry_cipher_authenticate +#undef gcry_cipher_checktag +#undef gcry_cipher_gettag #undef gcry_cipher_ctl #undef gcry_cipher_decrypt #undef gcry_cipher_encrypt @@ -474,6 +480,9 @@ MARK_VISIBLE (gcry_cipher_close) MARK_VISIBLE (gcry_cipher_setkey) MARK_VISIBLE (gcry_cipher_setiv) MARK_VISIBLE (gcry_cipher_setctr) +MARK_VISIBLE (gcry_cipher_authenticate) +MARK_VISIBLE (gcry_cipher_checktag) +MARK_VISIBLE (gcry_cipher_gettag) MARK_VISIBLE (gcry_cipher_ctl) MARK_VISIBLE (gcry_cipher_decrypt) MARK_VISIBLE (gcry_cipher_encrypt) From jussi.kivilinna at iki.fi Tue Oct 22 14:32:07 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 22 Oct 2013 15:32:07 +0300 Subject: [PATCH 2/2] [v4] Add Counter with CBC-MAC mode (CCM) In-Reply-To: <20131022123202.6563.15226.stgit@localhost6.localdomain6> References: <20131022123202.6563.15226.stgit@localhost6.localdomain6> Message-ID: <20131022123207.6563.4124.stgit@localhost6.localdomain6> * cipher/Makefile.am: Add 'cipher-ccm.c'. * cipher/cipher-ccm.c: New. * cipher/cipher-internal.h (gcry_cipher_handle): Add 'u_mode'. (_gcry_cipher_ccm_encrypt, _gcry_cipher_ccm_decrypt) (_gcry_cipher_ccm_set_nonce, _gcry_cipher_ccm_authenticate) (_gcry_cipher_ccm_get_tag, _gcry_cipher_ccm_check_tag) (_gcry_cipher_ccm_set_lengths): New prototypes. * cipher/cipher.c (gcry_cipher_open, cipher_encrypt, cipher_decrypt) (_gcry_cipher_setiv, _gcry_cipher_authenticate, _gcry_cipher_gettag) (_gcry_cipher_checktag, gry_cipher_ctl): Add handling for CCM mode. * doc/gcrypt.texi: Add documentation for GCRY_CIPHER_MODE_CCM. * src/gcrypt.h.in (gcry_cipher_modes): Add 'GCRY_CIPHER_MODE_CCM'. (gcry_ctl_cmds): Add 'GCRYCTL_SET_CCM_LENGTHS'. (GCRY_CCM_BLOCK_LEN): New. * tests/basic.c (check_ccm_cipher): New. (check_cipher_modes): Call 'check_ccm_cipher'. * tests/benchmark.c (ccm_aead_init): New. (cipher_bench): Add handling for AEAD modes and add CCM benchmarking. -- Patch adds CCM (Counter with CBC-MAC) mode as defined in RFC 3610 and NIST Special Publication 800-38C. Example for encrypting message (split in two buffers; buf1, buf2) and authenticating additional non-encrypted data (split in two buffers; aadbuf1, aadbuf2) with authentication tag length of eigth bytes: size_t params[3]; taglen = 8; gcry_cipher_setkey(h, key, len(key)); gcry_cipher_setiv(h, nonce, len(nonce)); params[0] = len(buf1) + len(buf2); /* 0: enclen */ params[1] = len(aadbuf1) + len(aadbuf2); /* 1: aadlen */ params[2] = taglen; /* 2: authtaglen */ gcry_cipher_ctl(h, GCRYCTL_SET_CCM_LENGTHS, params, sizeof(size_t) * 3); gcry_cipher_authenticate(h, aadbuf1, len(aadbuf1)); gcry_cipher_authenticate(h, aadbuf2, len(aadbuf2)); gcry_cipher_encrypt(h, buf1, len(buf1), buf1, len(buf1)); gcry_cipher_encrypt(h, buf2, len(buf2), buf2, len(buf2)); gcry_cipher_gettag(h, tag, taglen); Example for decrypting above message and checking authentication tag: size_t params[3]; taglen = 8; gcry_cipher_setkey(h, key, len(key)); gcry_cipher_setiv(h, nonce, len(nonce)); params[0] = len(buf1) + len(buf2); /* 0: enclen */ params[1] = len(aadbuf1) + len(aadbuf2); /* 1: aadlen */ params[2] = taglen; /* 2: authtaglen */ gcry_cipher_ctl(h, GCRYCTL_SET_CCM_LENGTHS, params, sizeof(size_t) * 3); gcry_cipher_authenticate(h, aadbuf1, len(aadbuf1)); gcry_cipher_authenticate(h, aadbuf2, len(aadbuf2)); gcry_cipher_decrypt(h, buf1, len(buf1), buf1, len(buf1)); gcry_cipher_decrypt(h, buf2, len(buf2), buf2, len(buf2)); err = gcry_cipher_checktag(h, tag, taglen); if (gpg_err_code (err) == GPG_ERR_CHECKSUM) { /* Authentication failed. */ } else if (err == 0) { /* Authentication ok. */ } Example for encrypting message without additional authenticated data: size_t params[3]; taglen = 10; gcry_cipher_setkey(h, key, len(key)); gcry_cipher_setiv(h, nonce, len(nonce)); params[0] = len(buf1); /* 0: enclen */ params[1] = 0; /* 1: aadlen */ params[2] = taglen; /* 2: authtaglen */ gcry_cipher_ctl(h, GCRYCTL_SET_CCM_LENGTHS, params, sizeof(size_t) * 3); gcry_cipher_encrypt(h, buf1, len(buf1), buf1, len(buf1)); gcry_cipher_gettag(h, tag, taglen); To reset CCM state for cipher handle, one can either set new nonce or use 'gcry_cipher_reset'. This implementation reuses existing CTR mode code for encryption/decryption and is there for able to process multiple buffers that are not multiple of blocksize. AAD data maybe also be passed into gcry_cipher_authenticate in non-blocksize chunks. [v4]: GCRYCTL_SET_CCM_PARAMS => GCRY_SET_CCM_LENGTHS Signed-off-by: Jussi Kivilinna --- cipher/Makefile.am | 1 cipher/cipher-ccm.c | 371 ++++++++++++++++++++++ cipher/cipher-internal.h | 48 +++ cipher/cipher.c | 107 ++++++ doc/gcrypt.texi | 16 + src/gcrypt.h.in | 8 tests/basic.c | 771 ++++++++++++++++++++++++++++++++++++++++++++++ tests/benchmark.c | 80 +++++ 8 files changed, 1377 insertions(+), 25 deletions(-) create mode 100644 cipher/cipher-ccm.c diff --git a/cipher/Makefile.am b/cipher/Makefile.am index a2b2c8a..b0efd89 100644 --- a/cipher/Makefile.am +++ b/cipher/Makefile.am @@ -40,6 +40,7 @@ libcipher_la_LIBADD = $(GCRYPT_MODULES) libcipher_la_SOURCES = \ cipher.c cipher-internal.h \ cipher-cbc.c cipher-cfb.c cipher-ofb.c cipher-ctr.c cipher-aeswrap.c \ +cipher-ccm.c \ cipher-selftest.c cipher-selftest.h \ pubkey.c pubkey-internal.h pubkey-util.c \ md.c \ diff --git a/cipher/cipher-ccm.c b/cipher/cipher-ccm.c new file mode 100644 index 0000000..ce67b40 --- /dev/null +++ b/cipher/cipher-ccm.c @@ -0,0 +1,371 @@ +/* cipher-ccm.c - CTR mode with CBC-MAC mode implementation + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser general Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include +#include +#include +#include +#include + +#include "g10lib.h" +#include "cipher.h" +#include "ath.h" +#include "bufhelp.h" +#include "./cipher-internal.h" + + +#define set_burn(burn, nburn) do { \ + unsigned int __nburn = (nburn); \ + (burn) = (burn) > __nburn ? (burn) : __nburn; } while (0) + + +static unsigned int +do_cbc_mac (gcry_cipher_hd_t c, const unsigned char *inbuf, size_t inlen, + int do_padding) +{ + const unsigned int blocksize = 16; + unsigned char tmp[blocksize]; + unsigned int burn = 0; + unsigned int unused = c->u_mode.ccm.mac_unused; + size_t nblocks; + + if (inlen == 0 && (unused == 0 || !do_padding)) + return 0; + + do + { + if (inlen + unused < blocksize || unused > 0) + { + for (; inlen && unused < blocksize; inlen--) + c->u_mode.ccm.macbuf[unused++] = *inbuf++; + } + if (!inlen) + { + if (!do_padding) + break; + + while (unused < blocksize) + c->u_mode.ccm.macbuf[unused++] = 0; + } + + if (unused > 0) + { + /* Process one block from macbuf. */ + buf_xor(c->u_iv.iv, c->u_iv.iv, c->u_mode.ccm.macbuf, blocksize); + set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_iv.iv, + c->u_iv.iv )); + + unused = 0; + } + + if (c->bulk.cbc_enc) + { + nblocks = inlen / blocksize; + c->bulk.cbc_enc (&c->context.c, c->u_iv.iv, tmp, inbuf, nblocks, 1); + inbuf += nblocks * blocksize; + inlen -= nblocks * blocksize; + + wipememory (tmp, sizeof(tmp)); + } + else + { + while (inlen >= blocksize) + { + buf_xor(c->u_iv.iv, c->u_iv.iv, inbuf, blocksize); + + set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_iv.iv, + c->u_iv.iv )); + + inlen -= blocksize; + inbuf += blocksize; + } + } + } + while (inlen > 0); + + c->u_mode.ccm.mac_unused = unused; + + if (burn) + burn += 4 * sizeof(void *); + + return burn; +} + + +gcry_err_code_t +_gcry_cipher_ccm_set_nonce (gcry_cipher_hd_t c, const unsigned char *nonce, + size_t noncelen) +{ + size_t L = 15 - noncelen; + size_t L_; + + L_ = L - 1; + + if (!nonce) + return GPG_ERR_INV_ARG; + /* Length field must be 2, 3, ..., or 8. */ + if (L < 2 || L > 8) + return GPG_ERR_INV_LENGTH; + + /* Reset state */ + memset (&c->u_mode, 0, sizeof(c->u_mode)); + memset (&c->marks, 0, sizeof(c->marks)); + memset (&c->u_iv, 0, sizeof(c->u_iv)); + memset (&c->u_ctr, 0, sizeof(c->u_ctr)); + memset (c->lastiv, 0, sizeof(c->lastiv)); + c->unused = 0; + + /* Setup CTR */ + c->u_ctr.ctr[0] = L_; + memcpy (&c->u_ctr.ctr[1], nonce, noncelen); + memset (&c->u_ctr.ctr[1 + noncelen], 0, L); + + /* Setup IV */ + c->u_iv.iv[0] = L_; + memcpy (&c->u_iv.iv[1], nonce, noncelen); + /* Add (8 * M_ + 64 * flags) to iv[0] and set iv[noncelen + 1 ... 15] later + in set_aad. */ + memset (&c->u_iv.iv[1 + noncelen], 0, L); + + c->u_mode.ccm.nonce = 1; + + return GPG_ERR_NO_ERROR; +} + + +gcry_err_code_t +_gcry_cipher_ccm_set_lengths (gcry_cipher_hd_t c, size_t encryptlen, + size_t aadlen, size_t taglen) +{ + unsigned int burn = 0; + unsigned char b0[16]; + size_t noncelen = 15 - (c->u_iv.iv[0] + 1); + size_t M = taglen; + size_t M_; + int i; + + M_ = (M - 2) / 2; + + /* Authentication field must be 4, 6, 8, 10, 12, 14 or 16. */ + if ((M_ * 2 + 2) != M || M < 4 || M > 16) + return GPG_ERR_INV_LENGTH; + if (!c->u_mode.ccm.nonce || c->u_mode.ccm.tag) + return GPG_ERR_INV_STATE; + if (c->u_mode.ccm.lengths) + return GPG_ERR_INV_STATE; + + c->u_mode.ccm.authlen = taglen; + c->u_mode.ccm.encryptlen = encryptlen; + c->u_mode.ccm.aadlen = aadlen; + + /* Complete IV setup. */ + c->u_iv.iv[0] += (aadlen > 0) * 64 + M_ * 8; + for (i = 16 - 1; i >= 1 + noncelen; i--) + { + c->u_iv.iv[i] = encryptlen & 0xff; + encryptlen >>= 8; + } + + memcpy (b0, c->u_iv.iv, 16); + memset (c->u_iv.iv, 0, 16); + + set_burn (burn, do_cbc_mac (c, b0, 16, 0)); + + if (aadlen == 0) + { + /* Do nothing. */ + } + else if (aadlen > 0 && aadlen <= (unsigned int)0xfeff) + { + b0[0] = (aadlen >> 8) & 0xff; + b0[1] = aadlen & 0xff; + set_burn (burn, do_cbc_mac (c, b0, 2, 0)); + } + else if (aadlen > 0xfeff && aadlen <= (unsigned int)0xffffffff) + { + b0[0] = 0xff; + b0[1] = 0xfe; + buf_put_be32(&b0[2], aadlen); + set_burn (burn, do_cbc_mac (c, b0, 6, 0)); + } +#ifdef HAVE_U64_TYPEDEF + else if (aadlen > (unsigned int)0xffffffff) + { + b0[0] = 0xff; + b0[1] = 0xff; + buf_put_be64(&b0[2], aadlen); + set_burn (burn, do_cbc_mac (c, b0, 10, 0)); + } +#endif + + /* Generate S_0 and increase counter. */ + set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_mode.ccm.s0, + c->u_ctr.ctr )); + c->u_ctr.ctr[15]++; + + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + + c->u_mode.ccm.lengths = 1; + + return GPG_ERR_NO_ERROR; +} + + +gcry_err_code_t +_gcry_cipher_ccm_authenticate (gcry_cipher_hd_t c, const unsigned char *abuf, + size_t abuflen) +{ + unsigned int burn; + + if (abuflen > 0 && !abuf) + return GPG_ERR_INV_ARG; + if (!c->u_mode.ccm.nonce || !c->u_mode.ccm.lengths || c->u_mode.ccm.tag) + return GPG_ERR_INV_STATE; + if (abuflen > c->u_mode.ccm.aadlen) + return GPG_ERR_INV_LENGTH; + + c->u_mode.ccm.aadlen -= abuflen; + burn = do_cbc_mac (c, abuf, abuflen, c->u_mode.ccm.aadlen == 0); + + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + + return GPG_ERR_NO_ERROR; +} + + +gcry_err_code_t +_gcry_cipher_ccm_tag (gcry_cipher_hd_t c, unsigned char *outbuf, + size_t outbuflen, int check) +{ + unsigned int burn; + + if (!outbuf || outbuflen == 0) + return GPG_ERR_INV_ARG; + /* Tag length must be same as initial authlen. */ + if (c->u_mode.ccm.authlen != outbuflen) + return GPG_ERR_INV_LENGTH; + if (!c->u_mode.ccm.nonce || !c->u_mode.ccm.lengths || c->u_mode.ccm.aadlen > 0) + return GPG_ERR_INV_STATE; + /* Initial encrypt length must match with length of actual data processed. */ + if (c->u_mode.ccm.encryptlen > 0) + return GPG_ERR_UNFINISHED; + + if (!c->u_mode.ccm.tag) + { + burn = do_cbc_mac (c, NULL, 0, 1); /* Perform final padding. */ + + /* Add S_0 */ + buf_xor (c->u_iv.iv, c->u_iv.iv, c->u_mode.ccm.s0, 16); + + wipememory (c->u_ctr.ctr, 16); + wipememory (c->u_mode.ccm.s0, 16); + wipememory (c->u_mode.ccm.macbuf, 16); + + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + } + + if (!check) + { + memcpy (outbuf, c->u_iv.iv, outbuflen); + return GPG_ERR_NO_ERROR; + } + else + { + int diff, i; + + /* Constant-time compare. */ + for (i = 0, diff = 0; i < outbuflen; i++) + diff -= !!(outbuf[i] - c->u_iv.iv[i]); + + return !diff ? GPG_ERR_NO_ERROR : GPG_ERR_CHECKSUM; + } +} + + +gcry_err_code_t +_gcry_cipher_ccm_get_tag (gcry_cipher_hd_t c, unsigned char *outtag, + size_t taglen) +{ + return _gcry_cipher_ccm_tag (c, outtag, taglen, 0); +} + + +gcry_err_code_t +_gcry_cipher_ccm_check_tag (gcry_cipher_hd_t c, const unsigned char *intag, + size_t taglen) +{ + return _gcry_cipher_ccm_tag (c, (unsigned char *)intag, taglen, 1); +} + + +gcry_err_code_t +_gcry_cipher_ccm_encrypt (gcry_cipher_hd_t c, unsigned char *outbuf, + unsigned int outbuflen, const unsigned char *inbuf, + unsigned int inbuflen) +{ + unsigned int burn; + + if (outbuflen < inbuflen) + return GPG_ERR_BUFFER_TOO_SHORT; + if (!c->u_mode.ccm.nonce || c->u_mode.ccm.tag || !c->u_mode.ccm.lengths || + c->u_mode.ccm.aadlen > 0) + return GPG_ERR_INV_STATE; + if (inbuflen > c->u_mode.ccm.encryptlen) + return GPG_ERR_INV_LENGTH; + + c->u_mode.ccm.encryptlen -= inbuflen; + burn = do_cbc_mac (c, inbuf, inbuflen, 0); + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + + return _gcry_cipher_ctr_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); +} + + +gcry_err_code_t +_gcry_cipher_ccm_decrypt (gcry_cipher_hd_t c, unsigned char *outbuf, + unsigned int outbuflen, const unsigned char *inbuf, + unsigned int inbuflen) +{ + gcry_err_code_t err; + unsigned int burn; + + if (outbuflen < inbuflen) + return GPG_ERR_BUFFER_TOO_SHORT; + if (!c->u_mode.ccm.nonce || c->u_mode.ccm.tag || !c->u_mode.ccm.lengths || + c->u_mode.ccm.aadlen > 0) + return GPG_ERR_INV_STATE; + if (inbuflen > c->u_mode.ccm.encryptlen) + return GPG_ERR_INV_LENGTH; + + err = _gcry_cipher_ctr_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); + if (err) + return err; + + c->u_mode.ccm.encryptlen -= inbuflen; + burn = do_cbc_mac (c, outbuf, inbuflen, 0); + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + + return err; +} + diff --git a/cipher/cipher-internal.h b/cipher/cipher-internal.h index b60ef38..981caa8 100644 --- a/cipher/cipher-internal.h +++ b/cipher/cipher-internal.h @@ -100,7 +100,8 @@ struct gcry_cipher_handle /* The initialization vector. For best performance we make sure that it is properly aligned. In particular some implementations - of bulk operations expect an 16 byte aligned IV. */ + of bulk operations expect an 16 byte aligned IV. IV is also used + to store CBC-MAC in CCM mode; counter IV is stored in U_CTR. */ union { cipher_context_alignment_t iv_align; unsigned char iv[MAX_BLOCKSIZE]; @@ -117,6 +118,26 @@ struct gcry_cipher_handle unsigned char lastiv[MAX_BLOCKSIZE]; int unused; /* Number of unused bytes in LASTIV. */ + union { + /* Mode specific storage for CCM mode. */ + struct { + size_t encryptlen; + size_t aadlen; + unsigned int authlen; + + /* Space to save partial input lengths for MAC. */ + unsigned char macbuf[GCRY_CCM_BLOCK_LEN]; + int mac_unused; /* Number of unprocessed bytes in MACBUF. */ + + unsigned char s0[GCRY_CCM_BLOCK_LEN]; + + unsigned int nonce:1;/* Set to 1 if nonce has been set. */ + unsigned int lengths:1; /* Set to 1 if CCM length parameters has been + processed. */ + unsigned int tag:1; /* Set to 1 if tag has been finalized. */ + } ccm; + } u_mode; + /* What follows are two contexts of the cipher in use. The first one needs to be aligned well enough for the cipher operation whereas the second one is a copy created by cipher_setkey and @@ -175,5 +196,30 @@ gcry_err_code_t _gcry_cipher_aeswrap_decrypt const byte *inbuf, unsigned int inbuflen); +/*-- cipher-ccm.c --*/ +gcry_err_code_t _gcry_cipher_ccm_encrypt +/* */ (gcry_cipher_hd_t c, + unsigned char *outbuf, unsigned int outbuflen, + const unsigned char *inbuf, unsigned int inbuflen); +gcry_err_code_t _gcry_cipher_ccm_decrypt +/* */ (gcry_cipher_hd_t c, + unsigned char *outbuf, unsigned int outbuflen, + const unsigned char *inbuf, unsigned int inbuflen); +gcry_err_code_t _gcry_cipher_ccm_set_nonce +/* */ (gcry_cipher_hd_t c, const unsigned char *nonce, + size_t noncelen); +gcry_err_code_t _gcry_cipher_ccm_authenticate +/* */ (gcry_cipher_hd_t c, const unsigned char *abuf, size_t abuflen); +gcry_err_code_t _gcry_cipher_ccm_set_lengths +/* */ (gcry_cipher_hd_t c, size_t encryptedlen, size_t aadlen, + size_t taglen); +gcry_err_code_t _gcry_cipher_ccm_get_tag +/* */ (gcry_cipher_hd_t c, + unsigned char *outtag, size_t taglen); +gcry_err_code_t _gcry_cipher_ccm_check_tag +/* */ (gcry_cipher_hd_t c, + const unsigned char *intag, size_t taglen); + + #endif /*G10_CIPHER_INTERNAL_H*/ diff --git a/cipher/cipher.c b/cipher/cipher.c index 36c79db..44d19a1 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -375,6 +375,13 @@ gcry_cipher_open (gcry_cipher_hd_t *handle, if (! err) switch (mode) { + case GCRY_CIPHER_MODE_CCM: + if (spec->blocksize != GCRY_CCM_BLOCK_LEN) + err = GPG_ERR_INV_CIPHER_MODE; + if (!spec->encrypt || !spec->decrypt) + err = GPG_ERR_INV_CIPHER_MODE; + break; + case GCRY_CIPHER_MODE_ECB: case GCRY_CIPHER_MODE_CBC: case GCRY_CIPHER_MODE_CFB: @@ -613,6 +620,8 @@ cipher_reset (gcry_cipher_hd_t c) memset (c->u_iv.iv, 0, c->spec->blocksize); memset (c->lastiv, 0, c->spec->blocksize); memset (c->u_ctr.ctr, 0, c->spec->blocksize); + memset (&c->u_mode, 0, sizeof c->u_mode); + c->unused = 0; } @@ -718,6 +727,10 @@ cipher_encrypt (gcry_cipher_hd_t c, byte *outbuf, unsigned int outbuflen, inbuf, inbuflen); break; + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); + break; + case GCRY_CIPHER_MODE_STREAM: c->spec->stencrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf, inbuflen); @@ -811,6 +824,10 @@ cipher_decrypt (gcry_cipher_hd_t c, byte *outbuf, unsigned int outbuflen, inbuf, inbuflen); break; + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); + break; + case GCRY_CIPHER_MODE_STREAM: c->spec->stdecrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf, inbuflen); @@ -885,8 +902,19 @@ _gcry_cipher_setkey (gcry_cipher_hd_t hd, const void *key, size_t keylen) gcry_error_t _gcry_cipher_setiv (gcry_cipher_hd_t hd, const void *iv, size_t ivlen) { - cipher_setiv (hd, iv, ivlen); - return 0; + gcry_err_code_t rc = GPG_ERR_NO_ERROR; + + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_set_nonce (hd, iv, ivlen); + break; + + default: + cipher_setiv (hd, iv, ivlen); + break; + } + return gpg_error (rc); } /* Set counter for CTR mode. (CTR,CTRLEN) must denote a buffer of @@ -914,34 +942,61 @@ gcry_error_t _gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, size_t abuflen) { - log_fatal ("gcry_cipher_authenticate: invalid mode %d\n", hd->mode); + gcry_err_code_t rc; + + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_authenticate (hd, abuf, abuflen); + break; - (void)abuf; - (void)abuflen; + default: + log_fatal ("gcry_cipher_authenticate: invalid mode %d\n", hd->mode); + rc = GPG_ERR_INV_CIPHER_MODE; + break; + } - return gpg_error (GPG_ERR_INV_CIPHER_MODE); + return gpg_error (rc); } gcry_error_t _gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, size_t taglen) { - log_fatal ("gcry_cipher_gettag: invalid mode %d\n", hd->mode); + gcry_err_code_t rc; + + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_get_tag (hd, outtag, taglen); + break; - (void)outtag; - (void)taglen; + default: + log_fatal ("gcry_cipher_gettag: invalid mode %d\n", hd->mode); + rc = GPG_ERR_INV_CIPHER_MODE; + break; + } - return gpg_error (GPG_ERR_INV_CIPHER_MODE); + return gpg_error (rc); } gcry_error_t _gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen) { - log_fatal ("gcry_cipher_checktag: invalid mode %d\n", hd->mode); + gcry_err_code_t rc; + + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_check_tag (hd, intag, taglen); + break; - (void)intag; - (void)taglen; + default: + log_fatal ("gcry_cipher_checktag: invalid mode %d\n", hd->mode); + rc = GPG_ERR_INV_CIPHER_MODE; + break; + } - return gpg_error (GPG_ERR_INV_CIPHER_MODE); + return gpg_error (rc); } @@ -980,6 +1035,30 @@ gcry_cipher_ctl( gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) h->flags &= ~GCRY_CIPHER_CBC_MAC; break; + case GCRYCTL_SET_CCM_LENGTHS: + { + size_t params[3]; + size_t encryptedlen; + size_t aadlen; + size_t authtaglen; + + if (h->mode != GCRY_CIPHER_MODE_CCM) + return gcry_error (GPG_ERR_INV_CIPHER_MODE); + + if (!buffer || buflen != 3 * sizeof(size_t)) + return gcry_error (GPG_ERR_INV_ARG); + + /* This command is used to pass additional length parameters needed + by CCM mode to initialize CBC-MAC. */ + memcpy (params, buffer, sizeof(params)); + encryptedlen = params[0]; + aadlen = params[1]; + authtaglen = params[2]; + + rc = _gcry_cipher_ccm_set_lengths (h, encryptedlen, aadlen, authtaglen); + } + break; + case GCRYCTL_DISABLE_ALGO: /* This command expects NULL for H and BUFFER to point to an integer with the algo number. */ diff --git a/doc/gcrypt.texi b/doc/gcrypt.texi index 0049fa0..91fe399 100644 --- a/doc/gcrypt.texi +++ b/doc/gcrypt.texi @@ -1635,6 +1635,12 @@ may be specified 64 bit (8 byte) shorter than then input buffer. As per specs the input length must be at least 128 bits and the length must be a multiple of 64 bits. + at item GCRY_CIPHER_MODE_CCM + at cindex CCM, Counter with CBC-MAC mode +Counter with CBC-MAC mode is an Authenticated Encryption with +Associated Data (AEAD) block cipher mode, which is specified in +'NIST Special Publication 800-38C' and RFC 3610. + @end table @node Working with cipher handles @@ -1661,11 +1667,13 @@ The cipher mode to use must be specified via @var{mode}. See @xref{Available cipher modes}, for a list of supported cipher modes and the according constants. Note that some modes are incompatible with some algorithms - in particular, stream mode -(@code{GCRY_CIPHER_MODE_STREAM}) only works with stream ciphers. Any -block cipher mode (@code{GCRY_CIPHER_MODE_ECB}, +(@code{GCRY_CIPHER_MODE_STREAM}) only works with stream ciphers. The +block cipher modes (@code{GCRY_CIPHER_MODE_ECB}, @code{GCRY_CIPHER_MODE_CBC}, @code{GCRY_CIPHER_MODE_CFB}, - at code{GCRY_CIPHER_MODE_OFB} or @code{GCRY_CIPHER_MODE_CTR}) will work -with any block cipher algorithm. + at code{GCRY_CIPHER_MODE_OFB} and @code{GCRY_CIPHER_MODE_CTR}) will work +with any block cipher algorithm. The @code{GCRY_CIPHER_MODE_CCM} will +only work with block cipher algorithms which have the block size of +16 bytes. The third argument @var{flags} can either be passed as @code{0} or as the bit-wise OR of the following constants. diff --git a/src/gcrypt.h.in b/src/gcrypt.h.in index f0ae927..948202d 100644 --- a/src/gcrypt.h.in +++ b/src/gcrypt.h.in @@ -325,7 +325,8 @@ enum gcry_ctl_cmds GCRYCTL_SET_PREFERRED_RNG_TYPE = 65, GCRYCTL_GET_CURRENT_RNG_TYPE = 66, GCRYCTL_DISABLE_LOCKED_SECMEM = 67, - GCRYCTL_DISABLE_PRIV_DROP = 68 + GCRYCTL_DISABLE_PRIV_DROP = 68, + GCRYCTL_SET_CCM_LENGTHS = 69 }; /* Perform various operations defined by CMD. */ @@ -884,7 +885,8 @@ enum gcry_cipher_modes GCRY_CIPHER_MODE_STREAM = 4, /* Used with stream ciphers. */ GCRY_CIPHER_MODE_OFB = 5, /* Outer feedback. */ GCRY_CIPHER_MODE_CTR = 6, /* Counter. */ - GCRY_CIPHER_MODE_AESWRAP= 7 /* AES-WRAP algorithm. */ + GCRY_CIPHER_MODE_AESWRAP= 7, /* AES-WRAP algorithm. */ + GCRY_CIPHER_MODE_CCM = 8 /* Counter with CBC-MAC. */ }; /* Flags used with the open function. */ @@ -896,6 +898,8 @@ enum gcry_cipher_flags GCRY_CIPHER_CBC_MAC = 8 /* Enable CBC message auth. code (MAC). */ }; +/* CCM works only with blocks of 128 bits. */ +#define GCRY_CCM_BLOCK_LEN (128 / 8) /* Create a handle for algorithm ALGO to be used in MODE. FLAGS may be given as an bitwise OR of the gcry_cipher_flags values. */ diff --git a/tests/basic.c b/tests/basic.c index 1d6e637..21af21d 100644 --- a/tests/basic.c +++ b/tests/basic.c @@ -1139,6 +1139,776 @@ check_ofb_cipher (void) static void +check_ccm_cipher (void) +{ + static const struct tv + { + int algo; + int keylen; + const char *key; + int noncelen; + const char *nonce; + int aadlen; + const char *aad; + int plainlen; + const char *plaintext; + int cipherlen; + const char *ciphertext; + } tv[] = + { + /* RFC 3610 */ + { GCRY_CIPHER_AES, /* Packet Vector #1 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x03\x02\x01\x00\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 31, + "\x58\x8C\x97\x9A\x61\xC6\x63\xD2\xF0\x66\xD0\xC2\xC0\xF9\x89\x80\x6D\x5F\x6B\x61\xDA\xC3\x84\x17\xE8\xD1\x2C\xFD\xF9\x26\xE0"}, + { GCRY_CIPHER_AES, /* Packet Vector #2 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x04\x03\x02\x01\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 32, + "\x72\xC9\x1A\x36\xE1\x35\xF8\xCF\x29\x1C\xA8\x94\x08\x5C\x87\xE3\xCC\x15\xC4\x39\xC9\xE4\x3A\x3B\xA0\x91\xD5\x6E\x10\x40\x09\x16"}, + { GCRY_CIPHER_AES, /* Packet Vector #3 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x05\x04\x03\x02\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 33, + "\x51\xB1\xE5\xF4\x4A\x19\x7D\x1D\xA4\x6B\x0F\x8E\x2D\x28\x2A\xE8\x71\xE8\x38\xBB\x64\xDA\x85\x96\x57\x4A\xDA\xA7\x6F\xBD\x9F\xB0\xC5"}, + { GCRY_CIPHER_AES, /* Packet Vector #4 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x06\x05\x04\x03\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 27, + "\xA2\x8C\x68\x65\x93\x9A\x9A\x79\xFA\xAA\x5C\x4C\x2A\x9D\x4A\x91\xCD\xAC\x8C\x96\xC8\x61\xB9\xC9\xE6\x1E\xF1"}, + { GCRY_CIPHER_AES, /* Packet Vector #5 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x07\x06\x05\x04\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 28, + "\xDC\xF1\xFB\x7B\x5D\x9E\x23\xFB\x9D\x4E\x13\x12\x53\x65\x8A\xD8\x6E\xBD\xCA\x3E\x51\xE8\x3F\x07\x7D\x9C\x2D\x93"}, + { GCRY_CIPHER_AES, /* Packet Vector #6 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x08\x07\x06\x05\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 29, + "\x6F\xC1\xB0\x11\xF0\x06\x56\x8B\x51\x71\xA4\x2D\x95\x3D\x46\x9B\x25\x70\xA4\xBD\x87\x40\x5A\x04\x43\xAC\x91\xCB\x94"}, + { GCRY_CIPHER_AES, /* Packet Vector #7 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x09\x08\x07\x06\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 33, + "\x01\x35\xD1\xB2\xC9\x5F\x41\xD5\xD1\xD4\xFE\xC1\x85\xD1\x66\xB8\x09\x4E\x99\x9D\xFE\xD9\x6C\x04\x8C\x56\x60\x2C\x97\xAC\xBB\x74\x90"}, + { GCRY_CIPHER_AES, /* Packet Vector #8 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0A\x09\x08\x07\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 34, + "\x7B\x75\x39\x9A\xC0\x83\x1D\xD2\xF0\xBB\xD7\x58\x79\xA2\xFD\x8F\x6C\xAE\x6B\x6C\xD9\xB7\xDB\x24\xC1\x7B\x44\x33\xF4\x34\x96\x3F\x34\xB4"}, + { GCRY_CIPHER_AES, /* Packet Vector #9 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0B\x0A\x09\x08\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 35, + "\x82\x53\x1A\x60\xCC\x24\x94\x5A\x4B\x82\x79\x18\x1A\xB5\xC8\x4D\xF2\x1C\xE7\xF9\xB7\x3F\x42\xE1\x97\xEA\x9C\x07\xE5\x6B\x5E\xB1\x7E\x5F\x4E"}, + { GCRY_CIPHER_AES, /* Packet Vector #10 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0C\x0B\x0A\x09\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 29, + "\x07\x34\x25\x94\x15\x77\x85\x15\x2B\x07\x40\x98\x33\x0A\xBB\x14\x1B\x94\x7B\x56\x6A\xA9\x40\x6B\x4D\x99\x99\x88\xDD"}, + { GCRY_CIPHER_AES, /* Packet Vector #11 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0D\x0C\x0B\x0A\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 30, + "\x67\x6B\xB2\x03\x80\xB0\xE3\x01\xE8\xAB\x79\x59\x0A\x39\x6D\xA7\x8B\x83\x49\x34\xF5\x3A\xA2\xE9\x10\x7A\x8B\x6C\x02\x2C"}, + { GCRY_CIPHER_AES, /* Packet Vector #12 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0E\x0D\x0C\x0B\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 31, + "\xC0\xFF\xA0\xD6\xF0\x5B\xDB\x67\xF2\x4D\x43\xA4\x33\x8D\x2A\xA4\xBE\xD7\xB2\x0E\x43\xCD\x1A\xA3\x16\x62\xE7\xAD\x65\xD6\xDB"}, + { GCRY_CIPHER_AES, /* Packet Vector #13 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x41\x2B\x4E\xA9\xCD\xBE\x3C\x96\x96\x76\x6C\xFA", + 8, "\x0B\xE1\xA8\x8B\xAC\xE0\x18\xB1", + 23, + "\x08\xE8\xCF\x97\xD8\x20\xEA\x25\x84\x60\xE9\x6A\xD9\xCF\x52\x89\x05\x4D\x89\x5C\xEA\xC4\x7C", + 31, + "\x4C\xB9\x7F\x86\xA2\xA4\x68\x9A\x87\x79\x47\xAB\x80\x91\xEF\x53\x86\xA6\xFF\xBD\xD0\x80\xF8\xE7\x8C\xF7\xCB\x0C\xDD\xD7\xB3"}, + { GCRY_CIPHER_AES, /* Packet Vector #14 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x33\x56\x8E\xF7\xB2\x63\x3C\x96\x96\x76\x6C\xFA", + 8, "\x63\x01\x8F\x76\xDC\x8A\x1B\xCB", + 24, + "\x90\x20\xEA\x6F\x91\xBD\xD8\x5A\xFA\x00\x39\xBA\x4B\xAF\xF9\xBF\xB7\x9C\x70\x28\x94\x9C\xD0\xEC", + 32, + "\x4C\xCB\x1E\x7C\xA9\x81\xBE\xFA\xA0\x72\x6C\x55\xD3\x78\x06\x12\x98\xC8\x5C\x92\x81\x4A\xBC\x33\xC5\x2E\xE8\x1D\x7D\x77\xC0\x8A"}, + { GCRY_CIPHER_AES, /* Packet Vector #15 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x10\x3F\xE4\x13\x36\x71\x3C\x96\x96\x76\x6C\xFA", + 8, "\xAA\x6C\xFA\x36\xCA\xE8\x6B\x40", + 25, + "\xB9\x16\xE0\xEA\xCC\x1C\x00\xD7\xDC\xEC\x68\xEC\x0B\x3B\xBB\x1A\x02\xDE\x8A\x2D\x1A\xA3\x46\x13\x2E", + 33, + "\xB1\xD2\x3A\x22\x20\xDD\xC0\xAC\x90\x0D\x9A\xA0\x3C\x61\xFC\xF4\xA5\x59\xA4\x41\x77\x67\x08\x97\x08\xA7\x76\x79\x6E\xDB\x72\x35\x06"}, + { GCRY_CIPHER_AES, /* Packet Vector #16 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x76\x4C\x63\xB8\x05\x8E\x3C\x96\x96\x76\x6C\xFA", + 12, "\xD0\xD0\x73\x5C\x53\x1E\x1B\xEC\xF0\x49\xC2\x44", + 19, + "\x12\xDA\xAC\x56\x30\xEF\xA5\x39\x6F\x77\x0C\xE1\xA6\x6B\x21\xF7\xB2\x10\x1C", + 27, + "\x14\xD2\x53\xC3\x96\x7B\x70\x60\x9B\x7C\xBB\x7C\x49\x91\x60\x28\x32\x45\x26\x9A\x6F\x49\x97\x5B\xCA\xDE\xAF"}, + { GCRY_CIPHER_AES, /* Packet Vector #17 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\xF8\xB6\x78\x09\x4E\x3B\x3C\x96\x96\x76\x6C\xFA", + 12, "\x77\xB6\x0F\x01\x1C\x03\xE1\x52\x58\x99\xBC\xAE", + 20, + "\xE8\x8B\x6A\x46\xC7\x8D\x63\xE5\x2E\xB8\xC5\x46\xEF\xB5\xDE\x6F\x75\xE9\xCC\x0D", + 28, + "\x55\x45\xFF\x1A\x08\x5E\xE2\xEF\xBF\x52\xB2\xE0\x4B\xEE\x1E\x23\x36\xC7\x3E\x3F\x76\x2C\x0C\x77\x44\xFE\x7E\x3C"}, + { GCRY_CIPHER_AES, /* Packet Vector #18 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\xD5\x60\x91\x2D\x3F\x70\x3C\x96\x96\x76\x6C\xFA", + 12, "\xCD\x90\x44\xD2\xB7\x1F\xDB\x81\x20\xEA\x60\xC0", + 21, + "\x64\x35\xAC\xBA\xFB\x11\xA8\x2E\x2F\x07\x1D\x7C\xA4\xA5\xEB\xD9\x3A\x80\x3B\xA8\x7F", + 29, + "\x00\x97\x69\xEC\xAB\xDF\x48\x62\x55\x94\xC5\x92\x51\xE6\x03\x57\x22\x67\x5E\x04\xC8\x47\x09\x9E\x5A\xE0\x70\x45\x51"}, + { GCRY_CIPHER_AES, /* Packet Vector #19 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x42\xFF\xF8\xF1\x95\x1C\x3C\x96\x96\x76\x6C\xFA", + 8, "\xD8\x5B\xC7\xE6\x9F\x94\x4F\xB8", + 23, + "\x8A\x19\xB9\x50\xBC\xF7\x1A\x01\x8E\x5E\x67\x01\xC9\x17\x87\x65\x98\x09\xD6\x7D\xBE\xDD\x18", + 33, + "\xBC\x21\x8D\xAA\x94\x74\x27\xB6\xDB\x38\x6A\x99\xAC\x1A\xEF\x23\xAD\xE0\xB5\x29\x39\xCB\x6A\x63\x7C\xF9\xBE\xC2\x40\x88\x97\xC6\xBA"}, + { GCRY_CIPHER_AES, /* Packet Vector #20 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x92\x0F\x40\xE5\x6C\xDC\x3C\x96\x96\x76\x6C\xFA", + 8, "\x74\xA0\xEB\xC9\x06\x9F\x5B\x37", + 24, + "\x17\x61\x43\x3C\x37\xC5\xA3\x5F\xC1\xF3\x9F\x40\x63\x02\xEB\x90\x7C\x61\x63\xBE\x38\xC9\x84\x37", + 34, + "\x58\x10\xE6\xFD\x25\x87\x40\x22\xE8\x03\x61\xA4\x78\xE3\xE9\xCF\x48\x4A\xB0\x4F\x44\x7E\xFF\xF6\xF0\xA4\x77\xCC\x2F\xC9\xBF\x54\x89\x44"}, + { GCRY_CIPHER_AES, /* Packet Vector #21 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x27\xCA\x0C\x71\x20\xBC\x3C\x96\x96\x76\x6C\xFA", + 8, "\x44\xA3\xAA\x3A\xAE\x64\x75\xCA", + 25, + "\xA4\x34\xA8\xE5\x85\x00\xC6\xE4\x15\x30\x53\x88\x62\xD6\x86\xEA\x9E\x81\x30\x1B\x5A\xE4\x22\x6B\xFA", + 35, + "\xF2\xBE\xED\x7B\xC5\x09\x8E\x83\xFE\xB5\xB3\x16\x08\xF8\xE2\x9C\x38\x81\x9A\x89\xC8\xE7\x76\xF1\x54\x4D\x41\x51\xA4\xED\x3A\x8B\x87\xB9\xCE"}, + { GCRY_CIPHER_AES, /* Packet Vector #22 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x5B\x8C\xCB\xCD\x9A\xF8\x3C\x96\x96\x76\x6C\xFA", + 12, "\xEC\x46\xBB\x63\xB0\x25\x20\xC3\x3C\x49\xFD\x70", + 19, + "\xB9\x6B\x49\xE2\x1D\x62\x17\x41\x63\x28\x75\xDB\x7F\x6C\x92\x43\xD2\xD7\xC2", + 29, + "\x31\xD7\x50\xA0\x9D\xA3\xED\x7F\xDD\xD4\x9A\x20\x32\xAA\xBF\x17\xEC\x8E\xBF\x7D\x22\xC8\x08\x8C\x66\x6B\xE5\xC1\x97"}, + { GCRY_CIPHER_AES, /* Packet Vector #23 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x3E\xBE\x94\x04\x4B\x9A\x3C\x96\x96\x76\x6C\xFA", + 12, "\x47\xA6\x5A\xC7\x8B\x3D\x59\x42\x27\xE8\x5E\x71", + 20, + "\xE2\xFC\xFB\xB8\x80\x44\x2C\x73\x1B\xF9\x51\x67\xC8\xFF\xD7\x89\x5E\x33\x70\x76", + 30, + "\xE8\x82\xF1\xDB\xD3\x8C\xE3\xED\xA7\xC2\x3F\x04\xDD\x65\x07\x1E\xB4\x13\x42\xAC\xDF\x7E\x00\xDC\xCE\xC7\xAE\x52\x98\x7D"}, + { GCRY_CIPHER_AES, /* Packet Vector #24 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x8D\x49\x3B\x30\xAE\x8B\x3C\x96\x96\x76\x6C\xFA", + 12, "\x6E\x37\xA6\xEF\x54\x6D\x95\x5D\x34\xAB\x60\x59", + 21, + "\xAB\xF2\x1C\x0B\x02\xFE\xB8\x8F\x85\x6D\xF4\xA3\x73\x81\xBC\xE3\xCC\x12\x85\x17\xD4", + 31, + "\xF3\x29\x05\xB8\x8A\x64\x1B\x04\xB9\xC9\xFF\xB5\x8C\xC3\x90\x90\x0F\x3D\xA1\x2A\xB1\x6D\xCE\x9E\x82\xEF\xA1\x6D\xA6\x20\x59"}, + /* RFC 5528 */ + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #1 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x03\x02\x01\x00\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 31, + "\xBA\x73\x71\x85\xE7\x19\x31\x04\x92\xF3\x8A\x5F\x12\x51\xDA\x55\xFA\xFB\xC9\x49\x84\x8A\x0D\xFC\xAE\xCE\x74\x6B\x3D\xB9\xAD"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #2 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x04\x03\x02\x01\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 32, + "\x5D\x25\x64\xBF\x8E\xAF\xE1\xD9\x95\x26\xEC\x01\x6D\x1B\xF0\x42\x4C\xFB\xD2\xCD\x62\x84\x8F\x33\x60\xB2\x29\x5D\xF2\x42\x83\xE8"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #3 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x05\x04\x03\x02\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 33, + "\x81\xF6\x63\xD6\xC7\x78\x78\x17\xF9\x20\x36\x08\xB9\x82\xAD\x15\xDC\x2B\xBD\x87\xD7\x56\xF7\x92\x04\xF5\x51\xD6\x68\x2F\x23\xAA\x46"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #4 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x06\x05\x04\x03\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 27, + "\xCA\xEF\x1E\x82\x72\x11\xB0\x8F\x7B\xD9\x0F\x08\xC7\x72\x88\xC0\x70\xA4\xA0\x8B\x3A\x93\x3A\x63\xE4\x97\xA0"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #5 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x07\x06\x05\x04\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 28, + "\x2A\xD3\xBA\xD9\x4F\xC5\x2E\x92\xBE\x43\x8E\x82\x7C\x10\x23\xB9\x6A\x8A\x77\x25\x8F\xA1\x7B\xA7\xF3\x31\xDB\x09"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #6 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x08\x07\x06\x05\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 29, + "\xFE\xA5\x48\x0B\xA5\x3F\xA8\xD3\xC3\x44\x22\xAA\xCE\x4D\xE6\x7F\xFA\x3B\xB7\x3B\xAB\xAB\x36\xA1\xEE\x4F\xE0\xFE\x28"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #7 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x09\x08\x07\x06\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 33, + "\x54\x53\x20\x26\xE5\x4C\x11\x9A\x8D\x36\xD9\xEC\x6E\x1E\xD9\x74\x16\xC8\x70\x8C\x4B\x5C\x2C\xAC\xAF\xA3\xBC\xCF\x7A\x4E\xBF\x95\x73"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #8 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0A\x09\x08\x07\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 34, + "\x8A\xD1\x9B\x00\x1A\x87\xD1\x48\xF4\xD9\x2B\xEF\x34\x52\x5C\xCC\xE3\xA6\x3C\x65\x12\xA6\xF5\x75\x73\x88\xE4\x91\x3E\xF1\x47\x01\xF4\x41"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #9 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0B\x0A\x09\x08\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 35, + "\x5D\xB0\x8D\x62\x40\x7E\x6E\x31\xD6\x0F\x9C\xA2\xC6\x04\x74\x21\x9A\xC0\xBE\x50\xC0\xD4\xA5\x77\x87\x94\xD6\xE2\x30\xCD\x25\xC9\xFE\xBF\x87"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #10 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0C\x0B\x0A\x09\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 29, + "\xDB\x11\x8C\xCE\xC1\xB8\x76\x1C\x87\x7C\xD8\x96\x3A\x67\xD6\xF3\xBB\xBC\x5C\xD0\x92\x99\xEB\x11\xF3\x12\xF2\x32\x37"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #11 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0D\x0C\x0B\x0A\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 30, + "\x7C\xC8\x3D\x8D\xC4\x91\x03\x52\x5B\x48\x3D\xC5\xCA\x7E\xA9\xAB\x81\x2B\x70\x56\x07\x9D\xAF\xFA\xDA\x16\xCC\xCF\x2C\x4E"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #12 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0E\x0D\x0C\x0B\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 31, + "\x2C\xD3\x5B\x88\x20\xD2\x3E\x7A\xA3\x51\xB0\xE9\x2F\xC7\x93\x67\x23\x8B\x2C\xC7\x48\xCB\xB9\x4C\x29\x47\x79\x3D\x64\xAF\x75"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #13 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xA9\x70\x11\x0E\x19\x27\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x6B\x7F\x46\x45\x07\xFA\xE4\x96", + 23, + "\xC6\xB5\xF3\xE6\xCA\x23\x11\xAE\xF7\x47\x2B\x20\x3E\x73\x5E\xA5\x61\xAD\xB1\x7D\x56\xC5\xA3", + 31, + "\xA4\x35\xD7\x27\x34\x8D\xDD\x22\x90\x7F\x7E\xB8\xF5\xFD\xBB\x4D\x93\x9D\xA6\x52\x4D\xB4\xF6\x45\x58\xC0\x2D\x25\xB1\x27\xEE"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #14 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x83\xCD\x8C\xE0\xCB\x42\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x98\x66\x05\xB4\x3D\xF1\x5D\xE7", + 24, + "\x01\xF6\xCE\x67\x64\xC5\x74\x48\x3B\xB0\x2E\x6B\xBF\x1E\x0A\xBD\x26\xA2\x25\x72\xB4\xD8\x0E\xE7", + 32, + "\x8A\xE0\x52\x50\x8F\xBE\xCA\x93\x2E\x34\x6F\x05\xE0\xDC\x0D\xFB\xCF\x93\x9E\xAF\xFA\x3E\x58\x7C\x86\x7D\x6E\x1C\x48\x70\x38\x06"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #15 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x5F\x54\x95\x0B\x18\xF2\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x48\xF2\xE7\xE1\xA7\x67\x1A\x51", + 25, + "\xCD\xF1\xD8\x40\x6F\xC2\xE9\x01\x49\x53\x89\x70\x05\xFB\xFB\x8B\xA5\x72\x76\xF9\x24\x04\x60\x8E\x08", + 33, + "\x08\xB6\x7E\xE2\x1C\x8B\xF2\x6E\x47\x3E\x40\x85\x99\xE9\xC0\x83\x6D\x6A\xF0\xBB\x18\xDF\x55\x46\x6C\xA8\x08\x78\xA7\x90\x47\x6D\xE5"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #16 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xEC\x60\x08\x63\x31\x9A\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\xDE\x97\xDF\x3B\x8C\xBD\x6D\x8E\x50\x30\xDA\x4C", + 19, + "\xB0\x05\xDC\xFA\x0B\x59\x18\x14\x26\xA9\x61\x68\x5A\x99\x3D\x8C\x43\x18\x5B", + 27, + "\x63\xB7\x8B\x49\x67\xB1\x9E\xDB\xB7\x33\xCD\x11\x14\xF6\x4E\xB2\x26\x08\x93\x68\xC3\x54\x82\x8D\x95\x0C\xC5"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #17 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x60\xCF\xF1\xA3\x1E\xA1\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\xA5\xEE\x93\xE4\x57\xDF\x05\x46\x6E\x78\x2D\xCF", + 20, + "\x2E\x20\x21\x12\x98\x10\x5F\x12\x9D\x5E\xD9\x5B\x93\xF7\x2D\x30\xB2\xFA\xCC\xD7", + 28, + "\x0B\xC6\xBB\xE2\xA8\xB9\x09\xF4\x62\x9E\xE6\xDC\x14\x8D\xA4\x44\x10\xE1\x8A\xF4\x31\x47\x38\x32\x76\xF6\x6A\x9F"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #18 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x0F\x85\xCD\x99\x5C\x97\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\x24\xAA\x1B\xF9\xA5\xCD\x87\x61\x82\xA2\x50\x74", + 21, + "\x26\x45\x94\x1E\x75\x63\x2D\x34\x91\xAF\x0F\xC0\xC9\x87\x6C\x3B\xE4\xAA\x74\x68\xC9", + 29, + "\x22\x2A\xD6\x32\xFA\x31\xD6\xAF\x97\x0C\x34\x5F\x7E\x77\xCA\x3B\xD0\xDC\x25\xB3\x40\xA1\xA3\xD3\x1F\x8D\x4B\x44\xB7"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #19 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xC2\x9B\x2C\xAA\xC4\xCD\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x69\x19\x46\xB9\xCA\x07\xBE\x87", + 23, + "\x07\x01\x35\xA6\x43\x7C\x9D\xB1\x20\xCD\x61\xD8\xF6\xC3\x9C\x3E\xA1\x25\xFD\x95\xA0\xD2\x3D", + 33, + "\x05\xB8\xE1\xB9\xC4\x9C\xFD\x56\xCF\x13\x0A\xA6\x25\x1D\xC2\xEC\xC0\x6C\xCC\x50\x8F\xE6\x97\xA0\x06\x6D\x57\xC8\x4B\xEC\x18\x27\x68"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #20 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x2C\x6B\x75\x95\xEE\x62\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\xD0\xC5\x4E\xCB\x84\x62\x7D\xC4", + 24, + "\xC8\xC0\x88\x0E\x6C\x63\x6E\x20\x09\x3D\xD6\x59\x42\x17\xD2\xE1\x88\x77\xDB\x26\x4E\x71\xA5\xCC", + 34, + "\x54\xCE\xB9\x68\xDE\xE2\x36\x11\x57\x5E\xC0\x03\xDF\xAA\x1C\xD4\x88\x49\xBD\xF5\xAE\x2E\xDB\x6B\x7F\xA7\x75\xB1\x50\xED\x43\x83\xC5\xA9"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #21 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xC5\x3C\xD4\xC2\xAA\x24\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\xE2\x85\xE0\xE4\x80\x8C\xDA\x3D", + 25, + "\xF7\x5D\xAA\x07\x10\xC4\xE6\x42\x97\x79\x4D\xC2\xB7\xD2\xA2\x07\x57\xB1\xAA\x4E\x44\x80\x02\xFF\xAB", + 35, + "\xB1\x40\x45\x46\xBF\x66\x72\x10\xCA\x28\xE3\x09\xB3\x9B\xD6\xCA\x7E\x9F\xC8\x28\x5F\xE6\x98\xD4\x3C\xD2\x0A\x02\xE0\xBD\xCA\xED\x20\x10\xD3"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #22 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xBE\xE9\x26\x7F\xBA\xDC\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\x6C\xAE\xF9\x94\x11\x41\x57\x0D\x7C\x81\x34\x05", + 19, + "\xC2\x38\x82\x2F\xAC\x5F\x98\xFF\x92\x94\x05\xB0\xAD\x12\x7A\x4E\x41\x85\x4E", + 29, + "\x94\xC8\x95\x9C\x11\x56\x9A\x29\x78\x31\xA7\x21\x00\x58\x57\xAB\x61\xB8\x7A\x2D\xEA\x09\x36\xB6\xEB\x5F\x62\x5F\x5D"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #23 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xDF\xA8\xB1\x24\x50\x07\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\x36\xA5\x2C\xF1\x6B\x19\xA2\x03\x7A\xB7\x01\x1E", + 20, + "\x4D\xBF\x3E\x77\x4A\xD2\x45\xE5\xD5\x89\x1F\x9D\x1C\x32\xA0\xAE\x02\x2C\x85\xD7", + 30, + "\x58\x69\xE3\xAA\xD2\x44\x7C\x74\xE0\xFC\x05\xF9\xA4\xEA\x74\x57\x7F\x4D\xE8\xCA\x89\x24\x76\x42\x96\xAD\x04\x11\x9C\xE7"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #24 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x3B\x8F\xD8\xD3\xA9\x37\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\xA4\xD4\x99\xF7\x84\x19\x72\x8C\x19\x17\x8B\x0C", + 21, + "\x9D\xC9\xED\xAE\x2F\xF5\xDF\x86\x36\xE8\xC6\xDE\x0E\xED\x55\xF7\x86\x7E\x33\x33\x7D", + 31, + "\x4B\x19\x81\x56\x39\x3B\x0F\x77\x96\x08\x6A\xAF\xB4\x54\xF8\xC3\xF0\x34\xCC\xA9\x66\x94\x5F\x1F\xCE\xA7\xE1\x1B\xEE\x6A\x2F"} + }; + static const int cut[] = { 0, 1, 8, 10, 16, 19, -1 }; + gcry_cipher_hd_t hde, hdd; + unsigned char out[MAX_DATA_LEN]; + size_t ctl_params[3]; + int split, aadsplit; + size_t j, i, keylen, blklen, authlen; + gcry_error_t err = 0; + + if (verbose) + fprintf (stderr, " Starting CCM checks.\n"); + + for (i = 0; i < sizeof (tv) / sizeof (tv[0]); i++) + { + if (verbose) + fprintf (stderr, " checking CCM mode for %s [%i]\n", + gcry_cipher_algo_name (tv[i].algo), + tv[i].algo); + + for (j = 0; j < sizeof (cut) / sizeof (cut[0]); j++) + { + split = cut[j] < 0 ? tv[i].plainlen : cut[j]; + if (tv[i].plainlen < split) + continue; + + err = gcry_cipher_open (&hde, tv[i].algo, GCRY_CIPHER_MODE_CCM, 0); + if (!err) + err = gcry_cipher_open (&hdd, tv[i].algo, GCRY_CIPHER_MODE_CCM, 0); + if (err) + { + fail ("cipher-ccm, gcry_cipher_open failed: %s\n", + gpg_strerror (err)); + return; + } + + keylen = gcry_cipher_get_algo_keylen(tv[i].algo); + if (!keylen) + { + fail ("cipher-ccm, gcry_cipher_get_algo_keylen failed\n"); + return; + } + + err = gcry_cipher_setkey (hde, tv[i].key, keylen); + if (!err) + err = gcry_cipher_setkey (hdd, tv[i].key, keylen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_setkey failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + blklen = gcry_cipher_get_algo_blklen(tv[i].algo); + if (!blklen) + { + fail ("cipher-ccm, gcry_cipher_get_algo_blklen failed\n"); + return; + } + + err = gcry_cipher_setiv (hde, tv[i].nonce, tv[i].noncelen); + if (!err) + err = gcry_cipher_setiv (hdd, tv[i].nonce, tv[i].noncelen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_setiv failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + authlen = tv[i].cipherlen - tv[i].plainlen; + ctl_params[0] = tv[i].plainlen; /* encryptedlen */ + ctl_params[1] = tv[i].aadlen; /* aadlen */ + ctl_params[2] = authlen; /* authtaglen */ + err = gcry_cipher_ctl (hde, GCRYCTL_SET_CCM_LENGTHS, ctl_params, + sizeof(ctl_params)); + if (!err) + err = gcry_cipher_ctl (hdd, GCRYCTL_SET_CCM_LENGTHS, ctl_params, + sizeof(ctl_params)); + if (err) + { + fail ("cipher-ccm, gcry_cipher_ctl GCRYCTL_SET_CCM_LENGTHS " + "failed: %s\n", gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + aadsplit = split > tv[i].aadlen ? 0 : split; + + err = gcry_cipher_authenticate (hde, tv[i].aad, + tv[i].aadlen - aadsplit); + if (!err) + err = gcry_cipher_authenticate (hde, + &tv[i].aad[tv[i].aadlen - aadsplit], + aadsplit); + if (!err) + err = gcry_cipher_authenticate (hdd, tv[i].aad, + tv[i].aadlen - aadsplit); + if (!err) + err = gcry_cipher_authenticate (hdd, + &tv[i].aad[tv[i].aadlen - aadsplit], + aadsplit); + if (err) + { + fail ("cipher-ccm, gcry_cipher_authenticate failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + err = gcry_cipher_encrypt (hde, out, MAX_DATA_LEN, tv[i].plaintext, + tv[i].plainlen - split); + if (!err) + err = gcry_cipher_encrypt (hde, &out[tv[i].plainlen - split], + MAX_DATA_LEN - (tv[i].plainlen - split), + &tv[i].plaintext[tv[i].plainlen - split], + split); + if (err) + { + fail ("cipher-ccm, gcry_cipher_encrypt (%d:%d) failed: %s\n", + i, j, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + err = gcry_cipher_gettag (hde, &out[tv[i].plainlen], authlen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_gettag (%d:%d) failed: %s\n", + i, j, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + if (memcmp (tv[i].ciphertext, out, tv[i].cipherlen)) + fail ("cipher-ccm, encrypt mismatch entry %d:%d\n", i, j); + + err = gcry_cipher_decrypt (hdd, out, tv[i].plainlen - split, NULL, 0); + if (!err) + err = gcry_cipher_decrypt (hdd, &out[tv[i].plainlen - split], split, + NULL, 0); + if (err) + { + fail ("cipher-ccm, gcry_cipher_decrypt (%d:%d) failed: %s\n", + i, j, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + if (memcmp (tv[i].plaintext, out, tv[i].plainlen)) + fail ("cipher-ccm, decrypt mismatch entry %d:%d\n", i, j); + + err = gcry_cipher_checktag (hdd, &out[tv[i].plainlen], authlen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_checktag (%d:%d) failed: %s\n", + i, j, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + } + } + + /* Large buffer tests. */ + + /* Test encoding of aadlen > 0xfeff. */ + { + static const char key[]={0x40,0x41,0x42,0x43,0x44,0x45,0x46,0x47, + 0x48,0x49,0x4a,0x4b,0x4c,0x4d,0x4e,0x4f}; + static const char iv[]={0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19}; + static const char tag[]={0x9C,0x76,0xE7,0x33,0xD5,0x15,0xB3,0x6C, + 0xBA,0x76,0x95,0xF7,0xFB,0x91}; + char buf[1024]; + size_t enclen = 0x20000; + size_t aadlen = 0x20000; + size_t taglen = sizeof(tag); + + err = gcry_cipher_open (&hde, GCRY_CIPHER_AES, GCRY_CIPHER_MODE_CCM, 0); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_open failed: %s\n", + gpg_strerror (err)); + return; + } + + err = gcry_cipher_setkey (hde, key, sizeof (key)); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_setkey failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + err = gcry_cipher_setiv (hde, iv, sizeof (iv)); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_setiv failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + ctl_params[0] = enclen; /* encryptedlen */ + ctl_params[1] = aadlen; /* aadlen */ + ctl_params[2] = taglen; /* authtaglen */ + err = gcry_cipher_ctl (hde, GCRYCTL_SET_CCM_LENGTHS, ctl_params, + sizeof(ctl_params)); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_ctl GCRYCTL_SET_CCM_LENGTHS " + "failed: %s\n", gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + memset (buf, 0xaa, sizeof(buf)); + + for (i = 0; i < aadlen; i += sizeof(buf)) + { + err = gcry_cipher_authenticate (hde, buf, sizeof (buf)); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_authenticate failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + } + + for (i = 0; i < enclen; i += sizeof(buf)) + { + memset (buf, 0xee, sizeof(buf)); + err = gcry_cipher_encrypt (hde, buf, sizeof (buf), NULL, 0); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_encrypt failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + } + + err = gcry_cipher_gettag (hde, buf, taglen); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_gettag failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + if (memcmp (buf, tag, taglen) != 0) + fail ("cipher-ccm-large, encrypt mismatch entry\n"); + } + +#if 0 + /* Test encoding of aadlen > 0xffffffff. */ + { + static const char key[]={0x40,0x41,0x42,0x43,0x44,0x45,0x46,0x47, + 0x48,0x49,0x4a,0x4b,0x4c,0x4d,0x4e,0x4f}; + static const char iv[]={0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19}; + static const char tag[]={0x01,0xB2,0xC3,0x4A,0xA6,0x6A,0x07,0x6D, + 0xBC,0xBD,0xEA,0x17,0xD3,0x73,0xD7,0xD4}; + char buf[1024]; + size_t enclen = (size_t)0xffffffff + 1 + 1024; + size_t aadlen = (size_t)0xffffffff + 1 + 1024; + size_t taglen = sizeof(tag); + + err = gcry_cipher_open (&hde, GCRY_CIPHER_AES, GCRY_CIPHER_MODE_CCM, 0); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_open failed: %s\n", + gpg_strerror (err)); + return; + } + + err = gcry_cipher_setkey (hde, key, sizeof (key)); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_setkey failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + err = gcry_cipher_setiv (hde, iv, sizeof (iv)); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_setiv failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + ctl_params[0] = enclen; /* encryptedlen */ + ctl_params[1] = aadlen; /* aadlen */ + ctl_params[2] = taglen; /* authtaglen */ + err = gcry_cipher_ctl (hde, GCRYCTL_SET_CCM_LENGTHS, ctl_params, + sizeof(ctl_params)); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_ctl GCRYCTL_SET_CCM_LENGTHS failed:" + "%s\n", gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + memset (buf, 0xaa, sizeof(buf)); + + for (i = 0; i < aadlen; i += sizeof(buf)) + { + err = gcry_cipher_authenticate (hde, buf, sizeof (buf)); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_authenticate failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + } + + for (i = 0; i < enclen; i += sizeof(buf)) + { + memset (buf, 0xee, sizeof(buf)); + err = gcry_cipher_encrypt (hde, buf, sizeof (buf), NULL, 0); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_encrypt failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + } + + err = gcry_cipher_gettag (hde, buf, taglen); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_gettag failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + if (memcmp (buf, tag, taglen) != 0) + fail ("cipher-ccm-huge, encrypt mismatch entry\n"); + } +#endif + + if (verbose) + fprintf (stderr, " Completed CCM checks.\n"); +} + + +static void check_stream_cipher (void) { struct tv @@ -2455,6 +3225,7 @@ check_cipher_modes(void) check_ctr_cipher (); check_cfb_cipher (); check_ofb_cipher (); + check_ccm_cipher (); check_stream_cipher (); check_stream_cipher_large_block (); diff --git a/tests/benchmark.c b/tests/benchmark.c index ecda0d3..d3ef1a2 100644 --- a/tests/benchmark.c +++ b/tests/benchmark.c @@ -435,6 +435,40 @@ md_bench ( const char *algoname ) fflush (stdout); } + +static void ccm_aead_init(gcry_cipher_hd_t hd, size_t buflen, int authlen) +{ + const int _L = 4; + const int noncelen = 15 - _L; + char nonce[noncelen]; + size_t params[3]; + gcry_error_t err = GPG_ERR_NO_ERROR; + + memset (nonce, 0x33, noncelen); + + err = gcry_cipher_setiv (hd, nonce, noncelen); + if (err) + { + fprintf (stderr, "gcry_cipher_setiv failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + params[0] = buflen; /* encryptedlen */ + params[1] = 0; /* aadlen */ + params[2] = authlen; /* authtaglen */ + err = gcry_cipher_ctl (hd, GCRYCTL_SET_CCM_LENGTHS, params, sizeof(params)); + if (err) + { + fprintf (stderr, "gcry_cipher_setiv failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } +} + + static void cipher_bench ( const char *algoname ) { @@ -448,12 +482,21 @@ cipher_bench ( const char *algoname ) char *raw_outbuf, *raw_buf; size_t allocated_buflen, buflen; int repetitions; - static struct { int mode; const char *name; int blocked; } modes[] = { + static const struct { + int mode; + const char *name; + int blocked; + void (* const aead_init)(gcry_cipher_hd_t hd, size_t buflen, int authlen); + int req_blocksize; + int authlen; + } modes[] = { { GCRY_CIPHER_MODE_ECB, " ECB/Stream", 1 }, { GCRY_CIPHER_MODE_CBC, " CBC", 1 }, { GCRY_CIPHER_MODE_CFB, " CFB", 0 }, { GCRY_CIPHER_MODE_OFB, " OFB", 0 }, { GCRY_CIPHER_MODE_CTR, " CTR", 0 }, + { GCRY_CIPHER_MODE_CCM, " CCM", 0, + ccm_aead_init, GCRY_CCM_BLOCK_LEN, 8 }, { GCRY_CIPHER_MODE_STREAM, "", 0 }, {0} }; @@ -542,9 +585,16 @@ cipher_bench ( const char *algoname ) for (modeidx=0; modes[modeidx].mode; modeidx++) { if ((blklen > 1 && modes[modeidx].mode == GCRY_CIPHER_MODE_STREAM) - | (blklen == 1 && modes[modeidx].mode != GCRY_CIPHER_MODE_STREAM)) + || (blklen == 1 && modes[modeidx].mode != GCRY_CIPHER_MODE_STREAM)) continue; + if (modes[modeidx].req_blocksize > 0 + && blklen != modes[modeidx].req_blocksize) + { + printf (" %7s %7s", "-", "-" ); + continue; + } + for (i=0; i < sizeof buf; i++) buf[i] = i; @@ -585,7 +635,18 @@ cipher_bench ( const char *algoname ) exit (1); } } - err = gcry_cipher_encrypt ( hd, outbuf, buflen, buf, buflen); + if (modes[modeidx].aead_init) + { + (*modes[modeidx].aead_init) (hd, buflen, modes[modeidx].authlen); + err = gcry_cipher_encrypt (hd, outbuf, buflen, buf, buflen); + if (err) + break; + err = gcry_cipher_gettag (hd, outbuf, modes[modeidx].authlen); + } + else + { + err = gcry_cipher_encrypt (hd, outbuf, buflen, buf, buflen); + } } stop_timer (); @@ -632,7 +693,18 @@ cipher_bench ( const char *algoname ) exit (1); } } - err = gcry_cipher_decrypt ( hd, outbuf, buflen, buf, buflen); + if (modes[modeidx].aead_init) + { + (*modes[modeidx].aead_init) (hd, buflen, modes[modeidx].authlen); + err = gcry_cipher_decrypt (hd, outbuf, buflen, buf, buflen); + if (err) + break; + err = gcry_cipher_checktag (hd, outbuf, modes[modeidx].authlen); + if (gpg_err_code (err) == GPG_ERR_CHECKSUM) + err = gpg_error (GPG_ERR_NO_ERROR); + } + else + err = gcry_cipher_decrypt (hd, outbuf, buflen, buf, buflen); } stop_timer (); printf (" %s", elapsed_time ()); From jussi.kivilinna at iki.fi Tue Oct 22 15:31:56 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 22 Oct 2013 16:31:56 +0300 Subject: [PATCH] bufhelp: enable unaligned memory accesses for AArch64 (64-bit ARM) Message-ID: <20131022133156.27715.19936.stgit@localhost6.localdomain6> * cipher/bufhelp.h [__aarch64__] (BUFHELP_FAST_UNALIGNED_ACCESS): Set macro on AArch64. -- Signed-off-by: Jussi Kivilinna --- cipher/bufhelp.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/cipher/bufhelp.h b/cipher/bufhelp.h index 21fac07..198d286 100644 --- a/cipher/bufhelp.h +++ b/cipher/bufhelp.h @@ -35,7 +35,8 @@ #if defined(__i386__) || defined(__x86_64__) || \ defined(__powerpc__) || defined(__powerpc64__) || \ - (defined(__arm__) && defined(__ARM_FEATURE_UNALIGNED)) + (defined(__arm__) && defined(__ARM_FEATURE_UNALIGNED)) || \ + defined(__aarch64__) /* These architectures are able of unaligned memory accesses and can handle those fast. */ From jussi.kivilinna at iki.fi Tue Oct 22 15:49:23 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 22 Oct 2013 16:49:23 +0300 Subject: [PATCH 1/2] [v3] Add API to support AEAD cipher modes In-Reply-To: <20131022123202.6563.15226.stgit@localhost6.localdomain6> References: <20131022123202.6563.15226.stgit@localhost6.localdomain6> Message-ID: <52668263.9000101@iki.fi> Do these look ok to push? -Jussi On 22.10.2013 15:32, Jussi Kivilinna wrote: > * cipher/cipher.c (_gcry_cipher_authenticate, _gcry_cipher_checktag) > (_gcry_cipher_gettag): New. > * doc/gcrypt.texi: Add documentation for new API functions. > * src/visibility.c (gcry_cipher_authenticate, gcry_cipher_checktag) > (gcry_cipher_gettag): New. > * src/gcrypt.h.in, src/visibility.h: add declarations of these > functions. > * src/libgcrypt.defs, src/libgcrypt.vers: export functions. > -- > > Authenticated Encryption with Associated Data (AEAD) cipher modes > provide authentication tag that can be used to authenticate message. At > the same time it allows one to specify additional (unencrypted data) > that will be authenticated together with the message. This class of > cipher modes requires additional API present in this commit. > > This patch is based on original patch by Dmitry Eremin-Solenikov. > > Changes in v2: > - Change gcry_cipher_tag to gcry_cipher_checktag and gcry_cipher_gettag > for giving tag (checktag) for decryption and reading tag (gettag) after > encryption. > - Change gcry_cipher_authenticate to gcry_cipher_setaad, since > additional parameters needed for some AEAD modes (in this case CCM, > which needs the length of encrypted data and tag for MAC > initialization). > - Add some documentation. > > Changes in v3: > - Change gcry_cipher_setaad back to gcry_cipher_authenticate. Additional > parameters (encrypt_len, tag_len, aad_len) for CCM will be given > through GCRY_CTL_SET_CCM_LENGTHS. > > Signed-off-by: Jussi Kivilinna > --- > cipher/cipher.c | 34 ++++++++++++++++++++++++++++++++++ > doc/gcrypt.texi | 35 +++++++++++++++++++++++++++++++++++ > src/gcrypt.h.in | 11 +++++++++++ > src/libgcrypt.def | 3 +++ > src/libgcrypt.vers | 1 + > src/visibility.c | 27 +++++++++++++++++++++++++++ > src/visibility.h | 9 +++++++++ > 7 files changed, 120 insertions(+) > > diff --git a/cipher/cipher.c b/cipher/cipher.c > index 75d42d1..36c79db 100644 > --- a/cipher/cipher.c > +++ b/cipher/cipher.c > @@ -910,6 +910,40 @@ _gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen) > return 0; > } > > +gcry_error_t > +_gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, > + size_t abuflen) > +{ > + log_fatal ("gcry_cipher_authenticate: invalid mode %d\n", hd->mode); > + > + (void)abuf; > + (void)abuflen; > + > + return gpg_error (GPG_ERR_INV_CIPHER_MODE); > +} > + > +gcry_error_t > +_gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, size_t taglen) > +{ > + log_fatal ("gcry_cipher_gettag: invalid mode %d\n", hd->mode); > + > + (void)outtag; > + (void)taglen; > + > + return gpg_error (GPG_ERR_INV_CIPHER_MODE); > +} > + > +gcry_error_t > +_gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen) > +{ > + log_fatal ("gcry_cipher_checktag: invalid mode %d\n", hd->mode); > + > + (void)intag; > + (void)taglen; > + > + return gpg_error (GPG_ERR_INV_CIPHER_MODE); > +} > + > > gcry_error_t > gcry_cipher_ctl( gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) > diff --git a/doc/gcrypt.texi b/doc/gcrypt.texi > index 473c484..0049fa0 100644 > --- a/doc/gcrypt.texi > +++ b/doc/gcrypt.texi > @@ -1731,6 +1731,10 @@ matches the requirement of the selected algorithm and mode. > This function is also used with the Salsa20 stream cipher to set or > update the required nonce. In this case it needs to be called after > setting the key. > + > +This function is also used with the AEAD cipher modes to set or > +update the required nonce. > + > @end deftypefun > > @deftypefun gcry_error_t gcry_cipher_setctr (gcry_cipher_hd_t @var{h}, const void *@var{c}, size_t @var{l}) > @@ -1750,6 +1754,37 @@ call to gcry_cipher_setkey and clear the initialization vector. > Note that gcry_cipher_reset is implemented as a macro. > @end deftypefun > > +Authenticated Encryption with Associated Data (AEAD) block cipher > +modes require the handling of the authentication tag and the additional > +authenticated data, which can be done by using the following > +functions: > + > + at deftypefun gcry_error_t gcry_cipher_authenticate (gcry_cipher_hd_t @var{h}, const void *@var{abuf}, size_t @var{abuflen}) > + > +Process the buffer @var{abuf} of length @var{abuflen} as the additional > +authenticated data (AAD) for AEAD cipher modes. > + > + at end deftypefun > + > + at deftypefun gcry_error_t gcry_cipher_gettag (gcry_cipher_hd_t @var{h}, void *@var{tag}, size_t @var{taglen}) > + > +This function is used to read the authentication tag after encryption. > +The function finalizes and outputs the authentication tag to the buffer > + at var{tag} of length @var{taglen} bytes. > + > + at end deftypefun > + > + at deftypefun gcry_error_t gcry_cipher_checktag (gcry_cipher_hd_t @var{h}, const void *@var{tag}, size_t @var{taglen}) > + > +Check the authentication tag after decryption. The authentication > +tag is passed as the buffer @var{tag} of length @var{taglen} bytes > +and compared to internal authentication tag computed during > +decryption. Error code @code{GPG_ERR_CHECKSUM} is returned if > +the authentication tag in the buffer @var{tag} does not match > +the authentication tag calculated during decryption. > + > + at end deftypefun > + > The actual encryption and decryption is done by using one of the > following functions. They may be used as often as required to process > all the data. > diff --git a/src/gcrypt.h.in b/src/gcrypt.h.in > index 64cc0e4..f0ae927 100644 > --- a/src/gcrypt.h.in > +++ b/src/gcrypt.h.in > @@ -953,6 +953,17 @@ gcry_error_t gcry_cipher_setkey (gcry_cipher_hd_t hd, > gcry_error_t gcry_cipher_setiv (gcry_cipher_hd_t hd, > const void *iv, size_t ivlen); > > +/* Provide additional authentication data for AEAD modes/ciphers. */ > +gcry_error_t gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, > + size_t abuflen); > + > +/* Get authentication tag for AEAD modes/ciphers. */ > +gcry_error_t gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, > + size_t taglen); > + > +/* Check authentication tag for AEAD modes/ciphers. */ > +gcry_error_t gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, > + size_t taglen); > > /* Reset the handle to the state after open. */ > #define gcry_cipher_reset(h) gcry_cipher_ctl ((h), GCRYCTL_RESET, NULL, 0) > diff --git a/src/libgcrypt.def b/src/libgcrypt.def > index ec0c1e3..64ba370 100644 > --- a/src/libgcrypt.def > +++ b/src/libgcrypt.def > @@ -255,6 +255,9 @@ EXPORTS > > gcry_sexp_extract_param @225 > > + gcry_cipher_authenticate @226 > + gcry_cipher_gettag @227 > + gcry_cipher_checktag @228 > > > ;; end of file with public symbols for Windows. > diff --git a/src/libgcrypt.vers b/src/libgcrypt.vers > index be72aad..93eaa93 100644 > --- a/src/libgcrypt.vers > +++ b/src/libgcrypt.vers > @@ -51,6 +51,7 @@ GCRYPT_1.6 { > gcry_cipher_info; gcry_cipher_map_name; > gcry_cipher_mode_from_oid; gcry_cipher_open; > gcry_cipher_setkey; gcry_cipher_setiv; gcry_cipher_setctr; > + gcry_cipher_authenticate; gcry_cipher_gettag; gcry_cipher_checktag; > > gcry_pk_algo_info; gcry_pk_algo_name; gcry_pk_ctl; > gcry_pk_decrypt; gcry_pk_encrypt; gcry_pk_genkey; > diff --git a/src/visibility.c b/src/visibility.c > index 848925e..1f7bb3a 100644 > --- a/src/visibility.c > +++ b/src/visibility.c > @@ -713,6 +713,33 @@ gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen) > return _gcry_cipher_setctr (hd, ctr, ctrlen); > } > > +gcry_error_t > +gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, size_t abuflen) > +{ > + if (!fips_is_operational ()) > + return gpg_error (fips_not_operational ()); > + > + return _gcry_cipher_authenticate (hd, abuf, abuflen); > +} > + > +gcry_error_t > +gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, size_t taglen) > +{ > + if (!fips_is_operational ()) > + return gpg_error (fips_not_operational ()); > + > + return _gcry_cipher_gettag (hd, outtag, taglen); > +} > + > +gcry_error_t > +gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen) > +{ > + if (!fips_is_operational ()) > + return gpg_error (fips_not_operational ()); > + > + return _gcry_cipher_checktag (hd, intag, taglen); > +} > + > > gcry_error_t > gcry_cipher_ctl (gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) > diff --git a/src/visibility.h b/src/visibility.h > index 1c8f047..b2fa4c0 100644 > --- a/src/visibility.h > +++ b/src/visibility.h > @@ -81,6 +81,9 @@ > #define gcry_cipher_setkey _gcry_cipher_setkey > #define gcry_cipher_setiv _gcry_cipher_setiv > #define gcry_cipher_setctr _gcry_cipher_setctr > +#define gcry_cipher_authenticate _gcry_cipher_authenticate > +#define gcry_cipher_checktag _gcry_cipher_checktag > +#define gcry_cipher_gettag _gcry_cipher_gettag > #define gcry_cipher_ctl _gcry_cipher_ctl > #define gcry_cipher_decrypt _gcry_cipher_decrypt > #define gcry_cipher_encrypt _gcry_cipher_encrypt > @@ -297,6 +300,9 @@ gcry_err_code_t gcry_md_get (gcry_md_hd_t hd, int algo, > #undef gcry_cipher_setkey > #undef gcry_cipher_setiv > #undef gcry_cipher_setctr > +#undef gcry_cipher_authenticate > +#undef gcry_cipher_checktag > +#undef gcry_cipher_gettag > #undef gcry_cipher_ctl > #undef gcry_cipher_decrypt > #undef gcry_cipher_encrypt > @@ -474,6 +480,9 @@ MARK_VISIBLE (gcry_cipher_close) > MARK_VISIBLE (gcry_cipher_setkey) > MARK_VISIBLE (gcry_cipher_setiv) > MARK_VISIBLE (gcry_cipher_setctr) > +MARK_VISIBLE (gcry_cipher_authenticate) > +MARK_VISIBLE (gcry_cipher_checktag) > +MARK_VISIBLE (gcry_cipher_gettag) > MARK_VISIBLE (gcry_cipher_ctl) > MARK_VISIBLE (gcry_cipher_decrypt) > MARK_VISIBLE (gcry_cipher_encrypt) > > > _______________________________________________ > Gcrypt-devel mailing list > Gcrypt-devel at gnupg.org > http://lists.gnupg.org/mailman/listinfo/gcrypt-devel > From wk at gnupg.org Tue Oct 22 16:31:13 2013 From: wk at gnupg.org (Werner Koch) Date: Tue, 22 Oct 2013 16:31:13 +0200 Subject: [PATCH 1/2] [v3] Add API to support AEAD cipher modes In-Reply-To: <52668263.9000101@iki.fi> (Jussi Kivilinna's message of "Tue, 22 Oct 2013 16:49:23 +0300") References: <20131022123202.6563.15226.stgit@localhost6.localdomain6> <52668263.9000101@iki.fi> Message-ID: <87fvrt8jim.fsf@vigenere.g10code.de> On Tue, 22 Oct 2013 15:49, jussi.kivilinna at iki.fi said: > Do these look ok to push? Yes; but see below. >> +gcry_error_t >> +_gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, >> + size_t abuflen) >> +{ >> + log_fatal ("gcry_cipher_authenticate: invalid mode %d\n", hd->mode); >> + >> + (void)abuf; >> + (void)abuflen; >> + >> + return gpg_error (GPG_ERR_INV_CIPHER_MODE); Sure that you want log_fatal here? This calls abort and thus terminates the process. I think it is better to use log_error. Or do you fear that callers do not check the return code? Maybe we can add an error state to gcry_cipher_hd_t to let all further operation fail. Salam-Shalom, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From jussi.kivilinna at iki.fi Tue Oct 22 18:21:15 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 22 Oct 2013 19:21:15 +0300 Subject: [PATCH 1/2] [v3] Add API to support AEAD cipher modes In-Reply-To: <87fvrt8jim.fsf@vigenere.g10code.de> References: <20131022123202.6563.15226.stgit@localhost6.localdomain6> <52668263.9000101@iki.fi> <87fvrt8jim.fsf@vigenere.g10code.de> Message-ID: <5266A5FB.5000903@iki.fi> On 22.10.2013 17:31, Werner Koch wrote: > On Tue, 22 Oct 2013 15:49, jussi.kivilinna at iki.fi said: >> Do these look ok to push? > > Yes; but see below. > >>> +gcry_error_t >>> +_gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, >>> + size_t abuflen) >>> +{ >>> + log_fatal ("gcry_cipher_authenticate: invalid mode %d\n", hd->mode); >>> + >>> + (void)abuf; >>> + (void)abuflen; >>> + >>> + return gpg_error (GPG_ERR_INV_CIPHER_MODE); > > Sure that you want log_fatal here? This calls abort and thus > terminates the process. I think it is better to use log_error. Ok, I'll change to that. > Or do > you fear that callers do not check the return code? Maybe we can add an > error state to gcry_cipher_hd_t to let all further operation fail. I based patch on Dmitry's earlier patch, so I don't know if the use of log_fatal was intended. Maybe logging isn't needed here at all. Earlier use of log_fatal is in gcry_cipher_encrypt/decrypt, where its use probably makes sense. And log_error isn't used elsewhere in cipher.c. -Jussi > > > Salam-Shalom, > > Werner > From jussi.kivilinna at iki.fi Tue Oct 22 18:24:38 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 22 Oct 2013 19:24:38 +0300 Subject: [PATCH] serpent-amd64: do not use GAS macros In-Reply-To: <877gd58ja9.fsf@vigenere.g10code.de> References: <20131020180142.19195.21145.stgit@localhost6.localdomain6> <877gd58ja9.fsf@vigenere.g10code.de> Message-ID: <5266A6C6.4070607@iki.fi> On 22.10.2013 17:36, Werner Koch wrote: > On Sun, 20 Oct 2013 20:01, jussi.kivilinna at iki.fi said: >> * cipher/serpent-avx2-amd64.S: Remove use of GAS macros. >> * cipher/serpent-sse2-amd64.S: Ditto. >> * configure.ac [HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS]: Do not check >> for GAS macros. > > Always feel free to push fixes for your own code. > Absolutely.. I just want to have some delay on largeish patches so I have some time to catch my own bugs. Such as the broken change for configure.ac in this patch; assembly implementations end up disabled with gcc, but enabled on clang. -Jussi > > Salam-Shalom, > > Werner > From cvs at cvs.gnupg.org Tue Oct 22 18:52:31 2013 From: cvs at cvs.gnupg.org (by Jussi Kivilinna) Date: Tue, 22 Oct 2013 18:52:31 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-321-g335d9bf Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 335d9bf7b035815750b63a3a8334d6ce44dc4449 (commit) via 95654041f2aa62f71aac4d8614dafe8433d10f95 (commit) from a5a277a9016ccb34f1858a65e0ed1791b2fc3db3 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 335d9bf7b035815750b63a3a8334d6ce44dc4449 Author: Jussi Kivilinna Date: Tue Oct 22 17:07:53 2013 +0300 Add Counter with CBC-MAC mode (CCM) * cipher/Makefile.am: Add 'cipher-ccm.c'. * cipher/cipher-ccm.c: New. * cipher/cipher-internal.h (gcry_cipher_handle): Add 'u_mode'. (_gcry_cipher_ccm_encrypt, _gcry_cipher_ccm_decrypt) (_gcry_cipher_ccm_set_nonce, _gcry_cipher_ccm_authenticate) (_gcry_cipher_ccm_get_tag, _gcry_cipher_ccm_check_tag) (_gcry_cipher_ccm_set_lengths): New prototypes. * cipher/cipher.c (gcry_cipher_open, cipher_encrypt, cipher_decrypt) (_gcry_cipher_setiv, _gcry_cipher_authenticate, _gcry_cipher_gettag) (_gcry_cipher_checktag, gry_cipher_ctl): Add handling for CCM mode. * doc/gcrypt.texi: Add documentation for GCRY_CIPHER_MODE_CCM. * src/gcrypt.h.in (gcry_cipher_modes): Add 'GCRY_CIPHER_MODE_CCM'. (gcry_ctl_cmds): Add 'GCRYCTL_SET_CCM_LENGTHS'. (GCRY_CCM_BLOCK_LEN): New. * tests/basic.c (check_ccm_cipher): New. (check_cipher_modes): Call 'check_ccm_cipher'. * tests/benchmark.c (ccm_aead_init): New. (cipher_bench): Add handling for AEAD modes and add CCM benchmarking. -- Patch adds CCM (Counter with CBC-MAC) mode as defined in RFC 3610 and NIST Special Publication 800-38C. Example for encrypting message (split in two buffers; buf1, buf2) and authenticating additional non-encrypted data (split in two buffers; aadbuf1, aadbuf2) with authentication tag length of eigth bytes: size_t params[3]; taglen = 8; gcry_cipher_setkey(h, key, len(key)); gcry_cipher_setiv(h, nonce, len(nonce)); params[0] = len(buf1) + len(buf2); /* 0: enclen */ params[1] = len(aadbuf1) + len(aadbuf2); /* 1: aadlen */ params[2] = taglen; /* 2: authtaglen */ gcry_cipher_ctl(h, GCRYCTL_SET_CCM_LENGTHS, params, sizeof(size_t) * 3); gcry_cipher_authenticate(h, aadbuf1, len(aadbuf1)); gcry_cipher_authenticate(h, aadbuf2, len(aadbuf2)); gcry_cipher_encrypt(h, buf1, len(buf1), buf1, len(buf1)); gcry_cipher_encrypt(h, buf2, len(buf2), buf2, len(buf2)); gcry_cipher_gettag(h, tag, taglen); Example for decrypting above message and checking authentication tag: size_t params[3]; taglen = 8; gcry_cipher_setkey(h, key, len(key)); gcry_cipher_setiv(h, nonce, len(nonce)); params[0] = len(buf1) + len(buf2); /* 0: enclen */ params[1] = len(aadbuf1) + len(aadbuf2); /* 1: aadlen */ params[2] = taglen; /* 2: authtaglen */ gcry_cipher_ctl(h, GCRYCTL_SET_CCM_LENGTHS, params, sizeof(size_t) * 3); gcry_cipher_authenticate(h, aadbuf1, len(aadbuf1)); gcry_cipher_authenticate(h, aadbuf2, len(aadbuf2)); gcry_cipher_decrypt(h, buf1, len(buf1), buf1, len(buf1)); gcry_cipher_decrypt(h, buf2, len(buf2), buf2, len(buf2)); err = gcry_cipher_checktag(h, tag, taglen); if (gpg_err_code (err) == GPG_ERR_CHECKSUM) { /* Authentication failed. */ } else if (err == 0) { /* Authentication ok. */ } Example for encrypting message without additional authenticated data: size_t params[3]; taglen = 10; gcry_cipher_setkey(h, key, len(key)); gcry_cipher_setiv(h, nonce, len(nonce)); params[0] = len(buf1); /* 0: enclen */ params[1] = 0; /* 1: aadlen */ params[2] = taglen; /* 2: authtaglen */ gcry_cipher_ctl(h, GCRYCTL_SET_CCM_LENGTHS, params, sizeof(size_t) * 3); gcry_cipher_encrypt(h, buf1, len(buf1), buf1, len(buf1)); gcry_cipher_gettag(h, tag, taglen); To reset CCM state for cipher handle, one can either set new nonce or use 'gcry_cipher_reset'. This implementation reuses existing CTR mode code for encryption/decryption and is there for able to process multiple buffers that are not multiple of blocksize. AAD data maybe also be passed into gcry_cipher_authenticate in non-blocksize chunks. [v4]: GCRYCTL_SET_CCM_PARAMS => GCRY_SET_CCM_LENGTHS Signed-off-by: Jussi Kivilinna diff --git a/cipher/Makefile.am b/cipher/Makefile.am index a2b2c8a..b0efd89 100644 --- a/cipher/Makefile.am +++ b/cipher/Makefile.am @@ -40,6 +40,7 @@ libcipher_la_LIBADD = $(GCRYPT_MODULES) libcipher_la_SOURCES = \ cipher.c cipher-internal.h \ cipher-cbc.c cipher-cfb.c cipher-ofb.c cipher-ctr.c cipher-aeswrap.c \ +cipher-ccm.c \ cipher-selftest.c cipher-selftest.h \ pubkey.c pubkey-internal.h pubkey-util.c \ md.c \ diff --git a/cipher/cipher-ccm.c b/cipher/cipher-ccm.c new file mode 100644 index 0000000..38752d5 --- /dev/null +++ b/cipher/cipher-ccm.c @@ -0,0 +1,370 @@ +/* cipher-ccm.c - CTR mode with CBC-MAC mode implementation + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser general Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include +#include +#include +#include +#include + +#include "g10lib.h" +#include "cipher.h" +#include "ath.h" +#include "bufhelp.h" +#include "./cipher-internal.h" + + +#define set_burn(burn, nburn) do { \ + unsigned int __nburn = (nburn); \ + (burn) = (burn) > __nburn ? (burn) : __nburn; } while (0) + + +static unsigned int +do_cbc_mac (gcry_cipher_hd_t c, const unsigned char *inbuf, size_t inlen, + int do_padding) +{ + const unsigned int blocksize = 16; + unsigned char tmp[blocksize]; + unsigned int burn = 0; + unsigned int unused = c->u_mode.ccm.mac_unused; + size_t nblocks; + + if (inlen == 0 && (unused == 0 || !do_padding)) + return 0; + + do + { + if (inlen + unused < blocksize || unused > 0) + { + for (; inlen && unused < blocksize; inlen--) + c->u_mode.ccm.macbuf[unused++] = *inbuf++; + } + if (!inlen) + { + if (!do_padding) + break; + + while (unused < blocksize) + c->u_mode.ccm.macbuf[unused++] = 0; + } + + if (unused > 0) + { + /* Process one block from macbuf. */ + buf_xor(c->u_iv.iv, c->u_iv.iv, c->u_mode.ccm.macbuf, blocksize); + set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_iv.iv, + c->u_iv.iv )); + + unused = 0; + } + + if (c->bulk.cbc_enc) + { + nblocks = inlen / blocksize; + c->bulk.cbc_enc (&c->context.c, c->u_iv.iv, tmp, inbuf, nblocks, 1); + inbuf += nblocks * blocksize; + inlen -= nblocks * blocksize; + + wipememory (tmp, sizeof(tmp)); + } + else + { + while (inlen >= blocksize) + { + buf_xor(c->u_iv.iv, c->u_iv.iv, inbuf, blocksize); + + set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_iv.iv, + c->u_iv.iv )); + + inlen -= blocksize; + inbuf += blocksize; + } + } + } + while (inlen > 0); + + c->u_mode.ccm.mac_unused = unused; + + if (burn) + burn += 4 * sizeof(void *); + + return burn; +} + + +gcry_err_code_t +_gcry_cipher_ccm_set_nonce (gcry_cipher_hd_t c, const unsigned char *nonce, + size_t noncelen) +{ + size_t L = 15 - noncelen; + size_t L_; + + L_ = L - 1; + + if (!nonce) + return GPG_ERR_INV_ARG; + /* Length field must be 2, 3, ..., or 8. */ + if (L < 2 || L > 8) + return GPG_ERR_INV_LENGTH; + + /* Reset state */ + memset (&c->u_mode, 0, sizeof(c->u_mode)); + memset (&c->marks, 0, sizeof(c->marks)); + memset (&c->u_iv, 0, sizeof(c->u_iv)); + memset (&c->u_ctr, 0, sizeof(c->u_ctr)); + memset (c->lastiv, 0, sizeof(c->lastiv)); + c->unused = 0; + + /* Setup CTR */ + c->u_ctr.ctr[0] = L_; + memcpy (&c->u_ctr.ctr[1], nonce, noncelen); + memset (&c->u_ctr.ctr[1 + noncelen], 0, L); + + /* Setup IV */ + c->u_iv.iv[0] = L_; + memcpy (&c->u_iv.iv[1], nonce, noncelen); + /* Add (8 * M_ + 64 * flags) to iv[0] and set iv[noncelen + 1 ... 15] later + in set_aad. */ + memset (&c->u_iv.iv[1 + noncelen], 0, L); + + c->u_mode.ccm.nonce = 1; + + return GPG_ERR_NO_ERROR; +} + + +gcry_err_code_t +_gcry_cipher_ccm_set_lengths (gcry_cipher_hd_t c, size_t encryptlen, + size_t aadlen, size_t taglen) +{ + unsigned int burn = 0; + unsigned char b0[16]; + size_t noncelen = 15 - (c->u_iv.iv[0] + 1); + size_t M = taglen; + size_t M_; + int i; + + M_ = (M - 2) / 2; + + /* Authentication field must be 4, 6, 8, 10, 12, 14 or 16. */ + if ((M_ * 2 + 2) != M || M < 4 || M > 16) + return GPG_ERR_INV_LENGTH; + if (!c->u_mode.ccm.nonce || c->u_mode.ccm.tag) + return GPG_ERR_INV_STATE; + if (c->u_mode.ccm.lengths) + return GPG_ERR_INV_STATE; + + c->u_mode.ccm.authlen = taglen; + c->u_mode.ccm.encryptlen = encryptlen; + c->u_mode.ccm.aadlen = aadlen; + + /* Complete IV setup. */ + c->u_iv.iv[0] += (aadlen > 0) * 64 + M_ * 8; + for (i = 16 - 1; i >= 1 + noncelen; i--) + { + c->u_iv.iv[i] = encryptlen & 0xff; + encryptlen >>= 8; + } + + memcpy (b0, c->u_iv.iv, 16); + memset (c->u_iv.iv, 0, 16); + + set_burn (burn, do_cbc_mac (c, b0, 16, 0)); + + if (aadlen == 0) + { + /* Do nothing. */ + } + else if (aadlen > 0 && aadlen <= (unsigned int)0xfeff) + { + b0[0] = (aadlen >> 8) & 0xff; + b0[1] = aadlen & 0xff; + set_burn (burn, do_cbc_mac (c, b0, 2, 0)); + } + else if (aadlen > 0xfeff && aadlen <= (unsigned int)0xffffffff) + { + b0[0] = 0xff; + b0[1] = 0xfe; + buf_put_be32(&b0[2], aadlen); + set_burn (burn, do_cbc_mac (c, b0, 6, 0)); + } +#ifdef HAVE_U64_TYPEDEF + else if (aadlen > (unsigned int)0xffffffff) + { + b0[0] = 0xff; + b0[1] = 0xff; + buf_put_be64(&b0[2], aadlen); + set_burn (burn, do_cbc_mac (c, b0, 10, 0)); + } +#endif + + /* Generate S_0 and increase counter. */ + set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_mode.ccm.s0, + c->u_ctr.ctr )); + c->u_ctr.ctr[15]++; + + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + + c->u_mode.ccm.lengths = 1; + + return GPG_ERR_NO_ERROR; +} + + +gcry_err_code_t +_gcry_cipher_ccm_authenticate (gcry_cipher_hd_t c, const unsigned char *abuf, + size_t abuflen) +{ + unsigned int burn; + + if (abuflen > 0 && !abuf) + return GPG_ERR_INV_ARG; + if (!c->u_mode.ccm.nonce || !c->u_mode.ccm.lengths || c->u_mode.ccm.tag) + return GPG_ERR_INV_STATE; + if (abuflen > c->u_mode.ccm.aadlen) + return GPG_ERR_INV_LENGTH; + + c->u_mode.ccm.aadlen -= abuflen; + burn = do_cbc_mac (c, abuf, abuflen, c->u_mode.ccm.aadlen == 0); + + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + + return GPG_ERR_NO_ERROR; +} + + +gcry_err_code_t +_gcry_cipher_ccm_tag (gcry_cipher_hd_t c, unsigned char *outbuf, + size_t outbuflen, int check) +{ + unsigned int burn; + + if (!outbuf || outbuflen == 0) + return GPG_ERR_INV_ARG; + /* Tag length must be same as initial authlen. */ + if (c->u_mode.ccm.authlen != outbuflen) + return GPG_ERR_INV_LENGTH; + if (!c->u_mode.ccm.nonce || !c->u_mode.ccm.lengths || c->u_mode.ccm.aadlen > 0) + return GPG_ERR_INV_STATE; + /* Initial encrypt length must match with length of actual data processed. */ + if (c->u_mode.ccm.encryptlen > 0) + return GPG_ERR_UNFINISHED; + + if (!c->u_mode.ccm.tag) + { + burn = do_cbc_mac (c, NULL, 0, 1); /* Perform final padding. */ + + /* Add S_0 */ + buf_xor (c->u_iv.iv, c->u_iv.iv, c->u_mode.ccm.s0, 16); + + wipememory (c->u_ctr.ctr, 16); + wipememory (c->u_mode.ccm.s0, 16); + wipememory (c->u_mode.ccm.macbuf, 16); + + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + } + + if (!check) + { + memcpy (outbuf, c->u_iv.iv, outbuflen); + return GPG_ERR_NO_ERROR; + } + else + { + int diff, i; + + /* Constant-time compare. */ + for (i = 0, diff = 0; i < outbuflen; i++) + diff -= !!(outbuf[i] - c->u_iv.iv[i]); + + return !diff ? GPG_ERR_NO_ERROR : GPG_ERR_CHECKSUM; + } +} + + +gcry_err_code_t +_gcry_cipher_ccm_get_tag (gcry_cipher_hd_t c, unsigned char *outtag, + size_t taglen) +{ + return _gcry_cipher_ccm_tag (c, outtag, taglen, 0); +} + + +gcry_err_code_t +_gcry_cipher_ccm_check_tag (gcry_cipher_hd_t c, const unsigned char *intag, + size_t taglen) +{ + return _gcry_cipher_ccm_tag (c, (unsigned char *)intag, taglen, 1); +} + + +gcry_err_code_t +_gcry_cipher_ccm_encrypt (gcry_cipher_hd_t c, unsigned char *outbuf, + unsigned int outbuflen, const unsigned char *inbuf, + unsigned int inbuflen) +{ + unsigned int burn; + + if (outbuflen < inbuflen) + return GPG_ERR_BUFFER_TOO_SHORT; + if (!c->u_mode.ccm.nonce || c->u_mode.ccm.tag || !c->u_mode.ccm.lengths || + c->u_mode.ccm.aadlen > 0) + return GPG_ERR_INV_STATE; + if (inbuflen > c->u_mode.ccm.encryptlen) + return GPG_ERR_INV_LENGTH; + + c->u_mode.ccm.encryptlen -= inbuflen; + burn = do_cbc_mac (c, inbuf, inbuflen, 0); + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + + return _gcry_cipher_ctr_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); +} + + +gcry_err_code_t +_gcry_cipher_ccm_decrypt (gcry_cipher_hd_t c, unsigned char *outbuf, + unsigned int outbuflen, const unsigned char *inbuf, + unsigned int inbuflen) +{ + gcry_err_code_t err; + unsigned int burn; + + if (outbuflen < inbuflen) + return GPG_ERR_BUFFER_TOO_SHORT; + if (!c->u_mode.ccm.nonce || c->u_mode.ccm.tag || !c->u_mode.ccm.lengths || + c->u_mode.ccm.aadlen > 0) + return GPG_ERR_INV_STATE; + if (inbuflen > c->u_mode.ccm.encryptlen) + return GPG_ERR_INV_LENGTH; + + err = _gcry_cipher_ctr_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); + if (err) + return err; + + c->u_mode.ccm.encryptlen -= inbuflen; + burn = do_cbc_mac (c, outbuf, inbuflen, 0); + if (burn) + _gcry_burn_stack (burn + sizeof(void *) * 5); + + return err; +} diff --git a/cipher/cipher-internal.h b/cipher/cipher-internal.h index b60ef38..981caa8 100644 --- a/cipher/cipher-internal.h +++ b/cipher/cipher-internal.h @@ -100,7 +100,8 @@ struct gcry_cipher_handle /* The initialization vector. For best performance we make sure that it is properly aligned. In particular some implementations - of bulk operations expect an 16 byte aligned IV. */ + of bulk operations expect an 16 byte aligned IV. IV is also used + to store CBC-MAC in CCM mode; counter IV is stored in U_CTR. */ union { cipher_context_alignment_t iv_align; unsigned char iv[MAX_BLOCKSIZE]; @@ -117,6 +118,26 @@ struct gcry_cipher_handle unsigned char lastiv[MAX_BLOCKSIZE]; int unused; /* Number of unused bytes in LASTIV. */ + union { + /* Mode specific storage for CCM mode. */ + struct { + size_t encryptlen; + size_t aadlen; + unsigned int authlen; + + /* Space to save partial input lengths for MAC. */ + unsigned char macbuf[GCRY_CCM_BLOCK_LEN]; + int mac_unused; /* Number of unprocessed bytes in MACBUF. */ + + unsigned char s0[GCRY_CCM_BLOCK_LEN]; + + unsigned int nonce:1;/* Set to 1 if nonce has been set. */ + unsigned int lengths:1; /* Set to 1 if CCM length parameters has been + processed. */ + unsigned int tag:1; /* Set to 1 if tag has been finalized. */ + } ccm; + } u_mode; + /* What follows are two contexts of the cipher in use. The first one needs to be aligned well enough for the cipher operation whereas the second one is a copy created by cipher_setkey and @@ -175,5 +196,30 @@ gcry_err_code_t _gcry_cipher_aeswrap_decrypt const byte *inbuf, unsigned int inbuflen); +/*-- cipher-ccm.c --*/ +gcry_err_code_t _gcry_cipher_ccm_encrypt +/* */ (gcry_cipher_hd_t c, + unsigned char *outbuf, unsigned int outbuflen, + const unsigned char *inbuf, unsigned int inbuflen); +gcry_err_code_t _gcry_cipher_ccm_decrypt +/* */ (gcry_cipher_hd_t c, + unsigned char *outbuf, unsigned int outbuflen, + const unsigned char *inbuf, unsigned int inbuflen); +gcry_err_code_t _gcry_cipher_ccm_set_nonce +/* */ (gcry_cipher_hd_t c, const unsigned char *nonce, + size_t noncelen); +gcry_err_code_t _gcry_cipher_ccm_authenticate +/* */ (gcry_cipher_hd_t c, const unsigned char *abuf, size_t abuflen); +gcry_err_code_t _gcry_cipher_ccm_set_lengths +/* */ (gcry_cipher_hd_t c, size_t encryptedlen, size_t aadlen, + size_t taglen); +gcry_err_code_t _gcry_cipher_ccm_get_tag +/* */ (gcry_cipher_hd_t c, + unsigned char *outtag, size_t taglen); +gcry_err_code_t _gcry_cipher_ccm_check_tag +/* */ (gcry_cipher_hd_t c, + const unsigned char *intag, size_t taglen); + + #endif /*G10_CIPHER_INTERNAL_H*/ diff --git a/cipher/cipher.c b/cipher/cipher.c index d6d3021..5214d26 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -375,6 +375,13 @@ gcry_cipher_open (gcry_cipher_hd_t *handle, if (! err) switch (mode) { + case GCRY_CIPHER_MODE_CCM: + if (spec->blocksize != GCRY_CCM_BLOCK_LEN) + err = GPG_ERR_INV_CIPHER_MODE; + if (!spec->encrypt || !spec->decrypt) + err = GPG_ERR_INV_CIPHER_MODE; + break; + case GCRY_CIPHER_MODE_ECB: case GCRY_CIPHER_MODE_CBC: case GCRY_CIPHER_MODE_CFB: @@ -613,6 +620,8 @@ cipher_reset (gcry_cipher_hd_t c) memset (c->u_iv.iv, 0, c->spec->blocksize); memset (c->lastiv, 0, c->spec->blocksize); memset (c->u_ctr.ctr, 0, c->spec->blocksize); + memset (&c->u_mode, 0, sizeof c->u_mode); + c->unused = 0; } @@ -718,6 +727,10 @@ cipher_encrypt (gcry_cipher_hd_t c, byte *outbuf, unsigned int outbuflen, inbuf, inbuflen); break; + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); + break; + case GCRY_CIPHER_MODE_STREAM: c->spec->stencrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf, inbuflen); @@ -811,6 +824,10 @@ cipher_decrypt (gcry_cipher_hd_t c, byte *outbuf, unsigned int outbuflen, inbuf, inbuflen); break; + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); + break; + case GCRY_CIPHER_MODE_STREAM: c->spec->stdecrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf, inbuflen); @@ -885,8 +902,19 @@ _gcry_cipher_setkey (gcry_cipher_hd_t hd, const void *key, size_t keylen) gcry_error_t _gcry_cipher_setiv (gcry_cipher_hd_t hd, const void *iv, size_t ivlen) { - cipher_setiv (hd, iv, ivlen); - return 0; + gcry_err_code_t rc = GPG_ERR_NO_ERROR; + + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_set_nonce (hd, iv, ivlen); + break; + + default: + cipher_setiv (hd, iv, ivlen); + break; + } + return gpg_error (rc); } /* Set counter for CTR mode. (CTR,CTRLEN) must denote a buffer of @@ -914,34 +942,61 @@ gcry_error_t _gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, size_t abuflen) { - log_error ("gcry_cipher_authenticate: invalid mode %d\n", hd->mode); + gcry_err_code_t rc; + + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_authenticate (hd, abuf, abuflen); + break; - (void)abuf; - (void)abuflen; + default: + log_error ("gcry_cipher_authenticate: invalid mode %d\n", hd->mode); + rc = GPG_ERR_INV_CIPHER_MODE; + break; + } - return gpg_error (GPG_ERR_INV_CIPHER_MODE); + return gpg_error (rc); } gcry_error_t _gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, size_t taglen) { - log_error ("gcry_cipher_gettag: invalid mode %d\n", hd->mode); + gcry_err_code_t rc; + + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_get_tag (hd, outtag, taglen); + break; - (void)outtag; - (void)taglen; + default: + log_error ("gcry_cipher_gettag: invalid mode %d\n", hd->mode); + rc = GPG_ERR_INV_CIPHER_MODE; + break; + } - return gpg_error (GPG_ERR_INV_CIPHER_MODE); + return gpg_error (rc); } gcry_error_t _gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen) { - log_error ("gcry_cipher_checktag: invalid mode %d\n", hd->mode); + gcry_err_code_t rc; + + switch (hd->mode) + { + case GCRY_CIPHER_MODE_CCM: + rc = _gcry_cipher_ccm_check_tag (hd, intag, taglen); + break; - (void)intag; - (void)taglen; + default: + log_error ("gcry_cipher_checktag: invalid mode %d\n", hd->mode); + rc = GPG_ERR_INV_CIPHER_MODE; + break; + } - return gpg_error (GPG_ERR_INV_CIPHER_MODE); + return gpg_error (rc); } @@ -980,6 +1035,30 @@ gcry_cipher_ctl( gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) h->flags &= ~GCRY_CIPHER_CBC_MAC; break; + case GCRYCTL_SET_CCM_LENGTHS: + { + size_t params[3]; + size_t encryptedlen; + size_t aadlen; + size_t authtaglen; + + if (h->mode != GCRY_CIPHER_MODE_CCM) + return gcry_error (GPG_ERR_INV_CIPHER_MODE); + + if (!buffer || buflen != 3 * sizeof(size_t)) + return gcry_error (GPG_ERR_INV_ARG); + + /* This command is used to pass additional length parameters needed + by CCM mode to initialize CBC-MAC. */ + memcpy (params, buffer, sizeof(params)); + encryptedlen = params[0]; + aadlen = params[1]; + authtaglen = params[2]; + + rc = _gcry_cipher_ccm_set_lengths (h, encryptedlen, aadlen, authtaglen); + } + break; + case GCRYCTL_DISABLE_ALGO: /* This command expects NULL for H and BUFFER to point to an integer with the algo number. */ diff --git a/doc/gcrypt.texi b/doc/gcrypt.texi index 0049fa0..91fe399 100644 --- a/doc/gcrypt.texi +++ b/doc/gcrypt.texi @@ -1635,6 +1635,12 @@ may be specified 64 bit (8 byte) shorter than then input buffer. As per specs the input length must be at least 128 bits and the length must be a multiple of 64 bits. + at item GCRY_CIPHER_MODE_CCM + at cindex CCM, Counter with CBC-MAC mode +Counter with CBC-MAC mode is an Authenticated Encryption with +Associated Data (AEAD) block cipher mode, which is specified in +'NIST Special Publication 800-38C' and RFC 3610. + @end table @node Working with cipher handles @@ -1661,11 +1667,13 @@ The cipher mode to use must be specified via @var{mode}. See @xref{Available cipher modes}, for a list of supported cipher modes and the according constants. Note that some modes are incompatible with some algorithms - in particular, stream mode -(@code{GCRY_CIPHER_MODE_STREAM}) only works with stream ciphers. Any -block cipher mode (@code{GCRY_CIPHER_MODE_ECB}, +(@code{GCRY_CIPHER_MODE_STREAM}) only works with stream ciphers. The +block cipher modes (@code{GCRY_CIPHER_MODE_ECB}, @code{GCRY_CIPHER_MODE_CBC}, @code{GCRY_CIPHER_MODE_CFB}, - at code{GCRY_CIPHER_MODE_OFB} or @code{GCRY_CIPHER_MODE_CTR}) will work -with any block cipher algorithm. + at code{GCRY_CIPHER_MODE_OFB} and @code{GCRY_CIPHER_MODE_CTR}) will work +with any block cipher algorithm. The @code{GCRY_CIPHER_MODE_CCM} will +only work with block cipher algorithms which have the block size of +16 bytes. The third argument @var{flags} can either be passed as @code{0} or as the bit-wise OR of the following constants. diff --git a/src/gcrypt.h.in b/src/gcrypt.h.in index f0ae927..948202d 100644 --- a/src/gcrypt.h.in +++ b/src/gcrypt.h.in @@ -325,7 +325,8 @@ enum gcry_ctl_cmds GCRYCTL_SET_PREFERRED_RNG_TYPE = 65, GCRYCTL_GET_CURRENT_RNG_TYPE = 66, GCRYCTL_DISABLE_LOCKED_SECMEM = 67, - GCRYCTL_DISABLE_PRIV_DROP = 68 + GCRYCTL_DISABLE_PRIV_DROP = 68, + GCRYCTL_SET_CCM_LENGTHS = 69 }; /* Perform various operations defined by CMD. */ @@ -884,7 +885,8 @@ enum gcry_cipher_modes GCRY_CIPHER_MODE_STREAM = 4, /* Used with stream ciphers. */ GCRY_CIPHER_MODE_OFB = 5, /* Outer feedback. */ GCRY_CIPHER_MODE_CTR = 6, /* Counter. */ - GCRY_CIPHER_MODE_AESWRAP= 7 /* AES-WRAP algorithm. */ + GCRY_CIPHER_MODE_AESWRAP= 7, /* AES-WRAP algorithm. */ + GCRY_CIPHER_MODE_CCM = 8 /* Counter with CBC-MAC. */ }; /* Flags used with the open function. */ @@ -896,6 +898,8 @@ enum gcry_cipher_flags GCRY_CIPHER_CBC_MAC = 8 /* Enable CBC message auth. code (MAC). */ }; +/* CCM works only with blocks of 128 bits. */ +#define GCRY_CCM_BLOCK_LEN (128 / 8) /* Create a handle for algorithm ALGO to be used in MODE. FLAGS may be given as an bitwise OR of the gcry_cipher_flags values. */ diff --git a/tests/basic.c b/tests/basic.c index 1d6e637..21af21d 100644 --- a/tests/basic.c +++ b/tests/basic.c @@ -1139,6 +1139,776 @@ check_ofb_cipher (void) static void +check_ccm_cipher (void) +{ + static const struct tv + { + int algo; + int keylen; + const char *key; + int noncelen; + const char *nonce; + int aadlen; + const char *aad; + int plainlen; + const char *plaintext; + int cipherlen; + const char *ciphertext; + } tv[] = + { + /* RFC 3610 */ + { GCRY_CIPHER_AES, /* Packet Vector #1 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x03\x02\x01\x00\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 31, + "\x58\x8C\x97\x9A\x61\xC6\x63\xD2\xF0\x66\xD0\xC2\xC0\xF9\x89\x80\x6D\x5F\x6B\x61\xDA\xC3\x84\x17\xE8\xD1\x2C\xFD\xF9\x26\xE0"}, + { GCRY_CIPHER_AES, /* Packet Vector #2 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x04\x03\x02\x01\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 32, + "\x72\xC9\x1A\x36\xE1\x35\xF8\xCF\x29\x1C\xA8\x94\x08\x5C\x87\xE3\xCC\x15\xC4\x39\xC9\xE4\x3A\x3B\xA0\x91\xD5\x6E\x10\x40\x09\x16"}, + { GCRY_CIPHER_AES, /* Packet Vector #3 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x05\x04\x03\x02\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 33, + "\x51\xB1\xE5\xF4\x4A\x19\x7D\x1D\xA4\x6B\x0F\x8E\x2D\x28\x2A\xE8\x71\xE8\x38\xBB\x64\xDA\x85\x96\x57\x4A\xDA\xA7\x6F\xBD\x9F\xB0\xC5"}, + { GCRY_CIPHER_AES, /* Packet Vector #4 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x06\x05\x04\x03\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 27, + "\xA2\x8C\x68\x65\x93\x9A\x9A\x79\xFA\xAA\x5C\x4C\x2A\x9D\x4A\x91\xCD\xAC\x8C\x96\xC8\x61\xB9\xC9\xE6\x1E\xF1"}, + { GCRY_CIPHER_AES, /* Packet Vector #5 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x07\x06\x05\x04\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 28, + "\xDC\xF1\xFB\x7B\x5D\x9E\x23\xFB\x9D\x4E\x13\x12\x53\x65\x8A\xD8\x6E\xBD\xCA\x3E\x51\xE8\x3F\x07\x7D\x9C\x2D\x93"}, + { GCRY_CIPHER_AES, /* Packet Vector #6 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x08\x07\x06\x05\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 29, + "\x6F\xC1\xB0\x11\xF0\x06\x56\x8B\x51\x71\xA4\x2D\x95\x3D\x46\x9B\x25\x70\xA4\xBD\x87\x40\x5A\x04\x43\xAC\x91\xCB\x94"}, + { GCRY_CIPHER_AES, /* Packet Vector #7 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x09\x08\x07\x06\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 33, + "\x01\x35\xD1\xB2\xC9\x5F\x41\xD5\xD1\xD4\xFE\xC1\x85\xD1\x66\xB8\x09\x4E\x99\x9D\xFE\xD9\x6C\x04\x8C\x56\x60\x2C\x97\xAC\xBB\x74\x90"}, + { GCRY_CIPHER_AES, /* Packet Vector #8 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0A\x09\x08\x07\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 34, + "\x7B\x75\x39\x9A\xC0\x83\x1D\xD2\xF0\xBB\xD7\x58\x79\xA2\xFD\x8F\x6C\xAE\x6B\x6C\xD9\xB7\xDB\x24\xC1\x7B\x44\x33\xF4\x34\x96\x3F\x34\xB4"}, + { GCRY_CIPHER_AES, /* Packet Vector #9 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0B\x0A\x09\x08\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 35, + "\x82\x53\x1A\x60\xCC\x24\x94\x5A\x4B\x82\x79\x18\x1A\xB5\xC8\x4D\xF2\x1C\xE7\xF9\xB7\x3F\x42\xE1\x97\xEA\x9C\x07\xE5\x6B\x5E\xB1\x7E\x5F\x4E"}, + { GCRY_CIPHER_AES, /* Packet Vector #10 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0C\x0B\x0A\x09\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 29, + "\x07\x34\x25\x94\x15\x77\x85\x15\x2B\x07\x40\x98\x33\x0A\xBB\x14\x1B\x94\x7B\x56\x6A\xA9\x40\x6B\x4D\x99\x99\x88\xDD"}, + { GCRY_CIPHER_AES, /* Packet Vector #11 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0D\x0C\x0B\x0A\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 30, + "\x67\x6B\xB2\x03\x80\xB0\xE3\x01\xE8\xAB\x79\x59\x0A\x39\x6D\xA7\x8B\x83\x49\x34\xF5\x3A\xA2\xE9\x10\x7A\x8B\x6C\x02\x2C"}, + { GCRY_CIPHER_AES, /* Packet Vector #12 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0E\x0D\x0C\x0B\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 31, + "\xC0\xFF\xA0\xD6\xF0\x5B\xDB\x67\xF2\x4D\x43\xA4\x33\x8D\x2A\xA4\xBE\xD7\xB2\x0E\x43\xCD\x1A\xA3\x16\x62\xE7\xAD\x65\xD6\xDB"}, + { GCRY_CIPHER_AES, /* Packet Vector #13 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x41\x2B\x4E\xA9\xCD\xBE\x3C\x96\x96\x76\x6C\xFA", + 8, "\x0B\xE1\xA8\x8B\xAC\xE0\x18\xB1", + 23, + "\x08\xE8\xCF\x97\xD8\x20\xEA\x25\x84\x60\xE9\x6A\xD9\xCF\x52\x89\x05\x4D\x89\x5C\xEA\xC4\x7C", + 31, + "\x4C\xB9\x7F\x86\xA2\xA4\x68\x9A\x87\x79\x47\xAB\x80\x91\xEF\x53\x86\xA6\xFF\xBD\xD0\x80\xF8\xE7\x8C\xF7\xCB\x0C\xDD\xD7\xB3"}, + { GCRY_CIPHER_AES, /* Packet Vector #14 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x33\x56\x8E\xF7\xB2\x63\x3C\x96\x96\x76\x6C\xFA", + 8, "\x63\x01\x8F\x76\xDC\x8A\x1B\xCB", + 24, + "\x90\x20\xEA\x6F\x91\xBD\xD8\x5A\xFA\x00\x39\xBA\x4B\xAF\xF9\xBF\xB7\x9C\x70\x28\x94\x9C\xD0\xEC", + 32, + "\x4C\xCB\x1E\x7C\xA9\x81\xBE\xFA\xA0\x72\x6C\x55\xD3\x78\x06\x12\x98\xC8\x5C\x92\x81\x4A\xBC\x33\xC5\x2E\xE8\x1D\x7D\x77\xC0\x8A"}, + { GCRY_CIPHER_AES, /* Packet Vector #15 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x10\x3F\xE4\x13\x36\x71\x3C\x96\x96\x76\x6C\xFA", + 8, "\xAA\x6C\xFA\x36\xCA\xE8\x6B\x40", + 25, + "\xB9\x16\xE0\xEA\xCC\x1C\x00\xD7\xDC\xEC\x68\xEC\x0B\x3B\xBB\x1A\x02\xDE\x8A\x2D\x1A\xA3\x46\x13\x2E", + 33, + "\xB1\xD2\x3A\x22\x20\xDD\xC0\xAC\x90\x0D\x9A\xA0\x3C\x61\xFC\xF4\xA5\x59\xA4\x41\x77\x67\x08\x97\x08\xA7\x76\x79\x6E\xDB\x72\x35\x06"}, + { GCRY_CIPHER_AES, /* Packet Vector #16 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x76\x4C\x63\xB8\x05\x8E\x3C\x96\x96\x76\x6C\xFA", + 12, "\xD0\xD0\x73\x5C\x53\x1E\x1B\xEC\xF0\x49\xC2\x44", + 19, + "\x12\xDA\xAC\x56\x30\xEF\xA5\x39\x6F\x77\x0C\xE1\xA6\x6B\x21\xF7\xB2\x10\x1C", + 27, + "\x14\xD2\x53\xC3\x96\x7B\x70\x60\x9B\x7C\xBB\x7C\x49\x91\x60\x28\x32\x45\x26\x9A\x6F\x49\x97\x5B\xCA\xDE\xAF"}, + { GCRY_CIPHER_AES, /* Packet Vector #17 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\xF8\xB6\x78\x09\x4E\x3B\x3C\x96\x96\x76\x6C\xFA", + 12, "\x77\xB6\x0F\x01\x1C\x03\xE1\x52\x58\x99\xBC\xAE", + 20, + "\xE8\x8B\x6A\x46\xC7\x8D\x63\xE5\x2E\xB8\xC5\x46\xEF\xB5\xDE\x6F\x75\xE9\xCC\x0D", + 28, + "\x55\x45\xFF\x1A\x08\x5E\xE2\xEF\xBF\x52\xB2\xE0\x4B\xEE\x1E\x23\x36\xC7\x3E\x3F\x76\x2C\x0C\x77\x44\xFE\x7E\x3C"}, + { GCRY_CIPHER_AES, /* Packet Vector #18 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\xD5\x60\x91\x2D\x3F\x70\x3C\x96\x96\x76\x6C\xFA", + 12, "\xCD\x90\x44\xD2\xB7\x1F\xDB\x81\x20\xEA\x60\xC0", + 21, + "\x64\x35\xAC\xBA\xFB\x11\xA8\x2E\x2F\x07\x1D\x7C\xA4\xA5\xEB\xD9\x3A\x80\x3B\xA8\x7F", + 29, + "\x00\x97\x69\xEC\xAB\xDF\x48\x62\x55\x94\xC5\x92\x51\xE6\x03\x57\x22\x67\x5E\x04\xC8\x47\x09\x9E\x5A\xE0\x70\x45\x51"}, + { GCRY_CIPHER_AES, /* Packet Vector #19 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x42\xFF\xF8\xF1\x95\x1C\x3C\x96\x96\x76\x6C\xFA", + 8, "\xD8\x5B\xC7\xE6\x9F\x94\x4F\xB8", + 23, + "\x8A\x19\xB9\x50\xBC\xF7\x1A\x01\x8E\x5E\x67\x01\xC9\x17\x87\x65\x98\x09\xD6\x7D\xBE\xDD\x18", + 33, + "\xBC\x21\x8D\xAA\x94\x74\x27\xB6\xDB\x38\x6A\x99\xAC\x1A\xEF\x23\xAD\xE0\xB5\x29\x39\xCB\x6A\x63\x7C\xF9\xBE\xC2\x40\x88\x97\xC6\xBA"}, + { GCRY_CIPHER_AES, /* Packet Vector #20 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x92\x0F\x40\xE5\x6C\xDC\x3C\x96\x96\x76\x6C\xFA", + 8, "\x74\xA0\xEB\xC9\x06\x9F\x5B\x37", + 24, + "\x17\x61\x43\x3C\x37\xC5\xA3\x5F\xC1\xF3\x9F\x40\x63\x02\xEB\x90\x7C\x61\x63\xBE\x38\xC9\x84\x37", + 34, + "\x58\x10\xE6\xFD\x25\x87\x40\x22\xE8\x03\x61\xA4\x78\xE3\xE9\xCF\x48\x4A\xB0\x4F\x44\x7E\xFF\xF6\xF0\xA4\x77\xCC\x2F\xC9\xBF\x54\x89\x44"}, + { GCRY_CIPHER_AES, /* Packet Vector #21 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x27\xCA\x0C\x71\x20\xBC\x3C\x96\x96\x76\x6C\xFA", + 8, "\x44\xA3\xAA\x3A\xAE\x64\x75\xCA", + 25, + "\xA4\x34\xA8\xE5\x85\x00\xC6\xE4\x15\x30\x53\x88\x62\xD6\x86\xEA\x9E\x81\x30\x1B\x5A\xE4\x22\x6B\xFA", + 35, + "\xF2\xBE\xED\x7B\xC5\x09\x8E\x83\xFE\xB5\xB3\x16\x08\xF8\xE2\x9C\x38\x81\x9A\x89\xC8\xE7\x76\xF1\x54\x4D\x41\x51\xA4\xED\x3A\x8B\x87\xB9\xCE"}, + { GCRY_CIPHER_AES, /* Packet Vector #22 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x5B\x8C\xCB\xCD\x9A\xF8\x3C\x96\x96\x76\x6C\xFA", + 12, "\xEC\x46\xBB\x63\xB0\x25\x20\xC3\x3C\x49\xFD\x70", + 19, + "\xB9\x6B\x49\xE2\x1D\x62\x17\x41\x63\x28\x75\xDB\x7F\x6C\x92\x43\xD2\xD7\xC2", + 29, + "\x31\xD7\x50\xA0\x9D\xA3\xED\x7F\xDD\xD4\x9A\x20\x32\xAA\xBF\x17\xEC\x8E\xBF\x7D\x22\xC8\x08\x8C\x66\x6B\xE5\xC1\x97"}, + { GCRY_CIPHER_AES, /* Packet Vector #23 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x3E\xBE\x94\x04\x4B\x9A\x3C\x96\x96\x76\x6C\xFA", + 12, "\x47\xA6\x5A\xC7\x8B\x3D\x59\x42\x27\xE8\x5E\x71", + 20, + "\xE2\xFC\xFB\xB8\x80\x44\x2C\x73\x1B\xF9\x51\x67\xC8\xFF\xD7\x89\x5E\x33\x70\x76", + 30, + "\xE8\x82\xF1\xDB\xD3\x8C\xE3\xED\xA7\xC2\x3F\x04\xDD\x65\x07\x1E\xB4\x13\x42\xAC\xDF\x7E\x00\xDC\xCE\xC7\xAE\x52\x98\x7D"}, + { GCRY_CIPHER_AES, /* Packet Vector #24 */ + 16, "\xD7\x82\x8D\x13\xB2\xB0\xBD\xC3\x25\xA7\x62\x36\xDF\x93\xCC\x6B", + 13, "\x00\x8D\x49\x3B\x30\xAE\x8B\x3C\x96\x96\x76\x6C\xFA", + 12, "\x6E\x37\xA6\xEF\x54\x6D\x95\x5D\x34\xAB\x60\x59", + 21, + "\xAB\xF2\x1C\x0B\x02\xFE\xB8\x8F\x85\x6D\xF4\xA3\x73\x81\xBC\xE3\xCC\x12\x85\x17\xD4", + 31, + "\xF3\x29\x05\xB8\x8A\x64\x1B\x04\xB9\xC9\xFF\xB5\x8C\xC3\x90\x90\x0F\x3D\xA1\x2A\xB1\x6D\xCE\x9E\x82\xEF\xA1\x6D\xA6\x20\x59"}, + /* RFC 5528 */ + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #1 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x03\x02\x01\x00\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 31, + "\xBA\x73\x71\x85\xE7\x19\x31\x04\x92\xF3\x8A\x5F\x12\x51\xDA\x55\xFA\xFB\xC9\x49\x84\x8A\x0D\xFC\xAE\xCE\x74\x6B\x3D\xB9\xAD"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #2 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x04\x03\x02\x01\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 32, + "\x5D\x25\x64\xBF\x8E\xAF\xE1\xD9\x95\x26\xEC\x01\x6D\x1B\xF0\x42\x4C\xFB\xD2\xCD\x62\x84\x8F\x33\x60\xB2\x29\x5D\xF2\x42\x83\xE8"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #3 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x05\x04\x03\x02\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 33, + "\x81\xF6\x63\xD6\xC7\x78\x78\x17\xF9\x20\x36\x08\xB9\x82\xAD\x15\xDC\x2B\xBD\x87\xD7\x56\xF7\x92\x04\xF5\x51\xD6\x68\x2F\x23\xAA\x46"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #4 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x06\x05\x04\x03\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 27, + "\xCA\xEF\x1E\x82\x72\x11\xB0\x8F\x7B\xD9\x0F\x08\xC7\x72\x88\xC0\x70\xA4\xA0\x8B\x3A\x93\x3A\x63\xE4\x97\xA0"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #5 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x07\x06\x05\x04\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 28, + "\x2A\xD3\xBA\xD9\x4F\xC5\x2E\x92\xBE\x43\x8E\x82\x7C\x10\x23\xB9\x6A\x8A\x77\x25\x8F\xA1\x7B\xA7\xF3\x31\xDB\x09"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #6 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x08\x07\x06\x05\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 29, + "\xFE\xA5\x48\x0B\xA5\x3F\xA8\xD3\xC3\x44\x22\xAA\xCE\x4D\xE6\x7F\xFA\x3B\xB7\x3B\xAB\xAB\x36\xA1\xEE\x4F\xE0\xFE\x28"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #7 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x09\x08\x07\x06\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 23, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 33, + "\x54\x53\x20\x26\xE5\x4C\x11\x9A\x8D\x36\xD9\xEC\x6E\x1E\xD9\x74\x16\xC8\x70\x8C\x4B\x5C\x2C\xAC\xAF\xA3\xBC\xCF\x7A\x4E\xBF\x95\x73"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #8 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0A\x09\x08\x07\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 24, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 34, + "\x8A\xD1\x9B\x00\x1A\x87\xD1\x48\xF4\xD9\x2B\xEF\x34\x52\x5C\xCC\xE3\xA6\x3C\x65\x12\xA6\xF5\x75\x73\x88\xE4\x91\x3E\xF1\x47\x01\xF4\x41"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #9 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0B\x0A\x09\x08\xA0\xA1\xA2\xA3\xA4\xA5", + 8, "\x00\x01\x02\x03\x04\x05\x06\x07", + 25, + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 35, + "\x5D\xB0\x8D\x62\x40\x7E\x6E\x31\xD6\x0F\x9C\xA2\xC6\x04\x74\x21\x9A\xC0\xBE\x50\xC0\xD4\xA5\x77\x87\x94\xD6\xE2\x30\xCD\x25\xC9\xFE\xBF\x87"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #10 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0C\x0B\x0A\x09\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 19, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E", + 29, + "\xDB\x11\x8C\xCE\xC1\xB8\x76\x1C\x87\x7C\xD8\x96\x3A\x67\xD6\xF3\xBB\xBC\x5C\xD0\x92\x99\xEB\x11\xF3\x12\xF2\x32\x37"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #11 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0D\x0C\x0B\x0A\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 20, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", + 30, + "\x7C\xC8\x3D\x8D\xC4\x91\x03\x52\x5B\x48\x3D\xC5\xCA\x7E\xA9\xAB\x81\x2B\x70\x56\x07\x9D\xAF\xFA\xDA\x16\xCC\xCF\x2C\x4E"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #12 */ + 16, "\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF", + 13, "\x00\x00\x00\x0E\x0D\x0C\x0B\xA0\xA1\xA2\xA3\xA4\xA5", + 12, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B", + 21, + "\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20", + 31, + "\x2C\xD3\x5B\x88\x20\xD2\x3E\x7A\xA3\x51\xB0\xE9\x2F\xC7\x93\x67\x23\x8B\x2C\xC7\x48\xCB\xB9\x4C\x29\x47\x79\x3D\x64\xAF\x75"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #13 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xA9\x70\x11\x0E\x19\x27\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x6B\x7F\x46\x45\x07\xFA\xE4\x96", + 23, + "\xC6\xB5\xF3\xE6\xCA\x23\x11\xAE\xF7\x47\x2B\x20\x3E\x73\x5E\xA5\x61\xAD\xB1\x7D\x56\xC5\xA3", + 31, + "\xA4\x35\xD7\x27\x34\x8D\xDD\x22\x90\x7F\x7E\xB8\xF5\xFD\xBB\x4D\x93\x9D\xA6\x52\x4D\xB4\xF6\x45\x58\xC0\x2D\x25\xB1\x27\xEE"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #14 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x83\xCD\x8C\xE0\xCB\x42\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x98\x66\x05\xB4\x3D\xF1\x5D\xE7", + 24, + "\x01\xF6\xCE\x67\x64\xC5\x74\x48\x3B\xB0\x2E\x6B\xBF\x1E\x0A\xBD\x26\xA2\x25\x72\xB4\xD8\x0E\xE7", + 32, + "\x8A\xE0\x52\x50\x8F\xBE\xCA\x93\x2E\x34\x6F\x05\xE0\xDC\x0D\xFB\xCF\x93\x9E\xAF\xFA\x3E\x58\x7C\x86\x7D\x6E\x1C\x48\x70\x38\x06"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #15 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x5F\x54\x95\x0B\x18\xF2\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x48\xF2\xE7\xE1\xA7\x67\x1A\x51", + 25, + "\xCD\xF1\xD8\x40\x6F\xC2\xE9\x01\x49\x53\x89\x70\x05\xFB\xFB\x8B\xA5\x72\x76\xF9\x24\x04\x60\x8E\x08", + 33, + "\x08\xB6\x7E\xE2\x1C\x8B\xF2\x6E\x47\x3E\x40\x85\x99\xE9\xC0\x83\x6D\x6A\xF0\xBB\x18\xDF\x55\x46\x6C\xA8\x08\x78\xA7\x90\x47\x6D\xE5"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #16 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xEC\x60\x08\x63\x31\x9A\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\xDE\x97\xDF\x3B\x8C\xBD\x6D\x8E\x50\x30\xDA\x4C", + 19, + "\xB0\x05\xDC\xFA\x0B\x59\x18\x14\x26\xA9\x61\x68\x5A\x99\x3D\x8C\x43\x18\x5B", + 27, + "\x63\xB7\x8B\x49\x67\xB1\x9E\xDB\xB7\x33\xCD\x11\x14\xF6\x4E\xB2\x26\x08\x93\x68\xC3\x54\x82\x8D\x95\x0C\xC5"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #17 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x60\xCF\xF1\xA3\x1E\xA1\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\xA5\xEE\x93\xE4\x57\xDF\x05\x46\x6E\x78\x2D\xCF", + 20, + "\x2E\x20\x21\x12\x98\x10\x5F\x12\x9D\x5E\xD9\x5B\x93\xF7\x2D\x30\xB2\xFA\xCC\xD7", + 28, + "\x0B\xC6\xBB\xE2\xA8\xB9\x09\xF4\x62\x9E\xE6\xDC\x14\x8D\xA4\x44\x10\xE1\x8A\xF4\x31\x47\x38\x32\x76\xF6\x6A\x9F"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #18 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x0F\x85\xCD\x99\x5C\x97\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\x24\xAA\x1B\xF9\xA5\xCD\x87\x61\x82\xA2\x50\x74", + 21, + "\x26\x45\x94\x1E\x75\x63\x2D\x34\x91\xAF\x0F\xC0\xC9\x87\x6C\x3B\xE4\xAA\x74\x68\xC9", + 29, + "\x22\x2A\xD6\x32\xFA\x31\xD6\xAF\x97\x0C\x34\x5F\x7E\x77\xCA\x3B\xD0\xDC\x25\xB3\x40\xA1\xA3\xD3\x1F\x8D\x4B\x44\xB7"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #19 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xC2\x9B\x2C\xAA\xC4\xCD\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\x69\x19\x46\xB9\xCA\x07\xBE\x87", + 23, + "\x07\x01\x35\xA6\x43\x7C\x9D\xB1\x20\xCD\x61\xD8\xF6\xC3\x9C\x3E\xA1\x25\xFD\x95\xA0\xD2\x3D", + 33, + "\x05\xB8\xE1\xB9\xC4\x9C\xFD\x56\xCF\x13\x0A\xA6\x25\x1D\xC2\xEC\xC0\x6C\xCC\x50\x8F\xE6\x97\xA0\x06\x6D\x57\xC8\x4B\xEC\x18\x27\x68"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #20 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x2C\x6B\x75\x95\xEE\x62\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\xD0\xC5\x4E\xCB\x84\x62\x7D\xC4", + 24, + "\xC8\xC0\x88\x0E\x6C\x63\x6E\x20\x09\x3D\xD6\x59\x42\x17\xD2\xE1\x88\x77\xDB\x26\x4E\x71\xA5\xCC", + 34, + "\x54\xCE\xB9\x68\xDE\xE2\x36\x11\x57\x5E\xC0\x03\xDF\xAA\x1C\xD4\x88\x49\xBD\xF5\xAE\x2E\xDB\x6B\x7F\xA7\x75\xB1\x50\xED\x43\x83\xC5\xA9"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #21 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xC5\x3C\xD4\xC2\xAA\x24\xB1\x60\xB6\xA3\x1C\x1C", + 8, "\xE2\x85\xE0\xE4\x80\x8C\xDA\x3D", + 25, + "\xF7\x5D\xAA\x07\x10\xC4\xE6\x42\x97\x79\x4D\xC2\xB7\xD2\xA2\x07\x57\xB1\xAA\x4E\x44\x80\x02\xFF\xAB", + 35, + "\xB1\x40\x45\x46\xBF\x66\x72\x10\xCA\x28\xE3\x09\xB3\x9B\xD6\xCA\x7E\x9F\xC8\x28\x5F\xE6\x98\xD4\x3C\xD2\x0A\x02\xE0\xBD\xCA\xED\x20\x10\xD3"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #22 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xBE\xE9\x26\x7F\xBA\xDC\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\x6C\xAE\xF9\x94\x11\x41\x57\x0D\x7C\x81\x34\x05", + 19, + "\xC2\x38\x82\x2F\xAC\x5F\x98\xFF\x92\x94\x05\xB0\xAD\x12\x7A\x4E\x41\x85\x4E", + 29, + "\x94\xC8\x95\x9C\x11\x56\x9A\x29\x78\x31\xA7\x21\x00\x58\x57\xAB\x61\xB8\x7A\x2D\xEA\x09\x36\xB6\xEB\x5F\x62\x5F\x5D"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #23 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\xDF\xA8\xB1\x24\x50\x07\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\x36\xA5\x2C\xF1\x6B\x19\xA2\x03\x7A\xB7\x01\x1E", + 20, + "\x4D\xBF\x3E\x77\x4A\xD2\x45\xE5\xD5\x89\x1F\x9D\x1C\x32\xA0\xAE\x02\x2C\x85\xD7", + 30, + "\x58\x69\xE3\xAA\xD2\x44\x7C\x74\xE0\xFC\x05\xF9\xA4\xEA\x74\x57\x7F\x4D\xE8\xCA\x89\x24\x76\x42\x96\xAD\x04\x11\x9C\xE7"}, + { GCRY_CIPHER_CAMELLIA128, /* Packet Vector #24 */ + 16, "\xD7\x5C\x27\x78\x07\x8C\xA9\x3D\x97\x1F\x96\xFD\xE7\x20\xF4\xCD", + 13, "\x00\x3B\x8F\xD8\xD3\xA9\x37\xB1\x60\xB6\xA3\x1C\x1C", + 12, "\xA4\xD4\x99\xF7\x84\x19\x72\x8C\x19\x17\x8B\x0C", + 21, + "\x9D\xC9\xED\xAE\x2F\xF5\xDF\x86\x36\xE8\xC6\xDE\x0E\xED\x55\xF7\x86\x7E\x33\x33\x7D", + 31, + "\x4B\x19\x81\x56\x39\x3B\x0F\x77\x96\x08\x6A\xAF\xB4\x54\xF8\xC3\xF0\x34\xCC\xA9\x66\x94\x5F\x1F\xCE\xA7\xE1\x1B\xEE\x6A\x2F"} + }; + static const int cut[] = { 0, 1, 8, 10, 16, 19, -1 }; + gcry_cipher_hd_t hde, hdd; + unsigned char out[MAX_DATA_LEN]; + size_t ctl_params[3]; + int split, aadsplit; + size_t j, i, keylen, blklen, authlen; + gcry_error_t err = 0; + + if (verbose) + fprintf (stderr, " Starting CCM checks.\n"); + + for (i = 0; i < sizeof (tv) / sizeof (tv[0]); i++) + { + if (verbose) + fprintf (stderr, " checking CCM mode for %s [%i]\n", + gcry_cipher_algo_name (tv[i].algo), + tv[i].algo); + + for (j = 0; j < sizeof (cut) / sizeof (cut[0]); j++) + { + split = cut[j] < 0 ? tv[i].plainlen : cut[j]; + if (tv[i].plainlen < split) + continue; + + err = gcry_cipher_open (&hde, tv[i].algo, GCRY_CIPHER_MODE_CCM, 0); + if (!err) + err = gcry_cipher_open (&hdd, tv[i].algo, GCRY_CIPHER_MODE_CCM, 0); + if (err) + { + fail ("cipher-ccm, gcry_cipher_open failed: %s\n", + gpg_strerror (err)); + return; + } + + keylen = gcry_cipher_get_algo_keylen(tv[i].algo); + if (!keylen) + { + fail ("cipher-ccm, gcry_cipher_get_algo_keylen failed\n"); + return; + } + + err = gcry_cipher_setkey (hde, tv[i].key, keylen); + if (!err) + err = gcry_cipher_setkey (hdd, tv[i].key, keylen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_setkey failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + blklen = gcry_cipher_get_algo_blklen(tv[i].algo); + if (!blklen) + { + fail ("cipher-ccm, gcry_cipher_get_algo_blklen failed\n"); + return; + } + + err = gcry_cipher_setiv (hde, tv[i].nonce, tv[i].noncelen); + if (!err) + err = gcry_cipher_setiv (hdd, tv[i].nonce, tv[i].noncelen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_setiv failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + authlen = tv[i].cipherlen - tv[i].plainlen; + ctl_params[0] = tv[i].plainlen; /* encryptedlen */ + ctl_params[1] = tv[i].aadlen; /* aadlen */ + ctl_params[2] = authlen; /* authtaglen */ + err = gcry_cipher_ctl (hde, GCRYCTL_SET_CCM_LENGTHS, ctl_params, + sizeof(ctl_params)); + if (!err) + err = gcry_cipher_ctl (hdd, GCRYCTL_SET_CCM_LENGTHS, ctl_params, + sizeof(ctl_params)); + if (err) + { + fail ("cipher-ccm, gcry_cipher_ctl GCRYCTL_SET_CCM_LENGTHS " + "failed: %s\n", gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + aadsplit = split > tv[i].aadlen ? 0 : split; + + err = gcry_cipher_authenticate (hde, tv[i].aad, + tv[i].aadlen - aadsplit); + if (!err) + err = gcry_cipher_authenticate (hde, + &tv[i].aad[tv[i].aadlen - aadsplit], + aadsplit); + if (!err) + err = gcry_cipher_authenticate (hdd, tv[i].aad, + tv[i].aadlen - aadsplit); + if (!err) + err = gcry_cipher_authenticate (hdd, + &tv[i].aad[tv[i].aadlen - aadsplit], + aadsplit); + if (err) + { + fail ("cipher-ccm, gcry_cipher_authenticate failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + err = gcry_cipher_encrypt (hde, out, MAX_DATA_LEN, tv[i].plaintext, + tv[i].plainlen - split); + if (!err) + err = gcry_cipher_encrypt (hde, &out[tv[i].plainlen - split], + MAX_DATA_LEN - (tv[i].plainlen - split), + &tv[i].plaintext[tv[i].plainlen - split], + split); + if (err) + { + fail ("cipher-ccm, gcry_cipher_encrypt (%d:%d) failed: %s\n", + i, j, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + err = gcry_cipher_gettag (hde, &out[tv[i].plainlen], authlen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_gettag (%d:%d) failed: %s\n", + i, j, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + if (memcmp (tv[i].ciphertext, out, tv[i].cipherlen)) + fail ("cipher-ccm, encrypt mismatch entry %d:%d\n", i, j); + + err = gcry_cipher_decrypt (hdd, out, tv[i].plainlen - split, NULL, 0); + if (!err) + err = gcry_cipher_decrypt (hdd, &out[tv[i].plainlen - split], split, + NULL, 0); + if (err) + { + fail ("cipher-ccm, gcry_cipher_decrypt (%d:%d) failed: %s\n", + i, j, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + if (memcmp (tv[i].plaintext, out, tv[i].plainlen)) + fail ("cipher-ccm, decrypt mismatch entry %d:%d\n", i, j); + + err = gcry_cipher_checktag (hdd, &out[tv[i].plainlen], authlen); + if (err) + { + fail ("cipher-ccm, gcry_cipher_checktag (%d:%d) failed: %s\n", + i, j, gpg_strerror (err)); + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + return; + } + + gcry_cipher_close (hde); + gcry_cipher_close (hdd); + } + } + + /* Large buffer tests. */ + + /* Test encoding of aadlen > 0xfeff. */ + { + static const char key[]={0x40,0x41,0x42,0x43,0x44,0x45,0x46,0x47, + 0x48,0x49,0x4a,0x4b,0x4c,0x4d,0x4e,0x4f}; + static const char iv[]={0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19}; + static const char tag[]={0x9C,0x76,0xE7,0x33,0xD5,0x15,0xB3,0x6C, + 0xBA,0x76,0x95,0xF7,0xFB,0x91}; + char buf[1024]; + size_t enclen = 0x20000; + size_t aadlen = 0x20000; + size_t taglen = sizeof(tag); + + err = gcry_cipher_open (&hde, GCRY_CIPHER_AES, GCRY_CIPHER_MODE_CCM, 0); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_open failed: %s\n", + gpg_strerror (err)); + return; + } + + err = gcry_cipher_setkey (hde, key, sizeof (key)); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_setkey failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + err = gcry_cipher_setiv (hde, iv, sizeof (iv)); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_setiv failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + ctl_params[0] = enclen; /* encryptedlen */ + ctl_params[1] = aadlen; /* aadlen */ + ctl_params[2] = taglen; /* authtaglen */ + err = gcry_cipher_ctl (hde, GCRYCTL_SET_CCM_LENGTHS, ctl_params, + sizeof(ctl_params)); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_ctl GCRYCTL_SET_CCM_LENGTHS " + "failed: %s\n", gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + memset (buf, 0xaa, sizeof(buf)); + + for (i = 0; i < aadlen; i += sizeof(buf)) + { + err = gcry_cipher_authenticate (hde, buf, sizeof (buf)); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_authenticate failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + } + + for (i = 0; i < enclen; i += sizeof(buf)) + { + memset (buf, 0xee, sizeof(buf)); + err = gcry_cipher_encrypt (hde, buf, sizeof (buf), NULL, 0); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_encrypt failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + } + + err = gcry_cipher_gettag (hde, buf, taglen); + if (err) + { + fail ("cipher-ccm-large, gcry_cipher_gettag failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + if (memcmp (buf, tag, taglen) != 0) + fail ("cipher-ccm-large, encrypt mismatch entry\n"); + } + +#if 0 + /* Test encoding of aadlen > 0xffffffff. */ + { + static const char key[]={0x40,0x41,0x42,0x43,0x44,0x45,0x46,0x47, + 0x48,0x49,0x4a,0x4b,0x4c,0x4d,0x4e,0x4f}; + static const char iv[]={0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19}; + static const char tag[]={0x01,0xB2,0xC3,0x4A,0xA6,0x6A,0x07,0x6D, + 0xBC,0xBD,0xEA,0x17,0xD3,0x73,0xD7,0xD4}; + char buf[1024]; + size_t enclen = (size_t)0xffffffff + 1 + 1024; + size_t aadlen = (size_t)0xffffffff + 1 + 1024; + size_t taglen = sizeof(tag); + + err = gcry_cipher_open (&hde, GCRY_CIPHER_AES, GCRY_CIPHER_MODE_CCM, 0); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_open failed: %s\n", + gpg_strerror (err)); + return; + } + + err = gcry_cipher_setkey (hde, key, sizeof (key)); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_setkey failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + err = gcry_cipher_setiv (hde, iv, sizeof (iv)); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_setiv failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + ctl_params[0] = enclen; /* encryptedlen */ + ctl_params[1] = aadlen; /* aadlen */ + ctl_params[2] = taglen; /* authtaglen */ + err = gcry_cipher_ctl (hde, GCRYCTL_SET_CCM_LENGTHS, ctl_params, + sizeof(ctl_params)); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_ctl GCRYCTL_SET_CCM_LENGTHS failed:" + "%s\n", gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + memset (buf, 0xaa, sizeof(buf)); + + for (i = 0; i < aadlen; i += sizeof(buf)) + { + err = gcry_cipher_authenticate (hde, buf, sizeof (buf)); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_authenticate failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + } + + for (i = 0; i < enclen; i += sizeof(buf)) + { + memset (buf, 0xee, sizeof(buf)); + err = gcry_cipher_encrypt (hde, buf, sizeof (buf), NULL, 0); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_encrypt failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + } + + err = gcry_cipher_gettag (hde, buf, taglen); + if (err) + { + fail ("cipher-ccm-huge, gcry_cipher_gettag failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hde); + return; + } + + if (memcmp (buf, tag, taglen) != 0) + fail ("cipher-ccm-huge, encrypt mismatch entry\n"); + } +#endif + + if (verbose) + fprintf (stderr, " Completed CCM checks.\n"); +} + + +static void check_stream_cipher (void) { struct tv @@ -2455,6 +3225,7 @@ check_cipher_modes(void) check_ctr_cipher (); check_cfb_cipher (); check_ofb_cipher (); + check_ccm_cipher (); check_stream_cipher (); check_stream_cipher_large_block (); diff --git a/tests/benchmark.c b/tests/benchmark.c index ecda0d3..d3ef1a2 100644 --- a/tests/benchmark.c +++ b/tests/benchmark.c @@ -435,6 +435,40 @@ md_bench ( const char *algoname ) fflush (stdout); } + +static void ccm_aead_init(gcry_cipher_hd_t hd, size_t buflen, int authlen) +{ + const int _L = 4; + const int noncelen = 15 - _L; + char nonce[noncelen]; + size_t params[3]; + gcry_error_t err = GPG_ERR_NO_ERROR; + + memset (nonce, 0x33, noncelen); + + err = gcry_cipher_setiv (hd, nonce, noncelen); + if (err) + { + fprintf (stderr, "gcry_cipher_setiv failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + params[0] = buflen; /* encryptedlen */ + params[1] = 0; /* aadlen */ + params[2] = authlen; /* authtaglen */ + err = gcry_cipher_ctl (hd, GCRYCTL_SET_CCM_LENGTHS, params, sizeof(params)); + if (err) + { + fprintf (stderr, "gcry_cipher_setiv failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } +} + + static void cipher_bench ( const char *algoname ) { @@ -448,12 +482,21 @@ cipher_bench ( const char *algoname ) char *raw_outbuf, *raw_buf; size_t allocated_buflen, buflen; int repetitions; - static struct { int mode; const char *name; int blocked; } modes[] = { + static const struct { + int mode; + const char *name; + int blocked; + void (* const aead_init)(gcry_cipher_hd_t hd, size_t buflen, int authlen); + int req_blocksize; + int authlen; + } modes[] = { { GCRY_CIPHER_MODE_ECB, " ECB/Stream", 1 }, { GCRY_CIPHER_MODE_CBC, " CBC", 1 }, { GCRY_CIPHER_MODE_CFB, " CFB", 0 }, { GCRY_CIPHER_MODE_OFB, " OFB", 0 }, { GCRY_CIPHER_MODE_CTR, " CTR", 0 }, + { GCRY_CIPHER_MODE_CCM, " CCM", 0, + ccm_aead_init, GCRY_CCM_BLOCK_LEN, 8 }, { GCRY_CIPHER_MODE_STREAM, "", 0 }, {0} }; @@ -542,9 +585,16 @@ cipher_bench ( const char *algoname ) for (modeidx=0; modes[modeidx].mode; modeidx++) { if ((blklen > 1 && modes[modeidx].mode == GCRY_CIPHER_MODE_STREAM) - | (blklen == 1 && modes[modeidx].mode != GCRY_CIPHER_MODE_STREAM)) + || (blklen == 1 && modes[modeidx].mode != GCRY_CIPHER_MODE_STREAM)) continue; + if (modes[modeidx].req_blocksize > 0 + && blklen != modes[modeidx].req_blocksize) + { + printf (" %7s %7s", "-", "-" ); + continue; + } + for (i=0; i < sizeof buf; i++) buf[i] = i; @@ -585,7 +635,18 @@ cipher_bench ( const char *algoname ) exit (1); } } - err = gcry_cipher_encrypt ( hd, outbuf, buflen, buf, buflen); + if (modes[modeidx].aead_init) + { + (*modes[modeidx].aead_init) (hd, buflen, modes[modeidx].authlen); + err = gcry_cipher_encrypt (hd, outbuf, buflen, buf, buflen); + if (err) + break; + err = gcry_cipher_gettag (hd, outbuf, modes[modeidx].authlen); + } + else + { + err = gcry_cipher_encrypt (hd, outbuf, buflen, buf, buflen); + } } stop_timer (); @@ -632,7 +693,18 @@ cipher_bench ( const char *algoname ) exit (1); } } - err = gcry_cipher_decrypt ( hd, outbuf, buflen, buf, buflen); + if (modes[modeidx].aead_init) + { + (*modes[modeidx].aead_init) (hd, buflen, modes[modeidx].authlen); + err = gcry_cipher_decrypt (hd, outbuf, buflen, buf, buflen); + if (err) + break; + err = gcry_cipher_checktag (hd, outbuf, modes[modeidx].authlen); + if (gpg_err_code (err) == GPG_ERR_CHECKSUM) + err = gpg_error (GPG_ERR_NO_ERROR); + } + else + err = gcry_cipher_decrypt (hd, outbuf, buflen, buf, buflen); } stop_timer (); printf (" %s", elapsed_time ()); commit 95654041f2aa62f71aac4d8614dafe8433d10f95 Author: Jussi Kivilinna Date: Tue Oct 22 17:07:53 2013 +0300 Add API to support AEAD cipher modes * cipher/cipher.c (_gcry_cipher_authenticate, _gcry_cipher_checktag) (_gcry_cipher_gettag): New. * doc/gcrypt.texi: Add documentation for new API functions. * src/visibility.c (gcry_cipher_authenticate, gcry_cipher_checktag) (gcry_cipher_gettag): New. * src/gcrypt.h.in, src/visibility.h: add declarations of these functions. * src/libgcrypt.defs, src/libgcrypt.vers: export functions. -- Authenticated Encryption with Associated Data (AEAD) cipher modes provide authentication tag that can be used to authenticate message. At the same time it allows one to specify additional (unencrypted data) that will be authenticated together with the message. This class of cipher modes requires additional API present in this commit. This patch is based on original patch by Dmitry Eremin-Solenikov. Changes in v2: - Change gcry_cipher_tag to gcry_cipher_checktag and gcry_cipher_gettag for giving tag (checktag) for decryption and reading tag (gettag) after encryption. - Change gcry_cipher_authenticate to gcry_cipher_setaad, since additional parameters needed for some AEAD modes (in this case CCM, which needs the length of encrypted data and tag for MAC initialization). - Add some documentation. Changes in v3: - Change gcry_cipher_setaad back to gcry_cipher_authenticate. Additional parameters (encrypt_len, tag_len, aad_len) for CCM will be given through GCRY_CTL_SET_CCM_LENGTHS. Changes in v4: - log_fatal => log_error Signed-off-by: Jussi Kivilinna diff --git a/cipher/cipher.c b/cipher/cipher.c index 75d42d1..d6d3021 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -910,6 +910,40 @@ _gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen) return 0; } +gcry_error_t +_gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, + size_t abuflen) +{ + log_error ("gcry_cipher_authenticate: invalid mode %d\n", hd->mode); + + (void)abuf; + (void)abuflen; + + return gpg_error (GPG_ERR_INV_CIPHER_MODE); +} + +gcry_error_t +_gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, size_t taglen) +{ + log_error ("gcry_cipher_gettag: invalid mode %d\n", hd->mode); + + (void)outtag; + (void)taglen; + + return gpg_error (GPG_ERR_INV_CIPHER_MODE); +} + +gcry_error_t +_gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen) +{ + log_error ("gcry_cipher_checktag: invalid mode %d\n", hd->mode); + + (void)intag; + (void)taglen; + + return gpg_error (GPG_ERR_INV_CIPHER_MODE); +} + gcry_error_t gcry_cipher_ctl( gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) diff --git a/doc/gcrypt.texi b/doc/gcrypt.texi index 473c484..0049fa0 100644 --- a/doc/gcrypt.texi +++ b/doc/gcrypt.texi @@ -1731,6 +1731,10 @@ matches the requirement of the selected algorithm and mode. This function is also used with the Salsa20 stream cipher to set or update the required nonce. In this case it needs to be called after setting the key. + +This function is also used with the AEAD cipher modes to set or +update the required nonce. + @end deftypefun @deftypefun gcry_error_t gcry_cipher_setctr (gcry_cipher_hd_t @var{h}, const void *@var{c}, size_t @var{l}) @@ -1750,6 +1754,37 @@ call to gcry_cipher_setkey and clear the initialization vector. Note that gcry_cipher_reset is implemented as a macro. @end deftypefun +Authenticated Encryption with Associated Data (AEAD) block cipher +modes require the handling of the authentication tag and the additional +authenticated data, which can be done by using the following +functions: + + at deftypefun gcry_error_t gcry_cipher_authenticate (gcry_cipher_hd_t @var{h}, const void *@var{abuf}, size_t @var{abuflen}) + +Process the buffer @var{abuf} of length @var{abuflen} as the additional +authenticated data (AAD) for AEAD cipher modes. + + at end deftypefun + + at deftypefun gcry_error_t gcry_cipher_gettag (gcry_cipher_hd_t @var{h}, void *@var{tag}, size_t @var{taglen}) + +This function is used to read the authentication tag after encryption. +The function finalizes and outputs the authentication tag to the buffer + at var{tag} of length @var{taglen} bytes. + + at end deftypefun + + at deftypefun gcry_error_t gcry_cipher_checktag (gcry_cipher_hd_t @var{h}, const void *@var{tag}, size_t @var{taglen}) + +Check the authentication tag after decryption. The authentication +tag is passed as the buffer @var{tag} of length @var{taglen} bytes +and compared to internal authentication tag computed during +decryption. Error code @code{GPG_ERR_CHECKSUM} is returned if +the authentication tag in the buffer @var{tag} does not match +the authentication tag calculated during decryption. + + at end deftypefun + The actual encryption and decryption is done by using one of the following functions. They may be used as often as required to process all the data. diff --git a/src/gcrypt.h.in b/src/gcrypt.h.in index 64cc0e4..f0ae927 100644 --- a/src/gcrypt.h.in +++ b/src/gcrypt.h.in @@ -953,6 +953,17 @@ gcry_error_t gcry_cipher_setkey (gcry_cipher_hd_t hd, gcry_error_t gcry_cipher_setiv (gcry_cipher_hd_t hd, const void *iv, size_t ivlen); +/* Provide additional authentication data for AEAD modes/ciphers. */ +gcry_error_t gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, + size_t abuflen); + +/* Get authentication tag for AEAD modes/ciphers. */ +gcry_error_t gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, + size_t taglen); + +/* Check authentication tag for AEAD modes/ciphers. */ +gcry_error_t gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, + size_t taglen); /* Reset the handle to the state after open. */ #define gcry_cipher_reset(h) gcry_cipher_ctl ((h), GCRYCTL_RESET, NULL, 0) diff --git a/src/libgcrypt.def b/src/libgcrypt.def index ec0c1e3..64ba370 100644 --- a/src/libgcrypt.def +++ b/src/libgcrypt.def @@ -255,6 +255,9 @@ EXPORTS gcry_sexp_extract_param @225 + gcry_cipher_authenticate @226 + gcry_cipher_gettag @227 + gcry_cipher_checktag @228 ;; end of file with public symbols for Windows. diff --git a/src/libgcrypt.vers b/src/libgcrypt.vers index be72aad..93eaa93 100644 --- a/src/libgcrypt.vers +++ b/src/libgcrypt.vers @@ -51,6 +51,7 @@ GCRYPT_1.6 { gcry_cipher_info; gcry_cipher_map_name; gcry_cipher_mode_from_oid; gcry_cipher_open; gcry_cipher_setkey; gcry_cipher_setiv; gcry_cipher_setctr; + gcry_cipher_authenticate; gcry_cipher_gettag; gcry_cipher_checktag; gcry_pk_algo_info; gcry_pk_algo_name; gcry_pk_ctl; gcry_pk_decrypt; gcry_pk_encrypt; gcry_pk_genkey; diff --git a/src/visibility.c b/src/visibility.c index 848925e..1f7bb3a 100644 --- a/src/visibility.c +++ b/src/visibility.c @@ -713,6 +713,33 @@ gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen) return _gcry_cipher_setctr (hd, ctr, ctrlen); } +gcry_error_t +gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, size_t abuflen) +{ + if (!fips_is_operational ()) + return gpg_error (fips_not_operational ()); + + return _gcry_cipher_authenticate (hd, abuf, abuflen); +} + +gcry_error_t +gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, size_t taglen) +{ + if (!fips_is_operational ()) + return gpg_error (fips_not_operational ()); + + return _gcry_cipher_gettag (hd, outtag, taglen); +} + +gcry_error_t +gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen) +{ + if (!fips_is_operational ()) + return gpg_error (fips_not_operational ()); + + return _gcry_cipher_checktag (hd, intag, taglen); +} + gcry_error_t gcry_cipher_ctl (gcry_cipher_hd_t h, int cmd, void *buffer, size_t buflen) diff --git a/src/visibility.h b/src/visibility.h index 1c8f047..b2fa4c0 100644 --- a/src/visibility.h +++ b/src/visibility.h @@ -81,6 +81,9 @@ #define gcry_cipher_setkey _gcry_cipher_setkey #define gcry_cipher_setiv _gcry_cipher_setiv #define gcry_cipher_setctr _gcry_cipher_setctr +#define gcry_cipher_authenticate _gcry_cipher_authenticate +#define gcry_cipher_checktag _gcry_cipher_checktag +#define gcry_cipher_gettag _gcry_cipher_gettag #define gcry_cipher_ctl _gcry_cipher_ctl #define gcry_cipher_decrypt _gcry_cipher_decrypt #define gcry_cipher_encrypt _gcry_cipher_encrypt @@ -297,6 +300,9 @@ gcry_err_code_t gcry_md_get (gcry_md_hd_t hd, int algo, #undef gcry_cipher_setkey #undef gcry_cipher_setiv #undef gcry_cipher_setctr +#undef gcry_cipher_authenticate +#undef gcry_cipher_checktag +#undef gcry_cipher_gettag #undef gcry_cipher_ctl #undef gcry_cipher_decrypt #undef gcry_cipher_encrypt @@ -474,6 +480,9 @@ MARK_VISIBLE (gcry_cipher_close) MARK_VISIBLE (gcry_cipher_setkey) MARK_VISIBLE (gcry_cipher_setiv) MARK_VISIBLE (gcry_cipher_setctr) +MARK_VISIBLE (gcry_cipher_authenticate) +MARK_VISIBLE (gcry_cipher_checktag) +MARK_VISIBLE (gcry_cipher_gettag) MARK_VISIBLE (gcry_cipher_ctl) MARK_VISIBLE (gcry_cipher_decrypt) MARK_VISIBLE (gcry_cipher_encrypt) ----------------------------------------------------------------------- Summary of changes: cipher/Makefile.am | 1 + cipher/cipher-ccm.c | 370 ++++++++++++++++++++++ cipher/cipher-internal.h | 48 ++- cipher/cipher.c | 117 ++++++- doc/gcrypt.texi | 51 ++- src/gcrypt.h.in | 19 +- src/libgcrypt.def | 3 + src/libgcrypt.vers | 1 + src/visibility.c | 27 ++ src/visibility.h | 9 + tests/basic.c | 771 ++++++++++++++++++++++++++++++++++++++++++++++ tests/benchmark.c | 80 ++++- 12 files changed, 1484 insertions(+), 13 deletions(-) create mode 100644 cipher/cipher-ccm.c hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Tue Oct 22 19:20:55 2013 From: cvs at cvs.gnupg.org (by Jussi Kivilinna) Date: Tue, 22 Oct 2013 19:20:55 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-324-g98674fd Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 98674fdaa30ab22a3ac86ca05d688b5b6112895d (commit) via e67c67321ce240c93dd0fa2b21c649c0a8e233f7 (commit) via c7efaa5fe0ee92e321a7b49d56752cc12eb75fe0 (commit) from 335d9bf7b035815750b63a3a8334d6ce44dc4449 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 98674fdaa30ab22a3ac86ca05d688b5b6112895d Author: Jussi Kivilinna Date: Tue Oct 22 17:07:53 2013 +0300 twofish: add ARMv6 assembly implementation * cipher/Makefile.am: Add 'twofish-armv6.S'. * cipher/twofish-armv6.S: New. * cipher/twofish.c (USE_ARMV6_ASM): New macro. [USE_ARMV6_ASM] (_gcry_twofish_armv6_encrypt_block) (_gcry_twofish_armv6_decrypt_block): New prototypes. [USE_AMDV6_ASM] (twofish_encrypt, twofish_decrypt): Add. [USE_AMD64_ASM] (do_twofish_encrypt, do_twofish_decrypt): Remove. (_gcry_twofish_ctr_enc, _gcry_twofish_cfb_dec): Use 'twofish_encrypt' instead of 'do_twofish_encrypt'. (_gcry_twofish_cbc_dec): Use 'twofish_decrypt' instead of 'do_twofish_decrypt'. * configure.ac [arm]: Add 'twofish-armv6.lo'. -- Add optimized ARMv6 assembly implementation for Twofish. Implementation is tuned for Cortex-A8. Unaligned access handling is done in assembly part. For now, only enable this on little-endian systems as big-endian correctness have not been tested yet. Old (gcc-4.8) vs new (twofish-asm), Cortex-A8 (on armhf): ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- TWOFISH 1.23x 1.25x 1.16x 1.26x 1.16x 1.30x 1.18x 1.17x 1.23x 1.23x 1.22x 1.22x Signed-off-by: Jussi Kivilinna diff --git a/cipher/Makefile.am b/cipher/Makefile.am index b0efd89..3d8149a 100644 --- a/cipher/Makefile.am +++ b/cipher/Makefile.am @@ -80,7 +80,7 @@ sha512.c sha512-armv7-neon.S \ stribog.c \ tiger.c \ whirlpool.c \ -twofish.c twofish-amd64.S \ +twofish.c twofish-amd64.S twofish-armv6.S \ rfc2268.c \ camellia.c camellia.h camellia-glue.c camellia-aesni-avx-amd64.S \ camellia-aesni-avx2-amd64.S camellia-armv6.S diff --git a/cipher/twofish-armv6.S b/cipher/twofish-armv6.S new file mode 100644 index 0000000..b76ab37 --- /dev/null +++ b/cipher/twofish-armv6.S @@ -0,0 +1,365 @@ +/* twofish-armv6.S - ARM assembly implementation of Twofish cipher + * + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include + +#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +#ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS + +.text + +.syntax unified +.arm + +/* structure of TWOFISH_context: */ +#define s0 0 +#define s1 ((s0) + 4 * 256) +#define s2 ((s1) + 4 * 256) +#define s3 ((s2) + 4 * 256) +#define w ((s3) + 4 * 256) +#define k ((w) + 4 * 8) + +/* register macros */ +#define CTX %r0 +#define CTXs0 %r0 +#define CTXs1 %r1 +#define CTXs3 %r7 + +#define RA %r3 +#define RB %r4 +#define RC %r5 +#define RD %r6 + +#define RX %r2 +#define RY %ip + +#define RMASK %lr + +#define RT0 %r8 +#define RT1 %r9 +#define RT2 %r10 +#define RT3 %r11 + +/* helper macros */ +#define ldr_unaligned_le(rout, rsrc, offs, rtmp) \ + ldrb rout, [rsrc, #((offs) + 0)]; \ + ldrb rtmp, [rsrc, #((offs) + 1)]; \ + orr rout, rout, rtmp, lsl #8; \ + ldrb rtmp, [rsrc, #((offs) + 2)]; \ + orr rout, rout, rtmp, lsl #16; \ + ldrb rtmp, [rsrc, #((offs) + 3)]; \ + orr rout, rout, rtmp, lsl #24; + +#define str_unaligned_le(rin, rdst, offs, rtmp0, rtmp1) \ + mov rtmp0, rin, lsr #8; \ + strb rin, [rdst, #((offs) + 0)]; \ + mov rtmp1, rin, lsr #16; \ + strb rtmp0, [rdst, #((offs) + 1)]; \ + mov rtmp0, rin, lsr #24; \ + strb rtmp1, [rdst, #((offs) + 2)]; \ + strb rtmp0, [rdst, #((offs) + 3)]; + +#ifndef __ARMEL__ + /* bswap on big-endian */ + #define host_to_le(reg) \ + rev reg, reg; + #define le_to_host(reg) \ + rev reg, reg; +#else + /* nop on little-endian */ + #define host_to_le(reg) /*_*/ + #define le_to_host(reg) /*_*/ +#endif + +#define ldr_input_aligned_le(rin, a, b, c, d) \ + ldr a, [rin, #0]; \ + ldr b, [rin, #4]; \ + le_to_host(a); \ + ldr c, [rin, #8]; \ + le_to_host(b); \ + ldr d, [rin, #12]; \ + le_to_host(c); \ + le_to_host(d); + +#define str_output_aligned_le(rout, a, b, c, d) \ + le_to_host(a); \ + le_to_host(b); \ + str a, [rout, #0]; \ + le_to_host(c); \ + str b, [rout, #4]; \ + le_to_host(d); \ + str c, [rout, #8]; \ + str d, [rout, #12]; + +#ifdef __ARM_FEATURE_UNALIGNED + /* unaligned word reads/writes allowed */ + #define ldr_input_le(rin, ra, rb, rc, rd, rtmp) \ + ldr_input_aligned_le(rin, ra, rb, rc, rd) + + #define str_output_le(rout, ra, rb, rc, rd, rtmp0, rtmp1) \ + str_output_aligned_le(rout, ra, rb, rc, rd) +#else + /* need to handle unaligned reads/writes by byte reads */ + #define ldr_input_le(rin, ra, rb, rc, rd, rtmp0) \ + tst rin, #3; \ + beq 1f; \ + ldr_unaligned_le(ra, rin, 0, rtmp0); \ + ldr_unaligned_le(rb, rin, 4, rtmp0); \ + ldr_unaligned_le(rc, rin, 8, rtmp0); \ + ldr_unaligned_le(rd, rin, 12, rtmp0); \ + b 2f; \ + 1:;\ + ldr_input_aligned_le(rin, ra, rb, rc, rd); \ + 2:; + + #define str_output_le(rout, ra, rb, rc, rd, rtmp0, rtmp1) \ + tst rout, #3; \ + beq 1f; \ + str_unaligned_le(ra, rout, 0, rtmp0, rtmp1); \ + str_unaligned_le(rb, rout, 4, rtmp0, rtmp1); \ + str_unaligned_le(rc, rout, 8, rtmp0, rtmp1); \ + str_unaligned_le(rd, rout, 12, rtmp0, rtmp1); \ + b 2f; \ + 1:;\ + str_output_aligned_le(rout, ra, rb, rc, rd); \ + 2:; +#endif + +/********************************************************************** + 1-way twofish + **********************************************************************/ +#define encrypt_round(a, b, rc, rd, n, ror_a, adj_a) \ + and RT0, RMASK, b, lsr#(8 - 2); \ + and RY, RMASK, b, lsr#(16 - 2); \ + add RT0, RT0, #(s2 - s1); \ + and RT1, RMASK, b, lsr#(24 - 2); \ + ldr RY, [CTXs3, RY]; \ + and RT2, RMASK, b, lsl#(2); \ + ldr RT0, [CTXs1, RT0]; \ + and RT3, RMASK, a, lsr#(16 - 2 + (adj_a)); \ + ldr RT1, [CTXs0, RT1]; \ + and RX, RMASK, a, lsr#(8 - 2 + (adj_a)); \ + ldr RT2, [CTXs1, RT2]; \ + add RT3, RT3, #(s2 - s1); \ + ldr RX, [CTXs1, RX]; \ + ror_a(a); \ + \ + eor RY, RY, RT0; \ + ldr RT3, [CTXs1, RT3]; \ + and RT0, RMASK, a, lsl#(2); \ + eor RY, RY, RT1; \ + and RT1, RMASK, a, lsr#(24 - 2); \ + eor RY, RY, RT2; \ + ldr RT0, [CTXs0, RT0]; \ + eor RX, RX, RT3; \ + ldr RT1, [CTXs3, RT1]; \ + eor RX, RX, RT0; \ + \ + ldr RT3, [CTXs3, #(k - s3 + 8 * (n) + 4)]; \ + eor RX, RX, RT1; \ + ldr RT2, [CTXs3, #(k - s3 + 8 * (n))]; \ + \ + add RT0, RX, RY, lsl #1; \ + add RX, RX, RY; \ + add RT0, RT0, RT3; \ + add RX, RX, RT2; \ + eor rd, RT0, rd, ror #31; \ + eor rc, rc, RX; + +#define dummy(x) /*_*/ + +#define ror1(r) \ + ror r, r, #1; + +#define decrypt_round(a, b, rc, rd, n, ror_b, adj_b) \ + and RT3, RMASK, b, lsl#(2 - (adj_b)); \ + and RT1, RMASK, b, lsr#(8 - 2 + (adj_b)); \ + ror_b(b); \ + and RT2, RMASK, a, lsl#(2); \ + and RT0, RMASK, a, lsr#(8 - 2); \ + \ + ldr RY, [CTXs1, RT3]; \ + add RT1, RT1, #(s2 - s1); \ + ldr RX, [CTXs0, RT2]; \ + and RT3, RMASK, b, lsr#(16 - 2); \ + ldr RT1, [CTXs1, RT1]; \ + and RT2, RMASK, a, lsr#(16 - 2); \ + ldr RT0, [CTXs1, RT0]; \ + \ + add RT2, RT2, #(s2 - s1); \ + ldr RT3, [CTXs3, RT3]; \ + eor RY, RY, RT1; \ + \ + and RT1, RMASK, b, lsr#(24 - 2); \ + eor RX, RX, RT0; \ + ldr RT2, [CTXs1, RT2]; \ + and RT0, RMASK, a, lsr#(24 - 2); \ + \ + ldr RT1, [CTXs0, RT1]; \ + \ + eor RY, RY, RT3; \ + ldr RT0, [CTXs3, RT0]; \ + eor RX, RX, RT2; \ + eor RY, RY, RT1; \ + \ + ldr RT1, [CTXs3, #(k - s3 + 8 * (n) + 4)]; \ + eor RX, RX, RT0; \ + ldr RT2, [CTXs3, #(k - s3 + 8 * (n))]; \ + \ + add RT0, RX, RY, lsl #1; \ + add RX, RX, RY; \ + add RT0, RT0, RT1; \ + add RX, RX, RT2; \ + eor rd, rd, RT0; \ + eor rc, RX, rc, ror #31; + +#define first_encrypt_cycle(nc) \ + encrypt_round(RA, RB, RC, RD, (nc) * 2, dummy, 0); \ + encrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); + +#define encrypt_cycle(nc) \ + encrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); \ + encrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); + +#define last_encrypt_cycle(nc) \ + encrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); \ + encrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); \ + ror1(RA); + +#define first_decrypt_cycle(nc) \ + decrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, dummy, 0); \ + decrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); + +#define decrypt_cycle(nc) \ + decrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); \ + decrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); + +#define last_decrypt_cycle(nc) \ + decrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); \ + decrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); \ + ror1(RD); + +.align 3 +.global _gcry_twofish_armv6_encrypt_block +.type _gcry_twofish_armv6_encrypt_block,%function; + +_gcry_twofish_armv6_encrypt_block: + /* input: + * %r0: ctx + * %r1: dst + * %r2: src + */ + push {%r1, %r4-%r11, %ip, %lr}; + + add RY, CTXs0, #w; + + ldr_input_le(%r2, RA, RB, RC, RD, RT0); + + /* Input whitening */ + ldm RY, {RT0, RT1, RT2, RT3}; + add CTXs3, CTXs0, #(s3 - s0); + add CTXs1, CTXs0, #(s1 - s0); + mov RMASK, #(0xff << 2); + eor RA, RA, RT0; + eor RB, RB, RT1; + eor RC, RC, RT2; + eor RD, RD, RT3; + + first_encrypt_cycle(0); + encrypt_cycle(1); + encrypt_cycle(2); + encrypt_cycle(3); + encrypt_cycle(4); + encrypt_cycle(5); + encrypt_cycle(6); + last_encrypt_cycle(7); + + add RY, CTXs3, #(w + 4*4 - s3); + pop {%r1}; /* dst */ + + /* Output whitening */ + ldm RY, {RT0, RT1, RT2, RT3}; + eor RC, RC, RT0; + eor RD, RD, RT1; + eor RA, RA, RT2; + eor RB, RB, RT3; + + str_output_le(%r1, RC, RD, RA, RB, RT0, RT1); + + pop {%r4-%r11, %ip, %lr}; + bx %lr; +.ltorg +.size _gcry_twofish_armv6_encrypt_block,.-_gcry_twofish_armv6_encrypt_block; + +.align 3 +.global _gcry_twofish_armv6_decrypt_block +.type _gcry_twofish_armv6_decrypt_block,%function; + +_gcry_twofish_armv6_decrypt_block: + /* input: + * %r0: ctx + * %r1: dst + * %r2: src + */ + push {%r1, %r4-%r11, %ip, %lr}; + + add CTXs3, CTXs0, #(s3 - s0); + + ldr_input_le(%r2, RC, RD, RA, RB, RT0); + + add RY, CTXs3, #(w + 4*4 - s3); + add CTXs3, CTXs0, #(s3 - s0); + + /* Input whitening */ + ldm RY, {RT0, RT1, RT2, RT3}; + add CTXs1, CTXs0, #(s1 - s0); + mov RMASK, #(0xff << 2); + eor RC, RC, RT0; + eor RD, RD, RT1; + eor RA, RA, RT2; + eor RB, RB, RT3; + + first_decrypt_cycle(7); + decrypt_cycle(6); + decrypt_cycle(5); + decrypt_cycle(4); + decrypt_cycle(3); + decrypt_cycle(2); + decrypt_cycle(1); + last_decrypt_cycle(0); + + add RY, CTXs0, #w; + pop {%r1}; /* dst */ + + /* Output whitening */ + ldm RY, {RT0, RT1, RT2, RT3}; + eor RA, RA, RT0; + eor RB, RB, RT1; + eor RC, RC, RT2; + eor RD, RD, RT3; + + str_output_le(%r1, RA, RB, RC, RD, RT0, RT1); + + pop {%r4-%r11, %ip, %lr}; + bx %lr; +.size _gcry_twofish_armv6_decrypt_block,.-_gcry_twofish_armv6_decrypt_block; + +#endif /*HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS*/ +#endif /*__ARM_ARCH >= 6*/ diff --git a/cipher/twofish.c b/cipher/twofish.c index 993ad0f..d2cabbe 100644 --- a/cipher/twofish.c +++ b/cipher/twofish.c @@ -57,6 +57,14 @@ # define USE_AMD64_ASM 1 #endif +/* USE_ARMV6_ASM indicates whether to use ARMv6 assembly code. */ +#undef USE_ARMV6_ASM +#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +# if defined(HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS) +# define USE_ARMV6_ASM 1 +# endif +#endif + /* Prototype for the self-test function. */ static const char *selftest(void); @@ -746,7 +754,16 @@ extern void _gcry_twofish_amd64_cbc_dec(const TWOFISH_context *c, byte *out, extern void _gcry_twofish_amd64_cfb_dec(const TWOFISH_context *c, byte *out, const byte *in, byte *iv); -#else /*!USE_AMD64_ASM*/ +#elif defined(USE_ARMV6_ASM) + +/* Assembly implementations of Twofish. */ +extern void _gcry_twofish_armv6_encrypt_block(const TWOFISH_context *c, + byte *out, const byte *in); + +extern void _gcry_twofish_armv6_decrypt_block(const TWOFISH_context *c, + byte *out, const byte *in); + +#else /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ /* Macros to compute the g() function in the encryption and decryption * rounds. G1 is the straight g() function; G2 includes the 8-bit @@ -812,21 +829,25 @@ extern void _gcry_twofish_amd64_cfb_dec(const TWOFISH_context *c, byte *out, #ifdef USE_AMD64_ASM -static void -do_twofish_encrypt (const TWOFISH_context *ctx, byte *out, const byte *in) +static unsigned int +twofish_encrypt (void *context, byte *out, const byte *in) { + TWOFISH_context *ctx = context; _gcry_twofish_amd64_encrypt_block(ctx, out, in); + return /*burn_stack*/ (4*sizeof (void*)); } +#elif defined(USE_ARMV6_ASM) + static unsigned int twofish_encrypt (void *context, byte *out, const byte *in) { TWOFISH_context *ctx = context; - _gcry_twofish_amd64_encrypt_block(ctx, out, in); + _gcry_twofish_armv6_encrypt_block(ctx, out, in); return /*burn_stack*/ (4*sizeof (void*)); } -#else /*!USE_AMD64_ASM*/ +#else /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ static void do_twofish_encrypt (const TWOFISH_context *ctx, byte *out, const byte *in) @@ -868,28 +889,32 @@ twofish_encrypt (void *context, byte *out, const byte *in) return /*burn_stack*/ (24+3*sizeof (void*)); } -#endif /*!USE_AMD64_ASM*/ +#endif /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ /* Decrypt one block. in and out may be the same. */ #ifdef USE_AMD64_ASM -static void -do_twofish_decrypt (const TWOFISH_context *ctx, byte *out, const byte *in) +static unsigned int +twofish_decrypt (void *context, byte *out, const byte *in) { + TWOFISH_context *ctx = context; _gcry_twofish_amd64_decrypt_block(ctx, out, in); + return /*burn_stack*/ (4*sizeof (void*)); } +#elif defined(USE_ARMV6_ASM) + static unsigned int twofish_decrypt (void *context, byte *out, const byte *in) { TWOFISH_context *ctx = context; - _gcry_twofish_amd64_decrypt_block(ctx, out, in); + _gcry_twofish_armv6_decrypt_block(ctx, out, in); return /*burn_stack*/ (4*sizeof (void*)); } -#else /*!USE_AMD64_ASM*/ +#else /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ static void do_twofish_decrypt (const TWOFISH_context *ctx, byte *out, const byte *in) @@ -932,7 +957,7 @@ twofish_decrypt (void *context, byte *out, const byte *in) return /*burn_stack*/ (24+3*sizeof (void*)); } -#endif /*!USE_AMD64_ASM*/ +#endif /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ @@ -947,14 +972,11 @@ _gcry_twofish_ctr_enc(void *context, unsigned char *ctr, void *outbuf_arg, unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; unsigned char tmpbuf[TWOFISH_BLOCKSIZE]; - int burn_stack_depth = 24 + 3 * sizeof (void*); + unsigned int burn, burn_stack_depth = 0; int i; #ifdef USE_AMD64_ASM { - if (nblocks >= 3 && burn_stack_depth < 8 * sizeof(void*)) - burn_stack_depth = 8 * sizeof(void*); - /* Process data in 3 block chunks. */ while (nblocks >= 3) { @@ -963,6 +985,10 @@ _gcry_twofish_ctr_enc(void *context, unsigned char *ctr, void *outbuf_arg, nblocks -= 3; outbuf += 3 * TWOFISH_BLOCKSIZE; inbuf += 3 * TWOFISH_BLOCKSIZE; + + burn = 8 * sizeof(void*); + if (burn > burn_stack_depth) + burn_stack_depth = burn; } /* Use generic code to handle smaller chunks... */ @@ -973,7 +999,10 @@ _gcry_twofish_ctr_enc(void *context, unsigned char *ctr, void *outbuf_arg, for ( ;nblocks; nblocks-- ) { /* Encrypt the counter. */ - do_twofish_encrypt(ctx, tmpbuf, ctr); + burn = twofish_encrypt(ctx, tmpbuf, ctr); + if (burn > burn_stack_depth) + burn_stack_depth = burn; + /* XOR the input with the encrypted counter and store in output. */ buf_xor(outbuf, tmpbuf, inbuf, TWOFISH_BLOCKSIZE); outbuf += TWOFISH_BLOCKSIZE; @@ -1002,13 +1031,10 @@ _gcry_twofish_cbc_dec(void *context, unsigned char *iv, void *outbuf_arg, unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; unsigned char savebuf[TWOFISH_BLOCKSIZE]; - int burn_stack_depth = 24 + 3 * sizeof (void*); + unsigned int burn, burn_stack_depth = 0; #ifdef USE_AMD64_ASM { - if (nblocks >= 3 && burn_stack_depth < 9 * sizeof(void*)) - burn_stack_depth = 9 * sizeof(void*); - /* Process data in 3 block chunks. */ while (nblocks >= 3) { @@ -1017,6 +1043,10 @@ _gcry_twofish_cbc_dec(void *context, unsigned char *iv, void *outbuf_arg, nblocks -= 3; outbuf += 3 * TWOFISH_BLOCKSIZE; inbuf += 3 * TWOFISH_BLOCKSIZE; + + burn = 9 * sizeof(void*); + if (burn > burn_stack_depth) + burn_stack_depth = burn; } /* Use generic code to handle smaller chunks... */ @@ -1029,7 +1059,9 @@ _gcry_twofish_cbc_dec(void *context, unsigned char *iv, void *outbuf_arg, OUTBUF. */ memcpy(savebuf, inbuf, TWOFISH_BLOCKSIZE); - do_twofish_decrypt (ctx, outbuf, inbuf); + burn = twofish_decrypt (ctx, outbuf, inbuf); + if (burn > burn_stack_depth) + burn_stack_depth = burn; buf_xor(outbuf, outbuf, iv, TWOFISH_BLOCKSIZE); memcpy(iv, savebuf, TWOFISH_BLOCKSIZE); @@ -1051,13 +1083,10 @@ _gcry_twofish_cfb_dec(void *context, unsigned char *iv, void *outbuf_arg, TWOFISH_context *ctx = context; unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; - int burn_stack_depth = 24 + 3 * sizeof (void*); + unsigned int burn, burn_stack_depth = 0; #ifdef USE_AMD64_ASM { - if (nblocks >= 3 && burn_stack_depth < 8 * sizeof(void*)) - burn_stack_depth = 8 * sizeof(void*); - /* Process data in 3 block chunks. */ while (nblocks >= 3) { @@ -1066,6 +1095,10 @@ _gcry_twofish_cfb_dec(void *context, unsigned char *iv, void *outbuf_arg, nblocks -= 3; outbuf += 3 * TWOFISH_BLOCKSIZE; inbuf += 3 * TWOFISH_BLOCKSIZE; + + burn = 8 * sizeof(void*); + if (burn > burn_stack_depth) + burn_stack_depth = burn; } /* Use generic code to handle smaller chunks... */ @@ -1074,7 +1107,10 @@ _gcry_twofish_cfb_dec(void *context, unsigned char *iv, void *outbuf_arg, for ( ;nblocks; nblocks-- ) { - do_twofish_encrypt(ctx, iv, iv); + burn = twofish_encrypt(ctx, iv, iv); + if (burn > burn_stack_depth) + burn_stack_depth = burn; + buf_xor_n_copy(outbuf, iv, inbuf, TWOFISH_BLOCKSIZE); outbuf += TWOFISH_BLOCKSIZE; inbuf += TWOFISH_BLOCKSIZE; diff --git a/configure.ac b/configure.ac index 8fb14e2..739a650 100644 --- a/configure.ac +++ b/configure.ac @@ -1480,6 +1480,10 @@ if test "$found" = "1" ; then # Build with the assembly implementation GCRYPT_CIPHERS="$GCRYPT_CIPHERS twofish-amd64.lo" ;; + arm*-*-*) + # Build with the assembly implementation + GCRYPT_CIPHERS="$GCRYPT_CIPHERS twofish-armv6.lo" + ;; esac fi commit e67c67321ce240c93dd0fa2b21c649c0a8e233f7 Author: Jussi Kivilinna Date: Tue Oct 22 17:07:53 2013 +0300 mpi: allow building with clang on ARM * mpi/longlong.h [__arm__] (add_ssaaaa, sub_ddmmss, umul_ppmm) (count_leading_zeros): Do not cast assembly output arguments. [__arm__] (umul_ppmm): Remove the extra '%' ahead of assembly comment. [_ARM_ARCH >= 4] (umul_ppmm): Use correct inputs and outputs instead of registers. -- Signed-off-by: Jussi Kivilinna diff --git a/mpi/longlong.h b/mpi/longlong.h index c2ab9c5..8c8260e 100644 --- a/mpi/longlong.h +++ b/mpi/longlong.h @@ -213,8 +213,8 @@ extern UDItype __udiv_qrnnd (); #define add_ssaaaa(sh, sl, ah, al, bh, bl) \ __asm__ ("adds %1, %4, %5\n" \ "adc %0, %2, %3" \ - : "=r" ((USItype)(sh)), \ - "=&r" ((USItype)(sl)) \ + : "=r" ((sh)), \ + "=&r" ((sl)) \ : "%r" ((USItype)(ah)), \ "rI" ((USItype)(bh)), \ "%r" ((USItype)(al)), \ @@ -222,15 +222,15 @@ extern UDItype __udiv_qrnnd (); #define sub_ddmmss(sh, sl, ah, al, bh, bl) \ __asm__ ("subs %1, %4, %5\n" \ "sbc %0, %2, %3" \ - : "=r" ((USItype)(sh)), \ - "=&r" ((USItype)(sl)) \ + : "=r" ((sh)), \ + "=&r" ((sl)) \ : "r" ((USItype)(ah)), \ "rI" ((USItype)(bh)), \ "r" ((USItype)(al)), \ "rI" ((USItype)(bl)) __CLOBBER_CC) #if (defined __ARM_ARCH && __ARM_ARCH <= 3) #define umul_ppmm(xh, xl, a, b) \ - __asm__ ("%@ Inlined umul_ppmm\n" \ + __asm__ ("@ Inlined umul_ppmm\n" \ "mov %|r0, %2, lsr #16 @ AAAA\n" \ "mov %|r2, %3, lsr #16 @ BBBB\n" \ "bic %|r1, %2, %|r0, lsl #16 @ aaaa\n" \ @@ -243,27 +243,26 @@ extern UDItype __udiv_qrnnd (); "addcs %|r2, %|r2, #65536\n" \ "adds %1, %|r1, %|r0, lsl #16\n" \ "adc %0, %|r2, %|r0, lsr #16" \ - : "=&r" ((USItype)(xh)), \ - "=r" ((USItype)(xl)) \ + : "=&r" ((xh)), \ + "=r" ((xl)) \ : "r" ((USItype)(a)), \ "r" ((USItype)(b)) \ : "r0", "r1", "r2" __CLOBBER_CC) #else /* __ARM_ARCH >= 4 */ #define umul_ppmm(xh, xl, a, b) \ - __asm__ ("%@ Inlined umul_ppmm\n" \ - "umull %r1, %r0, %r2, %r3" \ - : "=&r" ((USItype)(xh)), \ - "=r" ((USItype)(xl)) \ + __asm__ ("@ Inlined umul_ppmm\n" \ + "umull %1, %0, %2, %3" \ + : "=&r" ((xh)), \ + "=r" ((xl)) \ : "r" ((USItype)(a)), \ - "r" ((USItype)(b)) \ - : "r0", "r1") + "r" ((USItype)(b))) #endif /* __ARM_ARCH >= 4 */ #define UMUL_TIME 20 #define UDIV_TIME 100 #if (defined __ARM_ARCH && __ARM_ARCH >= 5) #define count_leading_zeros(count, x) \ __asm__ ("clz %0, %1" \ - : "=r" ((USItype)(count)) \ + : "=r" ((count)) \ : "r" ((USItype)(x))) #endif /* __ARM_ARCH >= 5 */ #endif /* __arm__ */ commit c7efaa5fe0ee92e321a7b49d56752cc12eb75fe0 Author: Jussi Kivilinna Date: Tue Oct 22 17:07:53 2013 +0300 serpent-amd64: do not use GAS macros * cipher/serpent-avx2-amd64.S: Remove use of GAS macros. * cipher/serpent-sse2-amd64.S: Ditto. * configure.ac [HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS]: Do not check for GAS macros. -- This way we have better portability; for example, when compiling with clang on x86-64, the assembly implementations are now enabled and working. Signed-off-by: Jussi Kivilinna diff --git a/cipher/serpent-avx2-amd64.S b/cipher/serpent-avx2-amd64.S index c726e7b..8a76ab1 100644 --- a/cipher/serpent-avx2-amd64.S +++ b/cipher/serpent-avx2-amd64.S @@ -36,51 +36,36 @@ #define CTX %rdi /* vector registers */ -.set RA0, %ymm0 -.set RA1, %ymm1 -.set RA2, %ymm2 -.set RA3, %ymm3 -.set RA4, %ymm4 - -.set RB0, %ymm5 -.set RB1, %ymm6 -.set RB2, %ymm7 -.set RB3, %ymm8 -.set RB4, %ymm9 - -.set RNOT, %ymm10 -.set RTMP0, %ymm11 -.set RTMP1, %ymm12 -.set RTMP2, %ymm13 -.set RTMP3, %ymm14 -.set RTMP4, %ymm15 - -.set RNOTx, %xmm10 -.set RTMP0x, %xmm11 -.set RTMP1x, %xmm12 -.set RTMP2x, %xmm13 -.set RTMP3x, %xmm14 -.set RTMP4x, %xmm15 +#define RA0 %ymm0 +#define RA1 %ymm1 +#define RA2 %ymm2 +#define RA3 %ymm3 +#define RA4 %ymm4 + +#define RB0 %ymm5 +#define RB1 %ymm6 +#define RB2 %ymm7 +#define RB3 %ymm8 +#define RB4 %ymm9 + +#define RNOT %ymm10 +#define RTMP0 %ymm11 +#define RTMP1 %ymm12 +#define RTMP2 %ymm13 +#define RTMP3 %ymm14 +#define RTMP4 %ymm15 + +#define RNOTx %xmm10 +#define RTMP0x %xmm11 +#define RTMP1x %xmm12 +#define RTMP2x %xmm13 +#define RTMP3x %xmm14 +#define RTMP4x %xmm15 /********************************************************************** helper macros **********************************************************************/ -/* preprocessor macro for renaming vector registers using GAS macros */ -#define sbox_reg_rename(r0, r1, r2, r3, r4, \ - new_r0, new_r1, new_r2, new_r3, new_r4) \ - .set rename_reg0, new_r0; \ - .set rename_reg1, new_r1; \ - .set rename_reg2, new_r2; \ - .set rename_reg3, new_r3; \ - .set rename_reg4, new_r4; \ - \ - .set r0, rename_reg0; \ - .set r1, rename_reg1; \ - .set r2, rename_reg2; \ - .set r3, rename_reg3; \ - .set r4, rename_reg4; - /* vector 32-bit rotation to left */ #define vec_rol(reg, nleft, tmp) \ vpslld $(nleft), reg, tmp; \ @@ -128,9 +113,7 @@ vpxor r4, r2, r2; vpxor RNOT, r4, r4; \ vpor r1, r4, r4; vpxor r3, r1, r1; \ vpxor r4, r1, r1; vpor r0, r3, r3; \ - vpxor r3, r1, r1; vpxor r3, r4, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r4,r2,r0,r3); + vpxor r3, r1, r1; vpxor r3, r4, r4; #define SBOX0_INVERSE(r0, r1, r2, r3, r4) \ vpxor RNOT, r2, r2; vmovdqa r1, r4; \ @@ -143,9 +126,7 @@ vpxor r1, r2, r2; vpxor r0, r3, r3; \ vpxor r1, r3, r3; \ vpand r3, r2, r2; \ - vpxor r2, r4, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r0,r4,r1,r3,r2); + vpxor r2, r4, r4; #define SBOX1(r0, r1, r2, r3, r4) \ vpxor RNOT, r0, r0; vpxor RNOT, r2, r2; \ @@ -157,9 +138,7 @@ vpand r4, r2, r2; vpxor r1, r0, r0; \ vpand r2, r1, r1; \ vpxor r0, r1, r1; vpand r2, r0, r0; \ - vpxor r4, r0, r0; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r2,r0,r3,r1,r4); + vpxor r4, r0, r0; #define SBOX1_INVERSE(r0, r1, r2, r3, r4) \ vmovdqa r1, r4; vpxor r3, r1, r1; \ @@ -172,9 +151,7 @@ vpxor r1, r4, r4; vpor r0, r1, r1; \ vpxor r0, r1, r1; \ vpor r4, r1, r1; \ - vpxor r1, r3, r3; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r4,r0,r3,r2,r1); + vpxor r1, r3, r3; #define SBOX2(r0, r1, r2, r3, r4) \ vmovdqa r0, r4; vpand r2, r0, r0; \ @@ -184,9 +161,7 @@ vmovdqa r3, r1; vpor r4, r3, r3; \ vpxor r0, r3, r3; vpand r1, r0, r0; \ vpxor r0, r4, r4; vpxor r3, r1, r1; \ - vpxor r4, r1, r1; vpxor RNOT, r4, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r2,r3,r1,r4,r0); + vpxor r4, r1, r1; vpxor RNOT, r4, r4; #define SBOX2_INVERSE(r0, r1, r2, r3, r4) \ vpxor r3, r2, r2; vpxor r0, r3, r3; \ @@ -198,9 +173,7 @@ vpor r0, r2, r2; vpxor RNOT, r3, r3; \ vpxor r3, r2, r2; vpxor r3, r0, r0; \ vpand r1, r0, r0; vpxor r4, r3, r3; \ - vpxor r0, r3, r3; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r4,r2,r3,r0); + vpxor r0, r3, r3; #define SBOX3(r0, r1, r2, r3, r4) \ vmovdqa r0, r4; vpor r3, r0, r0; \ @@ -212,9 +185,7 @@ vpxor r2, r4, r4; vpor r0, r1, r1; \ vpxor r2, r1, r1; vpxor r3, r0, r0; \ vmovdqa r1, r2; vpor r3, r1, r1; \ - vpxor r0, r1, r1; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r2,r3,r4,r0); + vpxor r0, r1, r1; #define SBOX3_INVERSE(r0, r1, r2, r3, r4) \ vmovdqa r2, r4; vpxor r1, r2, r2; \ @@ -226,9 +197,7 @@ vpxor r1, r3, r3; vpxor r0, r1, r1; \ vpor r2, r1, r1; vpxor r3, r0, r0; \ vpxor r4, r1, r1; \ - vpxor r1, r0, r0; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r2,r1,r3,r0,r4); + vpxor r1, r0, r0; #define SBOX4(r0, r1, r2, r3, r4) \ vpxor r3, r1, r1; vpxor RNOT, r3, r3; \ @@ -240,9 +209,7 @@ vpxor r0, r3, r3; vpor r1, r4, r4; \ vpxor r0, r4, r4; vpor r3, r0, r0; \ vpxor r2, r0, r0; vpand r3, r2, r2; \ - vpxor RNOT, r0, r0; vpxor r2, r4, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r4,r0,r3,r2); + vpxor RNOT, r0, r0; vpxor r2, r4, r4; #define SBOX4_INVERSE(r0, r1, r2, r3, r4) \ vmovdqa r2, r4; vpand r3, r2, r2; \ @@ -255,9 +222,7 @@ vpand r0, r2, r2; vpxor r0, r3, r3; \ vpxor r4, r2, r2; \ vpor r3, r2, r2; vpxor r0, r3, r3; \ - vpxor r1, r2, r2; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r0,r3,r2,r4,r1); + vpxor r1, r2, r2; #define SBOX5(r0, r1, r2, r3, r4) \ vpxor r1, r0, r0; vpxor r3, r1, r1; \ @@ -269,9 +234,7 @@ vpxor r2, r4, r4; vpxor r0, r2, r2; \ vpand r3, r0, r0; vpxor RNOT, r2, r2; \ vpxor r4, r0, r0; vpor r3, r4, r4; \ - vpxor r4, r2, r2; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r3,r0,r2,r4); + vpxor r4, r2, r2; #define SBOX5_INVERSE(r0, r1, r2, r3, r4) \ vpxor RNOT, r1, r1; vmovdqa r3, r4; \ @@ -283,9 +246,7 @@ vpxor r3, r1, r1; vpxor r2, r4, r4; \ vpand r4, r3, r3; vpxor r1, r4, r4; \ vpxor r4, r3, r3; vpxor RNOT, r4, r4; \ - vpxor r0, r3, r3; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r4,r3,r2,r0); + vpxor r0, r3, r3; #define SBOX6(r0, r1, r2, r3, r4) \ vpxor RNOT, r2, r2; vmovdqa r3, r4; \ @@ -297,9 +258,7 @@ vpxor r2, r0, r0; vpxor r3, r4, r4; \ vpxor r0, r4, r4; vpxor RNOT, r3, r3; \ vpand r4, r2, r2; \ - vpxor r3, r2, r2; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r0,r1,r4,r2,r3); + vpxor r3, r2, r2; #define SBOX6_INVERSE(r0, r1, r2, r3, r4) \ vpxor r2, r0, r0; vmovdqa r2, r4; \ @@ -310,9 +269,7 @@ vpxor r1, r4, r4; vpand r3, r1, r1; \ vpxor r0, r1, r1; vpxor r3, r0, r0; \ vpor r2, r0, r0; vpxor r1, r3, r3; \ - vpxor r0, r4, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r2,r4,r3,r0); + vpxor r0, r4, r4; #define SBOX7(r0, r1, r2, r3, r4) \ vmovdqa r1, r4; vpor r2, r1, r1; \ @@ -325,9 +282,7 @@ vpxor r1, r2, r2; vpand r0, r1, r1; \ vpxor r4, r1, r1; vpxor RNOT, r2, r2; \ vpor r0, r2, r2; \ - vpxor r2, r4, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r4,r3,r1,r0,r2); + vpxor r2, r4, r4; #define SBOX7_INVERSE(r0, r1, r2, r3, r4) \ vmovdqa r2, r4; vpxor r0, r2, r2; \ @@ -339,9 +294,7 @@ vpor r2, r0, r0; vpxor r1, r4, r4; \ vpxor r3, r0, r0; vpxor r4, r3, r3; \ vpor r0, r4, r4; vpxor r2, r3, r3; \ - vpxor r2, r4, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r3,r0,r1,r4,r2); + vpxor r2, r4, r4; /* Apply SBOX number WHICH to to the block. */ #define SBOX(which, r0, r1, r2, r3, r4) \ @@ -402,49 +355,51 @@ /* Apply a Serpent round to sixteen parallel blocks. This macro increments `round'. */ -#define ROUND(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ - SBOX (which, a0, a1, a2, a3, a4); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - SBOX (which, b0, b1, b2, b3, b4); \ - LINEAR_TRANSFORMATION (a0, a1, a2, a3, a4); \ - LINEAR_TRANSFORMATION (b0, b1, b2, b3, b4); \ - .set round, (round + 1); +#define ROUND(round, which, a0, a1, a2, a3, a4, na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, nb0, nb1, nb2, nb3, nb4) \ + BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ + SBOX (which, a0, a1, a2, a3, a4); \ + BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ + SBOX (which, b0, b1, b2, b3, b4); \ + LINEAR_TRANSFORMATION (na0, na1, na2, na3, na4); \ + LINEAR_TRANSFORMATION (nb0, nb1, nb2, nb3, nb4); /* Apply the last Serpent round to sixteen parallel blocks. This macro increments `round'. */ -#define ROUND_LAST(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ - SBOX (which, a0, a1, a2, a3, a4); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - SBOX (which, b0, b1, b2, b3, b4); \ - .set round, (round + 1); \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - .set round, (round + 1); +#define ROUND_LAST(round, which, a0, a1, a2, a3, a4, na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, nb0, nb1, nb2, nb3, nb4) \ + BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ + SBOX (which, a0, a1, a2, a3, a4); \ + BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ + SBOX (which, b0, b1, b2, b3, b4); \ + BLOCK_XOR_KEY (na0, na1, na2, na3, na4, ((round) + 1)); \ + BLOCK_XOR_KEY (nb0, nb1, nb2, nb3, nb4, ((round) + 1)); /* Apply an inverse Serpent round to sixteen parallel blocks. This macro increments `round'. */ -#define ROUND_INVERSE(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ +#define ROUND_INVERSE(round, which, a0, a1, a2, a3, a4, \ + na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, \ + nb0, nb1, nb2, nb3, nb4) \ LINEAR_TRANSFORMATION_INVERSE (a0, a1, a2, a3, a4); \ LINEAR_TRANSFORMATION_INVERSE (b0, b1, b2, b3, b4); \ SBOX_INVERSE (which, a0, a1, a2, a3, a4); \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ + BLOCK_XOR_KEY (na0, na1, na2, na3, na4, round); \ SBOX_INVERSE (which, b0, b1, b2, b3, b4); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - .set round, (round - 1); + BLOCK_XOR_KEY (nb0, nb1, nb2, nb3, nb4, round); /* Apply the first inverse Serpent round to sixteen parallel blocks. This macro increments `round'. */ -#define ROUND_FIRST_INVERSE(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - .set round, (round - 1); \ +#define ROUND_FIRST_INVERSE(round, which, a0, a1, a2, a3, a4, \ + na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, \ + nb0, nb1, nb2, nb3, nb4) \ + BLOCK_XOR_KEY (a0, a1, a2, a3, a4, ((round) + 1)); \ + BLOCK_XOR_KEY (b0, b1, b2, b3, b4, ((round) + 1)); \ SBOX_INVERSE (which, a0, a1, a2, a3, a4); \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ + BLOCK_XOR_KEY (na0, na1, na2, na3, na4, round); \ SBOX_INVERSE (which, b0, b1, b2, b3, b4); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - .set round, (round - 1); + BLOCK_XOR_KEY (nb0, nb1, nb2, nb3, nb4, round); .text @@ -456,72 +411,82 @@ __serpent_enc_blk16: * RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3: sixteen parallel * plaintext blocks * output: - * RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3: sixteen parallel + * RA4, RA1, RA2, RA0, RB4, RB1, RB2, RB0: sixteen parallel * ciphertext blocks */ - /* record input vector names for __serpent_enc_blk16 */ - .set enc_in_a0, RA0 - .set enc_in_a1, RA1 - .set enc_in_a2, RA2 - .set enc_in_a3, RA3 - .set enc_in_b0, RB0 - .set enc_in_b1, RB1 - .set enc_in_b2, RB2 - .set enc_in_b3, RB3 - vpcmpeqd RNOT, RNOT, RNOT; transpose_4x4(RA0, RA1, RA2, RA3, RA4, RTMP0, RTMP1); transpose_4x4(RB0, RB1, RB2, RB3, RB4, RTMP0, RTMP1); - .set round, 0 - ROUND (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - - ROUND_LAST (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - - transpose_4x4(RA0, RA1, RA2, RA3, RA4, RTMP0, RTMP1); - transpose_4x4(RB0, RB1, RB2, RB3, RB4, RTMP0, RTMP1); - - /* record output vector names for __serpent_enc_blk16 */ - .set enc_out_a0, RA0 - .set enc_out_a1, RA1 - .set enc_out_a2, RA2 - .set enc_out_a3, RA3 - .set enc_out_b0, RB0 - .set enc_out_b1, RB1 - .set enc_out_b2, RB2 - .set enc_out_b3, RB3 + ROUND (0, 0, RA0, RA1, RA2, RA3, RA4, RA1, RA4, RA2, RA0, RA3, + RB0, RB1, RB2, RB3, RB4, RB1, RB4, RB2, RB0, RB3); + ROUND (1, 1, RA1, RA4, RA2, RA0, RA3, RA2, RA1, RA0, RA4, RA3, + RB1, RB4, RB2, RB0, RB3, RB2, RB1, RB0, RB4, RB3); + ROUND (2, 2, RA2, RA1, RA0, RA4, RA3, RA0, RA4, RA1, RA3, RA2, + RB2, RB1, RB0, RB4, RB3, RB0, RB4, RB1, RB3, RB2); + ROUND (3, 3, RA0, RA4, RA1, RA3, RA2, RA4, RA1, RA3, RA2, RA0, + RB0, RB4, RB1, RB3, RB2, RB4, RB1, RB3, RB2, RB0); + ROUND (4, 4, RA4, RA1, RA3, RA2, RA0, RA1, RA0, RA4, RA2, RA3, + RB4, RB1, RB3, RB2, RB0, RB1, RB0, RB4, RB2, RB3); + ROUND (5, 5, RA1, RA0, RA4, RA2, RA3, RA0, RA2, RA1, RA4, RA3, + RB1, RB0, RB4, RB2, RB3, RB0, RB2, RB1, RB4, RB3); + ROUND (6, 6, RA0, RA2, RA1, RA4, RA3, RA0, RA2, RA3, RA1, RA4, + RB0, RB2, RB1, RB4, RB3, RB0, RB2, RB3, RB1, RB4); + ROUND (7, 7, RA0, RA2, RA3, RA1, RA4, RA4, RA1, RA2, RA0, RA3, + RB0, RB2, RB3, RB1, RB4, RB4, RB1, RB2, RB0, RB3); + ROUND (8, 0, RA4, RA1, RA2, RA0, RA3, RA1, RA3, RA2, RA4, RA0, + RB4, RB1, RB2, RB0, RB3, RB1, RB3, RB2, RB4, RB0); + ROUND (9, 1, RA1, RA3, RA2, RA4, RA0, RA2, RA1, RA4, RA3, RA0, + RB1, RB3, RB2, RB4, RB0, RB2, RB1, RB4, RB3, RB0); + ROUND (10, 2, RA2, RA1, RA4, RA3, RA0, RA4, RA3, RA1, RA0, RA2, + RB2, RB1, RB4, RB3, RB0, RB4, RB3, RB1, RB0, RB2); + ROUND (11, 3, RA4, RA3, RA1, RA0, RA2, RA3, RA1, RA0, RA2, RA4, + RB4, RB3, RB1, RB0, RB2, RB3, RB1, RB0, RB2, RB4); + ROUND (12, 4, RA3, RA1, RA0, RA2, RA4, RA1, RA4, RA3, RA2, RA0, + RB3, RB1, RB0, RB2, RB4, RB1, RB4, RB3, RB2, RB0); + ROUND (13, 5, RA1, RA4, RA3, RA2, RA0, RA4, RA2, RA1, RA3, RA0, + RB1, RB4, RB3, RB2, RB0, RB4, RB2, RB1, RB3, RB0); + ROUND (14, 6, RA4, RA2, RA1, RA3, RA0, RA4, RA2, RA0, RA1, RA3, + RB4, RB2, RB1, RB3, RB0, RB4, RB2, RB0, RB1, RB3); + ROUND (15, 7, RA4, RA2, RA0, RA1, RA3, RA3, RA1, RA2, RA4, RA0, + RB4, RB2, RB0, RB1, RB3, RB3, RB1, RB2, RB4, RB0); + ROUND (16, 0, RA3, RA1, RA2, RA4, RA0, RA1, RA0, RA2, RA3, RA4, + RB3, RB1, RB2, RB4, RB0, RB1, RB0, RB2, RB3, RB4); + ROUND (17, 1, RA1, RA0, RA2, RA3, RA4, RA2, RA1, RA3, RA0, RA4, + RB1, RB0, RB2, RB3, RB4, RB2, RB1, RB3, RB0, RB4); + ROUND (18, 2, RA2, RA1, RA3, RA0, RA4, RA3, RA0, RA1, RA4, RA2, + RB2, RB1, RB3, RB0, RB4, RB3, RB0, RB1, RB4, RB2); + ROUND (19, 3, RA3, RA0, RA1, RA4, RA2, RA0, RA1, RA4, RA2, RA3, + RB3, RB0, RB1, RB4, RB2, RB0, RB1, RB4, RB2, RB3); + ROUND (20, 4, RA0, RA1, RA4, RA2, RA3, RA1, RA3, RA0, RA2, RA4, + RB0, RB1, RB4, RB2, RB3, RB1, RB3, RB0, RB2, RB4); + ROUND (21, 5, RA1, RA3, RA0, RA2, RA4, RA3, RA2, RA1, RA0, RA4, + RB1, RB3, RB0, RB2, RB4, RB3, RB2, RB1, RB0, RB4); + ROUND (22, 6, RA3, RA2, RA1, RA0, RA4, RA3, RA2, RA4, RA1, RA0, + RB3, RB2, RB1, RB0, RB4, RB3, RB2, RB4, RB1, RB0); + ROUND (23, 7, RA3, RA2, RA4, RA1, RA0, RA0, RA1, RA2, RA3, RA4, + RB3, RB2, RB4, RB1, RB0, RB0, RB1, RB2, RB3, RB4); + ROUND (24, 0, RA0, RA1, RA2, RA3, RA4, RA1, RA4, RA2, RA0, RA3, + RB0, RB1, RB2, RB3, RB4, RB1, RB4, RB2, RB0, RB3); + ROUND (25, 1, RA1, RA4, RA2, RA0, RA3, RA2, RA1, RA0, RA4, RA3, + RB1, RB4, RB2, RB0, RB3, RB2, RB1, RB0, RB4, RB3); + ROUND (26, 2, RA2, RA1, RA0, RA4, RA3, RA0, RA4, RA1, RA3, RA2, + RB2, RB1, RB0, RB4, RB3, RB0, RB4, RB1, RB3, RB2); + ROUND (27, 3, RA0, RA4, RA1, RA3, RA2, RA4, RA1, RA3, RA2, RA0, + RB0, RB4, RB1, RB3, RB2, RB4, RB1, RB3, RB2, RB0); + ROUND (28, 4, RA4, RA1, RA3, RA2, RA0, RA1, RA0, RA4, RA2, RA3, + RB4, RB1, RB3, RB2, RB0, RB1, RB0, RB4, RB2, RB3); + ROUND (29, 5, RA1, RA0, RA4, RA2, RA3, RA0, RA2, RA1, RA4, RA3, + RB1, RB0, RB4, RB2, RB3, RB0, RB2, RB1, RB4, RB3); + ROUND (30, 6, RA0, RA2, RA1, RA4, RA3, RA0, RA2, RA3, RA1, RA4, + RB0, RB2, RB1, RB4, RB3, RB0, RB2, RB3, RB1, RB4); + ROUND_LAST (31, 7, RA0, RA2, RA3, RA1, RA4, RA4, RA1, RA2, RA0, RA3, + RB0, RB2, RB3, RB1, RB4, RB4, RB1, RB2, RB0, RB3); + + transpose_4x4(RA4, RA1, RA2, RA0, RA3, RTMP0, RTMP1); + transpose_4x4(RB4, RB1, RB2, RB0, RB3, RTMP0, RTMP1); ret; .size __serpent_enc_blk16,.-__serpent_enc_blk16; @@ -538,69 +503,81 @@ __serpent_dec_blk16: * plaintext blocks */ - /* record input vector names for __serpent_dec_blk16 */ - .set dec_in_a0, RA0 - .set dec_in_a1, RA1 - .set dec_in_a2, RA2 - .set dec_in_a3, RA3 - .set dec_in_b0, RB0 - .set dec_in_b1, RB1 - .set dec_in_b2, RB2 - .set dec_in_b3, RB3 - vpcmpeqd RNOT, RNOT, RNOT; transpose_4x4(RA0, RA1, RA2, RA3, RA4, RTMP0, RTMP1); transpose_4x4(RB0, RB1, RB2, RB3, RB4, RTMP0, RTMP1); - .set round, 32 - ROUND_FIRST_INVERSE (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - - ROUND_INVERSE (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); + ROUND_FIRST_INVERSE (31, 7, RA0, RA1, RA2, RA3, RA4, + RA3, RA0, RA1, RA4, RA2, + RB0, RB1, RB2, RB3, RB4, + RB3, RB0, RB1, RB4, RB2); + ROUND_INVERSE (30, 6, RA3, RA0, RA1, RA4, RA2, RA0, RA1, RA2, RA4, RA3, + RB3, RB0, RB1, RB4, RB2, RB0, RB1, RB2, RB4, RB3); + ROUND_INVERSE (29, 5, RA0, RA1, RA2, RA4, RA3, RA1, RA3, RA4, RA2, RA0, + RB0, RB1, RB2, RB4, RB3, RB1, RB3, RB4, RB2, RB0); + ROUND_INVERSE (28, 4, RA1, RA3, RA4, RA2, RA0, RA1, RA2, RA4, RA0, RA3, + RB1, RB3, RB4, RB2, RB0, RB1, RB2, RB4, RB0, RB3); + ROUND_INVERSE (27, 3, RA1, RA2, RA4, RA0, RA3, RA4, RA2, RA0, RA1, RA3, + RB1, RB2, RB4, RB0, RB3, RB4, RB2, RB0, RB1, RB3); + ROUND_INVERSE (26, 2, RA4, RA2, RA0, RA1, RA3, RA2, RA3, RA0, RA1, RA4, + RB4, RB2, RB0, RB1, RB3, RB2, RB3, RB0, RB1, RB4); + ROUND_INVERSE (25, 1, RA2, RA3, RA0, RA1, RA4, RA4, RA2, RA1, RA0, RA3, + RB2, RB3, RB0, RB1, RB4, RB4, RB2, RB1, RB0, RB3); + ROUND_INVERSE (24, 0, RA4, RA2, RA1, RA0, RA3, RA4, RA3, RA2, RA0, RA1, + RB4, RB2, RB1, RB0, RB3, RB4, RB3, RB2, RB0, RB1); + ROUND_INVERSE (23, 7, RA4, RA3, RA2, RA0, RA1, RA0, RA4, RA3, RA1, RA2, + RB4, RB3, RB2, RB0, RB1, RB0, RB4, RB3, RB1, RB2); + ROUND_INVERSE (22, 6, RA0, RA4, RA3, RA1, RA2, RA4, RA3, RA2, RA1, RA0, + RB0, RB4, RB3, RB1, RB2, RB4, RB3, RB2, RB1, RB0); + ROUND_INVERSE (21, 5, RA4, RA3, RA2, RA1, RA0, RA3, RA0, RA1, RA2, RA4, + RB4, RB3, RB2, RB1, RB0, RB3, RB0, RB1, RB2, RB4); + ROUND_INVERSE (20, 4, RA3, RA0, RA1, RA2, RA4, RA3, RA2, RA1, RA4, RA0, + RB3, RB0, RB1, RB2, RB4, RB3, RB2, RB1, RB4, RB0); + ROUND_INVERSE (19, 3, RA3, RA2, RA1, RA4, RA0, RA1, RA2, RA4, RA3, RA0, + RB3, RB2, RB1, RB4, RB0, RB1, RB2, RB4, RB3, RB0); + ROUND_INVERSE (18, 2, RA1, RA2, RA4, RA3, RA0, RA2, RA0, RA4, RA3, RA1, + RB1, RB2, RB4, RB3, RB0, RB2, RB0, RB4, RB3, RB1); + ROUND_INVERSE (17, 1, RA2, RA0, RA4, RA3, RA1, RA1, RA2, RA3, RA4, RA0, + RB2, RB0, RB4, RB3, RB1, RB1, RB2, RB3, RB4, RB0); + ROUND_INVERSE (16, 0, RA1, RA2, RA3, RA4, RA0, RA1, RA0, RA2, RA4, RA3, + RB1, RB2, RB3, RB4, RB0, RB1, RB0, RB2, RB4, RB3); + ROUND_INVERSE (15, 7, RA1, RA0, RA2, RA4, RA3, RA4, RA1, RA0, RA3, RA2, + RB1, RB0, RB2, RB4, RB3, RB4, RB1, RB0, RB3, RB2); + ROUND_INVERSE (14, 6, RA4, RA1, RA0, RA3, RA2, RA1, RA0, RA2, RA3, RA4, + RB4, RB1, RB0, RB3, RB2, RB1, RB0, RB2, RB3, RB4); + ROUND_INVERSE (13, 5, RA1, RA0, RA2, RA3, RA4, RA0, RA4, RA3, RA2, RA1, + RB1, RB0, RB2, RB3, RB4, RB0, RB4, RB3, RB2, RB1); + ROUND_INVERSE (12, 4, RA0, RA4, RA3, RA2, RA1, RA0, RA2, RA3, RA1, RA4, + RB0, RB4, RB3, RB2, RB1, RB0, RB2, RB3, RB1, RB4); + ROUND_INVERSE (11, 3, RA0, RA2, RA3, RA1, RA4, RA3, RA2, RA1, RA0, RA4, + RB0, RB2, RB3, RB1, RB4, RB3, RB2, RB1, RB0, RB4); + ROUND_INVERSE (10, 2, RA3, RA2, RA1, RA0, RA4, RA2, RA4, RA1, RA0, RA3, + RB3, RB2, RB1, RB0, RB4, RB2, RB4, RB1, RB0, RB3); + ROUND_INVERSE (9, 1, RA2, RA4, RA1, RA0, RA3, RA3, RA2, RA0, RA1, RA4, + RB2, RB4, RB1, RB0, RB3, RB3, RB2, RB0, RB1, RB4); + ROUND_INVERSE (8, 0, RA3, RA2, RA0, RA1, RA4, RA3, RA4, RA2, RA1, RA0, + RB3, RB2, RB0, RB1, RB4, RB3, RB4, RB2, RB1, RB0); + ROUND_INVERSE (7, 7, RA3, RA4, RA2, RA1, RA0, RA1, RA3, RA4, RA0, RA2, + RB3, RB4, RB2, RB1, RB0, RB1, RB3, RB4, RB0, RB2); + ROUND_INVERSE (6, 6, RA1, RA3, RA4, RA0, RA2, RA3, RA4, RA2, RA0, RA1, + RB1, RB3, RB4, RB0, RB2, RB3, RB4, RB2, RB0, RB1); + ROUND_INVERSE (5, 5, RA3, RA4, RA2, RA0, RA1, RA4, RA1, RA0, RA2, RA3, + RB3, RB4, RB2, RB0, RB1, RB4, RB1, RB0, RB2, RB3); + ROUND_INVERSE (4, 4, RA4, RA1, RA0, RA2, RA3, RA4, RA2, RA0, RA3, RA1, + RB4, RB1, RB0, RB2, RB3, RB4, RB2, RB0, RB3, RB1); + ROUND_INVERSE (3, 3, RA4, RA2, RA0, RA3, RA1, RA0, RA2, RA3, RA4, RA1, + RB4, RB2, RB0, RB3, RB1, RB0, RB2, RB3, RB4, RB1); + ROUND_INVERSE (2, 2, RA0, RA2, RA3, RA4, RA1, RA2, RA1, RA3, RA4, RA0, + RB0, RB2, RB3, RB4, RB1, RB2, RB1, RB3, RB4, RB0); + ROUND_INVERSE (1, 1, RA2, RA1, RA3, RA4, RA0, RA0, RA2, RA4, RA3, RA1, + RB2, RB1, RB3, RB4, RB0, RB0, RB2, RB4, RB3, RB1); + ROUND_INVERSE (0, 0, RA0, RA2, RA4, RA3, RA1, RA0, RA1, RA2, RA3, RA4, + RB0, RB2, RB4, RB3, RB1, RB0, RB1, RB2, RB3, RB4); transpose_4x4(RA0, RA1, RA2, RA3, RA4, RTMP0, RTMP1); transpose_4x4(RB0, RB1, RB2, RB3, RB4, RTMP0, RTMP1); - /* record output vector names for __serpent_dec_blk16 */ - .set dec_out_a0, RA0 - .set dec_out_a1, RA1 - .set dec_out_a2, RA2 - .set dec_out_a3, RA3 - .set dec_out_b0, RB0 - .set dec_out_b1, RB1 - .set dec_out_b2, RB2 - .set dec_out_b3, RB3 - ret; .size __serpent_dec_blk16,.-__serpent_dec_blk16; @@ -623,15 +600,6 @@ _gcry_serpent_avx2_ctr_enc: vzeroupper; - .set RA0, enc_in_a0 - .set RA1, enc_in_a1 - .set RA2, enc_in_a2 - .set RA3, enc_in_a3 - .set RB0, enc_in_b0 - .set RB1, enc_in_b1 - .set RB2, enc_in_b2 - .set RB3, enc_in_b3 - vbroadcasti128 .Lbswap128_mask RIP, RTMP3; vpcmpeqd RNOT, RNOT, RNOT; vpsrldq $8, RNOT, RNOT; /* ab: -1:0 ; cd: -1:0 */ @@ -703,32 +671,23 @@ _gcry_serpent_avx2_ctr_enc: call __serpent_enc_blk16; - .set RA0, enc_out_a0 - .set RA1, enc_out_a1 - .set RA2, enc_out_a2 - .set RA3, enc_out_a3 - .set RB0, enc_out_b0 - .set RB1, enc_out_b1 - .set RB2, enc_out_b2 - .set RB3, enc_out_b3 - - vpxor (0 * 32)(%rdx), RA0, RA0; + vpxor (0 * 32)(%rdx), RA4, RA4; vpxor (1 * 32)(%rdx), RA1, RA1; vpxor (2 * 32)(%rdx), RA2, RA2; - vpxor (3 * 32)(%rdx), RA3, RA3; - vpxor (4 * 32)(%rdx), RB0, RB0; + vpxor (3 * 32)(%rdx), RA0, RA0; + vpxor (4 * 32)(%rdx), RB4, RB4; vpxor (5 * 32)(%rdx), RB1, RB1; vpxor (6 * 32)(%rdx), RB2, RB2; - vpxor (7 * 32)(%rdx), RB3, RB3; + vpxor (7 * 32)(%rdx), RB0, RB0; - vmovdqu RA0, (0 * 32)(%rsi); + vmovdqu RA4, (0 * 32)(%rsi); vmovdqu RA1, (1 * 32)(%rsi); vmovdqu RA2, (2 * 32)(%rsi); - vmovdqu RA3, (3 * 32)(%rsi); - vmovdqu RB0, (4 * 32)(%rsi); + vmovdqu RA0, (3 * 32)(%rsi); + vmovdqu RB4, (4 * 32)(%rsi); vmovdqu RB1, (5 * 32)(%rsi); vmovdqu RB2, (6 * 32)(%rsi); - vmovdqu RB3, (7 * 32)(%rsi); + vmovdqu RB0, (7 * 32)(%rsi); vzeroall; @@ -748,15 +707,6 @@ _gcry_serpent_avx2_cbc_dec: vzeroupper; - .set RA0, dec_in_a0 - .set RA1, dec_in_a1 - .set RA2, dec_in_a2 - .set RA3, dec_in_a3 - .set RB0, dec_in_b0 - .set RB1, dec_in_b1 - .set RB2, dec_in_b2 - .set RB3, dec_in_b3 - vmovdqu (0 * 32)(%rdx), RA0; vmovdqu (1 * 32)(%rdx), RA1; vmovdqu (2 * 32)(%rdx), RA2; @@ -768,15 +718,6 @@ _gcry_serpent_avx2_cbc_dec: call __serpent_dec_blk16; - .set RA0, dec_out_a0 - .set RA1, dec_out_a1 - .set RA2, dec_out_a2 - .set RA3, dec_out_a3 - .set RB0, dec_out_b0 - .set RB1, dec_out_b1 - .set RB2, dec_out_b2 - .set RB3, dec_out_b3 - vmovdqu (%rcx), RNOTx; vinserti128 $1, (%rdx), RNOT, RNOT; vpxor RNOT, RA0, RA0; @@ -817,15 +758,6 @@ _gcry_serpent_avx2_cfb_dec: vzeroupper; - .set RA0, enc_in_a0 - .set RA1, enc_in_a1 - .set RA2, enc_in_a2 - .set RA3, enc_in_a3 - .set RB0, enc_in_b0 - .set RB1, enc_in_b1 - .set RB2, enc_in_b2 - .set RB3, enc_in_b3 - /* Load input */ vmovdqu (%rcx), RNOTx; vinserti128 $1, (%rdx), RNOT, RA0; @@ -843,32 +775,23 @@ _gcry_serpent_avx2_cfb_dec: call __serpent_enc_blk16; - .set RA0, enc_out_a0 - .set RA1, enc_out_a1 - .set RA2, enc_out_a2 - .set RA3, enc_out_a3 - .set RB0, enc_out_b0 - .set RB1, enc_out_b1 - .set RB2, enc_out_b2 - .set RB3, enc_out_b3 - - vpxor (0 * 32)(%rdx), RA0, RA0; + vpxor (0 * 32)(%rdx), RA4, RA4; vpxor (1 * 32)(%rdx), RA1, RA1; vpxor (2 * 32)(%rdx), RA2, RA2; - vpxor (3 * 32)(%rdx), RA3, RA3; - vpxor (4 * 32)(%rdx), RB0, RB0; + vpxor (3 * 32)(%rdx), RA0, RA0; + vpxor (4 * 32)(%rdx), RB4, RB4; vpxor (5 * 32)(%rdx), RB1, RB1; vpxor (6 * 32)(%rdx), RB2, RB2; - vpxor (7 * 32)(%rdx), RB3, RB3; + vpxor (7 * 32)(%rdx), RB0, RB0; - vmovdqu RA0, (0 * 32)(%rsi); + vmovdqu RA4, (0 * 32)(%rsi); vmovdqu RA1, (1 * 32)(%rsi); vmovdqu RA2, (2 * 32)(%rsi); - vmovdqu RA3, (3 * 32)(%rsi); - vmovdqu RB0, (4 * 32)(%rsi); + vmovdqu RA0, (3 * 32)(%rsi); + vmovdqu RB4, (4 * 32)(%rsi); vmovdqu RB1, (5 * 32)(%rsi); vmovdqu RB2, (6 * 32)(%rsi); - vmovdqu RB3, (7 * 32)(%rsi); + vmovdqu RB0, (7 * 32)(%rsi); vzeroall; diff --git a/cipher/serpent-sse2-amd64.S b/cipher/serpent-sse2-amd64.S index a5cf353..516126b 100644 --- a/cipher/serpent-sse2-amd64.S +++ b/cipher/serpent-sse2-amd64.S @@ -35,42 +35,27 @@ #define CTX %rdi /* vector registers */ -.set RA0, %xmm0 -.set RA1, %xmm1 -.set RA2, %xmm2 -.set RA3, %xmm3 -.set RA4, %xmm4 - -.set RB0, %xmm5 -.set RB1, %xmm6 -.set RB2, %xmm7 -.set RB3, %xmm8 -.set RB4, %xmm9 - -.set RNOT, %xmm10 -.set RTMP0, %xmm11 -.set RTMP1, %xmm12 -.set RTMP2, %xmm13 +#define RA0 %xmm0 +#define RA1 %xmm1 +#define RA2 %xmm2 +#define RA3 %xmm3 +#define RA4 %xmm4 + +#define RB0 %xmm5 +#define RB1 %xmm6 +#define RB2 %xmm7 +#define RB3 %xmm8 +#define RB4 %xmm9 + +#define RNOT %xmm10 +#define RTMP0 %xmm11 +#define RTMP1 %xmm12 +#define RTMP2 %xmm13 /********************************************************************** helper macros **********************************************************************/ -/* preprocessor macro for renaming vector registers using GAS macros */ -#define sbox_reg_rename(r0, r1, r2, r3, r4, \ - new_r0, new_r1, new_r2, new_r3, new_r4) \ - .set rename_reg0, new_r0; \ - .set rename_reg1, new_r1; \ - .set rename_reg2, new_r2; \ - .set rename_reg3, new_r3; \ - .set rename_reg4, new_r4; \ - \ - .set r0, rename_reg0; \ - .set r1, rename_reg1; \ - .set r2, rename_reg2; \ - .set r3, rename_reg3; \ - .set r4, rename_reg4; - /* vector 32-bit rotation to left */ #define vec_rol(reg, nleft, tmp) \ movdqa reg, tmp; \ @@ -147,9 +132,7 @@ pxor r4, r2; pxor RNOT, r4; \ por r1, r4; pxor r3, r1; \ pxor r4, r1; por r0, r3; \ - pxor r3, r1; pxor r3, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r4,r2,r0,r3); + pxor r3, r1; pxor r3, r4; #define SBOX0_INVERSE(r0, r1, r2, r3, r4) \ pxor RNOT, r2; movdqa r1, r4; \ @@ -162,9 +145,7 @@ pxor r1, r2; pxor r0, r3; \ pxor r1, r3; \ pand r3, r2; \ - pxor r2, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r0,r4,r1,r3,r2); + pxor r2, r4; #define SBOX1(r0, r1, r2, r3, r4) \ pxor RNOT, r0; pxor RNOT, r2; \ @@ -176,9 +157,7 @@ pand r4, r2; pxor r1, r0; \ pand r2, r1; \ pxor r0, r1; pand r2, r0; \ - pxor r4, r0; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r2,r0,r3,r1,r4); + pxor r4, r0; #define SBOX1_INVERSE(r0, r1, r2, r3, r4) \ movdqa r1, r4; pxor r3, r1; \ @@ -191,9 +170,7 @@ pxor r1, r4; por r0, r1; \ pxor r0, r1; \ por r4, r1; \ - pxor r1, r3; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r4,r0,r3,r2,r1); + pxor r1, r3; #define SBOX2(r0, r1, r2, r3, r4) \ movdqa r0, r4; pand r2, r0; \ @@ -203,9 +180,7 @@ movdqa r3, r1; por r4, r3; \ pxor r0, r3; pand r1, r0; \ pxor r0, r4; pxor r3, r1; \ - pxor r4, r1; pxor RNOT, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r2,r3,r1,r4,r0); + pxor r4, r1; pxor RNOT, r4; #define SBOX2_INVERSE(r0, r1, r2, r3, r4) \ pxor r3, r2; pxor r0, r3; \ @@ -217,9 +192,7 @@ por r0, r2; pxor RNOT, r3; \ pxor r3, r2; pxor r3, r0; \ pand r1, r0; pxor r4, r3; \ - pxor r0, r3; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r4,r2,r3,r0); + pxor r0, r3; #define SBOX3(r0, r1, r2, r3, r4) \ movdqa r0, r4; por r3, r0; \ @@ -231,9 +204,7 @@ pxor r2, r4; por r0, r1; \ pxor r2, r1; pxor r3, r0; \ movdqa r1, r2; por r3, r1; \ - pxor r0, r1; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r2,r3,r4,r0); + pxor r0, r1; #define SBOX3_INVERSE(r0, r1, r2, r3, r4) \ movdqa r2, r4; pxor r1, r2; \ @@ -245,9 +216,7 @@ pxor r1, r3; pxor r0, r1; \ por r2, r1; pxor r3, r0; \ pxor r4, r1; \ - pxor r1, r0; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r2,r1,r3,r0,r4); + pxor r1, r0; #define SBOX4(r0, r1, r2, r3, r4) \ pxor r3, r1; pxor RNOT, r3; \ @@ -259,9 +228,7 @@ pxor r0, r3; por r1, r4; \ pxor r0, r4; por r3, r0; \ pxor r2, r0; pand r3, r2; \ - pxor RNOT, r0; pxor r2, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r4,r0,r3,r2); + pxor RNOT, r0; pxor r2, r4; #define SBOX4_INVERSE(r0, r1, r2, r3, r4) \ movdqa r2, r4; pand r3, r2; \ @@ -274,9 +241,7 @@ pand r0, r2; pxor r0, r3; \ pxor r4, r2; \ por r3, r2; pxor r0, r3; \ - pxor r1, r2; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r0,r3,r2,r4,r1); + pxor r1, r2; #define SBOX5(r0, r1, r2, r3, r4) \ pxor r1, r0; pxor r3, r1; \ @@ -288,9 +253,7 @@ pxor r2, r4; pxor r0, r2; \ pand r3, r0; pxor RNOT, r2; \ pxor r4, r0; por r3, r4; \ - pxor r4, r2; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r3,r0,r2,r4); + pxor r4, r2; #define SBOX5_INVERSE(r0, r1, r2, r3, r4) \ pxor RNOT, r1; movdqa r3, r4; \ @@ -302,9 +265,7 @@ pxor r3, r1; pxor r2, r4; \ pand r4, r3; pxor r1, r4; \ pxor r4, r3; pxor RNOT, r4; \ - pxor r0, r3; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r4,r3,r2,r0); + pxor r0, r3; #define SBOX6(r0, r1, r2, r3, r4) \ pxor RNOT, r2; movdqa r3, r4; \ @@ -316,9 +277,7 @@ pxor r2, r0; pxor r3, r4; \ pxor r0, r4; pxor RNOT, r3; \ pand r4, r2; \ - pxor r3, r2; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r0,r1,r4,r2,r3); + pxor r3, r2; #define SBOX6_INVERSE(r0, r1, r2, r3, r4) \ pxor r2, r0; movdqa r2, r4; \ @@ -329,9 +288,7 @@ pxor r1, r4; pand r3, r1; \ pxor r0, r1; pxor r3, r0; \ por r2, r0; pxor r1, r3; \ - pxor r0, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r1,r2,r4,r3,r0); + pxor r0, r4; #define SBOX7(r0, r1, r2, r3, r4) \ movdqa r1, r4; por r2, r1; \ @@ -344,9 +301,7 @@ pxor r1, r2; pand r0, r1; \ pxor r4, r1; pxor RNOT, r2; \ por r0, r2; \ - pxor r2, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r4,r3,r1,r0,r2); + pxor r2, r4; #define SBOX7_INVERSE(r0, r1, r2, r3, r4) \ movdqa r2, r4; pxor r0, r2; \ @@ -358,9 +313,7 @@ por r2, r0; pxor r1, r4; \ pxor r3, r0; pxor r4, r3; \ por r0, r4; pxor r2, r3; \ - pxor r2, r4; \ - \ - sbox_reg_rename(r0,r1,r2,r3,r4, r3,r0,r1,r4,r2); + pxor r2, r4; /* Apply SBOX number WHICH to to the block. */ #define SBOX(which, r0, r1, r2, r3, r4) \ @@ -425,49 +378,51 @@ /* Apply a Serpent round to eight parallel blocks. This macro increments `round'. */ -#define ROUND(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ - SBOX (which, a0, a1, a2, a3, a4); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - SBOX (which, b0, b1, b2, b3, b4); \ - LINEAR_TRANSFORMATION (a0, a1, a2, a3, a4); \ - LINEAR_TRANSFORMATION (b0, b1, b2, b3, b4); \ - .set round, (round + 1); +#define ROUND(round, which, a0, a1, a2, a3, a4, na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, nb0, nb1, nb2, nb3, nb4) \ + BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ + SBOX (which, a0, a1, a2, a3, a4); \ + BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ + SBOX (which, b0, b1, b2, b3, b4); \ + LINEAR_TRANSFORMATION (na0, na1, na2, na3, na4); \ + LINEAR_TRANSFORMATION (nb0, nb1, nb2, nb3, nb4); /* Apply the last Serpent round to eight parallel blocks. This macro increments `round'. */ -#define ROUND_LAST(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ - SBOX (which, a0, a1, a2, a3, a4); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - SBOX (which, b0, b1, b2, b3, b4); \ - .set round, (round + 1); \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - .set round, (round + 1); +#define ROUND_LAST(round, which, a0, a1, a2, a3, a4, na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, nb0, nb1, nb2, nb3, nb4) \ + BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ + SBOX (which, a0, a1, a2, a3, a4); \ + BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ + SBOX (which, b0, b1, b2, b3, b4); \ + BLOCK_XOR_KEY (na0, na1, na2, na3, na4, ((round) + 1)); \ + BLOCK_XOR_KEY (nb0, nb1, nb2, nb3, nb4, ((round) + 1)); /* Apply an inverse Serpent round to eight parallel blocks. This macro increments `round'. */ -#define ROUND_INVERSE(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ +#define ROUND_INVERSE(round, which, a0, a1, a2, a3, a4, \ + na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, \ + nb0, nb1, nb2, nb3, nb4) \ LINEAR_TRANSFORMATION_INVERSE (a0, a1, a2, a3, a4); \ LINEAR_TRANSFORMATION_INVERSE (b0, b1, b2, b3, b4); \ SBOX_INVERSE (which, a0, a1, a2, a3, a4); \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ + BLOCK_XOR_KEY (na0, na1, na2, na3, na4, round); \ SBOX_INVERSE (which, b0, b1, b2, b3, b4); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - .set round, (round - 1); + BLOCK_XOR_KEY (nb0, nb1, nb2, nb3, nb4, round); /* Apply the first inverse Serpent round to eight parallel blocks. This macro increments `round'. */ -#define ROUND_FIRST_INVERSE(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - .set round, (round - 1); \ +#define ROUND_FIRST_INVERSE(round, which, a0, a1, a2, a3, a4, \ + na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, \ + nb0, nb1, nb2, nb3, nb4) \ + BLOCK_XOR_KEY (a0, a1, a2, a3, a4, ((round) + 1)); \ + BLOCK_XOR_KEY (b0, b1, b2, b3, b4, ((round) + 1)); \ SBOX_INVERSE (which, a0, a1, a2, a3, a4); \ - BLOCK_XOR_KEY (a0, a1, a2, a3, a4, round); \ + BLOCK_XOR_KEY (na0, na1, na2, na3, na4, round); \ SBOX_INVERSE (which, b0, b1, b2, b3, b4); \ - BLOCK_XOR_KEY (b0, b1, b2, b3, b4, round); \ - .set round, (round - 1); + BLOCK_XOR_KEY (nb0, nb1, nb2, nb3, nb4, round); .text @@ -479,72 +434,82 @@ __serpent_enc_blk8: * RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3: eight parallel plaintext * blocks * output: - * RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3: eight parallel + * RA4, RA1, RA2, RA0, RB4, RB1, RB2, RB0: eight parallel * ciphertext blocks */ - /* record input vector names for __serpent_enc_blk8 */ - .set enc_in_a0, RA0 - .set enc_in_a1, RA1 - .set enc_in_a2, RA2 - .set enc_in_a3, RA3 - .set enc_in_b0, RB0 - .set enc_in_b1, RB1 - .set enc_in_b2, RB2 - .set enc_in_b3, RB3 - pcmpeqd RNOT, RNOT; transpose_4x4(RA0, RA1, RA2, RA3, RA4, RTMP0, RTMP1); transpose_4x4(RB0, RB1, RB2, RB3, RB4, RTMP0, RTMP1); - .set round, 0 - ROUND (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - - ROUND_LAST (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - - transpose_4x4(RA0, RA1, RA2, RA3, RA4, RTMP0, RTMP1); - transpose_4x4(RB0, RB1, RB2, RB3, RB4, RTMP0, RTMP1); - - /* record output vector names for __serpent_enc_blk8 */ - .set enc_out_a0, RA0 - .set enc_out_a1, RA1 - .set enc_out_a2, RA2 - .set enc_out_a3, RA3 - .set enc_out_b0, RB0 - .set enc_out_b1, RB1 - .set enc_out_b2, RB2 - .set enc_out_b3, RB3 + ROUND (0, 0, RA0, RA1, RA2, RA3, RA4, RA1, RA4, RA2, RA0, RA3, + RB0, RB1, RB2, RB3, RB4, RB1, RB4, RB2, RB0, RB3); + ROUND (1, 1, RA1, RA4, RA2, RA0, RA3, RA2, RA1, RA0, RA4, RA3, + RB1, RB4, RB2, RB0, RB3, RB2, RB1, RB0, RB4, RB3); + ROUND (2, 2, RA2, RA1, RA0, RA4, RA3, RA0, RA4, RA1, RA3, RA2, + RB2, RB1, RB0, RB4, RB3, RB0, RB4, RB1, RB3, RB2); + ROUND (3, 3, RA0, RA4, RA1, RA3, RA2, RA4, RA1, RA3, RA2, RA0, + RB0, RB4, RB1, RB3, RB2, RB4, RB1, RB3, RB2, RB0); + ROUND (4, 4, RA4, RA1, RA3, RA2, RA0, RA1, RA0, RA4, RA2, RA3, + RB4, RB1, RB3, RB2, RB0, RB1, RB0, RB4, RB2, RB3); + ROUND (5, 5, RA1, RA0, RA4, RA2, RA3, RA0, RA2, RA1, RA4, RA3, + RB1, RB0, RB4, RB2, RB3, RB0, RB2, RB1, RB4, RB3); + ROUND (6, 6, RA0, RA2, RA1, RA4, RA3, RA0, RA2, RA3, RA1, RA4, + RB0, RB2, RB1, RB4, RB3, RB0, RB2, RB3, RB1, RB4); + ROUND (7, 7, RA0, RA2, RA3, RA1, RA4, RA4, RA1, RA2, RA0, RA3, + RB0, RB2, RB3, RB1, RB4, RB4, RB1, RB2, RB0, RB3); + ROUND (8, 0, RA4, RA1, RA2, RA0, RA3, RA1, RA3, RA2, RA4, RA0, + RB4, RB1, RB2, RB0, RB3, RB1, RB3, RB2, RB4, RB0); + ROUND (9, 1, RA1, RA3, RA2, RA4, RA0, RA2, RA1, RA4, RA3, RA0, + RB1, RB3, RB2, RB4, RB0, RB2, RB1, RB4, RB3, RB0); + ROUND (10, 2, RA2, RA1, RA4, RA3, RA0, RA4, RA3, RA1, RA0, RA2, + RB2, RB1, RB4, RB3, RB0, RB4, RB3, RB1, RB0, RB2); + ROUND (11, 3, RA4, RA3, RA1, RA0, RA2, RA3, RA1, RA0, RA2, RA4, + RB4, RB3, RB1, RB0, RB2, RB3, RB1, RB0, RB2, RB4); + ROUND (12, 4, RA3, RA1, RA0, RA2, RA4, RA1, RA4, RA3, RA2, RA0, + RB3, RB1, RB0, RB2, RB4, RB1, RB4, RB3, RB2, RB0); + ROUND (13, 5, RA1, RA4, RA3, RA2, RA0, RA4, RA2, RA1, RA3, RA0, + RB1, RB4, RB3, RB2, RB0, RB4, RB2, RB1, RB3, RB0); + ROUND (14, 6, RA4, RA2, RA1, RA3, RA0, RA4, RA2, RA0, RA1, RA3, + RB4, RB2, RB1, RB3, RB0, RB4, RB2, RB0, RB1, RB3); + ROUND (15, 7, RA4, RA2, RA0, RA1, RA3, RA3, RA1, RA2, RA4, RA0, + RB4, RB2, RB0, RB1, RB3, RB3, RB1, RB2, RB4, RB0); + ROUND (16, 0, RA3, RA1, RA2, RA4, RA0, RA1, RA0, RA2, RA3, RA4, + RB3, RB1, RB2, RB4, RB0, RB1, RB0, RB2, RB3, RB4); + ROUND (17, 1, RA1, RA0, RA2, RA3, RA4, RA2, RA1, RA3, RA0, RA4, + RB1, RB0, RB2, RB3, RB4, RB2, RB1, RB3, RB0, RB4); + ROUND (18, 2, RA2, RA1, RA3, RA0, RA4, RA3, RA0, RA1, RA4, RA2, + RB2, RB1, RB3, RB0, RB4, RB3, RB0, RB1, RB4, RB2); + ROUND (19, 3, RA3, RA0, RA1, RA4, RA2, RA0, RA1, RA4, RA2, RA3, + RB3, RB0, RB1, RB4, RB2, RB0, RB1, RB4, RB2, RB3); + ROUND (20, 4, RA0, RA1, RA4, RA2, RA3, RA1, RA3, RA0, RA2, RA4, + RB0, RB1, RB4, RB2, RB3, RB1, RB3, RB0, RB2, RB4); + ROUND (21, 5, RA1, RA3, RA0, RA2, RA4, RA3, RA2, RA1, RA0, RA4, + RB1, RB3, RB0, RB2, RB4, RB3, RB2, RB1, RB0, RB4); + ROUND (22, 6, RA3, RA2, RA1, RA0, RA4, RA3, RA2, RA4, RA1, RA0, + RB3, RB2, RB1, RB0, RB4, RB3, RB2, RB4, RB1, RB0); + ROUND (23, 7, RA3, RA2, RA4, RA1, RA0, RA0, RA1, RA2, RA3, RA4, + RB3, RB2, RB4, RB1, RB0, RB0, RB1, RB2, RB3, RB4); + ROUND (24, 0, RA0, RA1, RA2, RA3, RA4, RA1, RA4, RA2, RA0, RA3, + RB0, RB1, RB2, RB3, RB4, RB1, RB4, RB2, RB0, RB3); + ROUND (25, 1, RA1, RA4, RA2, RA0, RA3, RA2, RA1, RA0, RA4, RA3, + RB1, RB4, RB2, RB0, RB3, RB2, RB1, RB0, RB4, RB3); + ROUND (26, 2, RA2, RA1, RA0, RA4, RA3, RA0, RA4, RA1, RA3, RA2, + RB2, RB1, RB0, RB4, RB3, RB0, RB4, RB1, RB3, RB2); + ROUND (27, 3, RA0, RA4, RA1, RA3, RA2, RA4, RA1, RA3, RA2, RA0, + RB0, RB4, RB1, RB3, RB2, RB4, RB1, RB3, RB2, RB0); + ROUND (28, 4, RA4, RA1, RA3, RA2, RA0, RA1, RA0, RA4, RA2, RA3, + RB4, RB1, RB3, RB2, RB0, RB1, RB0, RB4, RB2, RB3); + ROUND (29, 5, RA1, RA0, RA4, RA2, RA3, RA0, RA2, RA1, RA4, RA3, + RB1, RB0, RB4, RB2, RB3, RB0, RB2, RB1, RB4, RB3); + ROUND (30, 6, RA0, RA2, RA1, RA4, RA3, RA0, RA2, RA3, RA1, RA4, + RB0, RB2, RB1, RB4, RB3, RB0, RB2, RB3, RB1, RB4); + ROUND_LAST (31, 7, RA0, RA2, RA3, RA1, RA4, RA4, RA1, RA2, RA0, RA3, + RB0, RB2, RB3, RB1, RB4, RB4, RB1, RB2, RB0, RB3); + + transpose_4x4(RA4, RA1, RA2, RA0, RA3, RTMP0, RTMP1); + transpose_4x4(RB4, RB1, RB2, RB0, RB3, RTMP0, RTMP1); ret; .size __serpent_enc_blk8,.-__serpent_enc_blk8; @@ -561,69 +526,81 @@ __serpent_dec_blk8: * blocks */ - /* record input vector names for __serpent_dec_blk8 */ - .set dec_in_a0, RA0 - .set dec_in_a1, RA1 - .set dec_in_a2, RA2 - .set dec_in_a3, RA3 - .set dec_in_b0, RB0 - .set dec_in_b1, RB1 - .set dec_in_b2, RB2 - .set dec_in_b3, RB3 - pcmpeqd RNOT, RNOT; transpose_4x4(RA0, RA1, RA2, RA3, RA4, RTMP0, RTMP1); transpose_4x4(RB0, RB1, RB2, RB3, RB4, RTMP0, RTMP1); - .set round, 32 - ROUND_FIRST_INVERSE (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - - ROUND_INVERSE (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (7, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (6, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (5, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (4, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (3, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (2, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (1, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); - ROUND_INVERSE (0, RA0, RA1, RA2, RA3, RA4, RB0, RB1, RB2, RB3, RB4); + ROUND_FIRST_INVERSE (31, 7, RA0, RA1, RA2, RA3, RA4, + RA3, RA0, RA1, RA4, RA2, + RB0, RB1, RB2, RB3, RB4, + RB3, RB0, RB1, RB4, RB2); + ROUND_INVERSE (30, 6, RA3, RA0, RA1, RA4, RA2, RA0, RA1, RA2, RA4, RA3, + RB3, RB0, RB1, RB4, RB2, RB0, RB1, RB2, RB4, RB3); + ROUND_INVERSE (29, 5, RA0, RA1, RA2, RA4, RA3, RA1, RA3, RA4, RA2, RA0, + RB0, RB1, RB2, RB4, RB3, RB1, RB3, RB4, RB2, RB0); + ROUND_INVERSE (28, 4, RA1, RA3, RA4, RA2, RA0, RA1, RA2, RA4, RA0, RA3, + RB1, RB3, RB4, RB2, RB0, RB1, RB2, RB4, RB0, RB3); + ROUND_INVERSE (27, 3, RA1, RA2, RA4, RA0, RA3, RA4, RA2, RA0, RA1, RA3, + RB1, RB2, RB4, RB0, RB3, RB4, RB2, RB0, RB1, RB3); + ROUND_INVERSE (26, 2, RA4, RA2, RA0, RA1, RA3, RA2, RA3, RA0, RA1, RA4, + RB4, RB2, RB0, RB1, RB3, RB2, RB3, RB0, RB1, RB4); + ROUND_INVERSE (25, 1, RA2, RA3, RA0, RA1, RA4, RA4, RA2, RA1, RA0, RA3, + RB2, RB3, RB0, RB1, RB4, RB4, RB2, RB1, RB0, RB3); + ROUND_INVERSE (24, 0, RA4, RA2, RA1, RA0, RA3, RA4, RA3, RA2, RA0, RA1, + RB4, RB2, RB1, RB0, RB3, RB4, RB3, RB2, RB0, RB1); + ROUND_INVERSE (23, 7, RA4, RA3, RA2, RA0, RA1, RA0, RA4, RA3, RA1, RA2, + RB4, RB3, RB2, RB0, RB1, RB0, RB4, RB3, RB1, RB2); + ROUND_INVERSE (22, 6, RA0, RA4, RA3, RA1, RA2, RA4, RA3, RA2, RA1, RA0, + RB0, RB4, RB3, RB1, RB2, RB4, RB3, RB2, RB1, RB0); + ROUND_INVERSE (21, 5, RA4, RA3, RA2, RA1, RA0, RA3, RA0, RA1, RA2, RA4, + RB4, RB3, RB2, RB1, RB0, RB3, RB0, RB1, RB2, RB4); + ROUND_INVERSE (20, 4, RA3, RA0, RA1, RA2, RA4, RA3, RA2, RA1, RA4, RA0, + RB3, RB0, RB1, RB2, RB4, RB3, RB2, RB1, RB4, RB0); + ROUND_INVERSE (19, 3, RA3, RA2, RA1, RA4, RA0, RA1, RA2, RA4, RA3, RA0, + RB3, RB2, RB1, RB4, RB0, RB1, RB2, RB4, RB3, RB0); + ROUND_INVERSE (18, 2, RA1, RA2, RA4, RA3, RA0, RA2, RA0, RA4, RA3, RA1, + RB1, RB2, RB4, RB3, RB0, RB2, RB0, RB4, RB3, RB1); + ROUND_INVERSE (17, 1, RA2, RA0, RA4, RA3, RA1, RA1, RA2, RA3, RA4, RA0, + RB2, RB0, RB4, RB3, RB1, RB1, RB2, RB3, RB4, RB0); + ROUND_INVERSE (16, 0, RA1, RA2, RA3, RA4, RA0, RA1, RA0, RA2, RA4, RA3, + RB1, RB2, RB3, RB4, RB0, RB1, RB0, RB2, RB4, RB3); + ROUND_INVERSE (15, 7, RA1, RA0, RA2, RA4, RA3, RA4, RA1, RA0, RA3, RA2, + RB1, RB0, RB2, RB4, RB3, RB4, RB1, RB0, RB3, RB2); + ROUND_INVERSE (14, 6, RA4, RA1, RA0, RA3, RA2, RA1, RA0, RA2, RA3, RA4, + RB4, RB1, RB0, RB3, RB2, RB1, RB0, RB2, RB3, RB4); + ROUND_INVERSE (13, 5, RA1, RA0, RA2, RA3, RA4, RA0, RA4, RA3, RA2, RA1, + RB1, RB0, RB2, RB3, RB4, RB0, RB4, RB3, RB2, RB1); + ROUND_INVERSE (12, 4, RA0, RA4, RA3, RA2, RA1, RA0, RA2, RA3, RA1, RA4, + RB0, RB4, RB3, RB2, RB1, RB0, RB2, RB3, RB1, RB4); + ROUND_INVERSE (11, 3, RA0, RA2, RA3, RA1, RA4, RA3, RA2, RA1, RA0, RA4, + RB0, RB2, RB3, RB1, RB4, RB3, RB2, RB1, RB0, RB4); + ROUND_INVERSE (10, 2, RA3, RA2, RA1, RA0, RA4, RA2, RA4, RA1, RA0, RA3, + RB3, RB2, RB1, RB0, RB4, RB2, RB4, RB1, RB0, RB3); + ROUND_INVERSE (9, 1, RA2, RA4, RA1, RA0, RA3, RA3, RA2, RA0, RA1, RA4, + RB2, RB4, RB1, RB0, RB3, RB3, RB2, RB0, RB1, RB4); + ROUND_INVERSE (8, 0, RA3, RA2, RA0, RA1, RA4, RA3, RA4, RA2, RA1, RA0, + RB3, RB2, RB0, RB1, RB4, RB3, RB4, RB2, RB1, RB0); + ROUND_INVERSE (7, 7, RA3, RA4, RA2, RA1, RA0, RA1, RA3, RA4, RA0, RA2, + RB3, RB4, RB2, RB1, RB0, RB1, RB3, RB4, RB0, RB2); + ROUND_INVERSE (6, 6, RA1, RA3, RA4, RA0, RA2, RA3, RA4, RA2, RA0, RA1, + RB1, RB3, RB4, RB0, RB2, RB3, RB4, RB2, RB0, RB1); + ROUND_INVERSE (5, 5, RA3, RA4, RA2, RA0, RA1, RA4, RA1, RA0, RA2, RA3, + RB3, RB4, RB2, RB0, RB1, RB4, RB1, RB0, RB2, RB3); + ROUND_INVERSE (4, 4, RA4, RA1, RA0, RA2, RA3, RA4, RA2, RA0, RA3, RA1, + RB4, RB1, RB0, RB2, RB3, RB4, RB2, RB0, RB3, RB1); + ROUND_INVERSE (3, 3, RA4, RA2, RA0, RA3, RA1, RA0, RA2, RA3, RA4, RA1, + RB4, RB2, RB0, RB3, RB1, RB0, RB2, RB3, RB4, RB1); + ROUND_INVERSE (2, 2, RA0, RA2, RA3, RA4, RA1, RA2, RA1, RA3, RA4, RA0, + RB0, RB2, RB3, RB4, RB1, RB2, RB1, RB3, RB4, RB0); + ROUND_INVERSE (1, 1, RA2, RA1, RA3, RA4, RA0, RA0, RA2, RA4, RA3, RA1, + RB2, RB1, RB3, RB4, RB0, RB0, RB2, RB4, RB3, RB1); + ROUND_INVERSE (0, 0, RA0, RA2, RA4, RA3, RA1, RA0, RA1, RA2, RA3, RA4, + RB0, RB2, RB4, RB3, RB1, RB0, RB1, RB2, RB3, RB4); transpose_4x4(RA0, RA1, RA2, RA3, RA4, RTMP0, RTMP1); transpose_4x4(RB0, RB1, RB2, RB3, RB4, RTMP0, RTMP1); - /* record output vector names for __serpent_dec_blk8 */ - .set dec_out_a0, RA0 - .set dec_out_a1, RA1 - .set dec_out_a2, RA2 - .set dec_out_a3, RA3 - .set dec_out_b0, RB0 - .set dec_out_b1, RB1 - .set dec_out_b2, RB2 - .set dec_out_b3, RB3 - ret; .size __serpent_dec_blk8,.-__serpent_dec_blk8; @@ -638,15 +615,6 @@ _gcry_serpent_sse2_ctr_enc: * %rcx: iv (big endian, 128bit) */ - .set RA0, enc_in_a0 - .set RA1, enc_in_a1 - .set RA2, enc_in_a2 - .set RA3, enc_in_a3 - .set RB0, enc_in_b0 - .set RB1, enc_in_b1 - .set RB2, enc_in_b2 - .set RB3, enc_in_b3 - /* load IV and byteswap */ movdqu (%rcx), RA0; movdqa RA0, RTMP0; @@ -729,42 +697,35 @@ _gcry_serpent_sse2_ctr_enc: call __serpent_enc_blk8; - .set RA0, enc_out_a0 - .set RA1, enc_out_a1 - .set RA2, enc_out_a2 - .set RA3, enc_out_a3 - .set RB0, enc_out_b0 - .set RB1, enc_out_b1 - .set RB2, enc_out_b2 - .set RB3, enc_out_b3 - - pxor_u((0 * 16)(%rdx), RA0, RTMP0); + pxor_u((0 * 16)(%rdx), RA4, RTMP0); pxor_u((1 * 16)(%rdx), RA1, RTMP0); pxor_u((2 * 16)(%rdx), RA2, RTMP0); - pxor_u((3 * 16)(%rdx), RA3, RTMP0); - pxor_u((4 * 16)(%rdx), RB0, RTMP0); + pxor_u((3 * 16)(%rdx), RA0, RTMP0); + pxor_u((4 * 16)(%rdx), RB4, RTMP0); pxor_u((5 * 16)(%rdx), RB1, RTMP0); pxor_u((6 * 16)(%rdx), RB2, RTMP0); - pxor_u((7 * 16)(%rdx), RB3, RTMP0); + pxor_u((7 * 16)(%rdx), RB0, RTMP0); - movdqu RA0, (0 * 16)(%rsi); + movdqu RA4, (0 * 16)(%rsi); movdqu RA1, (1 * 16)(%rsi); movdqu RA2, (2 * 16)(%rsi); - movdqu RA3, (3 * 16)(%rsi); - movdqu RB0, (4 * 16)(%rsi); + movdqu RA0, (3 * 16)(%rsi); + movdqu RB4, (4 * 16)(%rsi); movdqu RB1, (5 * 16)(%rsi); movdqu RB2, (6 * 16)(%rsi); - movdqu RB3, (7 * 16)(%rsi); + movdqu RB0, (7 * 16)(%rsi); /* clear the used registers */ pxor RA0, RA0; pxor RA1, RA1; pxor RA2, RA2; pxor RA3, RA3; + pxor RA4, RA4; pxor RB0, RB0; pxor RB1, RB1; pxor RB2, RB2; pxor RB3, RB3; + pxor RB4, RB4; pxor RTMP0, RTMP0; pxor RTMP1, RTMP1; pxor RTMP2, RTMP2; @@ -784,15 +745,6 @@ _gcry_serpent_sse2_cbc_dec: * %rcx: iv */ - .set RA0, dec_in_a0 - .set RA1, dec_in_a1 - .set RA2, dec_in_a2 - .set RA3, dec_in_a3 - .set RB0, dec_in_b0 - .set RB1, dec_in_b1 - .set RB2, dec_in_b2 - .set RB3, dec_in_b3 - movdqu (0 * 16)(%rdx), RA0; movdqu (1 * 16)(%rdx), RA1; movdqu (2 * 16)(%rdx), RA2; @@ -804,15 +756,6 @@ _gcry_serpent_sse2_cbc_dec: call __serpent_dec_blk8; - .set RA0, dec_out_a0 - .set RA1, dec_out_a1 - .set RA2, dec_out_a2 - .set RA3, dec_out_a3 - .set RB0, dec_out_b0 - .set RB1, dec_out_b1 - .set RB2, dec_out_b2 - .set RB3, dec_out_b3 - movdqu (7 * 16)(%rdx), RNOT; pxor_u((%rcx), RA0, RTMP0); pxor_u((0 * 16)(%rdx), RA1, RTMP0); @@ -838,10 +781,12 @@ _gcry_serpent_sse2_cbc_dec: pxor RA1, RA1; pxor RA2, RA2; pxor RA3, RA3; + pxor RA4, RA4; pxor RB0, RB0; pxor RB1, RB1; pxor RB2, RB2; pxor RB3, RB3; + pxor RB4, RB4; pxor RTMP0, RTMP0; pxor RTMP1, RTMP1; pxor RTMP2, RTMP2; @@ -861,15 +806,6 @@ _gcry_serpent_sse2_cfb_dec: * %rcx: iv */ - .set RA0, enc_in_a0 - .set RA1, enc_in_a1 - .set RA2, enc_in_a2 - .set RA3, enc_in_a3 - .set RB0, enc_in_b0 - .set RB1, enc_in_b1 - .set RB2, enc_in_b2 - .set RB3, enc_in_b3 - /* Load input */ movdqu (%rcx), RA0; movdqu 0 * 16(%rdx), RA1; @@ -886,42 +822,35 @@ _gcry_serpent_sse2_cfb_dec: call __serpent_enc_blk8; - .set RA0, enc_out_a0 - .set RA1, enc_out_a1 - .set RA2, enc_out_a2 - .set RA3, enc_out_a3 - .set RB0, enc_out_b0 - .set RB1, enc_out_b1 - .set RB2, enc_out_b2 - .set RB3, enc_out_b3 - - pxor_u((0 * 16)(%rdx), RA0, RTMP0); + pxor_u((0 * 16)(%rdx), RA4, RTMP0); pxor_u((1 * 16)(%rdx), RA1, RTMP0); pxor_u((2 * 16)(%rdx), RA2, RTMP0); - pxor_u((3 * 16)(%rdx), RA3, RTMP0); - pxor_u((4 * 16)(%rdx), RB0, RTMP0); + pxor_u((3 * 16)(%rdx), RA0, RTMP0); + pxor_u((4 * 16)(%rdx), RB4, RTMP0); pxor_u((5 * 16)(%rdx), RB1, RTMP0); pxor_u((6 * 16)(%rdx), RB2, RTMP0); - pxor_u((7 * 16)(%rdx), RB3, RTMP0); + pxor_u((7 * 16)(%rdx), RB0, RTMP0); - movdqu RA0, (0 * 16)(%rsi); + movdqu RA4, (0 * 16)(%rsi); movdqu RA1, (1 * 16)(%rsi); movdqu RA2, (2 * 16)(%rsi); - movdqu RA3, (3 * 16)(%rsi); - movdqu RB0, (4 * 16)(%rsi); + movdqu RA0, (3 * 16)(%rsi); + movdqu RB4, (4 * 16)(%rsi); movdqu RB1, (5 * 16)(%rsi); movdqu RB2, (6 * 16)(%rsi); - movdqu RB3, (7 * 16)(%rsi); + movdqu RB0, (7 * 16)(%rsi); /* clear the used registers */ pxor RA0, RA0; pxor RA1, RA1; pxor RA2, RA2; pxor RA3, RA3; + pxor RA4, RA4; pxor RB0, RB0; pxor RB1, RB1; pxor RB2, RB2; pxor RB3, RB3; + pxor RB4, RB4; pxor RTMP0, RTMP0; pxor RTMP1, RTMP1; pxor RTMP2, RTMP2; diff --git a/configure.ac b/configure.ac index 1460dfd..8fb14e2 100644 --- a/configure.ac +++ b/configure.ac @@ -1034,17 +1034,12 @@ if test $amd64_as_feature_detection = yes; then [gcry_cv_gcc_amd64_platform_as_ok=no AC_COMPILE_IFELSE([AC_LANG_SOURCE( [[__asm__( - /* Test if '.set' is supported by underlying assembler. */ - ".set a0, %rax\n\t" - ".set b0, %rdx\n\t" - "asmfunc:\n\t" - "movq a0, b0;\n\t" /* Fails here if .set ignored by as. */ - /* Test if '.type' and '.size' are supported. */ /* These work only on ELF targets. */ /* TODO: add COFF (mingw64, cygwin64) support to assembly * implementations. Mingw64/cygwin64 also require additional * work because they use different calling convention. */ + "asmfunc:\n\t" ".size asmfunc,.-asmfunc;\n\t" ".type asmfunc, at function;\n\t" );]])], ----------------------------------------------------------------------- Summary of changes: cipher/Makefile.am | 2 +- cipher/serpent-avx2-amd64.S | 519 ++++++++++++++++++------------------------- cipher/serpent-sse2-amd64.S | 507 ++++++++++++++++++------------------------ cipher/twofish-armv6.S | 365 ++++++++++++++++++++++++++++++ cipher/twofish.c | 88 +++++--- configure.ac | 11 +- mpi/longlong.h | 27 ++- 7 files changed, 885 insertions(+), 634 deletions(-) create mode 100644 cipher/twofish-armv6.S hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From dbaryshkov at gmail.com Tue Oct 22 21:29:25 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Tue, 22 Oct 2013 23:29:25 +0400 Subject: [PATCH 1/3] Correct ASM assembly test in configure.ac Message-ID: <1382470167-11975-1-git-send-email-dbaryshkov@gmail.com> * configure.ac: correct HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS test to require neither ARMv6, nor thumb mode. Our assembly code works perfectly even on ARMv4 now. Signed-off-by: Dmitry Eremin-Solenikov --- configure.ac | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/configure.ac b/configure.ac index 739a650..58916e8 100644 --- a/configure.ac +++ b/configure.ac @@ -1109,11 +1109,10 @@ AC_CACHE_CHECK([whether GCC assembler is compatible for ARM assembly implementat [[__asm__( /* Test if assembler supports UAL syntax. */ ".syntax unified\n\t" - ".thumb\n\t" /* thumb-2 in UAL, thumb-1 otherwise. */ - ".code 16\n\t" + ".arm\n\t" /* our assembly code is in ARM mode */ /* Following causes error if assembler ignored '.syntax unified'. */ "asmfunc:\n\t" - "add.w %r0, %r4, %r8, ror #12;\n\t" + "add %r0, %r0, %r4, ror #12;\n\t" /* Test if '.type' and '.size' are supported. */ ".size asmfunc,.-asmfunc;\n\t" -- 1.8.4.rc3 From dbaryshkov at gmail.com Tue Oct 22 21:29:26 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Tue, 22 Oct 2013 23:29:26 +0400 Subject: [PATCH 2/3] mpi: enable assembler on all arm architectures In-Reply-To: <1382470167-11975-1-git-send-email-dbaryshkov@gmail.com> References: <1382470167-11975-1-git-send-email-dbaryshkov@gmail.com> Message-ID: <1382470167-11975-2-git-send-email-dbaryshkov@gmail.com> * mpi/config.links: remove check for arm >= v6 * mpi/armv6 => mpi/arm: rename directory to reflect that is is generic enough -- MPI ARM assembly do not depend on CPU being armv6. Verified on PXA255: Before: Algorithm generate 100*sign 100*verify ------------------------------------------------ RSA 1024 bit 3990ms 57980ms 1680ms RSA 2048 bit 59620ms 389430ms 5690ms RSA 3072 bit 535850ms 1223200ms 12000ms RSA 4096 bit 449350ms 2707370ms 20050ms After: Algorithm generate 100*sign 100*verify ------------------------------------------------ RSA 1024 bit 2190ms 13730ms 320ms RSA 2048 bit 12750ms 67640ms 810ms RSA 3072 bit 110520ms 166100ms 1350ms RSA 4096 bit 100870ms 357560ms 2170ms Signed-off-by: Dmitry Eremin-Solenikov --- mpi/arm/mpi-asm-defs.h | 4 ++ mpi/arm/mpih-add1.S | 76 +++++++++++++++++++++++++++++++++++ mpi/arm/mpih-mul1.S | 80 +++++++++++++++++++++++++++++++++++++ mpi/arm/mpih-mul2.S | 94 ++++++++++++++++++++++++++++++++++++++++++++ mpi/arm/mpih-mul3.S | 100 +++++++++++++++++++++++++++++++++++++++++++++++ mpi/arm/mpih-sub1.S | 77 ++++++++++++++++++++++++++++++++++++ mpi/armv6/mpi-asm-defs.h | 4 -- mpi/armv6/mpih-add1.S | 76 ----------------------------------- mpi/armv6/mpih-mul1.S | 80 ------------------------------------- mpi/armv6/mpih-mul2.S | 94 -------------------------------------------- mpi/armv6/mpih-mul3.S | 100 ----------------------------------------------- mpi/armv6/mpih-sub1.S | 77 ------------------------------------ mpi/config.links | 11 ++---- 13 files changed, 434 insertions(+), 439 deletions(-) create mode 100644 mpi/arm/mpi-asm-defs.h create mode 100644 mpi/arm/mpih-add1.S create mode 100644 mpi/arm/mpih-mul1.S create mode 100644 mpi/arm/mpih-mul2.S create mode 100644 mpi/arm/mpih-mul3.S create mode 100644 mpi/arm/mpih-sub1.S delete mode 100644 mpi/armv6/mpi-asm-defs.h delete mode 100644 mpi/armv6/mpih-add1.S delete mode 100644 mpi/armv6/mpih-mul1.S delete mode 100644 mpi/armv6/mpih-mul2.S delete mode 100644 mpi/armv6/mpih-mul3.S delete mode 100644 mpi/armv6/mpih-sub1.S diff --git a/mpi/arm/mpi-asm-defs.h b/mpi/arm/mpi-asm-defs.h new file mode 100644 index 0000000..047d1f5 --- /dev/null +++ b/mpi/arm/mpi-asm-defs.h @@ -0,0 +1,4 @@ +/* This file defines some basic constants for the MPI machinery. We + * need to define the types on a per-CPU basis, so it is done with + * this file here. */ +#define BYTES_PER_MPI_LIMB (SIZEOF_UNSIGNED_LONG) diff --git a/mpi/arm/mpih-add1.S b/mpi/arm/mpih-add1.S new file mode 100644 index 0000000..60ea4c3 --- /dev/null +++ b/mpi/arm/mpih-add1.S @@ -0,0 +1,76 @@ +/* ARMv6 add_n -- Add two limb vectors of the same length > 0 and store + * sum in a third limb vector. + * + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + * + * Note: This code is heavily based on the GNU MP Library (version 4.2.1). + */ + +#include "sysdep.h" +#include "asm-syntax.h" + +.syntax unified +.arm + +/******************* + * mpi_limb_t + * _gcry_mpih_add_n( mpi_ptr_t res_ptr, %r0 + * mpi_ptr_t s1_ptr, %r1 + * mpi_ptr_t s2_ptr, %r2 + * mpi_size_t size) %r3 + */ + +.text + +.globl _gcry_mpih_add_n +.type _gcry_mpih_add_n,%function +_gcry_mpih_add_n: + push {%r4, %r5, %r6, %r7, %r8, %r9, %r10, %lr}; + cmn %r0, #0; /* clear carry flag */ + + tst %r3, #3; + beq .Large_loop; + +.Loop: + ldr %r4, [%r1], #4; + sub %r3, #1; + ldr %lr, [%r2], #4; + adcs %r4, %lr; + tst %r3, #3; + str %r4, [%r0], #4; + bne .Loop; + + teq %r3, #0; + beq .Lend; + +.Large_loop: + ldm %r1!, {%r4, %r6, %r8, %r10}; + ldm %r2!, {%r5, %r7, %r9, %lr}; + sub %r3, #4; + adcs %r4, %r5; + adcs %r6, %r7; + adcs %r8, %r9; + adcs %r10, %lr; + teq %r3, #0; + stm %r0!, {%r4, %r6, %r8, %r10}; + bne .Large_loop; + +.Lend: + adc %r0, %r3, #0; + pop {%r4, %r5, %r6, %r7, %r8, %r9, %r10, %pc}; +.size _gcry_mpih_add_n,.-_gcry_mpih_add_n; diff --git a/mpi/arm/mpih-mul1.S b/mpi/arm/mpih-mul1.S new file mode 100644 index 0000000..0aa41ef --- /dev/null +++ b/mpi/arm/mpih-mul1.S @@ -0,0 +1,80 @@ +/* ARMv6 mul_1 -- Multiply a limb vector with a limb and store the result in + * a second limb vector. + * + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + * + * Note: This code is heavily based on the GNU MP Library (version 4.2.1). + */ + +#include "sysdep.h" +#include "asm-syntax.h" + +.syntax unified +.arm + +/******************* + * mpi_limb_t + * _gcry_mpih_mul_1( mpi_ptr_t res_ptr, %r0 + * mpi_ptr_t s1_ptr, %r1 + * mpi_size_t s1_size, %r2 + * mpi_limb_t s2_limb) %r3 + */ + +.text + +.globl _gcry_mpih_mul_1 +.type _gcry_mpih_mul_1,%function +_gcry_mpih_mul_1: + push {%r4, %r5, %r6, %r7, %r8, %r9, %r10, %r11, %lr}; + mov %r4, #0; + + tst %r2, #3; + beq .Large_loop; + +.Loop: + ldr %r5, [%r1], #4; + mov %lr, #0; + umlal %r4, %lr, %r5, %r3; + sub %r2, #1; + str %r4, [%r0], #4; + tst %r2, #3; + mov %r4, %lr; + bne .Loop; + + teq %r2, #0; + beq .Lend; + +.Large_loop: + ldm %r1!, {%r5, %r6, %r7, %r8}; + mov %r9, #0; + mov %r10, #0; + umlal %r4, %r9, %r5, %r3; + mov %r11, #0; + umlal %r9, %r10, %r6, %r3; + str %r4, [%r0], #4; + mov %r4, #0; + umlal %r10, %r11, %r7, %r3; + subs %r2, #4; + umlal %r11, %r4, %r8, %r3; + stm %r0!, {%r9, %r10, %r11}; + bne .Large_loop; + +.Lend: + mov %r0, %r4; + pop {%r4, %r5, %r6, %r7, %r8, %r9, %r10, %r11, %pc}; +.size _gcry_mpih_mul_1,.-_gcry_mpih_mul_1; diff --git a/mpi/arm/mpih-mul2.S b/mpi/arm/mpih-mul2.S new file mode 100644 index 0000000..a7eb8a1 --- /dev/null +++ b/mpi/arm/mpih-mul2.S @@ -0,0 +1,94 @@ +/* ARMv6 mul_2 -- Multiply a limb vector with a limb and add the result to + * a second limb vector. + * + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + * + * Note: This code is heavily based on the GNU MP Library (version 4.2.1). + */ + +#include "sysdep.h" +#include "asm-syntax.h" + +.syntax unified +.arm + +/******************* + * mpi_limb_t + * _gcry_mpih_addmul_1( mpi_ptr_t res_ptr, %r0 + * mpi_ptr_t s1_ptr, %r1 + * mpi_size_t s1_size, %r2 + * mpi_limb_t s2_limb) %r3 + */ + +.text + +.globl _gcry_mpih_addmul_1 +.type _gcry_mpih_addmul_1,%function +_gcry_mpih_addmul_1: + push {%r4, %r5, %r6, %r8, %r10, %lr}; + mov %lr, #0; + cmn %r0, #0; /* clear carry flag */ + + tst %r2, #3; + beq .Large_loop; +.Loop: + ldr %r5, [%r1], #4; + ldr %r4, [%r0]; + sub %r2, #1; + adcs %r4, %lr; + mov %lr, #0; + umlal %r4, %lr, %r5, %r3; + tst %r2, #3; + str %r4, [%r0], #4; + bne .Loop; + + teq %r2, #0; + beq .Lend; + +.Large_loop: + ldr %r5, [%r1], #4; + ldm %r0, {%r4, %r6, %r8, %r10}; + + sub %r2, #4; + adcs %r4, %lr; + mov %lr, #0; + umlal %r4, %lr, %r5, %r3; + + ldr %r5, [%r1], #4; + adcs %r6, %lr; + mov %lr, #0; + umlal %r6, %lr, %r5, %r3; + + ldr %r5, [%r1], #4; + adcs %r8, %lr; + mov %lr, #0; + umlal %r8, %lr, %r5, %r3; + + ldr %r5, [%r1], #4; + adcs %r10, %lr; + mov %lr, #0; + umlal %r10, %lr, %r5, %r3; + + teq %r2, #0; + stm %r0!, {%r4, %r6, %r8, %r10}; + bne .Large_loop; + +.Lend: + adc %r0, %lr, #0; + pop {%r4, %r5, %r6, %r8, %r10, %pc}; +.size _gcry_mpih_addmul_1,.-_gcry_mpih_addmul_1; diff --git a/mpi/arm/mpih-mul3.S b/mpi/arm/mpih-mul3.S new file mode 100644 index 0000000..034929e --- /dev/null +++ b/mpi/arm/mpih-mul3.S @@ -0,0 +1,100 @@ +/* ARMv6 mul_3 -- Multiply a limb vector with a limb and subtract the result + * from a second limb vector. + * + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + * + * Note: This code is heavily based on the GNU MP Library (version 4.2.1). + */ + +#include "sysdep.h" +#include "asm-syntax.h" + +.syntax unified +.arm + +/******************* + * mpi_limb_t + * _gcry_mpih_submul_1( mpi_ptr_t res_ptr, %r0 + * mpi_ptr_t s1_ptr, %r1 + * mpi_size_t s1_size, %r2 + * mpi_limb_t s2_limb) %r3 + */ + +.text + +.globl _gcry_mpih_submul_1 +.type _gcry_mpih_submul_1,%function +_gcry_mpih_submul_1: + push {%r4, %r5, %r6, %r8, %r9, %r10, %lr}; + mov %lr, #0; + cmp %r0, #0; /* prepare carry flag for sbc */ + + tst %r2, #3; + beq .Large_loop; +.Loop: + ldr %r5, [%r1], #4; + mov %r4, %lr; + mov %lr, #0; + ldr %r6, [%r0]; + umlal %r4, %lr, %r5, %r3; + sub %r2, #1; + sbcs %r4, %r6, %r4; + tst %r2, #3; + str %r4, [%r0], #4; + bne .Loop; + + teq %r2, #0; + beq .Lend; + +.Large_loop: + ldr %r5, [%r1], #4; + mov %r9, #0; + ldr %r4, [%r0, #0]; + + umlal %lr, %r9, %r5, %r3; + ldr %r6, [%r0, #4]; + ldr %r5, [%r1], #4; + sbcs %r4, %r4, %lr; + + mov %lr, #0; + umlal %r9, %lr, %r5, %r3; + ldr %r8, [%r0, #8]; + ldr %r5, [%r1], #4; + sbcs %r6, %r6, %r9; + + mov %r9, #0; + umlal %lr, %r9, %r5, %r3; + ldr %r10, [%r0, #12]; + ldr %r5, [%r1], #4; + sbcs %r8, %r8, %lr; + + mov %lr, #0; + umlal %r9, %lr, %r5, %r3; + sub %r2, #4; + sbcs %r10, %r10, %r9; + + teq %r2, #0; + stm %r0!, {%r4, %r6, %r8, %r10}; + bne .Large_loop; + +.Lend: + it cc + movcc %r2, #1; + add %r0, %lr, %r2; + pop {%r4, %r5, %r6, %r8, %r9, %r10, %pc}; +.size _gcry_mpih_submul_1,.-_gcry_mpih_submul_1; diff --git a/mpi/arm/mpih-sub1.S b/mpi/arm/mpih-sub1.S new file mode 100644 index 0000000..77d05eb --- /dev/null +++ b/mpi/arm/mpih-sub1.S @@ -0,0 +1,77 @@ +/* ARMv6 sub_n -- Subtract two limb vectors of the same length > 0 and store + * sum in a third limb vector. + * + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + * + * Note: This code is heavily based on the GNU MP Library (version 4.2.1). + */ + +#include "sysdep.h" +#include "asm-syntax.h" + +.syntax unified +.arm + +/******************* + * mpi_limb_t + * _gcry_mpih_sub_n( mpi_ptr_t res_ptr, %r0 + * mpi_ptr_t s1_ptr, %r1 + * mpi_ptr_t s2_ptr, %r2 + * mpi_size_t size) %r3 + */ + +.text + +.globl _gcry_mpih_sub_n +.type _gcry_mpih_sub_n,%function +_gcry_mpih_sub_n: + push {%r4, %r5, %r6, %r7, %r8, %r9, %r10, %lr}; + cmp %r0, #0; /* prepare carry flag for sub */ + + tst %r3, #3; + beq .Large_loop; + +.Loop: + ldr %r4, [%r1], #4; + sub %r3, #1; + ldr %lr, [%r2], #4; + sbcs %r4, %lr; + tst %r3, #3; + str %r4, [%r0], #4; + bne .Loop; + + teq %r3, #0; + beq .Lend; + +.Large_loop: + ldm %r1!, {%r4, %r6, %r8, %r10}; + sub %r3, #4; + ldm %r2!, {%r5, %r7, %r9, %lr}; + sbcs %r4, %r5; + sbcs %r6, %r7; + sbcs %r8, %r9; + sbcs %r10, %lr; + teq %r3, #0; + stm %r0!, {%r4, %r6, %r8, %r10}; + bne .Large_loop; + +.Lend: + sbc %r0, %r3, #0; + neg %r0, %r0; + pop {%r4, %r5, %r6, %r7, %r8, %r9, %r10, %pc}; +.size _gcry_mpih_sub_n,.-_gcry_mpih_sub_n; diff --git a/mpi/armv6/mpi-asm-defs.h b/mpi/armv6/mpi-asm-defs.h deleted file mode 100644 index 047d1f5..0000000 --- a/mpi/armv6/mpi-asm-defs.h +++ /dev/null @@ -1,4 +0,0 @@ -/* This file defines some basic constants for the MPI machinery. We - * need to define the types on a per-CPU basis, so it is done with - * this file here. */ -#define BYTES_PER_MPI_LIMB (SIZEOF_UNSIGNED_LONG) diff --git a/mpi/armv6/mpih-add1.S b/mpi/armv6/mpih-add1.S deleted file mode 100644 index 60ea4c3..0000000 --- a/mpi/armv6/mpih-add1.S +++ /dev/null @@ -1,76 +0,0 @@ -/* ARMv6 add_n -- Add two limb vectors of the same length > 0 and store - * sum in a third limb vector. - * - * Copyright ? 2013 Jussi Kivilinna - * - * This file is part of Libgcrypt. - * - * Libgcrypt is free software; you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation; either version 2.1 of - * the License, or (at your option) any later version. - * - * Libgcrypt is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public - * License along with this program; if not, see . - * - * Note: This code is heavily based on the GNU MP Library (version 4.2.1). - */ - -#include "sysdep.h" -#include "asm-syntax.h" - -.syntax unified -.arm - -/******************* - * mpi_limb_t - * _gcry_mpih_add_n( mpi_ptr_t res_ptr, %r0 - * mpi_ptr_t s1_ptr, %r1 - * mpi_ptr_t s2_ptr, %r2 - * mpi_size_t size) %r3 - */ - -.text - -.globl _gcry_mpih_add_n -.type _gcry_mpih_add_n,%function -_gcry_mpih_add_n: - push {%r4, %r5, %r6, %r7, %r8, %r9, %r10, %lr}; - cmn %r0, #0; /* clear carry flag */ - - tst %r3, #3; - beq .Large_loop; - -.Loop: - ldr %r4, [%r1], #4; - sub %r3, #1; - ldr %lr, [%r2], #4; - adcs %r4, %lr; - tst %r3, #3; - str %r4, [%r0], #4; - bne .Loop; - - teq %r3, #0; - beq .Lend; - -.Large_loop: - ldm %r1!, {%r4, %r6, %r8, %r10}; - ldm %r2!, {%r5, %r7, %r9, %lr}; - sub %r3, #4; - adcs %r4, %r5; - adcs %r6, %r7; - adcs %r8, %r9; - adcs %r10, %lr; - teq %r3, #0; - stm %r0!, {%r4, %r6, %r8, %r10}; - bne .Large_loop; - -.Lend: - adc %r0, %r3, #0; - pop {%r4, %r5, %r6, %r7, %r8, %r9, %r10, %pc}; -.size _gcry_mpih_add_n,.-_gcry_mpih_add_n; diff --git a/mpi/armv6/mpih-mul1.S b/mpi/armv6/mpih-mul1.S deleted file mode 100644 index 0aa41ef..0000000 --- a/mpi/armv6/mpih-mul1.S +++ /dev/null @@ -1,80 +0,0 @@ -/* ARMv6 mul_1 -- Multiply a limb vector with a limb and store the result in - * a second limb vector. - * - * Copyright ? 2013 Jussi Kivilinna - * - * This file is part of Libgcrypt. - * - * Libgcrypt is free software; you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation; either version 2.1 of - * the License, or (at your option) any later version. - * - * Libgcrypt is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public - * License along with this program; if not, see . - * - * Note: This code is heavily based on the GNU MP Library (version 4.2.1). - */ - -#include "sysdep.h" -#include "asm-syntax.h" - -.syntax unified -.arm - -/******************* - * mpi_limb_t - * _gcry_mpih_mul_1( mpi_ptr_t res_ptr, %r0 - * mpi_ptr_t s1_ptr, %r1 - * mpi_size_t s1_size, %r2 - * mpi_limb_t s2_limb) %r3 - */ - -.text - -.globl _gcry_mpih_mul_1 -.type _gcry_mpih_mul_1,%function -_gcry_mpih_mul_1: - push {%r4, %r5, %r6, %r7, %r8, %r9, %r10, %r11, %lr}; - mov %r4, #0; - - tst %r2, #3; - beq .Large_loop; - -.Loop: - ldr %r5, [%r1], #4; - mov %lr, #0; - umlal %r4, %lr, %r5, %r3; - sub %r2, #1; - str %r4, [%r0], #4; - tst %r2, #3; - mov %r4, %lr; - bne .Loop; - - teq %r2, #0; - beq .Lend; - -.Large_loop: - ldm %r1!, {%r5, %r6, %r7, %r8}; - mov %r9, #0; - mov %r10, #0; - umlal %r4, %r9, %r5, %r3; - mov %r11, #0; - umlal %r9, %r10, %r6, %r3; - str %r4, [%r0], #4; - mov %r4, #0; - umlal %r10, %r11, %r7, %r3; - subs %r2, #4; - umlal %r11, %r4, %r8, %r3; - stm %r0!, {%r9, %r10, %r11}; - bne .Large_loop; - -.Lend: - mov %r0, %r4; - pop {%r4, %r5, %r6, %r7, %r8, %r9, %r10, %r11, %pc}; -.size _gcry_mpih_mul_1,.-_gcry_mpih_mul_1; diff --git a/mpi/armv6/mpih-mul2.S b/mpi/armv6/mpih-mul2.S deleted file mode 100644 index a7eb8a1..0000000 --- a/mpi/armv6/mpih-mul2.S +++ /dev/null @@ -1,94 +0,0 @@ -/* ARMv6 mul_2 -- Multiply a limb vector with a limb and add the result to - * a second limb vector. - * - * Copyright ? 2013 Jussi Kivilinna - * - * This file is part of Libgcrypt. - * - * Libgcrypt is free software; you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation; either version 2.1 of - * the License, or (at your option) any later version. - * - * Libgcrypt is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public - * License along with this program; if not, see . - * - * Note: This code is heavily based on the GNU MP Library (version 4.2.1). - */ - -#include "sysdep.h" -#include "asm-syntax.h" - -.syntax unified -.arm - -/******************* - * mpi_limb_t - * _gcry_mpih_addmul_1( mpi_ptr_t res_ptr, %r0 - * mpi_ptr_t s1_ptr, %r1 - * mpi_size_t s1_size, %r2 - * mpi_limb_t s2_limb) %r3 - */ - -.text - -.globl _gcry_mpih_addmul_1 -.type _gcry_mpih_addmul_1,%function -_gcry_mpih_addmul_1: - push {%r4, %r5, %r6, %r8, %r10, %lr}; - mov %lr, #0; - cmn %r0, #0; /* clear carry flag */ - - tst %r2, #3; - beq .Large_loop; -.Loop: - ldr %r5, [%r1], #4; - ldr %r4, [%r0]; - sub %r2, #1; - adcs %r4, %lr; - mov %lr, #0; - umlal %r4, %lr, %r5, %r3; - tst %r2, #3; - str %r4, [%r0], #4; - bne .Loop; - - teq %r2, #0; - beq .Lend; - -.Large_loop: - ldr %r5, [%r1], #4; - ldm %r0, {%r4, %r6, %r8, %r10}; - - sub %r2, #4; - adcs %r4, %lr; - mov %lr, #0; - umlal %r4, %lr, %r5, %r3; - - ldr %r5, [%r1], #4; - adcs %r6, %lr; - mov %lr, #0; - umlal %r6, %lr, %r5, %r3; - - ldr %r5, [%r1], #4; - adcs %r8, %lr; - mov %lr, #0; - umlal %r8, %lr, %r5, %r3; - - ldr %r5, [%r1], #4; - adcs %r10, %lr; - mov %lr, #0; - umlal %r10, %lr, %r5, %r3; - - teq %r2, #0; - stm %r0!, {%r4, %r6, %r8, %r10}; - bne .Large_loop; - -.Lend: - adc %r0, %lr, #0; - pop {%r4, %r5, %r6, %r8, %r10, %pc}; -.size _gcry_mpih_addmul_1,.-_gcry_mpih_addmul_1; diff --git a/mpi/armv6/mpih-mul3.S b/mpi/armv6/mpih-mul3.S deleted file mode 100644 index 034929e..0000000 --- a/mpi/armv6/mpih-mul3.S +++ /dev/null @@ -1,100 +0,0 @@ -/* ARMv6 mul_3 -- Multiply a limb vector with a limb and subtract the result - * from a second limb vector. - * - * Copyright ? 2013 Jussi Kivilinna - * - * This file is part of Libgcrypt. - * - * Libgcrypt is free software; you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation; either version 2.1 of - * the License, or (at your option) any later version. - * - * Libgcrypt is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public - * License along with this program; if not, see . - * - * Note: This code is heavily based on the GNU MP Library (version 4.2.1). - */ - -#include "sysdep.h" -#include "asm-syntax.h" - -.syntax unified -.arm - -/******************* - * mpi_limb_t - * _gcry_mpih_submul_1( mpi_ptr_t res_ptr, %r0 - * mpi_ptr_t s1_ptr, %r1 - * mpi_size_t s1_size, %r2 - * mpi_limb_t s2_limb) %r3 - */ - -.text - -.globl _gcry_mpih_submul_1 -.type _gcry_mpih_submul_1,%function -_gcry_mpih_submul_1: - push {%r4, %r5, %r6, %r8, %r9, %r10, %lr}; - mov %lr, #0; - cmp %r0, #0; /* prepare carry flag for sbc */ - - tst %r2, #3; - beq .Large_loop; -.Loop: - ldr %r5, [%r1], #4; - mov %r4, %lr; - mov %lr, #0; - ldr %r6, [%r0]; - umlal %r4, %lr, %r5, %r3; - sub %r2, #1; - sbcs %r4, %r6, %r4; - tst %r2, #3; - str %r4, [%r0], #4; - bne .Loop; - - teq %r2, #0; - beq .Lend; - -.Large_loop: - ldr %r5, [%r1], #4; - mov %r9, #0; - ldr %r4, [%r0, #0]; - - umlal %lr, %r9, %r5, %r3; - ldr %r6, [%r0, #4]; - ldr %r5, [%r1], #4; - sbcs %r4, %r4, %lr; - - mov %lr, #0; - umlal %r9, %lr, %r5, %r3; - ldr %r8, [%r0, #8]; - ldr %r5, [%r1], #4; - sbcs %r6, %r6, %r9; - - mov %r9, #0; - umlal %lr, %r9, %r5, %r3; - ldr %r10, [%r0, #12]; - ldr %r5, [%r1], #4; - sbcs %r8, %r8, %lr; - - mov %lr, #0; - umlal %r9, %lr, %r5, %r3; - sub %r2, #4; - sbcs %r10, %r10, %r9; - - teq %r2, #0; - stm %r0!, {%r4, %r6, %r8, %r10}; - bne .Large_loop; - -.Lend: - it cc - movcc %r2, #1; - add %r0, %lr, %r2; - pop {%r4, %r5, %r6, %r8, %r9, %r10, %pc}; -.size _gcry_mpih_submul_1,.-_gcry_mpih_submul_1; diff --git a/mpi/armv6/mpih-sub1.S b/mpi/armv6/mpih-sub1.S deleted file mode 100644 index 77d05eb..0000000 --- a/mpi/armv6/mpih-sub1.S +++ /dev/null @@ -1,77 +0,0 @@ -/* ARMv6 sub_n -- Subtract two limb vectors of the same length > 0 and store - * sum in a third limb vector. - * - * Copyright ? 2013 Jussi Kivilinna - * - * This file is part of Libgcrypt. - * - * Libgcrypt is free software; you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation; either version 2.1 of - * the License, or (at your option) any later version. - * - * Libgcrypt is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public - * License along with this program; if not, see . - * - * Note: This code is heavily based on the GNU MP Library (version 4.2.1). - */ - -#include "sysdep.h" -#include "asm-syntax.h" - -.syntax unified -.arm - -/******************* - * mpi_limb_t - * _gcry_mpih_sub_n( mpi_ptr_t res_ptr, %r0 - * mpi_ptr_t s1_ptr, %r1 - * mpi_ptr_t s2_ptr, %r2 - * mpi_size_t size) %r3 - */ - -.text - -.globl _gcry_mpih_sub_n -.type _gcry_mpih_sub_n,%function -_gcry_mpih_sub_n: - push {%r4, %r5, %r6, %r7, %r8, %r9, %r10, %lr}; - cmp %r0, #0; /* prepare carry flag for sub */ - - tst %r3, #3; - beq .Large_loop; - -.Loop: - ldr %r4, [%r1], #4; - sub %r3, #1; - ldr %lr, [%r2], #4; - sbcs %r4, %lr; - tst %r3, #3; - str %r4, [%r0], #4; - bne .Loop; - - teq %r3, #0; - beq .Lend; - -.Large_loop: - ldm %r1!, {%r4, %r6, %r8, %r10}; - sub %r3, #4; - ldm %r2!, {%r5, %r7, %r9, %lr}; - sbcs %r4, %r5; - sbcs %r6, %r7; - sbcs %r8, %r9; - sbcs %r10, %lr; - teq %r3, #0; - stm %r0!, {%r4, %r6, %r8, %r10}; - bne .Large_loop; - -.Lend: - sbc %r0, %r3, #0; - neg %r0, %r0; - pop {%r4, %r5, %r6, %r7, %r8, %r9, %r10, %pc}; -.size _gcry_mpih_sub_n,.-_gcry_mpih_sub_n; diff --git a/mpi/config.links b/mpi/config.links index 9fb4f10..90d1077 100644 --- a/mpi/config.links +++ b/mpi/config.links @@ -138,14 +138,9 @@ case "${host}" in ;; arm*-*-*) if test "$gcry_cv_gcc_arm_platform_as_ok" = "yes" ; then - if test "$gcry_cv_cc_arm_arch_is_v6" = "yes" ; then - echo '/* configured for armv6 */' >>./mpi/asm-syntax.h - path="armv6" - mpi_cpu_arch="arm" - else - echo '/* No assembler modules configured */' >>./mpi/asm-syntax.h - path="" - fi + echo '/* configured for arm */' >>./mpi/asm-syntax.h + path="arm" + mpi_cpu_arch="arm" else echo '/* No assembler modules configured */' >>./mpi/asm-syntax.h path="" -- 1.8.4.rc3 From dbaryshkov at gmail.com Tue Oct 22 21:29:27 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Tue, 22 Oct 2013 23:29:27 +0400 Subject: [PATCH 3/3] Enable assembler optimizations on earlier ARM cores In-Reply-To: <1382470167-11975-1-git-send-email-dbaryshkov@gmail.com> References: <1382470167-11975-1-git-send-email-dbaryshkov@gmail.com> Message-ID: <1382470167-11975-3-git-send-email-dbaryshkov@gmail.com> * cipher/blowfish-armv6.S => cipher/blowfish-arm.S: adapt to pre-armv6 CPUs. * cipher/blowfish.c: enable assembly on armv4/armv5 little-endian CPUs. * cipher/camellia-armv6.S => cipher/camellia-arm.S: adapt to pre-armv6 CPUs. * cipher/camellia.c, cipher-camellia-glue.c: enable assembly on armv4/armv5 little-endian CPUs. * cipher/cast5-armv6.S => cipher/cast5-arm.S: adapt to pre-armv6 CPUs. * cipher/cast5.c: enable assembly on armv4/armv5 little-endian CPUs. * cipher/rijndael-armv6.S => cipher/rijndael-arm.S: adapt to pre-armv6 CPUs. * cipher/rijndael.c: enable assembly on armv4/armv5 little-endian CPUs. * cipher/twofish-armv6.S => cipher/twofish-arm.S: adapt to pre-armv6 CPUs. * cipher/twofish.c: enable assembly on armv4/armv5 little-endian CPUs. -- Our ARMv6 assembly optimized code can be easily adapted to earlier CPUs. The only incompatible place is rev instruction used to do byte swapping. Replace it on <= ARMv6 with a series of 4 instructions. Compare: ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- --------------- AES 620ms 610ms 650ms 680ms 620ms 630ms 660ms 660ms 630ms 630ms CAMELLIA128 720ms 720ms 780ms 790ms 770ms 760ms 780ms 780ms 770ms 760ms CAMELLIA256 910ms 910ms 970ms 970ms 960ms 950ms 970ms 970ms 960ms 950ms CAST5 820ms 820ms 930ms 920ms 890ms 860ms 930ms 920ms 880ms 890ms BLOWFISH 550ms 560ms 650ms 660ms 630ms 600ms 660ms 650ms 610ms 620ms ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- --------------- AES 130ms 140ms 180ms 200ms 160ms 170ms 190ms 200ms 170ms 170ms CAMELLIA128 150ms 160ms 210ms 220ms 200ms 190ms 210ms 220ms 190ms 190ms CAMELLIA256 180ms 180ms 260ms 240ms 240ms 230ms 250ms 250ms 230ms 230ms CAST5 170ms 160ms 270ms 120ms 240ms 130ms 260ms 270ms 130ms 120ms BLOWFISH 160ms 150ms 260ms 110ms 230ms 120ms 250ms 260ms 110ms 120ms Signed-off-by: Dmitry Eremin-Solenikov --- cipher/Makefile.am | 8 +- cipher/blowfish-arm.S | 743 +++++++++++++++++++++++++++++++++++++++++ cipher/blowfish-armv6.S | 730 ----------------------------------------- cipher/blowfish.c | 44 +-- cipher/camellia-arm.S | 616 ++++++++++++++++++++++++++++++++++ cipher/camellia-armv6.S | 604 ---------------------------------- cipher/camellia-glue.c | 14 +- cipher/camellia.c | 8 +- cipher/camellia.h | 2 +- cipher/cast5-arm.S | 715 ++++++++++++++++++++++++++++++++++++++++ cipher/cast5-armv6.S | 702 --------------------------------------- cipher/cast5.c | 44 +-- cipher/rijndael-arm.S | 853 ++++++++++++++++++++++++++++++++++++++++++++++++ cipher/rijndael-armv6.S | 853 ------------------------------------------------ cipher/rijndael.c | 36 +- cipher/twofish-arm.S | 365 +++++++++++++++++++++ cipher/twofish-armv6.S | 365 --------------------- cipher/twofish.c | 8 +- configure.ac | 10 +- 19 files changed, 3379 insertions(+), 3341 deletions(-) create mode 100644 cipher/blowfish-arm.S delete mode 100644 cipher/blowfish-armv6.S create mode 100644 cipher/camellia-arm.S delete mode 100644 cipher/camellia-armv6.S create mode 100644 cipher/cast5-arm.S delete mode 100644 cipher/cast5-armv6.S create mode 100644 cipher/rijndael-arm.S delete mode 100644 cipher/rijndael-armv6.S create mode 100644 cipher/twofish-arm.S delete mode 100644 cipher/twofish-armv6.S diff --git a/cipher/Makefile.am b/cipher/Makefile.am index 3d8149a..e3aed3b 100644 --- a/cipher/Makefile.am +++ b/cipher/Makefile.am @@ -56,7 +56,7 @@ rmd.h EXTRA_libcipher_la_SOURCES = \ arcfour.c \ blowfish.c blowfish-amd64.S \ -cast5.c cast5-amd64.S cast5-armv6.S \ +cast5.c cast5-amd64.S cast5-arm.S \ crc.c \ des.c \ dsa.c \ @@ -67,7 +67,7 @@ gost28147.c gost.h \ gostr3411-94.c \ md4.c \ md5.c \ -rijndael.c rijndael-tables.h rijndael-amd64.S rijndael-armv6.S \ +rijndael.c rijndael-tables.h rijndael-amd64.S rijndael-arm.S \ rmd160.c \ rsa.c \ salsa20.c \ @@ -80,10 +80,10 @@ sha512.c sha512-armv7-neon.S \ stribog.c \ tiger.c \ whirlpool.c \ -twofish.c twofish-amd64.S twofish-armv6.S \ +twofish.c twofish-amd64.S twofish-arm.S \ rfc2268.c \ camellia.c camellia.h camellia-glue.c camellia-aesni-avx-amd64.S \ - camellia-aesni-avx2-amd64.S camellia-armv6.S + camellia-aesni-avx2-amd64.S camellia-arm.S if ENABLE_O_FLAG_MUNGING o_flag_munging = sed -e 's/-O\([2-9s][2-9s]*\)/-O1/' -e 's/-Ofast/-O1/g' diff --git a/cipher/blowfish-arm.S b/cipher/blowfish-arm.S new file mode 100644 index 0000000..501b085 --- /dev/null +++ b/cipher/blowfish-arm.S @@ -0,0 +1,743 @@ +/* blowfish-arm.S - ARM assembly implementation of Blowfish cipher + * + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include + +#if defined(__ARMEL__) +#ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS + +.text + +.syntax unified +.arm + +/* structure of crypto context */ +#define s0 0 +#define s1 (s0 + (1 * 256) * 4) +#define s2 (s0 + (2 * 256) * 4) +#define s3 (s0 + (3 * 256) * 4) +#define p (s3 + (1 * 256) * 4) + +/* register macros */ +#define CTXs0 %r0 +#define CTXs1 %r9 +#define CTXs2 %r8 +#define CTXs3 %r10 +#define RMASK %lr +#define RKEYL %r2 +#define RKEYR %ip + +#define RL0 %r3 +#define RR0 %r4 + +#define RL1 %r9 +#define RR1 %r10 + +#define RT0 %r11 +#define RT1 %r7 +#define RT2 %r5 +#define RT3 %r6 + +/* helper macros */ +#define ldr_unaligned_le(rout, rsrc, offs, rtmp) \ + ldrb rout, [rsrc, #((offs) + 0)]; \ + ldrb rtmp, [rsrc, #((offs) + 1)]; \ + orr rout, rout, rtmp, lsl #8; \ + ldrb rtmp, [rsrc, #((offs) + 2)]; \ + orr rout, rout, rtmp, lsl #16; \ + ldrb rtmp, [rsrc, #((offs) + 3)]; \ + orr rout, rout, rtmp, lsl #24; + +#define str_unaligned_le(rin, rdst, offs, rtmp0, rtmp1) \ + mov rtmp0, rin, lsr #8; \ + strb rin, [rdst, #((offs) + 0)]; \ + mov rtmp1, rin, lsr #16; \ + strb rtmp0, [rdst, #((offs) + 1)]; \ + mov rtmp0, rin, lsr #24; \ + strb rtmp1, [rdst, #((offs) + 2)]; \ + strb rtmp0, [rdst, #((offs) + 3)]; + +#define ldr_unaligned_be(rout, rsrc, offs, rtmp) \ + ldrb rout, [rsrc, #((offs) + 3)]; \ + ldrb rtmp, [rsrc, #((offs) + 2)]; \ + orr rout, rout, rtmp, lsl #8; \ + ldrb rtmp, [rsrc, #((offs) + 1)]; \ + orr rout, rout, rtmp, lsl #16; \ + ldrb rtmp, [rsrc, #((offs) + 0)]; \ + orr rout, rout, rtmp, lsl #24; + +#define str_unaligned_be(rin, rdst, offs, rtmp0, rtmp1) \ + mov rtmp0, rin, lsr #8; \ + strb rin, [rdst, #((offs) + 3)]; \ + mov rtmp1, rin, lsr #16; \ + strb rtmp0, [rdst, #((offs) + 2)]; \ + mov rtmp0, rin, lsr #24; \ + strb rtmp1, [rdst, #((offs) + 1)]; \ + strb rtmp0, [rdst, #((offs) + 0)]; + +#ifdef __ARMEL__ + #define ldr_unaligned_host ldr_unaligned_le + #define str_unaligned_host str_unaligned_le + + /* bswap on little-endian */ +#ifdef HAVE_ARM_ARCH_V6 + #define host_to_be(reg, rtmp) \ + rev reg, reg; + #define be_to_host(reg, rtmp) \ + rev reg, reg; +#else + #define host_to_be(reg, rtmp) \ + eor rtmp, reg, reg, ror #16; \ + mov rtmp, rtmp, lsr #8; \ + bic rtmp, rtmp, #65280; \ + eor reg, rtmp, reg, ror #8; + #define be_to_host(reg, rtmp) \ + eor rtmp, reg, reg, ror #16; \ + mov rtmp, rtmp, lsr #8; \ + bic rtmp, rtmp, #65280; \ + eor reg, rtmp, reg, ror #8; +#endif +#else + #define ldr_unaligned_host ldr_unaligned_be + #define str_unaligned_host str_unaligned_be + + /* nop on big-endian */ + #define host_to_be(reg, rtmp) /*_*/ + #define be_to_host(reg, rtmp) /*_*/ +#endif + +#define host_to_host(x, y) /*_*/ + +/*********************************************************************** + * 1-way blowfish + ***********************************************************************/ +#define F(l, r) \ + and RT0, RMASK, l, lsr#(24 - 2); \ + and RT1, RMASK, l, lsr#(16 - 2); \ + ldr RT0, [CTXs0, RT0]; \ + and RT2, RMASK, l, lsr#(8 - 2); \ + ldr RT1, [CTXs1, RT1]; \ + and RT3, RMASK, l, lsl#2; \ + ldr RT2, [CTXs2, RT2]; \ + add RT0, RT1; \ + ldr RT3, [CTXs3, RT3]; \ + eor RT0, RT2; \ + add RT0, RT3; \ + eor r, RT0; + +#define load_roundkey_enc(n) \ + ldr RKEYL, [CTXs2, #((p - s2) + (4 * (n) + 0))]; \ + ldr RKEYR, [CTXs2, #((p - s2) + (4 * (n) + 4))]; + +#define add_roundkey_enc() \ + eor RL0, RKEYL; \ + eor RR0, RKEYR; + +#define round_enc(n) \ + add_roundkey_enc(); \ + load_roundkey_enc(n); \ + \ + F(RL0, RR0); \ + F(RR0, RL0); + +#define load_roundkey_dec(n) \ + ldr RKEYL, [CTXs2, #((p - s2) + (4 * ((n) - 1) + 4))]; \ + ldr RKEYR, [CTXs2, #((p - s2) + (4 * ((n) - 1) + 0))]; + +#define add_roundkey_dec() \ + eor RL0, RKEYL; \ + eor RR0, RKEYR; + +#define round_dec(n) \ + add_roundkey_dec(); \ + load_roundkey_dec(n); \ + \ + F(RL0, RR0); \ + F(RR0, RL0); + +#define read_block_aligned(rin, offs, l0, r0, convert, rtmp) \ + ldr l0, [rin, #((offs) + 0)]; \ + ldr r0, [rin, #((offs) + 4)]; \ + convert(l0, rtmp); \ + convert(r0, rtmp); + +#define write_block_aligned(rout, offs, l0, r0, convert, rtmp) \ + convert(l0, rtmp); \ + convert(r0, rtmp); \ + str l0, [rout, #((offs) + 0)]; \ + str r0, [rout, #((offs) + 4)]; + +#ifdef __ARM_FEATURE_UNALIGNED + /* unaligned word reads allowed */ + #define read_block(rin, offs, l0, r0, rtmp0) \ + read_block_aligned(rin, offs, l0, r0, host_to_be) + + #define write_block(rout, offs, r0, l0, rtmp0, rtmp1) \ + write_block_aligned(rout, offs, r0, l0, be_to_host) + + #define read_block_host(rin, offs, l0, r0, rtmp0) \ + read_block_aligned(rin, offs, l0, r0, host_to_host) + + #define write_block_host(rout, offs, r0, l0, rtmp0, rtmp1) \ + write_block_aligned(rout, offs, r0, l0, host_to_host) +#else + /* need to handle unaligned reads by byte reads */ + #define read_block(rin, offs, l0, r0, rtmp0) \ + tst rin, #3; \ + beq 1f; \ + ldr_unaligned_be(l0, rin, (offs) + 0, rtmp0); \ + ldr_unaligned_be(r0, rin, (offs) + 4, rtmp0); \ + b 2f; \ + 1:;\ + read_block_aligned(rin, offs, l0, r0, host_to_be, rtmp0); \ + 2:; + + #define write_block(rout, offs, l0, r0, rtmp0, rtmp1) \ + tst rout, #3; \ + beq 1f; \ + str_unaligned_be(l0, rout, (offs) + 0, rtmp0, rtmp1); \ + str_unaligned_be(r0, rout, (offs) + 4, rtmp0, rtmp1); \ + b 2f; \ + 1:;\ + write_block_aligned(rout, offs, l0, r0, be_to_host, rtmp0); \ + 2:; + + #define read_block_host(rin, offs, l0, r0, rtmp0) \ + tst rin, #3; \ + beq 1f; \ + ldr_unaligned_host(l0, rin, (offs) + 0, rtmp0); \ + ldr_unaligned_host(r0, rin, (offs) + 4, rtmp0); \ + b 2f; \ + 1:;\ + read_block_aligned(rin, offs, l0, r0, host_to_host, rtmp0); \ + 2:; + + #define write_block_host(rout, offs, l0, r0, rtmp0, rtmp1) \ + tst rout, #3; \ + beq 1f; \ + str_unaligned_host(l0, rout, (offs) + 0, rtmp0, rtmp1); \ + str_unaligned_host(r0, rout, (offs) + 4, rtmp0, rtmp1); \ + b 2f; \ + 1:;\ + write_block_aligned(rout, offs, l0, r0, host_to_host); \ + 2:; +#endif + +.align 3 +.type __blowfish_enc_blk1,%function; + +__blowfish_enc_blk1: + /* input: + * preloaded: CTX + * [RL0, RR0]: src + * output: + * [RR0, RL0]: dst + */ + push {%lr}; + + add CTXs1, CTXs0, #(s1 - s0); + add CTXs2, CTXs0, #(s2 - s0); + mov RMASK, #(0xff << 2); /* byte mask */ + add CTXs3, CTXs1, #(s3 - s1); + + load_roundkey_enc(0); + round_enc(2); + round_enc(4); + round_enc(6); + round_enc(8); + round_enc(10); + round_enc(12); + round_enc(14); + round_enc(16); + add_roundkey_enc(); + + pop {%pc}; +.size __blowfish_enc_blk1,.-__blowfish_enc_blk1; + +.align 8 +.globl _gcry_blowfish_arm_do_encrypt +.type _gcry_blowfish_arm_do_encrypt,%function; + +_gcry_blowfish_arm_do_encrypt: + /* input: + * %r0: ctx, CTX + * %r1: u32 *ret_xl + * %r2: u32 *ret_xr + */ + push {%r2, %r4-%r11, %ip, %lr}; + + ldr RL0, [%r1]; + ldr RR0, [%r2]; + + bl __blowfish_enc_blk1; + + pop {%r2}; + str RR0, [%r1]; + str RL0, [%r2]; + + pop {%r4-%r11, %ip, %pc}; +.size _gcry_blowfish_arm_do_encrypt,.-_gcry_blowfish_arm_do_encrypt; + +.align 3 +.global _gcry_blowfish_arm_encrypt_block +.type _gcry_blowfish_arm_encrypt_block,%function; + +_gcry_blowfish_arm_encrypt_block: + /* input: + * %r0: ctx, CTX + * %r1: dst + * %r2: src + */ + push {%r4-%r11, %ip, %lr}; + + read_block(%r2, 0, RL0, RR0, RT0); + + bl __blowfish_enc_blk1; + + write_block(%r1, 0, RR0, RL0, RT0, RT1); + + pop {%r4-%r11, %ip, %pc}; +.size _gcry_blowfish_arm_encrypt_block,.-_gcry_blowfish_arm_encrypt_block; + +.align 3 +.global _gcry_blowfish_arm_decrypt_block +.type _gcry_blowfish_arm_decrypt_block,%function; + +_gcry_blowfish_arm_decrypt_block: + /* input: + * %r0: ctx, CTX + * %r1: dst + * %r2: src + */ + push {%r4-%r11, %ip, %lr}; + + add CTXs1, CTXs0, #(s1 - s0); + add CTXs2, CTXs0, #(s2 - s0); + mov RMASK, #(0xff << 2); /* byte mask */ + add CTXs3, CTXs1, #(s3 - s1); + + read_block(%r2, 0, RL0, RR0, RT0); + + load_roundkey_dec(17); + round_dec(15); + round_dec(13); + round_dec(11); + round_dec(9); + round_dec(7); + round_dec(5); + round_dec(3); + round_dec(1); + add_roundkey_dec(); + + write_block(%r1, 0, RR0, RL0, RT0, RT1); + + pop {%r4-%r11, %ip, %pc}; +.size _gcry_blowfish_arm_decrypt_block,.-_gcry_blowfish_arm_decrypt_block; + +/*********************************************************************** + * 2-way blowfish + ***********************************************************************/ +#define F2(n, l0, r0, l1, r1, set_nextk, dec) \ + \ + and RT0, RMASK, l0, lsr#(24 - 2); \ + and RT1, RMASK, l0, lsr#(16 - 2); \ + and RT2, RMASK, l0, lsr#(8 - 2); \ + add RT1, #(s1 - s0); \ + \ + ldr RT0, [CTXs0, RT0]; \ + and RT3, RMASK, l0, lsl#2; \ + ldr RT1, [CTXs0, RT1]; \ + add RT3, #(s3 - s2); \ + ldr RT2, [CTXs2, RT2]; \ + add RT0, RT1; \ + ldr RT3, [CTXs2, RT3]; \ + \ + and RT1, RMASK, l1, lsr#(24 - 2); \ + eor RT0, RT2; \ + and RT2, RMASK, l1, lsr#(16 - 2); \ + add RT0, RT3; \ + add RT2, #(s1 - s0); \ + and RT3, RMASK, l1, lsr#(8 - 2); \ + eor r0, RT0; \ + \ + ldr RT1, [CTXs0, RT1]; \ + and RT0, RMASK, l1, lsl#2; \ + ldr RT2, [CTXs0, RT2]; \ + add RT0, #(s3 - s2); \ + ldr RT3, [CTXs2, RT3]; \ + add RT1, RT2; \ + ldr RT0, [CTXs2, RT0]; \ + \ + and RT2, RMASK, r0, lsr#(24 - 2); \ + eor RT1, RT3; \ + and RT3, RMASK, r0, lsr#(16 - 2); \ + add RT1, RT0; \ + add RT3, #(s1 - s0); \ + and RT0, RMASK, r0, lsr#(8 - 2); \ + eor r1, RT1; \ + \ + ldr RT2, [CTXs0, RT2]; \ + and RT1, RMASK, r0, lsl#2; \ + ldr RT3, [CTXs0, RT3]; \ + add RT1, #(s3 - s2); \ + ldr RT0, [CTXs2, RT0]; \ + add RT2, RT3; \ + ldr RT1, [CTXs2, RT1]; \ + \ + and RT3, RMASK, r1, lsr#(24 - 2); \ + eor RT2, RT0; \ + and RT0, RMASK, r1, lsr#(16 - 2); \ + add RT2, RT1; \ + add RT0, #(s1 - s0); \ + and RT1, RMASK, r1, lsr#(8 - 2); \ + eor l0, RT2; \ + \ + ldr RT3, [CTXs0, RT3]; \ + and RT2, RMASK, r1, lsl#2; \ + ldr RT0, [CTXs0, RT0]; \ + add RT2, #(s3 - s2); \ + ldr RT1, [CTXs2, RT1]; \ + eor l1, RKEYL; \ + ldr RT2, [CTXs2, RT2]; \ + \ + eor r0, RKEYR; \ + add RT3, RT0; \ + eor r1, RKEYR; \ + eor RT3, RT1; \ + eor l0, RKEYL; \ + add RT3, RT2; \ + set_nextk(RKEYL, (p - s2) + (4 * (n) + ((dec) * 4))); \ + eor l1, RT3; \ + set_nextk(RKEYR, (p - s2) + (4 * (n) + (!(dec) * 4))); + +#define load_n_add_roundkey_enc2(n) \ + load_roundkey_enc(n); \ + eor RL0, RKEYL; \ + eor RR0, RKEYR; \ + eor RL1, RKEYL; \ + eor RR1, RKEYR; \ + load_roundkey_enc((n) + 2); + +#define next_key(reg, offs) \ + ldr reg, [CTXs2, #(offs)]; + +#define dummy(x, y) /* do nothing */ + +#define round_enc2(n, load_next_key) \ + F2((n) + 2, RL0, RR0, RL1, RR1, load_next_key, 0); + +#define load_n_add_roundkey_dec2(n) \ + load_roundkey_dec(n); \ + eor RL0, RKEYL; \ + eor RR0, RKEYR; \ + eor RL1, RKEYL; \ + eor RR1, RKEYR; \ + load_roundkey_dec((n) - 2); + +#define round_dec2(n, load_next_key) \ + F2((n) - 3, RL0, RR0, RL1, RR1, load_next_key, 1); + +#define read_block2_aligned(rin, l0, r0, l1, r1, convert, rtmp) \ + ldr l0, [rin, #(0)]; \ + ldr r0, [rin, #(4)]; \ + convert(l0, rtmp); \ + ldr l1, [rin, #(8)]; \ + convert(r0, rtmp); \ + ldr r1, [rin, #(12)]; \ + convert(l1, rtmp); \ + convert(r1, rtmp); + +#define write_block2_aligned(rout, l0, r0, l1, r1, convert, rtmp) \ + convert(l0, rtmp); \ + convert(r0, rtmp); \ + convert(l1, rtmp); \ + str l0, [rout, #(0)]; \ + convert(r1, rtmp); \ + str r0, [rout, #(4)]; \ + str l1, [rout, #(8)]; \ + str r1, [rout, #(12)]; + +#ifdef __ARM_FEATURE_UNALIGNED + /* unaligned word reads allowed */ + #define read_block2(rin, l0, r0, l1, r1, rtmp0) \ + read_block2_aligned(rin, l0, r0, l1, r1, host_to_be, rtmp0) + + #define write_block2(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ + write_block2_aligned(rout, l0, r0, l1, r1, be_to_host, rtmp0) + + #define read_block2_host(rin, l0, r0, l1, r1, rtmp0) \ + read_block2_aligned(rin, l0, r0, l1, r1, host_to_host, rtmp0) + + #define write_block2_host(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ + write_block2_aligned(rout, l0, r0, l1, r1, host_to_host, rtmp0) +#else + /* need to handle unaligned reads by byte reads */ + #define read_block2(rin, l0, r0, l1, r1, rtmp0) \ + tst rin, #3; \ + beq 1f; \ + ldr_unaligned_be(l0, rin, 0, rtmp0); \ + ldr_unaligned_be(r0, rin, 4, rtmp0); \ + ldr_unaligned_be(l1, rin, 8, rtmp0); \ + ldr_unaligned_be(r1, rin, 12, rtmp0); \ + b 2f; \ + 1:;\ + read_block2_aligned(rin, l0, r0, l1, r1, host_to_be, rtmp0); \ + 2:; + + #define write_block2(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ + tst rout, #3; \ + beq 1f; \ + str_unaligned_be(l0, rout, 0, rtmp0, rtmp1); \ + str_unaligned_be(r0, rout, 4, rtmp0, rtmp1); \ + str_unaligned_be(l1, rout, 8, rtmp0, rtmp1); \ + str_unaligned_be(r1, rout, 12, rtmp0, rtmp1); \ + b 2f; \ + 1:;\ + write_block2_aligned(rout, l0, r0, l1, r1, be_to_host, rtmp0); \ + 2:; + + #define read_block2_host(rin, l0, r0, l1, r1, rtmp0) \ + tst rin, #3; \ + beq 1f; \ + ldr_unaligned_host(l0, rin, 0, rtmp0); \ + ldr_unaligned_host(r0, rin, 4, rtmp0); \ + ldr_unaligned_host(l1, rin, 8, rtmp0); \ + ldr_unaligned_host(r1, rin, 12, rtmp0); \ + b 2f; \ + 1:;\ + read_block2_aligned(rin, l0, r0, l1, r1, host_to_host, rtmp0); \ + 2:; + + #define write_block2_host(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ + tst rout, #3; \ + beq 1f; \ + str_unaligned_host(l0, rout, 0, rtmp0, rtmp1); \ + str_unaligned_host(r0, rout, 4, rtmp0, rtmp1); \ + str_unaligned_host(l1, rout, 8, rtmp0, rtmp1); \ + str_unaligned_host(r1, rout, 12, rtmp0, rtmp1); \ + b 2f; \ + 1:;\ + write_block2_aligned(rout, l0, r0, l1, r1, host_to_host, rtmp0); \ + 2:; +#endif + +.align 3 +.type _gcry_blowfish_arm_enc_blk2,%function; + +_gcry_blowfish_arm_enc_blk2: + /* input: + * preloaded: CTX + * [RL0, RR0], [RL1, RR1]: src + * output: + * [RR0, RL0], [RR1, RL1]: dst + */ + push {RT0,%lr}; + + add CTXs2, CTXs0, #(s2 - s0); + mov RMASK, #(0xff << 2); /* byte mask */ + + load_n_add_roundkey_enc2(0); + round_enc2(2, next_key); + round_enc2(4, next_key); + round_enc2(6, next_key); + round_enc2(8, next_key); + round_enc2(10, next_key); + round_enc2(12, next_key); + round_enc2(14, next_key); + round_enc2(16, dummy); + + host_to_be(RR0, RT0); + host_to_be(RL0, RT0); + host_to_be(RR1, RT0); + host_to_be(RL1, RT0); + + pop {RT0,%pc}; +.size _gcry_blowfish_arm_enc_blk2,.-_gcry_blowfish_arm_enc_blk2; + +.align 3 +.globl _gcry_blowfish_arm_cfb_dec; +.type _gcry_blowfish_arm_cfb_dec,%function; + +_gcry_blowfish_arm_cfb_dec: + /* input: + * %r0: CTX + * %r1: dst (2 blocks) + * %r2: src (2 blocks) + * %r3: iv (64bit) + */ + push {%r2, %r4-%r11, %ip, %lr}; + + mov %lr, %r3; + + /* Load input (iv/%r3 is aligned, src/%r2 might not be) */ + ldm %r3, {RL0, RR0}; + host_to_be(RL0, RT0); + host_to_be(RR0, RT0); + read_block(%r2, 0, RL1, RR1, RT0); + + /* Update IV, load src[1] and save to iv[0] */ + read_block_host(%r2, 8, %r5, %r6, RT0); + stm %lr, {%r5, %r6}; + + bl _gcry_blowfish_arm_enc_blk2; + /* result in RR0:RL0, RR1:RL1 = %r4:%r3, %r10:%r9 */ + + /* %r1: dst, %r0: %src */ + pop {%r0}; + + /* dst = src ^ result */ + read_block2_host(%r0, %r5, %r6, %r7, %r8, %lr); + eor %r5, %r4; + eor %r6, %r3; + eor %r7, %r10; + eor %r8, %r9; + write_block2_host(%r1, %r5, %r6, %r7, %r8, %r9, %r10); + + pop {%r4-%r11, %ip, %pc}; +.ltorg +.size _gcry_blowfish_arm_cfb_dec,.-_gcry_blowfish_arm_cfb_dec; + +.align 3 +.globl _gcry_blowfish_arm_ctr_enc; +.type _gcry_blowfish_arm_ctr_enc,%function; + +_gcry_blowfish_arm_ctr_enc: + /* input: + * %r0: CTX + * %r1: dst (2 blocks) + * %r2: src (2 blocks) + * %r3: iv (64bit, big-endian) + */ + push {%r2, %r4-%r11, %ip, %lr}; + + mov %lr, %r3; + + /* Load IV (big => host endian) */ + read_block_aligned(%lr, 0, RL0, RR0, be_to_host, RT0); + + /* Construct IVs */ + adds RR1, RR0, #1; /* +1 */ + adc RL1, RL0, #0; + adds %r6, RR1, #1; /* +2 */ + adc %r5, RL1, #0; + + /* Store new IV (host => big-endian) */ + write_block_aligned(%lr, 0, %r5, %r6, host_to_be, RT0); + + bl _gcry_blowfish_arm_enc_blk2; + /* result in RR0:RL0, RR1:RL1 = %r4:%r3, %r10:%r9 */ + + /* %r1: dst, %r0: %src */ + pop {%r0}; + + /* XOR key-stream with plaintext */ + read_block2_host(%r0, %r5, %r6, %r7, %r8, %lr); + eor %r5, %r4; + eor %r6, %r3; + eor %r7, %r10; + eor %r8, %r9; + write_block2_host(%r1, %r5, %r6, %r7, %r8, %r9, %r10); + + pop {%r4-%r11, %ip, %pc}; +.ltorg +.size _gcry_blowfish_arm_ctr_enc,.-_gcry_blowfish_arm_ctr_enc; + +.align 3 +.type _gcry_blowfish_arm_dec_blk2,%function; + +_gcry_blowfish_arm_dec_blk2: + /* input: + * preloaded: CTX + * [RL0, RR0], [RL1, RR1]: src + * output: + * [RR0, RL0], [RR1, RL1]: dst + */ + add CTXs2, CTXs0, #(s2 - s0); + mov RMASK, #(0xff << 2); /* byte mask */ + + load_n_add_roundkey_dec2(17); + round_dec2(15, next_key); + round_dec2(13, next_key); + round_dec2(11, next_key); + round_dec2(9, next_key); + round_dec2(7, next_key); + round_dec2(5, next_key); + round_dec2(3, next_key); + round_dec2(1, dummy); + + host_to_be(RR0, RT0); + host_to_be(RL0, RT0); + host_to_be(RR1, RT0); + host_to_be(RL1, RT0); + + b .Ldec_cbc_tail; +.ltorg +.size _gcry_blowfish_arm_dec_blk2,.-_gcry_blowfish_arm_dec_blk2; + +.align 3 +.globl _gcry_blowfish_arm_cbc_dec; +.type _gcry_blowfish_arm_cbc_dec,%function; + +_gcry_blowfish_arm_cbc_dec: + /* input: + * %r0: CTX + * %r1: dst (2 blocks) + * %r2: src (2 blocks) + * %r3: iv (64bit) + */ + push {%r2-%r11, %ip, %lr}; + + read_block2(%r2, RL0, RR0, RL1, RR1, RT0); + + /* dec_blk2 is only used by cbc_dec, jump directly in/out instead + * of function call. */ + b _gcry_blowfish_arm_dec_blk2; +.Ldec_cbc_tail: + /* result in RR0:RL0, RR1:RL1 = %r4:%r3, %r10:%r9 */ + + /* %r0: %src, %r1: dst, %r2: iv */ + pop {%r0, %r2}; + + /* load IV+1 (src[0]) to %r7:%r8. Might be unaligned. */ + read_block_host(%r0, 0, %r7, %r8, %r5); + /* load IV (iv[0]) to %r5:%r6. 'iv' is aligned. */ + ldm %r2, {%r5, %r6}; + + /* out[1] ^= IV+1 */ + eor %r10, %r7; + eor %r9, %r8; + /* out[0] ^= IV */ + eor %r4, %r5; + eor %r3, %r6; + + /* load IV+2 (src[1]) to %r7:%r8. Might be unaligned. */ + read_block_host(%r0, 8, %r7, %r8, %r5); + /* store IV+2 to iv[0] (aligned). */ + stm %r2, {%r7, %r8}; + + /* store result to dst[0-3]. Might be unaligned. */ + write_block2_host(%r1, %r4, %r3, %r10, %r9, %r5, %r6); + + pop {%r4-%r11, %ip, %pc}; +.ltorg +.size _gcry_blowfish_arm_cbc_dec,.-_gcry_blowfish_arm_cbc_dec; + +#endif /*HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS*/ +#endif /*__ARM_ARCH >= 6*/ diff --git a/cipher/blowfish-armv6.S b/cipher/blowfish-armv6.S deleted file mode 100644 index eea879f..0000000 --- a/cipher/blowfish-armv6.S +++ /dev/null @@ -1,730 +0,0 @@ -/* blowfish-armv6.S - ARM assembly implementation of Blowfish cipher - * - * Copyright ? 2013 Jussi Kivilinna - * - * This file is part of Libgcrypt. - * - * Libgcrypt is free software; you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation; either version 2.1 of - * the License, or (at your option) any later version. - * - * Libgcrypt is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public - * License along with this program; if not, see . - */ - -#include - -#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) -#ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS - -.text - -.syntax unified -.arm - -/* structure of crypto context */ -#define s0 0 -#define s1 (s0 + (1 * 256) * 4) -#define s2 (s0 + (2 * 256) * 4) -#define s3 (s0 + (3 * 256) * 4) -#define p (s3 + (1 * 256) * 4) - -/* register macros */ -#define CTXs0 %r0 -#define CTXs1 %r9 -#define CTXs2 %r8 -#define CTXs3 %r10 -#define RMASK %lr -#define RKEYL %r2 -#define RKEYR %ip - -#define RL0 %r3 -#define RR0 %r4 - -#define RL1 %r9 -#define RR1 %r10 - -#define RT0 %r11 -#define RT1 %r7 -#define RT2 %r5 -#define RT3 %r6 - -/* helper macros */ -#define ldr_unaligned_le(rout, rsrc, offs, rtmp) \ - ldrb rout, [rsrc, #((offs) + 0)]; \ - ldrb rtmp, [rsrc, #((offs) + 1)]; \ - orr rout, rout, rtmp, lsl #8; \ - ldrb rtmp, [rsrc, #((offs) + 2)]; \ - orr rout, rout, rtmp, lsl #16; \ - ldrb rtmp, [rsrc, #((offs) + 3)]; \ - orr rout, rout, rtmp, lsl #24; - -#define str_unaligned_le(rin, rdst, offs, rtmp0, rtmp1) \ - mov rtmp0, rin, lsr #8; \ - strb rin, [rdst, #((offs) + 0)]; \ - mov rtmp1, rin, lsr #16; \ - strb rtmp0, [rdst, #((offs) + 1)]; \ - mov rtmp0, rin, lsr #24; \ - strb rtmp1, [rdst, #((offs) + 2)]; \ - strb rtmp0, [rdst, #((offs) + 3)]; - -#define ldr_unaligned_be(rout, rsrc, offs, rtmp) \ - ldrb rout, [rsrc, #((offs) + 3)]; \ - ldrb rtmp, [rsrc, #((offs) + 2)]; \ - orr rout, rout, rtmp, lsl #8; \ - ldrb rtmp, [rsrc, #((offs) + 1)]; \ - orr rout, rout, rtmp, lsl #16; \ - ldrb rtmp, [rsrc, #((offs) + 0)]; \ - orr rout, rout, rtmp, lsl #24; - -#define str_unaligned_be(rin, rdst, offs, rtmp0, rtmp1) \ - mov rtmp0, rin, lsr #8; \ - strb rin, [rdst, #((offs) + 3)]; \ - mov rtmp1, rin, lsr #16; \ - strb rtmp0, [rdst, #((offs) + 2)]; \ - mov rtmp0, rin, lsr #24; \ - strb rtmp1, [rdst, #((offs) + 1)]; \ - strb rtmp0, [rdst, #((offs) + 0)]; - -#ifdef __ARMEL__ - #define ldr_unaligned_host ldr_unaligned_le - #define str_unaligned_host str_unaligned_le - - /* bswap on little-endian */ - #define host_to_be(reg) \ - rev reg, reg; - #define be_to_host(reg) \ - rev reg, reg; -#else - #define ldr_unaligned_host ldr_unaligned_be - #define str_unaligned_host str_unaligned_be - - /* nop on big-endian */ - #define host_to_be(reg) /*_*/ - #define be_to_host(reg) /*_*/ -#endif - -#define host_to_host(x) /*_*/ - -/*********************************************************************** - * 1-way blowfish - ***********************************************************************/ -#define F(l, r) \ - and RT0, RMASK, l, lsr#(24 - 2); \ - and RT1, RMASK, l, lsr#(16 - 2); \ - ldr RT0, [CTXs0, RT0]; \ - and RT2, RMASK, l, lsr#(8 - 2); \ - ldr RT1, [CTXs1, RT1]; \ - and RT3, RMASK, l, lsl#2; \ - ldr RT2, [CTXs2, RT2]; \ - add RT0, RT1; \ - ldr RT3, [CTXs3, RT3]; \ - eor RT0, RT2; \ - add RT0, RT3; \ - eor r, RT0; - -#define load_roundkey_enc(n) \ - ldr RKEYL, [CTXs2, #((p - s2) + (4 * (n) + 0))]; \ - ldr RKEYR, [CTXs2, #((p - s2) + (4 * (n) + 4))]; - -#define add_roundkey_enc() \ - eor RL0, RKEYL; \ - eor RR0, RKEYR; - -#define round_enc(n) \ - add_roundkey_enc(); \ - load_roundkey_enc(n); \ - \ - F(RL0, RR0); \ - F(RR0, RL0); - -#define load_roundkey_dec(n) \ - ldr RKEYL, [CTXs2, #((p - s2) + (4 * ((n) - 1) + 4))]; \ - ldr RKEYR, [CTXs2, #((p - s2) + (4 * ((n) - 1) + 0))]; - -#define add_roundkey_dec() \ - eor RL0, RKEYL; \ - eor RR0, RKEYR; - -#define round_dec(n) \ - add_roundkey_dec(); \ - load_roundkey_dec(n); \ - \ - F(RL0, RR0); \ - F(RR0, RL0); - -#define read_block_aligned(rin, offs, l0, r0, convert) \ - ldr l0, [rin, #((offs) + 0)]; \ - ldr r0, [rin, #((offs) + 4)]; \ - convert(l0); \ - convert(r0); - -#define write_block_aligned(rout, offs, l0, r0, convert) \ - convert(l0); \ - convert(r0); \ - str l0, [rout, #((offs) + 0)]; \ - str r0, [rout, #((offs) + 4)]; - -#ifdef __ARM_FEATURE_UNALIGNED - /* unaligned word reads allowed */ - #define read_block(rin, offs, l0, r0, rtmp0) \ - read_block_aligned(rin, offs, l0, r0, host_to_be) - - #define write_block(rout, offs, r0, l0, rtmp0, rtmp1) \ - write_block_aligned(rout, offs, r0, l0, be_to_host) - - #define read_block_host(rin, offs, l0, r0, rtmp0) \ - read_block_aligned(rin, offs, l0, r0, host_to_host) - - #define write_block_host(rout, offs, r0, l0, rtmp0, rtmp1) \ - write_block_aligned(rout, offs, r0, l0, host_to_host) -#else - /* need to handle unaligned reads by byte reads */ - #define read_block(rin, offs, l0, r0, rtmp0) \ - tst rin, #3; \ - beq 1f; \ - ldr_unaligned_be(l0, rin, (offs) + 0, rtmp0); \ - ldr_unaligned_be(r0, rin, (offs) + 4, rtmp0); \ - b 2f; \ - 1:;\ - read_block_aligned(rin, offs, l0, r0, host_to_be); \ - 2:; - - #define write_block(rout, offs, l0, r0, rtmp0, rtmp1) \ - tst rout, #3; \ - beq 1f; \ - str_unaligned_be(l0, rout, (offs) + 0, rtmp0, rtmp1); \ - str_unaligned_be(r0, rout, (offs) + 4, rtmp0, rtmp1); \ - b 2f; \ - 1:;\ - write_block_aligned(rout, offs, l0, r0, be_to_host); \ - 2:; - - #define read_block_host(rin, offs, l0, r0, rtmp0) \ - tst rin, #3; \ - beq 1f; \ - ldr_unaligned_host(l0, rin, (offs) + 0, rtmp0); \ - ldr_unaligned_host(r0, rin, (offs) + 4, rtmp0); \ - b 2f; \ - 1:;\ - read_block_aligned(rin, offs, l0, r0, host_to_host); \ - 2:; - - #define write_block_host(rout, offs, l0, r0, rtmp0, rtmp1) \ - tst rout, #3; \ - beq 1f; \ - str_unaligned_host(l0, rout, (offs) + 0, rtmp0, rtmp1); \ - str_unaligned_host(r0, rout, (offs) + 4, rtmp0, rtmp1); \ - b 2f; \ - 1:;\ - write_block_aligned(rout, offs, l0, r0, host_to_host); \ - 2:; -#endif - -.align 3 -.type __blowfish_enc_blk1,%function; - -__blowfish_enc_blk1: - /* input: - * preloaded: CTX - * [RL0, RR0]: src - * output: - * [RR0, RL0]: dst - */ - push {%lr}; - - add CTXs1, CTXs0, #(s1 - s0); - add CTXs2, CTXs0, #(s2 - s0); - mov RMASK, #(0xff << 2); /* byte mask */ - add CTXs3, CTXs1, #(s3 - s1); - - load_roundkey_enc(0); - round_enc(2); - round_enc(4); - round_enc(6); - round_enc(8); - round_enc(10); - round_enc(12); - round_enc(14); - round_enc(16); - add_roundkey_enc(); - - pop {%pc}; -.size __blowfish_enc_blk1,.-__blowfish_enc_blk1; - -.align 8 -.globl _gcry_blowfish_armv6_do_encrypt -.type _gcry_blowfish_armv6_do_encrypt,%function; - -_gcry_blowfish_armv6_do_encrypt: - /* input: - * %r0: ctx, CTX - * %r1: u32 *ret_xl - * %r2: u32 *ret_xr - */ - push {%r2, %r4-%r11, %ip, %lr}; - - ldr RL0, [%r1]; - ldr RR0, [%r2]; - - bl __blowfish_enc_blk1; - - pop {%r2}; - str RR0, [%r1]; - str RL0, [%r2]; - - pop {%r4-%r11, %ip, %pc}; -.size _gcry_blowfish_armv6_do_encrypt,.-_gcry_blowfish_armv6_do_encrypt; - -.align 3 -.global _gcry_blowfish_armv6_encrypt_block -.type _gcry_blowfish_armv6_encrypt_block,%function; - -_gcry_blowfish_armv6_encrypt_block: - /* input: - * %r0: ctx, CTX - * %r1: dst - * %r2: src - */ - push {%r4-%r11, %ip, %lr}; - - read_block(%r2, 0, RL0, RR0, RT0); - - bl __blowfish_enc_blk1; - - write_block(%r1, 0, RR0, RL0, RT0, RT1); - - pop {%r4-%r11, %ip, %pc}; -.size _gcry_blowfish_armv6_encrypt_block,.-_gcry_blowfish_armv6_encrypt_block; - -.align 3 -.global _gcry_blowfish_armv6_decrypt_block -.type _gcry_blowfish_armv6_decrypt_block,%function; - -_gcry_blowfish_armv6_decrypt_block: - /* input: - * %r0: ctx, CTX - * %r1: dst - * %r2: src - */ - push {%r4-%r11, %ip, %lr}; - - add CTXs1, CTXs0, #(s1 - s0); - add CTXs2, CTXs0, #(s2 - s0); - mov RMASK, #(0xff << 2); /* byte mask */ - add CTXs3, CTXs1, #(s3 - s1); - - read_block(%r2, 0, RL0, RR0, RT0); - - load_roundkey_dec(17); - round_dec(15); - round_dec(13); - round_dec(11); - round_dec(9); - round_dec(7); - round_dec(5); - round_dec(3); - round_dec(1); - add_roundkey_dec(); - - write_block(%r1, 0, RR0, RL0, RT0, RT1); - - pop {%r4-%r11, %ip, %pc}; -.size _gcry_blowfish_armv6_decrypt_block,.-_gcry_blowfish_armv6_decrypt_block; - -/*********************************************************************** - * 2-way blowfish - ***********************************************************************/ -#define F2(n, l0, r0, l1, r1, set_nextk, dec) \ - \ - and RT0, RMASK, l0, lsr#(24 - 2); \ - and RT1, RMASK, l0, lsr#(16 - 2); \ - and RT2, RMASK, l0, lsr#(8 - 2); \ - add RT1, #(s1 - s0); \ - \ - ldr RT0, [CTXs0, RT0]; \ - and RT3, RMASK, l0, lsl#2; \ - ldr RT1, [CTXs0, RT1]; \ - add RT3, #(s3 - s2); \ - ldr RT2, [CTXs2, RT2]; \ - add RT0, RT1; \ - ldr RT3, [CTXs2, RT3]; \ - \ - and RT1, RMASK, l1, lsr#(24 - 2); \ - eor RT0, RT2; \ - and RT2, RMASK, l1, lsr#(16 - 2); \ - add RT0, RT3; \ - add RT2, #(s1 - s0); \ - and RT3, RMASK, l1, lsr#(8 - 2); \ - eor r0, RT0; \ - \ - ldr RT1, [CTXs0, RT1]; \ - and RT0, RMASK, l1, lsl#2; \ - ldr RT2, [CTXs0, RT2]; \ - add RT0, #(s3 - s2); \ - ldr RT3, [CTXs2, RT3]; \ - add RT1, RT2; \ - ldr RT0, [CTXs2, RT0]; \ - \ - and RT2, RMASK, r0, lsr#(24 - 2); \ - eor RT1, RT3; \ - and RT3, RMASK, r0, lsr#(16 - 2); \ - add RT1, RT0; \ - add RT3, #(s1 - s0); \ - and RT0, RMASK, r0, lsr#(8 - 2); \ - eor r1, RT1; \ - \ - ldr RT2, [CTXs0, RT2]; \ - and RT1, RMASK, r0, lsl#2; \ - ldr RT3, [CTXs0, RT3]; \ - add RT1, #(s3 - s2); \ - ldr RT0, [CTXs2, RT0]; \ - add RT2, RT3; \ - ldr RT1, [CTXs2, RT1]; \ - \ - and RT3, RMASK, r1, lsr#(24 - 2); \ - eor RT2, RT0; \ - and RT0, RMASK, r1, lsr#(16 - 2); \ - add RT2, RT1; \ - add RT0, #(s1 - s0); \ - and RT1, RMASK, r1, lsr#(8 - 2); \ - eor l0, RT2; \ - \ - ldr RT3, [CTXs0, RT3]; \ - and RT2, RMASK, r1, lsl#2; \ - ldr RT0, [CTXs0, RT0]; \ - add RT2, #(s3 - s2); \ - ldr RT1, [CTXs2, RT1]; \ - eor l1, RKEYL; \ - ldr RT2, [CTXs2, RT2]; \ - \ - eor r0, RKEYR; \ - add RT3, RT0; \ - eor r1, RKEYR; \ - eor RT3, RT1; \ - eor l0, RKEYL; \ - add RT3, RT2; \ - set_nextk(RKEYL, (p - s2) + (4 * (n) + ((dec) * 4))); \ - eor l1, RT3; \ - set_nextk(RKEYR, (p - s2) + (4 * (n) + (!(dec) * 4))); - -#define load_n_add_roundkey_enc2(n) \ - load_roundkey_enc(n); \ - eor RL0, RKEYL; \ - eor RR0, RKEYR; \ - eor RL1, RKEYL; \ - eor RR1, RKEYR; \ - load_roundkey_enc((n) + 2); - -#define next_key(reg, offs) \ - ldr reg, [CTXs2, #(offs)]; - -#define dummy(x, y) /* do nothing */ - -#define round_enc2(n, load_next_key) \ - F2((n) + 2, RL0, RR0, RL1, RR1, load_next_key, 0); - -#define load_n_add_roundkey_dec2(n) \ - load_roundkey_dec(n); \ - eor RL0, RKEYL; \ - eor RR0, RKEYR; \ - eor RL1, RKEYL; \ - eor RR1, RKEYR; \ - load_roundkey_dec((n) - 2); - -#define round_dec2(n, load_next_key) \ - F2((n) - 3, RL0, RR0, RL1, RR1, load_next_key, 1); - -#define read_block2_aligned(rin, l0, r0, l1, r1, convert) \ - ldr l0, [rin, #(0)]; \ - ldr r0, [rin, #(4)]; \ - convert(l0); \ - ldr l1, [rin, #(8)]; \ - convert(r0); \ - ldr r1, [rin, #(12)]; \ - convert(l1); \ - convert(r1); - -#define write_block2_aligned(rout, l0, r0, l1, r1, convert) \ - convert(l0); \ - convert(r0); \ - convert(l1); \ - str l0, [rout, #(0)]; \ - convert(r1); \ - str r0, [rout, #(4)]; \ - str l1, [rout, #(8)]; \ - str r1, [rout, #(12)]; - -#ifdef __ARM_FEATURE_UNALIGNED - /* unaligned word reads allowed */ - #define read_block2(rin, l0, r0, l1, r1, rtmp0) \ - read_block2_aligned(rin, l0, r0, l1, r1, host_to_be) - - #define write_block2(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ - write_block2_aligned(rout, l0, r0, l1, r1, be_to_host) - - #define read_block2_host(rin, l0, r0, l1, r1, rtmp0) \ - read_block2_aligned(rin, l0, r0, l1, r1, host_to_host) - - #define write_block2_host(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ - write_block2_aligned(rout, l0, r0, l1, r1, host_to_host) -#else - /* need to handle unaligned reads by byte reads */ - #define read_block2(rin, l0, r0, l1, r1, rtmp0) \ - tst rin, #3; \ - beq 1f; \ - ldr_unaligned_be(l0, rin, 0, rtmp0); \ - ldr_unaligned_be(r0, rin, 4, rtmp0); \ - ldr_unaligned_be(l1, rin, 8, rtmp0); \ - ldr_unaligned_be(r1, rin, 12, rtmp0); \ - b 2f; \ - 1:;\ - read_block2_aligned(rin, l0, r0, l1, r1, host_to_be); \ - 2:; - - #define write_block2(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ - tst rout, #3; \ - beq 1f; \ - str_unaligned_be(l0, rout, 0, rtmp0, rtmp1); \ - str_unaligned_be(r0, rout, 4, rtmp0, rtmp1); \ - str_unaligned_be(l1, rout, 8, rtmp0, rtmp1); \ - str_unaligned_be(r1, rout, 12, rtmp0, rtmp1); \ - b 2f; \ - 1:;\ - write_block2_aligned(rout, l0, r0, l1, r1, be_to_host); \ - 2:; - - #define read_block2_host(rin, l0, r0, l1, r1, rtmp0) \ - tst rin, #3; \ - beq 1f; \ - ldr_unaligned_host(l0, rin, 0, rtmp0); \ - ldr_unaligned_host(r0, rin, 4, rtmp0); \ - ldr_unaligned_host(l1, rin, 8, rtmp0); \ - ldr_unaligned_host(r1, rin, 12, rtmp0); \ - b 2f; \ - 1:;\ - read_block2_aligned(rin, l0, r0, l1, r1, host_to_host); \ - 2:; - - #define write_block2_host(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ - tst rout, #3; \ - beq 1f; \ - str_unaligned_host(l0, rout, 0, rtmp0, rtmp1); \ - str_unaligned_host(r0, rout, 4, rtmp0, rtmp1); \ - str_unaligned_host(l1, rout, 8, rtmp0, rtmp1); \ - str_unaligned_host(r1, rout, 12, rtmp0, rtmp1); \ - b 2f; \ - 1:;\ - write_block2_aligned(rout, l0, r0, l1, r1, host_to_host); \ - 2:; -#endif - -.align 3 -.type _gcry_blowfish_armv6_enc_blk2,%function; - -_gcry_blowfish_armv6_enc_blk2: - /* input: - * preloaded: CTX - * [RL0, RR0], [RL1, RR1]: src - * output: - * [RR0, RL0], [RR1, RL1]: dst - */ - push {%lr}; - - add CTXs2, CTXs0, #(s2 - s0); - mov RMASK, #(0xff << 2); /* byte mask */ - - load_n_add_roundkey_enc2(0); - round_enc2(2, next_key); - round_enc2(4, next_key); - round_enc2(6, next_key); - round_enc2(8, next_key); - round_enc2(10, next_key); - round_enc2(12, next_key); - round_enc2(14, next_key); - round_enc2(16, dummy); - - host_to_be(RR0); - host_to_be(RL0); - host_to_be(RR1); - host_to_be(RL1); - - pop {%pc}; -.size _gcry_blowfish_armv6_enc_blk2,.-_gcry_blowfish_armv6_enc_blk2; - -.align 3 -.globl _gcry_blowfish_armv6_cfb_dec; -.type _gcry_blowfish_armv6_cfb_dec,%function; - -_gcry_blowfish_armv6_cfb_dec: - /* input: - * %r0: CTX - * %r1: dst (2 blocks) - * %r2: src (2 blocks) - * %r3: iv (64bit) - */ - push {%r2, %r4-%r11, %ip, %lr}; - - mov %lr, %r3; - - /* Load input (iv/%r3 is aligned, src/%r2 might not be) */ - ldm %r3, {RL0, RR0}; - host_to_be(RL0); - host_to_be(RR0); - read_block(%r2, 0, RL1, RR1, RT0); - - /* Update IV, load src[1] and save to iv[0] */ - read_block_host(%r2, 8, %r5, %r6, RT0); - stm %lr, {%r5, %r6}; - - bl _gcry_blowfish_armv6_enc_blk2; - /* result in RR0:RL0, RR1:RL1 = %r4:%r3, %r10:%r9 */ - - /* %r1: dst, %r0: %src */ - pop {%r0}; - - /* dst = src ^ result */ - read_block2_host(%r0, %r5, %r6, %r7, %r8, %lr); - eor %r5, %r4; - eor %r6, %r3; - eor %r7, %r10; - eor %r8, %r9; - write_block2_host(%r1, %r5, %r6, %r7, %r8, %r9, %r10); - - pop {%r4-%r11, %ip, %pc}; -.ltorg -.size _gcry_blowfish_armv6_cfb_dec,.-_gcry_blowfish_armv6_cfb_dec; - -.align 3 -.globl _gcry_blowfish_armv6_ctr_enc; -.type _gcry_blowfish_armv6_ctr_enc,%function; - -_gcry_blowfish_armv6_ctr_enc: - /* input: - * %r0: CTX - * %r1: dst (2 blocks) - * %r2: src (2 blocks) - * %r3: iv (64bit, big-endian) - */ - push {%r2, %r4-%r11, %ip, %lr}; - - mov %lr, %r3; - - /* Load IV (big => host endian) */ - read_block_aligned(%lr, 0, RL0, RR0, be_to_host); - - /* Construct IVs */ - adds RR1, RR0, #1; /* +1 */ - adc RL1, RL0, #0; - adds %r6, RR1, #1; /* +2 */ - adc %r5, RL1, #0; - - /* Store new IV (host => big-endian) */ - write_block_aligned(%lr, 0, %r5, %r6, host_to_be); - - bl _gcry_blowfish_armv6_enc_blk2; - /* result in RR0:RL0, RR1:RL1 = %r4:%r3, %r10:%r9 */ - - /* %r1: dst, %r0: %src */ - pop {%r0}; - - /* XOR key-stream with plaintext */ - read_block2_host(%r0, %r5, %r6, %r7, %r8, %lr); - eor %r5, %r4; - eor %r6, %r3; - eor %r7, %r10; - eor %r8, %r9; - write_block2_host(%r1, %r5, %r6, %r7, %r8, %r9, %r10); - - pop {%r4-%r11, %ip, %pc}; -.ltorg -.size _gcry_blowfish_armv6_ctr_enc,.-_gcry_blowfish_armv6_ctr_enc; - -.align 3 -.type _gcry_blowfish_armv6_dec_blk2,%function; - -_gcry_blowfish_armv6_dec_blk2: - /* input: - * preloaded: CTX - * [RL0, RR0], [RL1, RR1]: src - * output: - * [RR0, RL0], [RR1, RL1]: dst - */ - add CTXs2, CTXs0, #(s2 - s0); - mov RMASK, #(0xff << 2); /* byte mask */ - - load_n_add_roundkey_dec2(17); - round_dec2(15, next_key); - round_dec2(13, next_key); - round_dec2(11, next_key); - round_dec2(9, next_key); - round_dec2(7, next_key); - round_dec2(5, next_key); - round_dec2(3, next_key); - round_dec2(1, dummy); - - host_to_be(RR0); - host_to_be(RL0); - host_to_be(RR1); - host_to_be(RL1); - - b .Ldec_cbc_tail; -.ltorg -.size _gcry_blowfish_armv6_dec_blk2,.-_gcry_blowfish_armv6_dec_blk2; - -.align 3 -.globl _gcry_blowfish_armv6_cbc_dec; -.type _gcry_blowfish_armv6_cbc_dec,%function; - -_gcry_blowfish_armv6_cbc_dec: - /* input: - * %r0: CTX - * %r1: dst (2 blocks) - * %r2: src (2 blocks) - * %r3: iv (64bit) - */ - push {%r2-%r11, %ip, %lr}; - - read_block2(%r2, RL0, RR0, RL1, RR1, RT0); - - /* dec_blk2 is only used by cbc_dec, jump directly in/out instead - * of function call. */ - b _gcry_blowfish_armv6_dec_blk2; -.Ldec_cbc_tail: - /* result in RR0:RL0, RR1:RL1 = %r4:%r3, %r10:%r9 */ - - /* %r0: %src, %r1: dst, %r2: iv */ - pop {%r0, %r2}; - - /* load IV+1 (src[0]) to %r7:%r8. Might be unaligned. */ - read_block_host(%r0, 0, %r7, %r8, %r5); - /* load IV (iv[0]) to %r5:%r6. 'iv' is aligned. */ - ldm %r2, {%r5, %r6}; - - /* out[1] ^= IV+1 */ - eor %r10, %r7; - eor %r9, %r8; - /* out[0] ^= IV */ - eor %r4, %r5; - eor %r3, %r6; - - /* load IV+2 (src[1]) to %r7:%r8. Might be unaligned. */ - read_block_host(%r0, 8, %r7, %r8, %r5); - /* store IV+2 to iv[0] (aligned). */ - stm %r2, {%r7, %r8}; - - /* store result to dst[0-3]. Might be unaligned. */ - write_block2_host(%r1, %r4, %r3, %r10, %r9, %r5, %r6); - - pop {%r4-%r11, %ip, %pc}; -.ltorg -.size _gcry_blowfish_armv6_cbc_dec,.-_gcry_blowfish_armv6_cbc_dec; - -#endif /*HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS*/ -#endif /*__ARM_ARCH >= 6*/ diff --git a/cipher/blowfish.c b/cipher/blowfish.c index 2f739c8..2bedbea 100644 --- a/cipher/blowfish.c +++ b/cipher/blowfish.c @@ -50,11 +50,11 @@ # define USE_AMD64_ASM 1 #endif -/* USE_ARMV6_ASM indicates whether to use ARMv6 assembly code. */ -#undef USE_ARMV6_ASM -#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +/* USE_ARM_ASM indicates whether to use ARMv6 assembly code. */ +#undef USE_ARM_ASM +#if defined(__ARMEL__) # if (BLOWFISH_ROUNDS == 16) && defined(HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS) -# define USE_ARMV6_ASM 1 +# define USE_ARM_ASM 1 # endif #endif @@ -314,44 +314,44 @@ decrypt_block (void *context, byte *outbuf, const byte *inbuf) return /*burn_stack*/ (2*8); } -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) /* Assembly implementations of Blowfish. */ -extern void _gcry_blowfish_armv6_do_encrypt(BLOWFISH_context *c, u32 *ret_xl, +extern void _gcry_blowfish_arm_do_encrypt(BLOWFISH_context *c, u32 *ret_xl, u32 *ret_xr); -extern void _gcry_blowfish_armv6_encrypt_block(BLOWFISH_context *c, byte *out, +extern void _gcry_blowfish_arm_encrypt_block(BLOWFISH_context *c, byte *out, const byte *in); -extern void _gcry_blowfish_armv6_decrypt_block(BLOWFISH_context *c, byte *out, +extern void _gcry_blowfish_arm_decrypt_block(BLOWFISH_context *c, byte *out, const byte *in); /* These assembly implementations process two blocks in parallel. */ -extern void _gcry_blowfish_armv6_ctr_enc(BLOWFISH_context *ctx, byte *out, +extern void _gcry_blowfish_arm_ctr_enc(BLOWFISH_context *ctx, byte *out, const byte *in, byte *ctr); -extern void _gcry_blowfish_armv6_cbc_dec(BLOWFISH_context *ctx, byte *out, +extern void _gcry_blowfish_arm_cbc_dec(BLOWFISH_context *ctx, byte *out, const byte *in, byte *iv); -extern void _gcry_blowfish_armv6_cfb_dec(BLOWFISH_context *ctx, byte *out, +extern void _gcry_blowfish_arm_cfb_dec(BLOWFISH_context *ctx, byte *out, const byte *in, byte *iv); static void do_encrypt ( BLOWFISH_context *bc, u32 *ret_xl, u32 *ret_xr ) { - _gcry_blowfish_armv6_do_encrypt (bc, ret_xl, ret_xr); + _gcry_blowfish_arm_do_encrypt (bc, ret_xl, ret_xr); } static void do_encrypt_block (BLOWFISH_context *context, byte *outbuf, const byte *inbuf) { - _gcry_blowfish_armv6_encrypt_block (context, outbuf, inbuf); + _gcry_blowfish_arm_encrypt_block (context, outbuf, inbuf); } static void do_decrypt_block (BLOWFISH_context *context, byte *outbuf, const byte *inbuf) { - _gcry_blowfish_armv6_decrypt_block (context, outbuf, inbuf); + _gcry_blowfish_arm_decrypt_block (context, outbuf, inbuf); } static unsigned int @@ -370,7 +370,7 @@ decrypt_block (void *context, byte *outbuf, const byte *inbuf) return /*burn_stack*/ (10*4); } -#else /*USE_ARMV6_ASM*/ +#else /*USE_ARM_ASM*/ #if BLOWFISH_ROUNDS != 16 static inline u32 @@ -580,7 +580,7 @@ decrypt_block (void *context, byte *outbuf, const byte *inbuf) return /*burn_stack*/ (64); } -#endif /*!USE_AMD64_ASM&&!USE_ARMV6_ASM*/ +#endif /*!USE_AMD64_ASM&&!USE_ARM_ASM*/ /* Bulk encryption of complete blocks in CTR mode. This function is only @@ -615,12 +615,12 @@ _gcry_blowfish_ctr_enc(void *context, unsigned char *ctr, void *outbuf_arg, /* Use generic code to handle smaller chunks... */ /* TODO: use caching instead? */ } -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) { /* Process data in 2 block chunks. */ while (nblocks >= 2) { - _gcry_blowfish_armv6_ctr_enc(ctx, outbuf, inbuf, ctr); + _gcry_blowfish_arm_ctr_enc(ctx, outbuf, inbuf, ctr); nblocks -= 2; outbuf += 2 * BLOWFISH_BLOCKSIZE; @@ -683,12 +683,12 @@ _gcry_blowfish_cbc_dec(void *context, unsigned char *iv, void *outbuf_arg, /* Use generic code to handle smaller chunks... */ } -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) { /* Process data in 2 block chunks. */ while (nblocks >= 2) { - _gcry_blowfish_armv6_cbc_dec(ctx, outbuf, inbuf, iv); + _gcry_blowfish_arm_cbc_dec(ctx, outbuf, inbuf, iv); nblocks -= 2; outbuf += 2 * BLOWFISH_BLOCKSIZE; @@ -746,12 +746,12 @@ _gcry_blowfish_cfb_dec(void *context, unsigned char *iv, void *outbuf_arg, /* Use generic code to handle smaller chunks... */ } -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) { /* Process data in 2 block chunks. */ while (nblocks >= 2) { - _gcry_blowfish_armv6_cfb_dec(ctx, outbuf, inbuf, iv); + _gcry_blowfish_arm_cfb_dec(ctx, outbuf, inbuf, iv); nblocks -= 2; outbuf += 2 * BLOWFISH_BLOCKSIZE; diff --git a/cipher/camellia-arm.S b/cipher/camellia-arm.S new file mode 100644 index 0000000..820c46e --- /dev/null +++ b/cipher/camellia-arm.S @@ -0,0 +1,616 @@ +/* camellia-arm.S - ARM assembly implementation of Camellia cipher + * + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include + +#if defined(__ARMEL__) +#ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS + +.text + +.syntax unified +.arm + +#define CAMELLIA_TABLE_BYTE_LEN 272 + +/* struct camellia_ctx: */ +#define key_table 0 +#define key_length CAMELLIA_TABLE_BYTE_LEN + +/* register macros */ +#define CTX %r0 +#define RTAB1 %ip +#define RTAB3 %r1 +#define RMASK %lr + +#define IL %r2 +#define IR %r3 + +#define XL %r4 +#define XR %r5 +#define YL %r6 +#define YR %r7 + +#define RT0 %r8 +#define RT1 %r9 +#define RT2 %r10 +#define RT3 %r11 + +/* helper macros */ +#define ldr_unaligned_be(rout, rsrc, offs, rtmp) \ + ldrb rout, [rsrc, #((offs) + 3)]; \ + ldrb rtmp, [rsrc, #((offs) + 2)]; \ + orr rout, rout, rtmp, lsl #8; \ + ldrb rtmp, [rsrc, #((offs) + 1)]; \ + orr rout, rout, rtmp, lsl #16; \ + ldrb rtmp, [rsrc, #((offs) + 0)]; \ + orr rout, rout, rtmp, lsl #24; + +#define str_unaligned_be(rin, rdst, offs, rtmp0, rtmp1) \ + mov rtmp0, rin, lsr #8; \ + strb rin, [rdst, #((offs) + 3)]; \ + mov rtmp1, rin, lsr #16; \ + strb rtmp0, [rdst, #((offs) + 2)]; \ + mov rtmp0, rin, lsr #24; \ + strb rtmp1, [rdst, #((offs) + 1)]; \ + strb rtmp0, [rdst, #((offs) + 0)]; + +#ifdef __ARMEL__ +#ifdef HAVE_ARM_ARCH_V6 + #define host_to_be(reg, rtmp) \ + rev reg, reg; + #define be_to_host(reg, rtmp) \ + rev reg, reg; +#else + #define host_to_be(reg, rtmp) \ + eor rtmp, reg, reg, ror #16; \ + mov rtmp, rtmp, lsr #8; \ + bic rtmp, rtmp, #65280; \ + eor reg, rtmp, reg, ror #8; + #define be_to_host(reg, rtmp) \ + eor rtmp, reg, reg, ror #16; \ + mov rtmp, rtmp, lsr #8; \ + bic rtmp, rtmp, #65280; \ + eor reg, rtmp, reg, ror #8; +#endif +#else + /* nop on big-endian */ + #define host_to_be(reg, rtmp) /*_*/ + #define be_to_host(reg, rtmp) /*_*/ +#endif + +#define ldr_input_aligned_be(rin, a, b, c, d, rtmp) \ + ldr a, [rin, #0]; \ + ldr b, [rin, #4]; \ + be_to_host(a, rtmp); \ + ldr c, [rin, #8]; \ + be_to_host(b, rtmp); \ + ldr d, [rin, #12]; \ + be_to_host(c, rtmp); \ + be_to_host(d, rtmp); + +#define str_output_aligned_be(rout, a, b, c, d, rtmp) \ + be_to_host(a, rtmp); \ + be_to_host(b, rtmp); \ + str a, [rout, #0]; \ + be_to_host(c, rtmp); \ + str b, [rout, #4]; \ + be_to_host(d, rtmp); \ + str c, [rout, #8]; \ + str d, [rout, #12]; + +#ifdef __ARM_FEATURE_UNALIGNED + /* unaligned word reads/writes allowed */ + #define ldr_input_be(rin, ra, rb, rc, rd, rtmp) \ + ldr_input_aligned_be(rin, ra, rb, rc, rd, rtmp) + + #define str_output_be(rout, ra, rb, rc, rd, rtmp0, rtmp1) \ + str_output_aligned_be(rout, ra, rb, rc, rd, rtmp0) +#else + /* need to handle unaligned reads/writes by byte reads */ + #define ldr_input_be(rin, ra, rb, rc, rd, rtmp0) \ + tst rin, #3; \ + beq 1f; \ + ldr_unaligned_be(ra, rin, 0, rtmp0); \ + ldr_unaligned_be(rb, rin, 4, rtmp0); \ + ldr_unaligned_be(rc, rin, 8, rtmp0); \ + ldr_unaligned_be(rd, rin, 12, rtmp0); \ + b 2f; \ + 1:;\ + ldr_input_aligned_be(rin, ra, rb, rc, rd, rtmp0); \ + 2:; + + #define str_output_be(rout, ra, rb, rc, rd, rtmp0, rtmp1) \ + tst rout, #3; \ + beq 1f; \ + str_unaligned_be(ra, rout, 0, rtmp0, rtmp1); \ + str_unaligned_be(rb, rout, 4, rtmp0, rtmp1); \ + str_unaligned_be(rc, rout, 8, rtmp0, rtmp1); \ + str_unaligned_be(rd, rout, 12, rtmp0, rtmp1); \ + b 2f; \ + 1:;\ + str_output_aligned_be(rout, ra, rb, rc, rd, rtmp0); \ + 2:; +#endif + +/********************************************************************** + 1-way camellia + **********************************************************************/ +#define roundsm(xl, xr, kl, kr, yl, yr) \ + ldr RT2, [CTX, #(key_table + ((kl) * 4))]; \ + and IR, RMASK, xr, lsl#(4); /*sp1110*/ \ + ldr RT3, [CTX, #(key_table + ((kr) * 4))]; \ + and IL, RMASK, xl, lsr#(24 - 4); /*sp1110*/ \ + and RT0, RMASK, xr, lsr#(16 - 4); /*sp3033*/ \ + ldr IR, [RTAB1, IR]; \ + and RT1, RMASK, xl, lsr#(8 - 4); /*sp3033*/ \ + eor yl, RT2; \ + ldr IL, [RTAB1, IL]; \ + eor yr, RT3; \ + \ + ldr RT0, [RTAB3, RT0]; \ + add RTAB1, #4; \ + ldr RT1, [RTAB3, RT1]; \ + add RTAB3, #4; \ + \ + and RT2, RMASK, xr, lsr#(24 - 4); /*sp0222*/ \ + and RT3, RMASK, xl, lsr#(16 - 4); /*sp0222*/ \ + \ + eor IR, RT0; \ + eor IL, RT1; \ + \ + ldr RT2, [RTAB1, RT2]; \ + and RT0, RMASK, xr, lsr#(8 - 4); /*sp4404*/ \ + ldr RT3, [RTAB1, RT3]; \ + and RT1, RMASK, xl, lsl#(4); /*sp4404*/ \ + \ + ldr RT0, [RTAB3, RT0]; \ + sub RTAB1, #4; \ + ldr RT1, [RTAB3, RT1]; \ + sub RTAB3, #4; \ + \ + eor IR, RT2; \ + eor IL, RT3; \ + eor IR, RT0; \ + eor IL, RT1; \ + \ + eor IR, IL; \ + eor yr, yr, IL, ror#8; \ + eor yl, IR; \ + eor yr, IR; + +#define enc_rounds(n) \ + roundsm(XL, XR, ((n) + 2) * 2 + 0, ((n) + 2) * 2 + 1, YL, YR); \ + roundsm(YL, YR, ((n) + 3) * 2 + 0, ((n) + 3) * 2 + 1, XL, XR); \ + roundsm(XL, XR, ((n) + 4) * 2 + 0, ((n) + 4) * 2 + 1, YL, YR); \ + roundsm(YL, YR, ((n) + 5) * 2 + 0, ((n) + 5) * 2 + 1, XL, XR); \ + roundsm(XL, XR, ((n) + 6) * 2 + 0, ((n) + 6) * 2 + 1, YL, YR); \ + roundsm(YL, YR, ((n) + 7) * 2 + 0, ((n) + 7) * 2 + 1, XL, XR); + +#define dec_rounds(n) \ + roundsm(XL, XR, ((n) + 7) * 2 + 0, ((n) + 7) * 2 + 1, YL, YR); \ + roundsm(YL, YR, ((n) + 6) * 2 + 0, ((n) + 6) * 2 + 1, XL, XR); \ + roundsm(XL, XR, ((n) + 5) * 2 + 0, ((n) + 5) * 2 + 1, YL, YR); \ + roundsm(YL, YR, ((n) + 4) * 2 + 0, ((n) + 4) * 2 + 1, XL, XR); \ + roundsm(XL, XR, ((n) + 3) * 2 + 0, ((n) + 3) * 2 + 1, YL, YR); \ + roundsm(YL, YR, ((n) + 2) * 2 + 0, ((n) + 2) * 2 + 1, XL, XR); + +/* perform FL and FL?? */ +#define fls(ll, lr, rl, rr, kll, klr, krl, krr) \ + ldr RT0, [CTX, #(key_table + ((kll) * 4))]; \ + ldr RT2, [CTX, #(key_table + ((krr) * 4))]; \ + and RT0, ll; \ + ldr RT3, [CTX, #(key_table + ((krl) * 4))]; \ + orr RT2, rr; \ + ldr RT1, [CTX, #(key_table + ((klr) * 4))]; \ + eor rl, RT2; \ + eor lr, lr, RT0, ror#31; \ + and RT3, rl; \ + orr RT1, lr; \ + eor ll, RT1; \ + eor rr, rr, RT3, ror#31; + +#define enc_fls(n) \ + fls(XL, XR, YL, YR, \ + (n) * 2 + 0, (n) * 2 + 1, \ + (n) * 2 + 2, (n) * 2 + 3); + +#define dec_fls(n) \ + fls(XL, XR, YL, YR, \ + (n) * 2 + 2, (n) * 2 + 3, \ + (n) * 2 + 0, (n) * 2 + 1); + +#define inpack(n) \ + ldr_input_be(%r2, XL, XR, YL, YR, RT0); \ + ldr RT0, [CTX, #(key_table + ((n) * 8) + 0)]; \ + ldr RT1, [CTX, #(key_table + ((n) * 8) + 4)]; \ + eor XL, RT0; \ + eor XR, RT1; + +#define outunpack(n) \ + ldr RT0, [CTX, #(key_table + ((n) * 8) + 0)]; \ + ldr RT1, [CTX, #(key_table + ((n) * 8) + 4)]; \ + eor YL, RT0; \ + eor YR, RT1; \ + str_output_be(%r1, YL, YR, XL, XR, RT0, RT1); + +.align 3 +.global _gcry_camellia_arm_encrypt_block +.type _gcry_camellia_arm_encrypt_block,%function; + +_gcry_camellia_arm_encrypt_block: + /* input: + * %r0: keytable + * %r1: dst + * %r2: src + * %r3: keybitlen + */ + push {%r1, %r4-%r11, %ip, %lr}; + + ldr RTAB1, =.Lcamellia_sp1110; + mov RMASK, #0xff; + add RTAB3, RTAB1, #(2 * 4); + push {%r3}; + mov RMASK, RMASK, lsl#4 /* byte mask */ + + inpack(0); + + enc_rounds(0); + enc_fls(8); + enc_rounds(8); + enc_fls(16); + enc_rounds(16); + + pop {RT0}; + cmp RT0, #(16 * 8); + bne .Lenc_256; + + pop {%r1}; + outunpack(24); + + pop {%r4-%r11, %ip, %pc}; +.ltorg + +.Lenc_256: + enc_fls(24); + enc_rounds(24); + + pop {%r1}; + outunpack(32); + + pop {%r4-%r11, %ip, %pc}; +.ltorg +.size _gcry_camellia_arm_encrypt_block,.-_gcry_camellia_arm_encrypt_block; + +.align 3 +.global _gcry_camellia_arm_decrypt_block +.type _gcry_camellia_arm_decrypt_block,%function; + +_gcry_camellia_arm_decrypt_block: + /* input: + * %r0: keytable + * %r1: dst + * %r2: src + * %r3: keybitlen + */ + push {%r1, %r4-%r11, %ip, %lr}; + + ldr RTAB1, =.Lcamellia_sp1110; + mov RMASK, #0xff; + add RTAB3, RTAB1, #(2 * 4); + mov RMASK, RMASK, lsl#4 /* byte mask */ + + cmp %r3, #(16 * 8); + bne .Ldec_256; + + inpack(24); + +.Ldec_128: + dec_rounds(16); + dec_fls(16); + dec_rounds(8); + dec_fls(8); + dec_rounds(0); + + pop {%r1}; + outunpack(0); + + pop {%r4-%r11, %ip, %pc}; +.ltorg + +.Ldec_256: + inpack(32); + dec_rounds(24); + dec_fls(24); + + b .Ldec_128; +.ltorg +.size _gcry_camellia_arm_decrypt_block,.-_gcry_camellia_arm_decrypt_block; + +.data + +/* Encryption/Decryption tables */ +.align 5 +.Lcamellia_sp1110: +.long 0x70707000 +.Lcamellia_sp0222: + .long 0x00e0e0e0 +.Lcamellia_sp3033: + .long 0x38003838 +.Lcamellia_sp4404: + .long 0x70700070 +.long 0x82828200, 0x00050505, 0x41004141, 0x2c2c002c +.long 0x2c2c2c00, 0x00585858, 0x16001616, 0xb3b300b3 +.long 0xececec00, 0x00d9d9d9, 0x76007676, 0xc0c000c0 +.long 0xb3b3b300, 0x00676767, 0xd900d9d9, 0xe4e400e4 +.long 0x27272700, 0x004e4e4e, 0x93009393, 0x57570057 +.long 0xc0c0c000, 0x00818181, 0x60006060, 0xeaea00ea +.long 0xe5e5e500, 0x00cbcbcb, 0xf200f2f2, 0xaeae00ae +.long 0xe4e4e400, 0x00c9c9c9, 0x72007272, 0x23230023 +.long 0x85858500, 0x000b0b0b, 0xc200c2c2, 0x6b6b006b +.long 0x57575700, 0x00aeaeae, 0xab00abab, 0x45450045 +.long 0x35353500, 0x006a6a6a, 0x9a009a9a, 0xa5a500a5 +.long 0xeaeaea00, 0x00d5d5d5, 0x75007575, 0xeded00ed +.long 0x0c0c0c00, 0x00181818, 0x06000606, 0x4f4f004f +.long 0xaeaeae00, 0x005d5d5d, 0x57005757, 0x1d1d001d +.long 0x41414100, 0x00828282, 0xa000a0a0, 0x92920092 +.long 0x23232300, 0x00464646, 0x91009191, 0x86860086 +.long 0xefefef00, 0x00dfdfdf, 0xf700f7f7, 0xafaf00af +.long 0x6b6b6b00, 0x00d6d6d6, 0xb500b5b5, 0x7c7c007c +.long 0x93939300, 0x00272727, 0xc900c9c9, 0x1f1f001f +.long 0x45454500, 0x008a8a8a, 0xa200a2a2, 0x3e3e003e +.long 0x19191900, 0x00323232, 0x8c008c8c, 0xdcdc00dc +.long 0xa5a5a500, 0x004b4b4b, 0xd200d2d2, 0x5e5e005e +.long 0x21212100, 0x00424242, 0x90009090, 0x0b0b000b +.long 0xededed00, 0x00dbdbdb, 0xf600f6f6, 0xa6a600a6 +.long 0x0e0e0e00, 0x001c1c1c, 0x07000707, 0x39390039 +.long 0x4f4f4f00, 0x009e9e9e, 0xa700a7a7, 0xd5d500d5 +.long 0x4e4e4e00, 0x009c9c9c, 0x27002727, 0x5d5d005d +.long 0x1d1d1d00, 0x003a3a3a, 0x8e008e8e, 0xd9d900d9 +.long 0x65656500, 0x00cacaca, 0xb200b2b2, 0x5a5a005a +.long 0x92929200, 0x00252525, 0x49004949, 0x51510051 +.long 0xbdbdbd00, 0x007b7b7b, 0xde00dede, 0x6c6c006c +.long 0x86868600, 0x000d0d0d, 0x43004343, 0x8b8b008b +.long 0xb8b8b800, 0x00717171, 0x5c005c5c, 0x9a9a009a +.long 0xafafaf00, 0x005f5f5f, 0xd700d7d7, 0xfbfb00fb +.long 0x8f8f8f00, 0x001f1f1f, 0xc700c7c7, 0xb0b000b0 +.long 0x7c7c7c00, 0x00f8f8f8, 0x3e003e3e, 0x74740074 +.long 0xebebeb00, 0x00d7d7d7, 0xf500f5f5, 0x2b2b002b +.long 0x1f1f1f00, 0x003e3e3e, 0x8f008f8f, 0xf0f000f0 +.long 0xcecece00, 0x009d9d9d, 0x67006767, 0x84840084 +.long 0x3e3e3e00, 0x007c7c7c, 0x1f001f1f, 0xdfdf00df +.long 0x30303000, 0x00606060, 0x18001818, 0xcbcb00cb +.long 0xdcdcdc00, 0x00b9b9b9, 0x6e006e6e, 0x34340034 +.long 0x5f5f5f00, 0x00bebebe, 0xaf00afaf, 0x76760076 +.long 0x5e5e5e00, 0x00bcbcbc, 0x2f002f2f, 0x6d6d006d +.long 0xc5c5c500, 0x008b8b8b, 0xe200e2e2, 0xa9a900a9 +.long 0x0b0b0b00, 0x00161616, 0x85008585, 0xd1d100d1 +.long 0x1a1a1a00, 0x00343434, 0x0d000d0d, 0x04040004 +.long 0xa6a6a600, 0x004d4d4d, 0x53005353, 0x14140014 +.long 0xe1e1e100, 0x00c3c3c3, 0xf000f0f0, 0x3a3a003a +.long 0x39393900, 0x00727272, 0x9c009c9c, 0xdede00de +.long 0xcacaca00, 0x00959595, 0x65006565, 0x11110011 +.long 0xd5d5d500, 0x00ababab, 0xea00eaea, 0x32320032 +.long 0x47474700, 0x008e8e8e, 0xa300a3a3, 0x9c9c009c +.long 0x5d5d5d00, 0x00bababa, 0xae00aeae, 0x53530053 +.long 0x3d3d3d00, 0x007a7a7a, 0x9e009e9e, 0xf2f200f2 +.long 0xd9d9d900, 0x00b3b3b3, 0xec00ecec, 0xfefe00fe +.long 0x01010100, 0x00020202, 0x80008080, 0xcfcf00cf +.long 0x5a5a5a00, 0x00b4b4b4, 0x2d002d2d, 0xc3c300c3 +.long 0xd6d6d600, 0x00adadad, 0x6b006b6b, 0x7a7a007a +.long 0x51515100, 0x00a2a2a2, 0xa800a8a8, 0x24240024 +.long 0x56565600, 0x00acacac, 0x2b002b2b, 0xe8e800e8 +.long 0x6c6c6c00, 0x00d8d8d8, 0x36003636, 0x60600060 +.long 0x4d4d4d00, 0x009a9a9a, 0xa600a6a6, 0x69690069 +.long 0x8b8b8b00, 0x00171717, 0xc500c5c5, 0xaaaa00aa +.long 0x0d0d0d00, 0x001a1a1a, 0x86008686, 0xa0a000a0 +.long 0x9a9a9a00, 0x00353535, 0x4d004d4d, 0xa1a100a1 +.long 0x66666600, 0x00cccccc, 0x33003333, 0x62620062 +.long 0xfbfbfb00, 0x00f7f7f7, 0xfd00fdfd, 0x54540054 +.long 0xcccccc00, 0x00999999, 0x66006666, 0x1e1e001e +.long 0xb0b0b000, 0x00616161, 0x58005858, 0xe0e000e0 +.long 0x2d2d2d00, 0x005a5a5a, 0x96009696, 0x64640064 +.long 0x74747400, 0x00e8e8e8, 0x3a003a3a, 0x10100010 +.long 0x12121200, 0x00242424, 0x09000909, 0x00000000 +.long 0x2b2b2b00, 0x00565656, 0x95009595, 0xa3a300a3 +.long 0x20202000, 0x00404040, 0x10001010, 0x75750075 +.long 0xf0f0f000, 0x00e1e1e1, 0x78007878, 0x8a8a008a +.long 0xb1b1b100, 0x00636363, 0xd800d8d8, 0xe6e600e6 +.long 0x84848400, 0x00090909, 0x42004242, 0x09090009 +.long 0x99999900, 0x00333333, 0xcc00cccc, 0xdddd00dd +.long 0xdfdfdf00, 0x00bfbfbf, 0xef00efef, 0x87870087 +.long 0x4c4c4c00, 0x00989898, 0x26002626, 0x83830083 +.long 0xcbcbcb00, 0x00979797, 0xe500e5e5, 0xcdcd00cd +.long 0xc2c2c200, 0x00858585, 0x61006161, 0x90900090 +.long 0x34343400, 0x00686868, 0x1a001a1a, 0x73730073 +.long 0x7e7e7e00, 0x00fcfcfc, 0x3f003f3f, 0xf6f600f6 +.long 0x76767600, 0x00ececec, 0x3b003b3b, 0x9d9d009d +.long 0x05050500, 0x000a0a0a, 0x82008282, 0xbfbf00bf +.long 0x6d6d6d00, 0x00dadada, 0xb600b6b6, 0x52520052 +.long 0xb7b7b700, 0x006f6f6f, 0xdb00dbdb, 0xd8d800d8 +.long 0xa9a9a900, 0x00535353, 0xd400d4d4, 0xc8c800c8 +.long 0x31313100, 0x00626262, 0x98009898, 0xc6c600c6 +.long 0xd1d1d100, 0x00a3a3a3, 0xe800e8e8, 0x81810081 +.long 0x17171700, 0x002e2e2e, 0x8b008b8b, 0x6f6f006f +.long 0x04040400, 0x00080808, 0x02000202, 0x13130013 +.long 0xd7d7d700, 0x00afafaf, 0xeb00ebeb, 0x63630063 +.long 0x14141400, 0x00282828, 0x0a000a0a, 0xe9e900e9 +.long 0x58585800, 0x00b0b0b0, 0x2c002c2c, 0xa7a700a7 +.long 0x3a3a3a00, 0x00747474, 0x1d001d1d, 0x9f9f009f +.long 0x61616100, 0x00c2c2c2, 0xb000b0b0, 0xbcbc00bc +.long 0xdedede00, 0x00bdbdbd, 0x6f006f6f, 0x29290029 +.long 0x1b1b1b00, 0x00363636, 0x8d008d8d, 0xf9f900f9 +.long 0x11111100, 0x00222222, 0x88008888, 0x2f2f002f +.long 0x1c1c1c00, 0x00383838, 0x0e000e0e, 0xb4b400b4 +.long 0x32323200, 0x00646464, 0x19001919, 0x78780078 +.long 0x0f0f0f00, 0x001e1e1e, 0x87008787, 0x06060006 +.long 0x9c9c9c00, 0x00393939, 0x4e004e4e, 0xe7e700e7 +.long 0x16161600, 0x002c2c2c, 0x0b000b0b, 0x71710071 +.long 0x53535300, 0x00a6a6a6, 0xa900a9a9, 0xd4d400d4 +.long 0x18181800, 0x00303030, 0x0c000c0c, 0xabab00ab +.long 0xf2f2f200, 0x00e5e5e5, 0x79007979, 0x88880088 +.long 0x22222200, 0x00444444, 0x11001111, 0x8d8d008d +.long 0xfefefe00, 0x00fdfdfd, 0x7f007f7f, 0x72720072 +.long 0x44444400, 0x00888888, 0x22002222, 0xb9b900b9 +.long 0xcfcfcf00, 0x009f9f9f, 0xe700e7e7, 0xf8f800f8 +.long 0xb2b2b200, 0x00656565, 0x59005959, 0xacac00ac +.long 0xc3c3c300, 0x00878787, 0xe100e1e1, 0x36360036 +.long 0xb5b5b500, 0x006b6b6b, 0xda00dada, 0x2a2a002a +.long 0x7a7a7a00, 0x00f4f4f4, 0x3d003d3d, 0x3c3c003c +.long 0x91919100, 0x00232323, 0xc800c8c8, 0xf1f100f1 +.long 0x24242400, 0x00484848, 0x12001212, 0x40400040 +.long 0x08080800, 0x00101010, 0x04000404, 0xd3d300d3 +.long 0xe8e8e800, 0x00d1d1d1, 0x74007474, 0xbbbb00bb +.long 0xa8a8a800, 0x00515151, 0x54005454, 0x43430043 +.long 0x60606000, 0x00c0c0c0, 0x30003030, 0x15150015 +.long 0xfcfcfc00, 0x00f9f9f9, 0x7e007e7e, 0xadad00ad +.long 0x69696900, 0x00d2d2d2, 0xb400b4b4, 0x77770077 +.long 0x50505000, 0x00a0a0a0, 0x28002828, 0x80800080 +.long 0xaaaaaa00, 0x00555555, 0x55005555, 0x82820082 +.long 0xd0d0d000, 0x00a1a1a1, 0x68006868, 0xecec00ec +.long 0xa0a0a000, 0x00414141, 0x50005050, 0x27270027 +.long 0x7d7d7d00, 0x00fafafa, 0xbe00bebe, 0xe5e500e5 +.long 0xa1a1a100, 0x00434343, 0xd000d0d0, 0x85850085 +.long 0x89898900, 0x00131313, 0xc400c4c4, 0x35350035 +.long 0x62626200, 0x00c4c4c4, 0x31003131, 0x0c0c000c +.long 0x97979700, 0x002f2f2f, 0xcb00cbcb, 0x41410041 +.long 0x54545400, 0x00a8a8a8, 0x2a002a2a, 0xefef00ef +.long 0x5b5b5b00, 0x00b6b6b6, 0xad00adad, 0x93930093 +.long 0x1e1e1e00, 0x003c3c3c, 0x0f000f0f, 0x19190019 +.long 0x95959500, 0x002b2b2b, 0xca00caca, 0x21210021 +.long 0xe0e0e000, 0x00c1c1c1, 0x70007070, 0x0e0e000e +.long 0xffffff00, 0x00ffffff, 0xff00ffff, 0x4e4e004e +.long 0x64646400, 0x00c8c8c8, 0x32003232, 0x65650065 +.long 0xd2d2d200, 0x00a5a5a5, 0x69006969, 0xbdbd00bd +.long 0x10101000, 0x00202020, 0x08000808, 0xb8b800b8 +.long 0xc4c4c400, 0x00898989, 0x62006262, 0x8f8f008f +.long 0x00000000, 0x00000000, 0x00000000, 0xebeb00eb +.long 0x48484800, 0x00909090, 0x24002424, 0xcece00ce +.long 0xa3a3a300, 0x00474747, 0xd100d1d1, 0x30300030 +.long 0xf7f7f700, 0x00efefef, 0xfb00fbfb, 0x5f5f005f +.long 0x75757500, 0x00eaeaea, 0xba00baba, 0xc5c500c5 +.long 0xdbdbdb00, 0x00b7b7b7, 0xed00eded, 0x1a1a001a +.long 0x8a8a8a00, 0x00151515, 0x45004545, 0xe1e100e1 +.long 0x03030300, 0x00060606, 0x81008181, 0xcaca00ca +.long 0xe6e6e600, 0x00cdcdcd, 0x73007373, 0x47470047 +.long 0xdadada00, 0x00b5b5b5, 0x6d006d6d, 0x3d3d003d +.long 0x09090900, 0x00121212, 0x84008484, 0x01010001 +.long 0x3f3f3f00, 0x007e7e7e, 0x9f009f9f, 0xd6d600d6 +.long 0xdddddd00, 0x00bbbbbb, 0xee00eeee, 0x56560056 +.long 0x94949400, 0x00292929, 0x4a004a4a, 0x4d4d004d +.long 0x87878700, 0x000f0f0f, 0xc300c3c3, 0x0d0d000d +.long 0x5c5c5c00, 0x00b8b8b8, 0x2e002e2e, 0x66660066 +.long 0x83838300, 0x00070707, 0xc100c1c1, 0xcccc00cc +.long 0x02020200, 0x00040404, 0x01000101, 0x2d2d002d +.long 0xcdcdcd00, 0x009b9b9b, 0xe600e6e6, 0x12120012 +.long 0x4a4a4a00, 0x00949494, 0x25002525, 0x20200020 +.long 0x90909000, 0x00212121, 0x48004848, 0xb1b100b1 +.long 0x33333300, 0x00666666, 0x99009999, 0x99990099 +.long 0x73737300, 0x00e6e6e6, 0xb900b9b9, 0x4c4c004c +.long 0x67676700, 0x00cecece, 0xb300b3b3, 0xc2c200c2 +.long 0xf6f6f600, 0x00ededed, 0x7b007b7b, 0x7e7e007e +.long 0xf3f3f300, 0x00e7e7e7, 0xf900f9f9, 0x05050005 +.long 0x9d9d9d00, 0x003b3b3b, 0xce00cece, 0xb7b700b7 +.long 0x7f7f7f00, 0x00fefefe, 0xbf00bfbf, 0x31310031 +.long 0xbfbfbf00, 0x007f7f7f, 0xdf00dfdf, 0x17170017 +.long 0xe2e2e200, 0x00c5c5c5, 0x71007171, 0xd7d700d7 +.long 0x52525200, 0x00a4a4a4, 0x29002929, 0x58580058 +.long 0x9b9b9b00, 0x00373737, 0xcd00cdcd, 0x61610061 +.long 0xd8d8d800, 0x00b1b1b1, 0x6c006c6c, 0x1b1b001b +.long 0x26262600, 0x004c4c4c, 0x13001313, 0x1c1c001c +.long 0xc8c8c800, 0x00919191, 0x64006464, 0x0f0f000f +.long 0x37373700, 0x006e6e6e, 0x9b009b9b, 0x16160016 +.long 0xc6c6c600, 0x008d8d8d, 0x63006363, 0x18180018 +.long 0x3b3b3b00, 0x00767676, 0x9d009d9d, 0x22220022 +.long 0x81818100, 0x00030303, 0xc000c0c0, 0x44440044 +.long 0x96969600, 0x002d2d2d, 0x4b004b4b, 0xb2b200b2 +.long 0x6f6f6f00, 0x00dedede, 0xb700b7b7, 0xb5b500b5 +.long 0x4b4b4b00, 0x00969696, 0xa500a5a5, 0x91910091 +.long 0x13131300, 0x00262626, 0x89008989, 0x08080008 +.long 0xbebebe00, 0x007d7d7d, 0x5f005f5f, 0xa8a800a8 +.long 0x63636300, 0x00c6c6c6, 0xb100b1b1, 0xfcfc00fc +.long 0x2e2e2e00, 0x005c5c5c, 0x17001717, 0x50500050 +.long 0xe9e9e900, 0x00d3d3d3, 0xf400f4f4, 0xd0d000d0 +.long 0x79797900, 0x00f2f2f2, 0xbc00bcbc, 0x7d7d007d +.long 0xa7a7a700, 0x004f4f4f, 0xd300d3d3, 0x89890089 +.long 0x8c8c8c00, 0x00191919, 0x46004646, 0x97970097 +.long 0x9f9f9f00, 0x003f3f3f, 0xcf00cfcf, 0x5b5b005b +.long 0x6e6e6e00, 0x00dcdcdc, 0x37003737, 0x95950095 +.long 0xbcbcbc00, 0x00797979, 0x5e005e5e, 0xffff00ff +.long 0x8e8e8e00, 0x001d1d1d, 0x47004747, 0xd2d200d2 +.long 0x29292900, 0x00525252, 0x94009494, 0xc4c400c4 +.long 0xf5f5f500, 0x00ebebeb, 0xfa00fafa, 0x48480048 +.long 0xf9f9f900, 0x00f3f3f3, 0xfc00fcfc, 0xf7f700f7 +.long 0xb6b6b600, 0x006d6d6d, 0x5b005b5b, 0xdbdb00db +.long 0x2f2f2f00, 0x005e5e5e, 0x97009797, 0x03030003 +.long 0xfdfdfd00, 0x00fbfbfb, 0xfe00fefe, 0xdada00da +.long 0xb4b4b400, 0x00696969, 0x5a005a5a, 0x3f3f003f +.long 0x59595900, 0x00b2b2b2, 0xac00acac, 0x94940094 +.long 0x78787800, 0x00f0f0f0, 0x3c003c3c, 0x5c5c005c +.long 0x98989800, 0x00313131, 0x4c004c4c, 0x02020002 +.long 0x06060600, 0x000c0c0c, 0x03000303, 0x4a4a004a +.long 0x6a6a6a00, 0x00d4d4d4, 0x35003535, 0x33330033 +.long 0xe7e7e700, 0x00cfcfcf, 0xf300f3f3, 0x67670067 +.long 0x46464600, 0x008c8c8c, 0x23002323, 0xf3f300f3 +.long 0x71717100, 0x00e2e2e2, 0xb800b8b8, 0x7f7f007f +.long 0xbababa00, 0x00757575, 0x5d005d5d, 0xe2e200e2 +.long 0xd4d4d400, 0x00a9a9a9, 0x6a006a6a, 0x9b9b009b +.long 0x25252500, 0x004a4a4a, 0x92009292, 0x26260026 +.long 0xababab00, 0x00575757, 0xd500d5d5, 0x37370037 +.long 0x42424200, 0x00848484, 0x21002121, 0x3b3b003b +.long 0x88888800, 0x00111111, 0x44004444, 0x96960096 +.long 0xa2a2a200, 0x00454545, 0x51005151, 0x4b4b004b +.long 0x8d8d8d00, 0x001b1b1b, 0xc600c6c6, 0xbebe00be +.long 0xfafafa00, 0x00f5f5f5, 0x7d007d7d, 0x2e2e002e +.long 0x72727200, 0x00e4e4e4, 0x39003939, 0x79790079 +.long 0x07070700, 0x000e0e0e, 0x83008383, 0x8c8c008c +.long 0xb9b9b900, 0x00737373, 0xdc00dcdc, 0x6e6e006e +.long 0x55555500, 0x00aaaaaa, 0xaa00aaaa, 0x8e8e008e +.long 0xf8f8f800, 0x00f1f1f1, 0x7c007c7c, 0xf5f500f5 +.long 0xeeeeee00, 0x00dddddd, 0x77007777, 0xb6b600b6 +.long 0xacacac00, 0x00595959, 0x56005656, 0xfdfd00fd +.long 0x0a0a0a00, 0x00141414, 0x05000505, 0x59590059 +.long 0x36363600, 0x006c6c6c, 0x1b001b1b, 0x98980098 +.long 0x49494900, 0x00929292, 0xa400a4a4, 0x6a6a006a +.long 0x2a2a2a00, 0x00545454, 0x15001515, 0x46460046 +.long 0x68686800, 0x00d0d0d0, 0x34003434, 0xbaba00ba +.long 0x3c3c3c00, 0x00787878, 0x1e001e1e, 0x25250025 +.long 0x38383800, 0x00707070, 0x1c001c1c, 0x42420042 +.long 0xf1f1f100, 0x00e3e3e3, 0xf800f8f8, 0xa2a200a2 +.long 0xa4a4a400, 0x00494949, 0x52005252, 0xfafa00fa +.long 0x40404000, 0x00808080, 0x20002020, 0x07070007 +.long 0x28282800, 0x00505050, 0x14001414, 0x55550055 +.long 0xd3d3d300, 0x00a7a7a7, 0xe900e9e9, 0xeeee00ee +.long 0x7b7b7b00, 0x00f6f6f6, 0xbd00bdbd, 0x0a0a000a +.long 0xbbbbbb00, 0x00777777, 0xdd00dddd, 0x49490049 +.long 0xc9c9c900, 0x00939393, 0xe400e4e4, 0x68680068 +.long 0x43434300, 0x00868686, 0xa100a1a1, 0x38380038 +.long 0xc1c1c100, 0x00838383, 0xe000e0e0, 0xa4a400a4 +.long 0x15151500, 0x002a2a2a, 0x8a008a8a, 0x28280028 +.long 0xe3e3e300, 0x00c7c7c7, 0xf100f1f1, 0x7b7b007b +.long 0xadadad00, 0x005b5b5b, 0xd600d6d6, 0xc9c900c9 +.long 0xf4f4f400, 0x00e9e9e9, 0x7a007a7a, 0xc1c100c1 +.long 0x77777700, 0x00eeeeee, 0xbb00bbbb, 0xe3e300e3 +.long 0xc7c7c700, 0x008f8f8f, 0xe300e3e3, 0xf4f400f4 +.long 0x80808000, 0x00010101, 0x40004040, 0xc7c700c7 +.long 0x9e9e9e00, 0x003d3d3d, 0x4f004f4f, 0x9e9e009e + +#endif /*HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS*/ +#endif /*__ARM_ARCH >= 6*/ diff --git a/cipher/camellia-armv6.S b/cipher/camellia-armv6.S deleted file mode 100644 index 3544754..0000000 --- a/cipher/camellia-armv6.S +++ /dev/null @@ -1,604 +0,0 @@ -/* camellia-armv6.S - ARM assembly implementation of Camellia cipher - * - * Copyright ? 2013 Jussi Kivilinna - * - * This file is part of Libgcrypt. - * - * Libgcrypt is free software; you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation; either version 2.1 of - * the License, or (at your option) any later version. - * - * Libgcrypt is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public - * License along with this program; if not, see . - */ - -#include - -#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) -#ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS - -.text - -.syntax unified -.arm - -#define CAMELLIA_TABLE_BYTE_LEN 272 - -/* struct camellia_ctx: */ -#define key_table 0 -#define key_length CAMELLIA_TABLE_BYTE_LEN - -/* register macros */ -#define CTX %r0 -#define RTAB1 %ip -#define RTAB3 %r1 -#define RMASK %lr - -#define IL %r2 -#define IR %r3 - -#define XL %r4 -#define XR %r5 -#define YL %r6 -#define YR %r7 - -#define RT0 %r8 -#define RT1 %r9 -#define RT2 %r10 -#define RT3 %r11 - -/* helper macros */ -#define ldr_unaligned_be(rout, rsrc, offs, rtmp) \ - ldrb rout, [rsrc, #((offs) + 3)]; \ - ldrb rtmp, [rsrc, #((offs) + 2)]; \ - orr rout, rout, rtmp, lsl #8; \ - ldrb rtmp, [rsrc, #((offs) + 1)]; \ - orr rout, rout, rtmp, lsl #16; \ - ldrb rtmp, [rsrc, #((offs) + 0)]; \ - orr rout, rout, rtmp, lsl #24; - -#define str_unaligned_be(rin, rdst, offs, rtmp0, rtmp1) \ - mov rtmp0, rin, lsr #8; \ - strb rin, [rdst, #((offs) + 3)]; \ - mov rtmp1, rin, lsr #16; \ - strb rtmp0, [rdst, #((offs) + 2)]; \ - mov rtmp0, rin, lsr #24; \ - strb rtmp1, [rdst, #((offs) + 1)]; \ - strb rtmp0, [rdst, #((offs) + 0)]; - -#ifdef __ARMEL__ - /* bswap on little-endian */ - #define host_to_be(reg) \ - rev reg, reg; - #define be_to_host(reg) \ - rev reg, reg; -#else - /* nop on big-endian */ - #define host_to_be(reg) /*_*/ - #define be_to_host(reg) /*_*/ -#endif - -#define ldr_input_aligned_be(rin, a, b, c, d) \ - ldr a, [rin, #0]; \ - ldr b, [rin, #4]; \ - be_to_host(a); \ - ldr c, [rin, #8]; \ - be_to_host(b); \ - ldr d, [rin, #12]; \ - be_to_host(c); \ - be_to_host(d); - -#define str_output_aligned_be(rout, a, b, c, d) \ - be_to_host(a); \ - be_to_host(b); \ - str a, [rout, #0]; \ - be_to_host(c); \ - str b, [rout, #4]; \ - be_to_host(d); \ - str c, [rout, #8]; \ - str d, [rout, #12]; - -#ifdef __ARM_FEATURE_UNALIGNED - /* unaligned word reads/writes allowed */ - #define ldr_input_be(rin, ra, rb, rc, rd, rtmp) \ - ldr_input_aligned_be(rin, ra, rb, rc, rd) - - #define str_output_be(rout, ra, rb, rc, rd, rtmp0, rtmp1) \ - str_output_aligned_be(rout, ra, rb, rc, rd) -#else - /* need to handle unaligned reads/writes by byte reads */ - #define ldr_input_be(rin, ra, rb, rc, rd, rtmp0) \ - tst rin, #3; \ - beq 1f; \ - ldr_unaligned_be(ra, rin, 0, rtmp0); \ - ldr_unaligned_be(rb, rin, 4, rtmp0); \ - ldr_unaligned_be(rc, rin, 8, rtmp0); \ - ldr_unaligned_be(rd, rin, 12, rtmp0); \ - b 2f; \ - 1:;\ - ldr_input_aligned_be(rin, ra, rb, rc, rd); \ - 2:; - - #define str_output_be(rout, ra, rb, rc, rd, rtmp0, rtmp1) \ - tst rout, #3; \ - beq 1f; \ - str_unaligned_be(ra, rout, 0, rtmp0, rtmp1); \ - str_unaligned_be(rb, rout, 4, rtmp0, rtmp1); \ - str_unaligned_be(rc, rout, 8, rtmp0, rtmp1); \ - str_unaligned_be(rd, rout, 12, rtmp0, rtmp1); \ - b 2f; \ - 1:;\ - str_output_aligned_be(rout, ra, rb, rc, rd); \ - 2:; -#endif - -/********************************************************************** - 1-way camellia - **********************************************************************/ -#define roundsm(xl, xr, kl, kr, yl, yr) \ - ldr RT2, [CTX, #(key_table + ((kl) * 4))]; \ - and IR, RMASK, xr, lsl#(4); /*sp1110*/ \ - ldr RT3, [CTX, #(key_table + ((kr) * 4))]; \ - and IL, RMASK, xl, lsr#(24 - 4); /*sp1110*/ \ - and RT0, RMASK, xr, lsr#(16 - 4); /*sp3033*/ \ - ldr IR, [RTAB1, IR]; \ - and RT1, RMASK, xl, lsr#(8 - 4); /*sp3033*/ \ - eor yl, RT2; \ - ldr IL, [RTAB1, IL]; \ - eor yr, RT3; \ - \ - ldr RT0, [RTAB3, RT0]; \ - add RTAB1, #4; \ - ldr RT1, [RTAB3, RT1]; \ - add RTAB3, #4; \ - \ - and RT2, RMASK, xr, lsr#(24 - 4); /*sp0222*/ \ - and RT3, RMASK, xl, lsr#(16 - 4); /*sp0222*/ \ - \ - eor IR, RT0; \ - eor IL, RT1; \ - \ - ldr RT2, [RTAB1, RT2]; \ - and RT0, RMASK, xr, lsr#(8 - 4); /*sp4404*/ \ - ldr RT3, [RTAB1, RT3]; \ - and RT1, RMASK, xl, lsl#(4); /*sp4404*/ \ - \ - ldr RT0, [RTAB3, RT0]; \ - sub RTAB1, #4; \ - ldr RT1, [RTAB3, RT1]; \ - sub RTAB3, #4; \ - \ - eor IR, RT2; \ - eor IL, RT3; \ - eor IR, RT0; \ - eor IL, RT1; \ - \ - eor IR, IL; \ - eor yr, yr, IL, ror#8; \ - eor yl, IR; \ - eor yr, IR; - -#define enc_rounds(n) \ - roundsm(XL, XR, ((n) + 2) * 2 + 0, ((n) + 2) * 2 + 1, YL, YR); \ - roundsm(YL, YR, ((n) + 3) * 2 + 0, ((n) + 3) * 2 + 1, XL, XR); \ - roundsm(XL, XR, ((n) + 4) * 2 + 0, ((n) + 4) * 2 + 1, YL, YR); \ - roundsm(YL, YR, ((n) + 5) * 2 + 0, ((n) + 5) * 2 + 1, XL, XR); \ - roundsm(XL, XR, ((n) + 6) * 2 + 0, ((n) + 6) * 2 + 1, YL, YR); \ - roundsm(YL, YR, ((n) + 7) * 2 + 0, ((n) + 7) * 2 + 1, XL, XR); - -#define dec_rounds(n) \ - roundsm(XL, XR, ((n) + 7) * 2 + 0, ((n) + 7) * 2 + 1, YL, YR); \ - roundsm(YL, YR, ((n) + 6) * 2 + 0, ((n) + 6) * 2 + 1, XL, XR); \ - roundsm(XL, XR, ((n) + 5) * 2 + 0, ((n) + 5) * 2 + 1, YL, YR); \ - roundsm(YL, YR, ((n) + 4) * 2 + 0, ((n) + 4) * 2 + 1, XL, XR); \ - roundsm(XL, XR, ((n) + 3) * 2 + 0, ((n) + 3) * 2 + 1, YL, YR); \ - roundsm(YL, YR, ((n) + 2) * 2 + 0, ((n) + 2) * 2 + 1, XL, XR); - -/* perform FL and FL?? */ -#define fls(ll, lr, rl, rr, kll, klr, krl, krr) \ - ldr RT0, [CTX, #(key_table + ((kll) * 4))]; \ - ldr RT2, [CTX, #(key_table + ((krr) * 4))]; \ - and RT0, ll; \ - ldr RT3, [CTX, #(key_table + ((krl) * 4))]; \ - orr RT2, rr; \ - ldr RT1, [CTX, #(key_table + ((klr) * 4))]; \ - eor rl, RT2; \ - eor lr, lr, RT0, ror#31; \ - and RT3, rl; \ - orr RT1, lr; \ - eor ll, RT1; \ - eor rr, rr, RT3, ror#31; - -#define enc_fls(n) \ - fls(XL, XR, YL, YR, \ - (n) * 2 + 0, (n) * 2 + 1, \ - (n) * 2 + 2, (n) * 2 + 3); - -#define dec_fls(n) \ - fls(XL, XR, YL, YR, \ - (n) * 2 + 2, (n) * 2 + 3, \ - (n) * 2 + 0, (n) * 2 + 1); - -#define inpack(n) \ - ldr_input_be(%r2, XL, XR, YL, YR, RT0); \ - ldr RT0, [CTX, #(key_table + ((n) * 8) + 0)]; \ - ldr RT1, [CTX, #(key_table + ((n) * 8) + 4)]; \ - eor XL, RT0; \ - eor XR, RT1; - -#define outunpack(n) \ - ldr RT0, [CTX, #(key_table + ((n) * 8) + 0)]; \ - ldr RT1, [CTX, #(key_table + ((n) * 8) + 4)]; \ - eor YL, RT0; \ - eor YR, RT1; \ - str_output_be(%r1, YL, YR, XL, XR, RT0, RT1); - -.align 3 -.global _gcry_camellia_armv6_encrypt_block -.type _gcry_camellia_armv6_encrypt_block,%function; - -_gcry_camellia_armv6_encrypt_block: - /* input: - * %r0: keytable - * %r1: dst - * %r2: src - * %r3: keybitlen - */ - push {%r1, %r4-%r11, %ip, %lr}; - - ldr RTAB1, =.Lcamellia_sp1110; - mov RMASK, #0xff; - add RTAB3, RTAB1, #(2 * 4); - push {%r3}; - mov RMASK, RMASK, lsl#4 /* byte mask */ - - inpack(0); - - enc_rounds(0); - enc_fls(8); - enc_rounds(8); - enc_fls(16); - enc_rounds(16); - - pop {RT0}; - cmp RT0, #(16 * 8); - bne .Lenc_256; - - pop {%r1}; - outunpack(24); - - pop {%r4-%r11, %ip, %pc}; -.ltorg - -.Lenc_256: - enc_fls(24); - enc_rounds(24); - - pop {%r1}; - outunpack(32); - - pop {%r4-%r11, %ip, %pc}; -.ltorg -.size _gcry_camellia_armv6_encrypt_block,.-_gcry_camellia_armv6_encrypt_block; - -.align 3 -.global _gcry_camellia_armv6_decrypt_block -.type _gcry_camellia_armv6_decrypt_block,%function; - -_gcry_camellia_armv6_decrypt_block: - /* input: - * %r0: keytable - * %r1: dst - * %r2: src - * %r3: keybitlen - */ - push {%r1, %r4-%r11, %ip, %lr}; - - ldr RTAB1, =.Lcamellia_sp1110; - mov RMASK, #0xff; - add RTAB3, RTAB1, #(2 * 4); - mov RMASK, RMASK, lsl#4 /* byte mask */ - - cmp %r3, #(16 * 8); - bne .Ldec_256; - - inpack(24); - -.Ldec_128: - dec_rounds(16); - dec_fls(16); - dec_rounds(8); - dec_fls(8); - dec_rounds(0); - - pop {%r1}; - outunpack(0); - - pop {%r4-%r11, %ip, %pc}; -.ltorg - -.Ldec_256: - inpack(32); - dec_rounds(24); - dec_fls(24); - - b .Ldec_128; -.ltorg -.size _gcry_camellia_armv6_decrypt_block,.-_gcry_camellia_armv6_decrypt_block; - -.data - -/* Encryption/Decryption tables */ -.align 5 -.Lcamellia_sp1110: -.long 0x70707000 -.Lcamellia_sp0222: - .long 0x00e0e0e0 -.Lcamellia_sp3033: - .long 0x38003838 -.Lcamellia_sp4404: - .long 0x70700070 -.long 0x82828200, 0x00050505, 0x41004141, 0x2c2c002c -.long 0x2c2c2c00, 0x00585858, 0x16001616, 0xb3b300b3 -.long 0xececec00, 0x00d9d9d9, 0x76007676, 0xc0c000c0 -.long 0xb3b3b300, 0x00676767, 0xd900d9d9, 0xe4e400e4 -.long 0x27272700, 0x004e4e4e, 0x93009393, 0x57570057 -.long 0xc0c0c000, 0x00818181, 0x60006060, 0xeaea00ea -.long 0xe5e5e500, 0x00cbcbcb, 0xf200f2f2, 0xaeae00ae -.long 0xe4e4e400, 0x00c9c9c9, 0x72007272, 0x23230023 -.long 0x85858500, 0x000b0b0b, 0xc200c2c2, 0x6b6b006b -.long 0x57575700, 0x00aeaeae, 0xab00abab, 0x45450045 -.long 0x35353500, 0x006a6a6a, 0x9a009a9a, 0xa5a500a5 -.long 0xeaeaea00, 0x00d5d5d5, 0x75007575, 0xeded00ed -.long 0x0c0c0c00, 0x00181818, 0x06000606, 0x4f4f004f -.long 0xaeaeae00, 0x005d5d5d, 0x57005757, 0x1d1d001d -.long 0x41414100, 0x00828282, 0xa000a0a0, 0x92920092 -.long 0x23232300, 0x00464646, 0x91009191, 0x86860086 -.long 0xefefef00, 0x00dfdfdf, 0xf700f7f7, 0xafaf00af -.long 0x6b6b6b00, 0x00d6d6d6, 0xb500b5b5, 0x7c7c007c -.long 0x93939300, 0x00272727, 0xc900c9c9, 0x1f1f001f -.long 0x45454500, 0x008a8a8a, 0xa200a2a2, 0x3e3e003e -.long 0x19191900, 0x00323232, 0x8c008c8c, 0xdcdc00dc -.long 0xa5a5a500, 0x004b4b4b, 0xd200d2d2, 0x5e5e005e -.long 0x21212100, 0x00424242, 0x90009090, 0x0b0b000b -.long 0xededed00, 0x00dbdbdb, 0xf600f6f6, 0xa6a600a6 -.long 0x0e0e0e00, 0x001c1c1c, 0x07000707, 0x39390039 -.long 0x4f4f4f00, 0x009e9e9e, 0xa700a7a7, 0xd5d500d5 -.long 0x4e4e4e00, 0x009c9c9c, 0x27002727, 0x5d5d005d -.long 0x1d1d1d00, 0x003a3a3a, 0x8e008e8e, 0xd9d900d9 -.long 0x65656500, 0x00cacaca, 0xb200b2b2, 0x5a5a005a -.long 0x92929200, 0x00252525, 0x49004949, 0x51510051 -.long 0xbdbdbd00, 0x007b7b7b, 0xde00dede, 0x6c6c006c -.long 0x86868600, 0x000d0d0d, 0x43004343, 0x8b8b008b -.long 0xb8b8b800, 0x00717171, 0x5c005c5c, 0x9a9a009a -.long 0xafafaf00, 0x005f5f5f, 0xd700d7d7, 0xfbfb00fb -.long 0x8f8f8f00, 0x001f1f1f, 0xc700c7c7, 0xb0b000b0 -.long 0x7c7c7c00, 0x00f8f8f8, 0x3e003e3e, 0x74740074 -.long 0xebebeb00, 0x00d7d7d7, 0xf500f5f5, 0x2b2b002b -.long 0x1f1f1f00, 0x003e3e3e, 0x8f008f8f, 0xf0f000f0 -.long 0xcecece00, 0x009d9d9d, 0x67006767, 0x84840084 -.long 0x3e3e3e00, 0x007c7c7c, 0x1f001f1f, 0xdfdf00df -.long 0x30303000, 0x00606060, 0x18001818, 0xcbcb00cb -.long 0xdcdcdc00, 0x00b9b9b9, 0x6e006e6e, 0x34340034 -.long 0x5f5f5f00, 0x00bebebe, 0xaf00afaf, 0x76760076 -.long 0x5e5e5e00, 0x00bcbcbc, 0x2f002f2f, 0x6d6d006d -.long 0xc5c5c500, 0x008b8b8b, 0xe200e2e2, 0xa9a900a9 -.long 0x0b0b0b00, 0x00161616, 0x85008585, 0xd1d100d1 -.long 0x1a1a1a00, 0x00343434, 0x0d000d0d, 0x04040004 -.long 0xa6a6a600, 0x004d4d4d, 0x53005353, 0x14140014 -.long 0xe1e1e100, 0x00c3c3c3, 0xf000f0f0, 0x3a3a003a -.long 0x39393900, 0x00727272, 0x9c009c9c, 0xdede00de -.long 0xcacaca00, 0x00959595, 0x65006565, 0x11110011 -.long 0xd5d5d500, 0x00ababab, 0xea00eaea, 0x32320032 -.long 0x47474700, 0x008e8e8e, 0xa300a3a3, 0x9c9c009c -.long 0x5d5d5d00, 0x00bababa, 0xae00aeae, 0x53530053 -.long 0x3d3d3d00, 0x007a7a7a, 0x9e009e9e, 0xf2f200f2 -.long 0xd9d9d900, 0x00b3b3b3, 0xec00ecec, 0xfefe00fe -.long 0x01010100, 0x00020202, 0x80008080, 0xcfcf00cf -.long 0x5a5a5a00, 0x00b4b4b4, 0x2d002d2d, 0xc3c300c3 -.long 0xd6d6d600, 0x00adadad, 0x6b006b6b, 0x7a7a007a -.long 0x51515100, 0x00a2a2a2, 0xa800a8a8, 0x24240024 -.long 0x56565600, 0x00acacac, 0x2b002b2b, 0xe8e800e8 -.long 0x6c6c6c00, 0x00d8d8d8, 0x36003636, 0x60600060 -.long 0x4d4d4d00, 0x009a9a9a, 0xa600a6a6, 0x69690069 -.long 0x8b8b8b00, 0x00171717, 0xc500c5c5, 0xaaaa00aa -.long 0x0d0d0d00, 0x001a1a1a, 0x86008686, 0xa0a000a0 -.long 0x9a9a9a00, 0x00353535, 0x4d004d4d, 0xa1a100a1 -.long 0x66666600, 0x00cccccc, 0x33003333, 0x62620062 -.long 0xfbfbfb00, 0x00f7f7f7, 0xfd00fdfd, 0x54540054 -.long 0xcccccc00, 0x00999999, 0x66006666, 0x1e1e001e -.long 0xb0b0b000, 0x00616161, 0x58005858, 0xe0e000e0 -.long 0x2d2d2d00, 0x005a5a5a, 0x96009696, 0x64640064 -.long 0x74747400, 0x00e8e8e8, 0x3a003a3a, 0x10100010 -.long 0x12121200, 0x00242424, 0x09000909, 0x00000000 -.long 0x2b2b2b00, 0x00565656, 0x95009595, 0xa3a300a3 -.long 0x20202000, 0x00404040, 0x10001010, 0x75750075 -.long 0xf0f0f000, 0x00e1e1e1, 0x78007878, 0x8a8a008a -.long 0xb1b1b100, 0x00636363, 0xd800d8d8, 0xe6e600e6 -.long 0x84848400, 0x00090909, 0x42004242, 0x09090009 -.long 0x99999900, 0x00333333, 0xcc00cccc, 0xdddd00dd -.long 0xdfdfdf00, 0x00bfbfbf, 0xef00efef, 0x87870087 -.long 0x4c4c4c00, 0x00989898, 0x26002626, 0x83830083 -.long 0xcbcbcb00, 0x00979797, 0xe500e5e5, 0xcdcd00cd -.long 0xc2c2c200, 0x00858585, 0x61006161, 0x90900090 -.long 0x34343400, 0x00686868, 0x1a001a1a, 0x73730073 -.long 0x7e7e7e00, 0x00fcfcfc, 0x3f003f3f, 0xf6f600f6 -.long 0x76767600, 0x00ececec, 0x3b003b3b, 0x9d9d009d -.long 0x05050500, 0x000a0a0a, 0x82008282, 0xbfbf00bf -.long 0x6d6d6d00, 0x00dadada, 0xb600b6b6, 0x52520052 -.long 0xb7b7b700, 0x006f6f6f, 0xdb00dbdb, 0xd8d800d8 -.long 0xa9a9a900, 0x00535353, 0xd400d4d4, 0xc8c800c8 -.long 0x31313100, 0x00626262, 0x98009898, 0xc6c600c6 -.long 0xd1d1d100, 0x00a3a3a3, 0xe800e8e8, 0x81810081 -.long 0x17171700, 0x002e2e2e, 0x8b008b8b, 0x6f6f006f -.long 0x04040400, 0x00080808, 0x02000202, 0x13130013 -.long 0xd7d7d700, 0x00afafaf, 0xeb00ebeb, 0x63630063 -.long 0x14141400, 0x00282828, 0x0a000a0a, 0xe9e900e9 -.long 0x58585800, 0x00b0b0b0, 0x2c002c2c, 0xa7a700a7 -.long 0x3a3a3a00, 0x00747474, 0x1d001d1d, 0x9f9f009f -.long 0x61616100, 0x00c2c2c2, 0xb000b0b0, 0xbcbc00bc -.long 0xdedede00, 0x00bdbdbd, 0x6f006f6f, 0x29290029 -.long 0x1b1b1b00, 0x00363636, 0x8d008d8d, 0xf9f900f9 -.long 0x11111100, 0x00222222, 0x88008888, 0x2f2f002f -.long 0x1c1c1c00, 0x00383838, 0x0e000e0e, 0xb4b400b4 -.long 0x32323200, 0x00646464, 0x19001919, 0x78780078 -.long 0x0f0f0f00, 0x001e1e1e, 0x87008787, 0x06060006 -.long 0x9c9c9c00, 0x00393939, 0x4e004e4e, 0xe7e700e7 -.long 0x16161600, 0x002c2c2c, 0x0b000b0b, 0x71710071 -.long 0x53535300, 0x00a6a6a6, 0xa900a9a9, 0xd4d400d4 -.long 0x18181800, 0x00303030, 0x0c000c0c, 0xabab00ab -.long 0xf2f2f200, 0x00e5e5e5, 0x79007979, 0x88880088 -.long 0x22222200, 0x00444444, 0x11001111, 0x8d8d008d -.long 0xfefefe00, 0x00fdfdfd, 0x7f007f7f, 0x72720072 -.long 0x44444400, 0x00888888, 0x22002222, 0xb9b900b9 -.long 0xcfcfcf00, 0x009f9f9f, 0xe700e7e7, 0xf8f800f8 -.long 0xb2b2b200, 0x00656565, 0x59005959, 0xacac00ac -.long 0xc3c3c300, 0x00878787, 0xe100e1e1, 0x36360036 -.long 0xb5b5b500, 0x006b6b6b, 0xda00dada, 0x2a2a002a -.long 0x7a7a7a00, 0x00f4f4f4, 0x3d003d3d, 0x3c3c003c -.long 0x91919100, 0x00232323, 0xc800c8c8, 0xf1f100f1 -.long 0x24242400, 0x00484848, 0x12001212, 0x40400040 -.long 0x08080800, 0x00101010, 0x04000404, 0xd3d300d3 -.long 0xe8e8e800, 0x00d1d1d1, 0x74007474, 0xbbbb00bb -.long 0xa8a8a800, 0x00515151, 0x54005454, 0x43430043 -.long 0x60606000, 0x00c0c0c0, 0x30003030, 0x15150015 -.long 0xfcfcfc00, 0x00f9f9f9, 0x7e007e7e, 0xadad00ad -.long 0x69696900, 0x00d2d2d2, 0xb400b4b4, 0x77770077 -.long 0x50505000, 0x00a0a0a0, 0x28002828, 0x80800080 -.long 0xaaaaaa00, 0x00555555, 0x55005555, 0x82820082 -.long 0xd0d0d000, 0x00a1a1a1, 0x68006868, 0xecec00ec -.long 0xa0a0a000, 0x00414141, 0x50005050, 0x27270027 -.long 0x7d7d7d00, 0x00fafafa, 0xbe00bebe, 0xe5e500e5 -.long 0xa1a1a100, 0x00434343, 0xd000d0d0, 0x85850085 -.long 0x89898900, 0x00131313, 0xc400c4c4, 0x35350035 -.long 0x62626200, 0x00c4c4c4, 0x31003131, 0x0c0c000c -.long 0x97979700, 0x002f2f2f, 0xcb00cbcb, 0x41410041 -.long 0x54545400, 0x00a8a8a8, 0x2a002a2a, 0xefef00ef -.long 0x5b5b5b00, 0x00b6b6b6, 0xad00adad, 0x93930093 -.long 0x1e1e1e00, 0x003c3c3c, 0x0f000f0f, 0x19190019 -.long 0x95959500, 0x002b2b2b, 0xca00caca, 0x21210021 -.long 0xe0e0e000, 0x00c1c1c1, 0x70007070, 0x0e0e000e -.long 0xffffff00, 0x00ffffff, 0xff00ffff, 0x4e4e004e -.long 0x64646400, 0x00c8c8c8, 0x32003232, 0x65650065 -.long 0xd2d2d200, 0x00a5a5a5, 0x69006969, 0xbdbd00bd -.long 0x10101000, 0x00202020, 0x08000808, 0xb8b800b8 -.long 0xc4c4c400, 0x00898989, 0x62006262, 0x8f8f008f -.long 0x00000000, 0x00000000, 0x00000000, 0xebeb00eb -.long 0x48484800, 0x00909090, 0x24002424, 0xcece00ce -.long 0xa3a3a300, 0x00474747, 0xd100d1d1, 0x30300030 -.long 0xf7f7f700, 0x00efefef, 0xfb00fbfb, 0x5f5f005f -.long 0x75757500, 0x00eaeaea, 0xba00baba, 0xc5c500c5 -.long 0xdbdbdb00, 0x00b7b7b7, 0xed00eded, 0x1a1a001a -.long 0x8a8a8a00, 0x00151515, 0x45004545, 0xe1e100e1 -.long 0x03030300, 0x00060606, 0x81008181, 0xcaca00ca -.long 0xe6e6e600, 0x00cdcdcd, 0x73007373, 0x47470047 -.long 0xdadada00, 0x00b5b5b5, 0x6d006d6d, 0x3d3d003d -.long 0x09090900, 0x00121212, 0x84008484, 0x01010001 -.long 0x3f3f3f00, 0x007e7e7e, 0x9f009f9f, 0xd6d600d6 -.long 0xdddddd00, 0x00bbbbbb, 0xee00eeee, 0x56560056 -.long 0x94949400, 0x00292929, 0x4a004a4a, 0x4d4d004d -.long 0x87878700, 0x000f0f0f, 0xc300c3c3, 0x0d0d000d -.long 0x5c5c5c00, 0x00b8b8b8, 0x2e002e2e, 0x66660066 -.long 0x83838300, 0x00070707, 0xc100c1c1, 0xcccc00cc -.long 0x02020200, 0x00040404, 0x01000101, 0x2d2d002d -.long 0xcdcdcd00, 0x009b9b9b, 0xe600e6e6, 0x12120012 -.long 0x4a4a4a00, 0x00949494, 0x25002525, 0x20200020 -.long 0x90909000, 0x00212121, 0x48004848, 0xb1b100b1 -.long 0x33333300, 0x00666666, 0x99009999, 0x99990099 -.long 0x73737300, 0x00e6e6e6, 0xb900b9b9, 0x4c4c004c -.long 0x67676700, 0x00cecece, 0xb300b3b3, 0xc2c200c2 -.long 0xf6f6f600, 0x00ededed, 0x7b007b7b, 0x7e7e007e -.long 0xf3f3f300, 0x00e7e7e7, 0xf900f9f9, 0x05050005 -.long 0x9d9d9d00, 0x003b3b3b, 0xce00cece, 0xb7b700b7 -.long 0x7f7f7f00, 0x00fefefe, 0xbf00bfbf, 0x31310031 -.long 0xbfbfbf00, 0x007f7f7f, 0xdf00dfdf, 0x17170017 -.long 0xe2e2e200, 0x00c5c5c5, 0x71007171, 0xd7d700d7 -.long 0x52525200, 0x00a4a4a4, 0x29002929, 0x58580058 -.long 0x9b9b9b00, 0x00373737, 0xcd00cdcd, 0x61610061 -.long 0xd8d8d800, 0x00b1b1b1, 0x6c006c6c, 0x1b1b001b -.long 0x26262600, 0x004c4c4c, 0x13001313, 0x1c1c001c -.long 0xc8c8c800, 0x00919191, 0x64006464, 0x0f0f000f -.long 0x37373700, 0x006e6e6e, 0x9b009b9b, 0x16160016 -.long 0xc6c6c600, 0x008d8d8d, 0x63006363, 0x18180018 -.long 0x3b3b3b00, 0x00767676, 0x9d009d9d, 0x22220022 -.long 0x81818100, 0x00030303, 0xc000c0c0, 0x44440044 -.long 0x96969600, 0x002d2d2d, 0x4b004b4b, 0xb2b200b2 -.long 0x6f6f6f00, 0x00dedede, 0xb700b7b7, 0xb5b500b5 -.long 0x4b4b4b00, 0x00969696, 0xa500a5a5, 0x91910091 -.long 0x13131300, 0x00262626, 0x89008989, 0x08080008 -.long 0xbebebe00, 0x007d7d7d, 0x5f005f5f, 0xa8a800a8 -.long 0x63636300, 0x00c6c6c6, 0xb100b1b1, 0xfcfc00fc -.long 0x2e2e2e00, 0x005c5c5c, 0x17001717, 0x50500050 -.long 0xe9e9e900, 0x00d3d3d3, 0xf400f4f4, 0xd0d000d0 -.long 0x79797900, 0x00f2f2f2, 0xbc00bcbc, 0x7d7d007d -.long 0xa7a7a700, 0x004f4f4f, 0xd300d3d3, 0x89890089 -.long 0x8c8c8c00, 0x00191919, 0x46004646, 0x97970097 -.long 0x9f9f9f00, 0x003f3f3f, 0xcf00cfcf, 0x5b5b005b -.long 0x6e6e6e00, 0x00dcdcdc, 0x37003737, 0x95950095 -.long 0xbcbcbc00, 0x00797979, 0x5e005e5e, 0xffff00ff -.long 0x8e8e8e00, 0x001d1d1d, 0x47004747, 0xd2d200d2 -.long 0x29292900, 0x00525252, 0x94009494, 0xc4c400c4 -.long 0xf5f5f500, 0x00ebebeb, 0xfa00fafa, 0x48480048 -.long 0xf9f9f900, 0x00f3f3f3, 0xfc00fcfc, 0xf7f700f7 -.long 0xb6b6b600, 0x006d6d6d, 0x5b005b5b, 0xdbdb00db -.long 0x2f2f2f00, 0x005e5e5e, 0x97009797, 0x03030003 -.long 0xfdfdfd00, 0x00fbfbfb, 0xfe00fefe, 0xdada00da -.long 0xb4b4b400, 0x00696969, 0x5a005a5a, 0x3f3f003f -.long 0x59595900, 0x00b2b2b2, 0xac00acac, 0x94940094 -.long 0x78787800, 0x00f0f0f0, 0x3c003c3c, 0x5c5c005c -.long 0x98989800, 0x00313131, 0x4c004c4c, 0x02020002 -.long 0x06060600, 0x000c0c0c, 0x03000303, 0x4a4a004a -.long 0x6a6a6a00, 0x00d4d4d4, 0x35003535, 0x33330033 -.long 0xe7e7e700, 0x00cfcfcf, 0xf300f3f3, 0x67670067 -.long 0x46464600, 0x008c8c8c, 0x23002323, 0xf3f300f3 -.long 0x71717100, 0x00e2e2e2, 0xb800b8b8, 0x7f7f007f -.long 0xbababa00, 0x00757575, 0x5d005d5d, 0xe2e200e2 -.long 0xd4d4d400, 0x00a9a9a9, 0x6a006a6a, 0x9b9b009b -.long 0x25252500, 0x004a4a4a, 0x92009292, 0x26260026 -.long 0xababab00, 0x00575757, 0xd500d5d5, 0x37370037 -.long 0x42424200, 0x00848484, 0x21002121, 0x3b3b003b -.long 0x88888800, 0x00111111, 0x44004444, 0x96960096 -.long 0xa2a2a200, 0x00454545, 0x51005151, 0x4b4b004b -.long 0x8d8d8d00, 0x001b1b1b, 0xc600c6c6, 0xbebe00be -.long 0xfafafa00, 0x00f5f5f5, 0x7d007d7d, 0x2e2e002e -.long 0x72727200, 0x00e4e4e4, 0x39003939, 0x79790079 -.long 0x07070700, 0x000e0e0e, 0x83008383, 0x8c8c008c -.long 0xb9b9b900, 0x00737373, 0xdc00dcdc, 0x6e6e006e -.long 0x55555500, 0x00aaaaaa, 0xaa00aaaa, 0x8e8e008e -.long 0xf8f8f800, 0x00f1f1f1, 0x7c007c7c, 0xf5f500f5 -.long 0xeeeeee00, 0x00dddddd, 0x77007777, 0xb6b600b6 -.long 0xacacac00, 0x00595959, 0x56005656, 0xfdfd00fd -.long 0x0a0a0a00, 0x00141414, 0x05000505, 0x59590059 -.long 0x36363600, 0x006c6c6c, 0x1b001b1b, 0x98980098 -.long 0x49494900, 0x00929292, 0xa400a4a4, 0x6a6a006a -.long 0x2a2a2a00, 0x00545454, 0x15001515, 0x46460046 -.long 0x68686800, 0x00d0d0d0, 0x34003434, 0xbaba00ba -.long 0x3c3c3c00, 0x00787878, 0x1e001e1e, 0x25250025 -.long 0x38383800, 0x00707070, 0x1c001c1c, 0x42420042 -.long 0xf1f1f100, 0x00e3e3e3, 0xf800f8f8, 0xa2a200a2 -.long 0xa4a4a400, 0x00494949, 0x52005252, 0xfafa00fa -.long 0x40404000, 0x00808080, 0x20002020, 0x07070007 -.long 0x28282800, 0x00505050, 0x14001414, 0x55550055 -.long 0xd3d3d300, 0x00a7a7a7, 0xe900e9e9, 0xeeee00ee -.long 0x7b7b7b00, 0x00f6f6f6, 0xbd00bdbd, 0x0a0a000a -.long 0xbbbbbb00, 0x00777777, 0xdd00dddd, 0x49490049 -.long 0xc9c9c900, 0x00939393, 0xe400e4e4, 0x68680068 -.long 0x43434300, 0x00868686, 0xa100a1a1, 0x38380038 -.long 0xc1c1c100, 0x00838383, 0xe000e0e0, 0xa4a400a4 -.long 0x15151500, 0x002a2a2a, 0x8a008a8a, 0x28280028 -.long 0xe3e3e300, 0x00c7c7c7, 0xf100f1f1, 0x7b7b007b -.long 0xadadad00, 0x005b5b5b, 0xd600d6d6, 0xc9c900c9 -.long 0xf4f4f400, 0x00e9e9e9, 0x7a007a7a, 0xc1c100c1 -.long 0x77777700, 0x00eeeeee, 0xbb00bbbb, 0xe3e300e3 -.long 0xc7c7c700, 0x008f8f8f, 0xe300e3e3, 0xf4f400f4 -.long 0x80808000, 0x00010101, 0x40004040, 0xc7c700c7 -.long 0x9e9e9e00, 0x003d3d3d, 0x4f004f4f, 0x9e9e009e - -#endif /*HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS*/ -#endif /*__ARM_ARCH >= 6*/ diff --git a/cipher/camellia-glue.c b/cipher/camellia-glue.c index 29cb7a5..e6d4029 100644 --- a/cipher/camellia-glue.c +++ b/cipher/camellia-glue.c @@ -193,14 +193,14 @@ camellia_setkey(void *c, const byte *key, unsigned keylen) return 0; } -#ifdef USE_ARMV6_ASM +#ifdef USE_ARM_ASM /* Assembly implementations of CAST5. */ -extern void _gcry_camellia_armv6_encrypt_block(const KEY_TABLE_TYPE keyTable, +extern void _gcry_camellia_arm_encrypt_block(const KEY_TABLE_TYPE keyTable, byte *outbuf, const byte *inbuf, const int keybits); -extern void _gcry_camellia_armv6_decrypt_block(const KEY_TABLE_TYPE keyTable, +extern void _gcry_camellia_arm_decrypt_block(const KEY_TABLE_TYPE keyTable, byte *outbuf, const byte *inbuf, const int keybits); @@ -209,7 +209,7 @@ static void Camellia_EncryptBlock(const int keyBitLength, const KEY_TABLE_TYPE keyTable, unsigned char *cipherText) { - _gcry_camellia_armv6_encrypt_block(keyTable, cipherText, plaintext, + _gcry_camellia_arm_encrypt_block(keyTable, cipherText, plaintext, keyBitLength); } @@ -218,7 +218,7 @@ static void Camellia_DecryptBlock(const int keyBitLength, const KEY_TABLE_TYPE keyTable, unsigned char *plaintext) { - _gcry_camellia_armv6_decrypt_block(keyTable, plaintext, cipherText, + _gcry_camellia_arm_decrypt_block(keyTable, plaintext, cipherText, keyBitLength); } @@ -240,7 +240,7 @@ camellia_decrypt(void *c, byte *outbuf, const byte *inbuf) return /*burn_stack*/ (CAMELLIA_decrypt_stack_burn_size); } -#else /*USE_ARMV6_ASM*/ +#else /*USE_ARM_ASM*/ static unsigned int camellia_encrypt(void *c, byte *outbuf, const byte *inbuf) @@ -276,7 +276,7 @@ camellia_decrypt(void *c, byte *outbuf, const byte *inbuf) return /*burn_stack*/ (CAMELLIA_decrypt_stack_burn_size); } -#endif /*!USE_ARMV6_ASM*/ +#endif /*!USE_ARM_ASM*/ /* Bulk encryption of complete blocks in CTR mode. This function is only intended for the bulk encryption feature of cipher.c. CTR is expected to be diff --git a/cipher/camellia.c b/cipher/camellia.c index 03510a3..9067246 100644 --- a/cipher/camellia.c +++ b/cipher/camellia.c @@ -861,7 +861,7 @@ void camellia_setup192(const unsigned char *key, u32 *subkey) } -#ifndef USE_ARMV6_ASM +#ifndef USE_ARM_ASM /** * Stuff related to camellia encryption/decryption * @@ -1321,7 +1321,7 @@ void camellia_decrypt256(const u32 *subkey, u32 *blocks) return; } -#endif /*!USE_ARMV6_ASM*/ +#endif /*!USE_ARM_ASM*/ /*** @@ -1349,7 +1349,7 @@ void Camellia_Ekeygen(const int keyBitLength, } -#ifndef USE_ARMV6_ASM +#ifndef USE_ARM_ASM void Camellia_EncryptBlock(const int keyBitLength, const unsigned char *plaintext, const KEY_TABLE_TYPE keyTable, @@ -1410,4 +1410,4 @@ void Camellia_DecryptBlock(const int keyBitLength, PUTU32(plaintext + 8, tmp[2]); PUTU32(plaintext + 12, tmp[3]); } -#endif /*!USE_ARMV6_ASM*/ +#endif /*!USE_ARM_ASM*/ diff --git a/cipher/camellia.h b/cipher/camellia.h index 72f2d1f..20faa2c 100644 --- a/cipher/camellia.h +++ b/cipher/camellia.h @@ -32,7 +32,7 @@ #include /* USE_ARMV6_ASM indicates whether to use ARMv6 assembly code. */ # undef USE_ARMV6_ASM -# if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +# if defined(__ARMEL__) # ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS # define USE_ARMV6_ASM 1 # endif diff --git a/cipher/cast5-arm.S b/cipher/cast5-arm.S new file mode 100644 index 0000000..ce7fa93 --- /dev/null +++ b/cipher/cast5-arm.S @@ -0,0 +1,715 @@ +/* cast5-arm.S - ARM assembly implementation of CAST5 cipher + * + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include + +#if defined(__ARMEL__) +#ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS + +.text + +.syntax unified +.arm + +.extern _gcry_cast5_s1to4; + +/* structure of crypto context */ +#define Km 0 +#define Kr (Km + (16 * 4)) +#define Kr_arm_enc (Kr + (16)) +#define Kr_arm_dec (Kr_arm_enc + (16)) + +/* register macros */ +#define CTX %r0 +#define Rs1 %r7 +#define Rs2 %r8 +#define Rs3 %r9 +#define Rs4 %r10 +#define RMASK %r11 +#define RKM %r1 +#define RKR %r2 + +#define RL0 %r3 +#define RR0 %r4 + +#define RL1 %r9 +#define RR1 %r10 + +#define RT0 %lr +#define RT1 %ip +#define RT2 %r5 +#define RT3 %r6 + +/* helper macros */ +#define ldr_unaligned_le(rout, rsrc, offs, rtmp) \ + ldrb rout, [rsrc, #((offs) + 0)]; \ + ldrb rtmp, [rsrc, #((offs) + 1)]; \ + orr rout, rout, rtmp, lsl #8; \ + ldrb rtmp, [rsrc, #((offs) + 2)]; \ + orr rout, rout, rtmp, lsl #16; \ + ldrb rtmp, [rsrc, #((offs) + 3)]; \ + orr rout, rout, rtmp, lsl #24; + +#define str_unaligned_le(rin, rdst, offs, rtmp0, rtmp1) \ + mov rtmp0, rin, lsr #8; \ + strb rin, [rdst, #((offs) + 0)]; \ + mov rtmp1, rin, lsr #16; \ + strb rtmp0, [rdst, #((offs) + 1)]; \ + mov rtmp0, rin, lsr #24; \ + strb rtmp1, [rdst, #((offs) + 2)]; \ + strb rtmp0, [rdst, #((offs) + 3)]; + +#define ldr_unaligned_be(rout, rsrc, offs, rtmp) \ + ldrb rout, [rsrc, #((offs) + 3)]; \ + ldrb rtmp, [rsrc, #((offs) + 2)]; \ + orr rout, rout, rtmp, lsl #8; \ + ldrb rtmp, [rsrc, #((offs) + 1)]; \ + orr rout, rout, rtmp, lsl #16; \ + ldrb rtmp, [rsrc, #((offs) + 0)]; \ + orr rout, rout, rtmp, lsl #24; + +#define str_unaligned_be(rin, rdst, offs, rtmp0, rtmp1) \ + mov rtmp0, rin, lsr #8; \ + strb rin, [rdst, #((offs) + 3)]; \ + mov rtmp1, rin, lsr #16; \ + strb rtmp0, [rdst, #((offs) + 2)]; \ + mov rtmp0, rin, lsr #24; \ + strb rtmp1, [rdst, #((offs) + 1)]; \ + strb rtmp0, [rdst, #((offs) + 0)]; + +#ifdef __ARMEL__ + #define ldr_unaligned_host ldr_unaligned_le + #define str_unaligned_host str_unaligned_le + + /* bswap on little-endian */ +#ifdef HAVE_ARM_ARCH_V6 + #define host_to_be(reg, rtmp) \ + rev reg, reg; + #define be_to_host(reg, rtmp) \ + rev reg, reg; +#else + #define host_to_be(reg, rtmp) \ + eor rtmp, reg, reg, ror #16; \ + mov rtmp, rtmp, lsr #8; \ + bic rtmp, rtmp, #65280; \ + eor reg, rtmp, reg, ror #8; + #define be_to_host(reg, rtmp) \ + eor rtmp, reg, reg, ror #16; \ + mov rtmp, rtmp, lsr #8; \ + bic rtmp, rtmp, #65280; \ + eor reg, rtmp, reg, ror #8; +#endif +#else + #define ldr_unaligned_host ldr_unaligned_be + #define str_unaligned_host str_unaligned_be + + /* nop on big-endian */ + #define host_to_be(reg, rtmp) /*_*/ + #define be_to_host(reg, rtmp) /*_*/ +#endif + +#define host_to_host(x, y) /*_*/ + +/********************************************************************** + 1-way cast5 + **********************************************************************/ + +#define dummy(n) /*_*/ + +#define load_kr(n) \ + ldr RKR, [CTX, #(Kr_arm_enc + (n))]; /* Kr[n] */ + +#define load_dec_kr(n) \ + ldr RKR, [CTX, #(Kr_arm_dec + (n) - 3)]; /* Kr[n] */ + +#define load_km(n) \ + ldr RKM, [CTX, #(Km + (n) * 4)]; /* Km[n] */ + +#define shift_kr(dummy) \ + mov RKR, RKR, lsr #8; + +#define F(n, rl, rr, op1, op2, op3, op4, dec, loadkm, shiftkr, loadkr) \ + op1 RKM, rr; \ + mov RKM, RKM, ror RKR; \ + \ + and RT0, RMASK, RKM, ror #(24); \ + and RT1, RMASK, RKM, lsr #(16); \ + and RT2, RMASK, RKM, lsr #(8); \ + ldr RT0, [Rs1, RT0]; \ + and RT3, RMASK, RKM; \ + ldr RT1, [Rs2, RT1]; \ + shiftkr(RKR); \ + \ + ldr RT2, [Rs3, RT2]; \ + \ + op2 RT0, RT1; \ + ldr RT3, [Rs4, RT3]; \ + op3 RT0, RT2; \ + loadkm((n) + (1 - ((dec) * 2))); \ + op4 RT0, RT3; \ + loadkr((n) + (1 - ((dec) * 2))); \ + eor rl, RT0; + +#define F1(n, rl, rr, dec, loadkm, shiftkr, loadkr) \ + F(n, rl, rr, add, eor, sub, add, dec, loadkm, shiftkr, loadkr) +#define F2(n, rl, rr, dec, loadkm, shiftkr, loadkr) \ + F(n, rl, rr, eor, sub, add, eor, dec, loadkm, shiftkr, loadkr) +#define F3(n, rl, rr, dec, loadkm, shiftkr, loadkr) \ + F(n, rl, rr, sub, add, eor, sub, dec, loadkm, shiftkr, loadkr) + +#define enc_round(n, Fx, rl, rr, loadkm, shiftkr, loadkr) \ + Fx(n, rl, rr, 0, loadkm, shiftkr, loadkr) + +#define dec_round(n, Fx, rl, rr, loadkm, shiftkr, loadkr) \ + Fx(n, rl, rr, 1, loadkm, shiftkr, loadkr) + +#define read_block_aligned(rin, offs, l0, r0, convert, rtmp) \ + ldr l0, [rin, #((offs) + 0)]; \ + ldr r0, [rin, #((offs) + 4)]; \ + convert(l0, rtmp); \ + convert(r0, rtmp); + +#define write_block_aligned(rout, offs, l0, r0, convert, rtmp) \ + convert(l0, rtmp); \ + convert(r0, rtmp); \ + str l0, [rout, #((offs) + 0)]; \ + str r0, [rout, #((offs) + 4)]; + +#ifdef __ARM_FEATURE_UNALIGNED + /* unaligned word reads allowed */ + #define read_block(rin, offs, l0, r0, rtmp0) \ + read_block_aligned(rin, offs, l0, r0, host_to_be, rtmp0) + + #define write_block(rout, offs, r0, l0, rtmp0, rtmp1) \ + write_block_aligned(rout, offs, r0, l0, be_to_host, rtmp0) + + #define read_block_host(rin, offs, l0, r0, rtmp0) \ + read_block_aligned(rin, offs, l0, r0, host_to_host, rtmp0) + + #define write_block_host(rout, offs, r0, l0, rtmp0, rtmp1) \ + write_block_aligned(rout, offs, r0, l0, host_to_host, rtmp0) +#else + /* need to handle unaligned reads by byte reads */ + #define read_block(rin, offs, l0, r0, rtmp0) \ + tst rin, #3; \ + beq 1f; \ + ldr_unaligned_be(l0, rin, (offs) + 0, rtmp0); \ + ldr_unaligned_be(r0, rin, (offs) + 4, rtmp0); \ + b 2f; \ + 1:;\ + read_block_aligned(rin, offs, l0, r0, host_to_be, rtmp0); \ + 2:; + + #define write_block(rout, offs, l0, r0, rtmp0, rtmp1) \ + tst rout, #3; \ + beq 1f; \ + str_unaligned_be(l0, rout, (offs) + 0, rtmp0, rtmp1); \ + str_unaligned_be(r0, rout, (offs) + 4, rtmp0, rtmp1); \ + b 2f; \ + 1:;\ + write_block_aligned(rout, offs, l0, r0, be_to_host, rtmp0); \ + 2:; + + #define read_block_host(rin, offs, l0, r0, rtmp0) \ + tst rin, #3; \ + beq 1f; \ + ldr_unaligned_host(l0, rin, (offs) + 0, rtmp0); \ + ldr_unaligned_host(r0, rin, (offs) + 4, rtmp0); \ + b 2f; \ + 1:;\ + read_block_aligned(rin, offs, l0, r0, host_to_host, rtmp0); \ + 2:; + + #define write_block_host(rout, offs, l0, r0, rtmp0, rtmp1) \ + tst rout, #3; \ + beq 1f; \ + str_unaligned_host(l0, rout, (offs) + 0, rtmp0, rtmp1); \ + str_unaligned_host(r0, rout, (offs) + 4, rtmp0, rtmp1); \ + b 2f; \ + 1:;\ + write_block_aligned(rout, offs, l0, r0, host_to_host, rtmp0); \ + 2:; +#endif + +.align 3 +.globl _gcry_cast5_arm_encrypt_block +.type _gcry_cast5_arm_encrypt_block,%function; + +_gcry_cast5_arm_encrypt_block: + /* input: + * %r0: CTX + * %r1: dst + * %r2: src + */ + push {%r1, %r4-%r11, %ip, %lr}; + + ldr Rs1, =_gcry_cast5_s1to4; + mov RMASK, #(0xff << 2); + add Rs2, Rs1, #(0x100*4); + add Rs3, Rs1, #(0x100*4*2); + add Rs4, Rs1, #(0x100*4*3); + + read_block(%r2, 0, RL0, RR0, RT0); + + load_km(0); + load_kr(0); + enc_round(0, F1, RL0, RR0, load_km, shift_kr, dummy); + enc_round(1, F2, RR0, RL0, load_km, shift_kr, dummy); + enc_round(2, F3, RL0, RR0, load_km, shift_kr, dummy); + enc_round(3, F1, RR0, RL0, load_km, dummy, load_kr); + enc_round(4, F2, RL0, RR0, load_km, shift_kr, dummy); + enc_round(5, F3, RR0, RL0, load_km, shift_kr, dummy); + enc_round(6, F1, RL0, RR0, load_km, shift_kr, dummy); + enc_round(7, F2, RR0, RL0, load_km, dummy, load_kr); + enc_round(8, F3, RL0, RR0, load_km, shift_kr, dummy); + enc_round(9, F1, RR0, RL0, load_km, shift_kr, dummy); + enc_round(10, F2, RL0, RR0, load_km, shift_kr, dummy); + enc_round(11, F3, RR0, RL0, load_km, dummy, load_kr); + enc_round(12, F1, RL0, RR0, load_km, shift_kr, dummy); + enc_round(13, F2, RR0, RL0, load_km, shift_kr, dummy); + enc_round(14, F3, RL0, RR0, load_km, shift_kr, dummy); + enc_round(15, F1, RR0, RL0, dummy, dummy, dummy); + + ldr %r1, [%sp], #4; + write_block(%r1, 0, RR0, RL0, RT0, RT1); + + pop {%r4-%r11, %ip, %pc}; +.ltorg +.size _gcry_cast5_arm_encrypt_block,.-_gcry_cast5_arm_encrypt_block; + +.align 3 +.globl _gcry_cast5_arm_decrypt_block +.type _gcry_cast5_arm_decrypt_block,%function; + +_gcry_cast5_arm_decrypt_block: + /* input: + * %r0: CTX + * %r1: dst + * %r2: src + */ + push {%r1, %r4-%r11, %ip, %lr}; + + ldr Rs1, =_gcry_cast5_s1to4; + mov RMASK, #(0xff << 2); + add Rs2, Rs1, #(0x100 * 4); + add Rs3, Rs1, #(0x100 * 4 * 2); + add Rs4, Rs1, #(0x100 * 4 * 3); + + read_block(%r2, 0, RL0, RR0, RT0); + + load_km(15); + load_dec_kr(15); + dec_round(15, F1, RL0, RR0, load_km, shift_kr, dummy); + dec_round(14, F3, RR0, RL0, load_km, shift_kr, dummy); + dec_round(13, F2, RL0, RR0, load_km, shift_kr, dummy); + dec_round(12, F1, RR0, RL0, load_km, dummy, load_dec_kr); + dec_round(11, F3, RL0, RR0, load_km, shift_kr, dummy); + dec_round(10, F2, RR0, RL0, load_km, shift_kr, dummy); + dec_round(9, F1, RL0, RR0, load_km, shift_kr, dummy); + dec_round(8, F3, RR0, RL0, load_km, dummy, load_dec_kr); + dec_round(7, F2, RL0, RR0, load_km, shift_kr, dummy); + dec_round(6, F1, RR0, RL0, load_km, shift_kr, dummy); + dec_round(5, F3, RL0, RR0, load_km, shift_kr, dummy); + dec_round(4, F2, RR0, RL0, load_km, dummy, load_dec_kr); + dec_round(3, F1, RL0, RR0, load_km, shift_kr, dummy); + dec_round(2, F3, RR0, RL0, load_km, shift_kr, dummy); + dec_round(1, F2, RL0, RR0, load_km, shift_kr, dummy); + dec_round(0, F1, RR0, RL0, dummy, dummy, dummy); + + ldr %r1, [%sp], #4; + write_block(%r1, 0, RR0, RL0, RT0, RT1); + + pop {%r4-%r11, %ip, %pc}; +.ltorg +.size _gcry_cast5_arm_decrypt_block,.-_gcry_cast5_arm_decrypt_block; + +/********************************************************************** + 2-way cast5 + **********************************************************************/ + +#define F_2w(n, rl0, rr0, rl1, rr1, op1, op2, op3, op4, dec, loadkm, shiftkr, \ + loadkr) \ + op1 RT3, RKM, rr0; \ + op1 RKM, RKM, rr1; \ + mov RT3, RT3, ror RKR; \ + mov RKM, RKM, ror RKR; \ + \ + and RT0, RMASK, RT3, ror #(24); \ + and RT1, RMASK, RT3, lsr #(16); \ + and RT2, RMASK, RT3, lsr #(8); \ + and RT3, RMASK, RT3; \ + \ + ldr RT0, [Rs1, RT0]; \ + add RT2, #(0x100 * 4); \ + ldr RT1, [Rs2, RT1]; \ + add RT3, #(0x100 * 4 * 2); \ + \ + ldr RT2, [Rs2, RT2]; \ + \ + op2 RT0, RT1; \ + ldr RT3, [Rs2, RT3]; \ + and RT1, RMASK, RKM, ror #(24); \ + op3 RT0, RT2; \ + and RT2, RMASK, RKM, lsr #(16); \ + op4 RT0, RT3; \ + and RT3, RMASK, RKM, lsr #(8); \ + eor rl0, RT0; \ + add RT3, #(0x100 * 4); \ + ldr RT1, [Rs1, RT1]; \ + and RT0, RMASK, RKM; \ + ldr RT2, [Rs2, RT2]; \ + add RT0, #(0x100 * 4 * 2); \ + \ + ldr RT3, [Rs2, RT3]; \ + \ + op2 RT1, RT2; \ + ldr RT0, [Rs2, RT0]; \ + op3 RT1, RT3; \ + loadkm((n) + (1 - ((dec) * 2))); \ + op4 RT1, RT0; \ + loadkr((n) + (1 - ((dec) * 2))); \ + shiftkr(RKR); \ + eor rl1, RT1; + +#define F1_2w(n, rl0, rr0, rl1, rr1, dec, loadkm, shiftkr, loadkr) \ + F_2w(n, rl0, rr0, rl1, rr1, add, eor, sub, add, dec, \ + loadkm, shiftkr, loadkr) +#define F2_2w(n, rl0, rr0, rl1, rr1, dec, loadkm, shiftkr, loadkr) \ + F_2w(n, rl0, rr0, rl1, rr1, eor, sub, add, eor, dec, \ + loadkm, shiftkr, loadkr) +#define F3_2w(n, rl0, rr0, rl1, rr1, dec, loadkm, shiftkr, loadkr) \ + F_2w(n, rl0, rr0, rl1, rr1, sub, add, eor, sub, dec, \ + loadkm, shiftkr, loadkr) + +#define enc_round2(n, Fx, rl, rr, loadkm, shiftkr, loadkr) \ + Fx##_2w(n, rl##0, rr##0, rl##1, rr##1, 0, loadkm, shiftkr, loadkr) + +#define dec_round2(n, Fx, rl, rr, loadkm, shiftkr, loadkr) \ + Fx##_2w(n, rl##0, rr##0, rl##1, rr##1, 1, loadkm, shiftkr, loadkr) + +#define read_block2_aligned(rin, l0, r0, l1, r1, convert, rtmp) \ + ldr l0, [rin, #(0)]; \ + ldr r0, [rin, #(4)]; \ + convert(l0, rtmp); \ + ldr l1, [rin, #(8)]; \ + convert(r0, rtmp); \ + ldr r1, [rin, #(12)]; \ + convert(l1, rtmp); \ + convert(r1, rtmp); + +#define write_block2_aligned(rout, l0, r0, l1, r1, convert, rtmp) \ + convert(l0, rtmp); \ + convert(r0, rtmp); \ + convert(l1, rtmp); \ + str l0, [rout, #(0)]; \ + convert(r1, rtmp); \ + str r0, [rout, #(4)]; \ + str l1, [rout, #(8)]; \ + str r1, [rout, #(12)]; + +#ifdef __ARM_FEATURE_UNALIGNED + /* unaligned word reads allowed */ + #define read_block2(rin, l0, r0, l1, r1, rtmp0) \ + read_block2_aligned(rin, l0, r0, l1, r1, host_to_be, rtmp0) + + #define write_block2(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ + write_block2_aligned(rout, l0, r0, l1, r1, be_to_host, rtmp0) + + #define read_block2_host(rin, l0, r0, l1, r1, rtmp0) \ + read_block2_aligned(rin, l0, r0, l1, r1, host_to_host, rtmp0) + + #define write_block2_host(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ + write_block2_aligned(rout, l0, r0, l1, r1, host_to_host, rtmp0) +#else + /* need to handle unaligned reads by byte reads */ + #define read_block2(rin, l0, r0, l1, r1, rtmp0) \ + tst rin, #3; \ + beq 1f; \ + ldr_unaligned_be(l0, rin, 0, rtmp0); \ + ldr_unaligned_be(r0, rin, 4, rtmp0); \ + ldr_unaligned_be(l1, rin, 8, rtmp0); \ + ldr_unaligned_be(r1, rin, 12, rtmp0); \ + b 2f; \ + 1:;\ + read_block2_aligned(rin, l0, r0, l1, r1, host_to_be, rtmp0); \ + 2:; + + #define write_block2(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ + tst rout, #3; \ + beq 1f; \ + str_unaligned_be(l0, rout, 0, rtmp0, rtmp1); \ + str_unaligned_be(r0, rout, 4, rtmp0, rtmp1); \ + str_unaligned_be(l1, rout, 8, rtmp0, rtmp1); \ + str_unaligned_be(r1, rout, 12, rtmp0, rtmp1); \ + b 2f; \ + 1:;\ + write_block2_aligned(rout, l0, r0, l1, r1, be_to_host, rtmp0); \ + 2:; + + #define read_block2_host(rin, l0, r0, l1, r1, rtmp0) \ + tst rin, #3; \ + beq 1f; \ + ldr_unaligned_host(l0, rin, 0, rtmp0); \ + ldr_unaligned_host(r0, rin, 4, rtmp0); \ + ldr_unaligned_host(l1, rin, 8, rtmp0); \ + ldr_unaligned_host(r1, rin, 12, rtmp0); \ + b 2f; \ + 1:;\ + read_block2_aligned(rin, l0, r0, l1, r1, host_to_host, rtmp0); \ + 2:; + + #define write_block2_host(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ + tst rout, #3; \ + beq 1f; \ + str_unaligned_host(l0, rout, 0, rtmp0, rtmp1); \ + str_unaligned_host(r0, rout, 4, rtmp0, rtmp1); \ + str_unaligned_host(l1, rout, 8, rtmp0, rtmp1); \ + str_unaligned_host(r1, rout, 12, rtmp0, rtmp1); \ + b 2f; \ + 1:;\ + write_block2_aligned(rout, l0, r0, l1, r1, host_to_host, rtmp0); \ + 2:; +#endif + +.align 3 +.type _gcry_cast5_arm_enc_blk2,%function; + +_gcry_cast5_arm_enc_blk2: + /* input: + * preloaded: CTX + * [RL0, RR0], [RL1, RR1]: src + * output: + * [RR0, RL0], [RR1, RL1]: dst + */ + push {%lr}; + + ldr Rs1, =_gcry_cast5_s1to4; + mov RMASK, #(0xff << 2); + add Rs2, Rs1, #(0x100 * 4); + + load_km(0); + load_kr(0); + enc_round2(0, F1, RL, RR, load_km, shift_kr, dummy); + enc_round2(1, F2, RR, RL, load_km, shift_kr, dummy); + enc_round2(2, F3, RL, RR, load_km, shift_kr, dummy); + enc_round2(3, F1, RR, RL, load_km, dummy, load_kr); + enc_round2(4, F2, RL, RR, load_km, shift_kr, dummy); + enc_round2(5, F3, RR, RL, load_km, shift_kr, dummy); + enc_round2(6, F1, RL, RR, load_km, shift_kr, dummy); + enc_round2(7, F2, RR, RL, load_km, dummy, load_kr); + enc_round2(8, F3, RL, RR, load_km, shift_kr, dummy); + enc_round2(9, F1, RR, RL, load_km, shift_kr, dummy); + enc_round2(10, F2, RL, RR, load_km, shift_kr, dummy); + enc_round2(11, F3, RR, RL, load_km, dummy, load_kr); + enc_round2(12, F1, RL, RR, load_km, shift_kr, dummy); + enc_round2(13, F2, RR, RL, load_km, shift_kr, dummy); + enc_round2(14, F3, RL, RR, load_km, shift_kr, dummy); + enc_round2(15, F1, RR, RL, dummy, dummy, dummy); + + host_to_be(RR0, RT0); + host_to_be(RL0, RT0); + host_to_be(RR1, RT0); + host_to_be(RL1, RT0); + + pop {%pc}; +.ltorg +.size _gcry_cast5_arm_enc_blk2,.-_gcry_cast5_arm_enc_blk2; + +.align 3 +.globl _gcry_cast5_arm_cfb_dec; +.type _gcry_cast5_arm_cfb_dec,%function; + +_gcry_cast5_arm_cfb_dec: + /* input: + * %r0: CTX + * %r1: dst (2 blocks) + * %r2: src (2 blocks) + * %r3: iv (64bit) + */ + push {%r1, %r2, %r4-%r11, %ip, %lr}; + + mov %lr, %r3; + + /* Load input (iv/%r3 is aligned, src/%r2 might not be) */ + ldm %r3, {RL0, RR0}; + host_to_be(RL0, RT1); + host_to_be(RR0, RT1); + read_block(%r2, 0, RL1, RR1, %ip); + + /* Update IV, load src[1] and save to iv[0] */ + read_block_host(%r2, 8, %r5, %r6, %r7); + stm %lr, {%r5, %r6}; + + bl _gcry_cast5_arm_enc_blk2; + /* result in RR0:RL0, RR1:RL1 = %r4:%r3, %r10:%r9 */ + + /* %r0: dst, %r1: %src */ + pop {%r0, %r1}; + + /* dst = src ^ result */ + read_block2_host(%r1, %r5, %r6, %r7, %r8, %lr); + eor %r5, %r4; + eor %r6, %r3; + eor %r7, %r10; + eor %r8, %r9; + write_block2_host(%r0, %r5, %r6, %r7, %r8, %r1, %r2); + + pop {%r4-%r11, %ip, %pc}; +.ltorg +.size _gcry_cast5_arm_cfb_dec,.-_gcry_cast5_arm_cfb_dec; + +.align 3 +.globl _gcry_cast5_arm_ctr_enc; +.type _gcry_cast5_arm_ctr_enc,%function; + +_gcry_cast5_arm_ctr_enc: + /* input: + * %r0: CTX + * %r1: dst (2 blocks) + * %r2: src (2 blocks) + * %r3: iv (64bit, big-endian) + */ + push {%r1, %r2, %r4-%r11, %ip, %lr}; + + mov %lr, %r3; + + /* Load IV (big => host endian) */ + read_block_aligned(%lr, 0, RL0, RR0, be_to_host, RT1); + + /* Construct IVs */ + adds RR1, RR0, #1; /* +1 */ + adc RL1, RL0, #0; + adds %r6, RR1, #1; /* +2 */ + adc %r5, RL1, #0; + + /* Store new IV (host => big-endian) */ + write_block_aligned(%lr, 0, %r5, %r6, host_to_be, RT1); + + bl _gcry_cast5_arm_enc_blk2; + /* result in RR0:RL0, RR1:RL1 = %r4:%r3, %r10:%r9 */ + + /* %r0: dst, %r1: %src */ + pop {%r0, %r1}; + + /* XOR key-stream with plaintext */ + read_block2_host(%r1, %r5, %r6, %r7, %r8, %lr); + eor %r5, %r4; + eor %r6, %r3; + eor %r7, %r10; + eor %r8, %r9; + write_block2_host(%r0, %r5, %r6, %r7, %r8, %r1, %r2); + + pop {%r4-%r11, %ip, %pc}; +.ltorg +.size _gcry_cast5_arm_ctr_enc,.-_gcry_cast5_arm_ctr_enc; + +.align 3 +.type _gcry_cast5_arm_dec_blk2,%function; + +_gcry_cast5_arm_dec_blk2: + /* input: + * preloaded: CTX + * [RL0, RR0], [RL1, RR1]: src + * output: + * [RR0, RL0], [RR1, RL1]: dst + */ + + ldr Rs1, =_gcry_cast5_s1to4; + mov RMASK, #(0xff << 2); + add Rs2, Rs1, #(0x100 * 4); + + load_km(15); + load_dec_kr(15); + dec_round2(15, F1, RL, RR, load_km, shift_kr, dummy); + dec_round2(14, F3, RR, RL, load_km, shift_kr, dummy); + dec_round2(13, F2, RL, RR, load_km, shift_kr, dummy); + dec_round2(12, F1, RR, RL, load_km, dummy, load_dec_kr); + dec_round2(11, F3, RL, RR, load_km, shift_kr, dummy); + dec_round2(10, F2, RR, RL, load_km, shift_kr, dummy); + dec_round2(9, F1, RL, RR, load_km, shift_kr, dummy); + dec_round2(8, F3, RR, RL, load_km, dummy, load_dec_kr); + dec_round2(7, F2, RL, RR, load_km, shift_kr, dummy); + dec_round2(6, F1, RR, RL, load_km, shift_kr, dummy); + dec_round2(5, F3, RL, RR, load_km, shift_kr, dummy); + dec_round2(4, F2, RR, RL, load_km, dummy, load_dec_kr); + dec_round2(3, F1, RL, RR, load_km, shift_kr, dummy); + dec_round2(2, F3, RR, RL, load_km, shift_kr, dummy); + dec_round2(1, F2, RL, RR, load_km, shift_kr, dummy); + dec_round2(0, F1, RR, RL, dummy, dummy, dummy); + + host_to_be(RR0, RT0); + host_to_be(RL0, RT0); + host_to_be(RR1, RT0); + host_to_be(RL1, RT0); + + b .Ldec_cbc_tail; +.ltorg +.size _gcry_cast5_arm_dec_blk2,.-_gcry_cast5_arm_dec_blk2; + +.align 3 +.globl _gcry_cast5_arm_cbc_dec; +.type _gcry_cast5_arm_cbc_dec,%function; + +_gcry_cast5_arm_cbc_dec: + /* input: + * %r0: CTX + * %r1: dst (2 blocks) + * %r2: src (2 blocks) + * %r3: iv (64bit) + */ + push {%r1-%r11, %ip, %lr}; + + read_block2(%r2, RL0, RR0, RL1, RR1, RT0); + + /* dec_blk2 is only used by cbc_dec, jump directly in/out instead + * of function call. */ + b _gcry_cast5_arm_dec_blk2; +.Ldec_cbc_tail: + /* result in RR0:RL0, RR1:RL1 = %r4:%r3, %r10:%r9 */ + + /* %r0: dst, %r1: %src, %r2: iv */ + pop {%r0-%r2}; + + /* load IV+1 (src[0]) to %r7:%r8. Might be unaligned. */ + read_block_host(%r1, 0, %r7, %r8, %r5); + /* load IV (iv[0]) to %r5:%r6. 'iv' is aligned. */ + ldm %r2, {%r5, %r6}; + + /* out[1] ^= IV+1 */ + eor %r10, %r7; + eor %r9, %r8; + /* out[0] ^= IV */ + eor %r4, %r5; + eor %r3, %r6; + + /* load IV+2 (src[1]) to %r7:%r8. Might be unaligned. */ + read_block_host(%r1, 8, %r7, %r8, %r5); + /* store IV+2 to iv[0] (aligned). */ + stm %r2, {%r7, %r8}; + + /* store result to dst[0-3]. Might be unaligned. */ + write_block2_host(%r0, %r4, %r3, %r10, %r9, %r5, %r6); + + pop {%r4-%r11, %ip, %pc}; +.ltorg +.size _gcry_cast5_arm_cbc_dec,.-_gcry_cast5_arm_cbc_dec; + +#endif /*HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS*/ +#endif /*__ARM_ARCH >= 6*/ diff --git a/cipher/cast5-armv6.S b/cipher/cast5-armv6.S deleted file mode 100644 index 038fc4f..0000000 --- a/cipher/cast5-armv6.S +++ /dev/null @@ -1,702 +0,0 @@ -/* cast5-armv6.S - ARM assembly implementation of CAST5 cipher - * - * Copyright ? 2013 Jussi Kivilinna - * - * This file is part of Libgcrypt. - * - * Libgcrypt is free software; you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation; either version 2.1 of - * the License, or (at your option) any later version. - * - * Libgcrypt is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public - * License along with this program; if not, see . - */ - -#include - -#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) -#ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS - -.text - -.syntax unified -.arm - -.extern _gcry_cast5_s1to4; - -/* structure of crypto context */ -#define Km 0 -#define Kr (Km + (16 * 4)) -#define Kr_arm_enc (Kr + (16)) -#define Kr_arm_dec (Kr_arm_enc + (16)) - -/* register macros */ -#define CTX %r0 -#define Rs1 %r7 -#define Rs2 %r8 -#define Rs3 %r9 -#define Rs4 %r10 -#define RMASK %r11 -#define RKM %r1 -#define RKR %r2 - -#define RL0 %r3 -#define RR0 %r4 - -#define RL1 %r9 -#define RR1 %r10 - -#define RT0 %lr -#define RT1 %ip -#define RT2 %r5 -#define RT3 %r6 - -/* helper macros */ -#define ldr_unaligned_le(rout, rsrc, offs, rtmp) \ - ldrb rout, [rsrc, #((offs) + 0)]; \ - ldrb rtmp, [rsrc, #((offs) + 1)]; \ - orr rout, rout, rtmp, lsl #8; \ - ldrb rtmp, [rsrc, #((offs) + 2)]; \ - orr rout, rout, rtmp, lsl #16; \ - ldrb rtmp, [rsrc, #((offs) + 3)]; \ - orr rout, rout, rtmp, lsl #24; - -#define str_unaligned_le(rin, rdst, offs, rtmp0, rtmp1) \ - mov rtmp0, rin, lsr #8; \ - strb rin, [rdst, #((offs) + 0)]; \ - mov rtmp1, rin, lsr #16; \ - strb rtmp0, [rdst, #((offs) + 1)]; \ - mov rtmp0, rin, lsr #24; \ - strb rtmp1, [rdst, #((offs) + 2)]; \ - strb rtmp0, [rdst, #((offs) + 3)]; - -#define ldr_unaligned_be(rout, rsrc, offs, rtmp) \ - ldrb rout, [rsrc, #((offs) + 3)]; \ - ldrb rtmp, [rsrc, #((offs) + 2)]; \ - orr rout, rout, rtmp, lsl #8; \ - ldrb rtmp, [rsrc, #((offs) + 1)]; \ - orr rout, rout, rtmp, lsl #16; \ - ldrb rtmp, [rsrc, #((offs) + 0)]; \ - orr rout, rout, rtmp, lsl #24; - -#define str_unaligned_be(rin, rdst, offs, rtmp0, rtmp1) \ - mov rtmp0, rin, lsr #8; \ - strb rin, [rdst, #((offs) + 3)]; \ - mov rtmp1, rin, lsr #16; \ - strb rtmp0, [rdst, #((offs) + 2)]; \ - mov rtmp0, rin, lsr #24; \ - strb rtmp1, [rdst, #((offs) + 1)]; \ - strb rtmp0, [rdst, #((offs) + 0)]; - -#ifdef __ARMEL__ - #define ldr_unaligned_host ldr_unaligned_le - #define str_unaligned_host str_unaligned_le - - /* bswap on little-endian */ - #define host_to_be(reg) \ - rev reg, reg; - #define be_to_host(reg) \ - rev reg, reg; -#else - #define ldr_unaligned_host ldr_unaligned_be - #define str_unaligned_host str_unaligned_be - - /* nop on big-endian */ - #define host_to_be(reg) /*_*/ - #define be_to_host(reg) /*_*/ -#endif - -#define host_to_host(x) /*_*/ - -/********************************************************************** - 1-way cast5 - **********************************************************************/ - -#define dummy(n) /*_*/ - -#define load_kr(n) \ - ldr RKR, [CTX, #(Kr_arm_enc + (n))]; /* Kr[n] */ - -#define load_dec_kr(n) \ - ldr RKR, [CTX, #(Kr_arm_dec + (n) - 3)]; /* Kr[n] */ - -#define load_km(n) \ - ldr RKM, [CTX, #(Km + (n) * 4)]; /* Km[n] */ - -#define shift_kr(dummy) \ - mov RKR, RKR, lsr #8; - -#define F(n, rl, rr, op1, op2, op3, op4, dec, loadkm, shiftkr, loadkr) \ - op1 RKM, rr; \ - mov RKM, RKM, ror RKR; \ - \ - and RT0, RMASK, RKM, ror #(24); \ - and RT1, RMASK, RKM, lsr #(16); \ - and RT2, RMASK, RKM, lsr #(8); \ - ldr RT0, [Rs1, RT0]; \ - and RT3, RMASK, RKM; \ - ldr RT1, [Rs2, RT1]; \ - shiftkr(RKR); \ - \ - ldr RT2, [Rs3, RT2]; \ - \ - op2 RT0, RT1; \ - ldr RT3, [Rs4, RT3]; \ - op3 RT0, RT2; \ - loadkm((n) + (1 - ((dec) * 2))); \ - op4 RT0, RT3; \ - loadkr((n) + (1 - ((dec) * 2))); \ - eor rl, RT0; - -#define F1(n, rl, rr, dec, loadkm, shiftkr, loadkr) \ - F(n, rl, rr, add, eor, sub, add, dec, loadkm, shiftkr, loadkr) -#define F2(n, rl, rr, dec, loadkm, shiftkr, loadkr) \ - F(n, rl, rr, eor, sub, add, eor, dec, loadkm, shiftkr, loadkr) -#define F3(n, rl, rr, dec, loadkm, shiftkr, loadkr) \ - F(n, rl, rr, sub, add, eor, sub, dec, loadkm, shiftkr, loadkr) - -#define enc_round(n, Fx, rl, rr, loadkm, shiftkr, loadkr) \ - Fx(n, rl, rr, 0, loadkm, shiftkr, loadkr) - -#define dec_round(n, Fx, rl, rr, loadkm, shiftkr, loadkr) \ - Fx(n, rl, rr, 1, loadkm, shiftkr, loadkr) - -#define read_block_aligned(rin, offs, l0, r0, convert) \ - ldr l0, [rin, #((offs) + 0)]; \ - ldr r0, [rin, #((offs) + 4)]; \ - convert(l0); \ - convert(r0); - -#define write_block_aligned(rout, offs, l0, r0, convert) \ - convert(l0); \ - convert(r0); \ - str l0, [rout, #((offs) + 0)]; \ - str r0, [rout, #((offs) + 4)]; - -#ifdef __ARM_FEATURE_UNALIGNED - /* unaligned word reads allowed */ - #define read_block(rin, offs, l0, r0, rtmp0) \ - read_block_aligned(rin, offs, l0, r0, host_to_be) - - #define write_block(rout, offs, r0, l0, rtmp0, rtmp1) \ - write_block_aligned(rout, offs, r0, l0, be_to_host) - - #define read_block_host(rin, offs, l0, r0, rtmp0) \ - read_block_aligned(rin, offs, l0, r0, host_to_host) - - #define write_block_host(rout, offs, r0, l0, rtmp0, rtmp1) \ - write_block_aligned(rout, offs, r0, l0, host_to_host) -#else - /* need to handle unaligned reads by byte reads */ - #define read_block(rin, offs, l0, r0, rtmp0) \ - tst rin, #3; \ - beq 1f; \ - ldr_unaligned_be(l0, rin, (offs) + 0, rtmp0); \ - ldr_unaligned_be(r0, rin, (offs) + 4, rtmp0); \ - b 2f; \ - 1:;\ - read_block_aligned(rin, offs, l0, r0, host_to_be); \ - 2:; - - #define write_block(rout, offs, l0, r0, rtmp0, rtmp1) \ - tst rout, #3; \ - beq 1f; \ - str_unaligned_be(l0, rout, (offs) + 0, rtmp0, rtmp1); \ - str_unaligned_be(r0, rout, (offs) + 4, rtmp0, rtmp1); \ - b 2f; \ - 1:;\ - write_block_aligned(rout, offs, l0, r0, be_to_host); \ - 2:; - - #define read_block_host(rin, offs, l0, r0, rtmp0) \ - tst rin, #3; \ - beq 1f; \ - ldr_unaligned_host(l0, rin, (offs) + 0, rtmp0); \ - ldr_unaligned_host(r0, rin, (offs) + 4, rtmp0); \ - b 2f; \ - 1:;\ - read_block_aligned(rin, offs, l0, r0, host_to_host); \ - 2:; - - #define write_block_host(rout, offs, l0, r0, rtmp0, rtmp1) \ - tst rout, #3; \ - beq 1f; \ - str_unaligned_host(l0, rout, (offs) + 0, rtmp0, rtmp1); \ - str_unaligned_host(r0, rout, (offs) + 4, rtmp0, rtmp1); \ - b 2f; \ - 1:;\ - write_block_aligned(rout, offs, l0, r0, host_to_host); \ - 2:; -#endif - -.align 3 -.globl _gcry_cast5_armv6_encrypt_block -.type _gcry_cast5_armv6_encrypt_block,%function; - -_gcry_cast5_armv6_encrypt_block: - /* input: - * %r0: CTX - * %r1: dst - * %r2: src - */ - push {%r1, %r4-%r11, %ip, %lr}; - - ldr Rs1, =_gcry_cast5_s1to4; - mov RMASK, #(0xff << 2); - add Rs2, Rs1, #(0x100*4); - add Rs3, Rs1, #(0x100*4*2); - add Rs4, Rs1, #(0x100*4*3); - - read_block(%r2, 0, RL0, RR0, RT0); - - load_km(0); - load_kr(0); - enc_round(0, F1, RL0, RR0, load_km, shift_kr, dummy); - enc_round(1, F2, RR0, RL0, load_km, shift_kr, dummy); - enc_round(2, F3, RL0, RR0, load_km, shift_kr, dummy); - enc_round(3, F1, RR0, RL0, load_km, dummy, load_kr); - enc_round(4, F2, RL0, RR0, load_km, shift_kr, dummy); - enc_round(5, F3, RR0, RL0, load_km, shift_kr, dummy); - enc_round(6, F1, RL0, RR0, load_km, shift_kr, dummy); - enc_round(7, F2, RR0, RL0, load_km, dummy, load_kr); - enc_round(8, F3, RL0, RR0, load_km, shift_kr, dummy); - enc_round(9, F1, RR0, RL0, load_km, shift_kr, dummy); - enc_round(10, F2, RL0, RR0, load_km, shift_kr, dummy); - enc_round(11, F3, RR0, RL0, load_km, dummy, load_kr); - enc_round(12, F1, RL0, RR0, load_km, shift_kr, dummy); - enc_round(13, F2, RR0, RL0, load_km, shift_kr, dummy); - enc_round(14, F3, RL0, RR0, load_km, shift_kr, dummy); - enc_round(15, F1, RR0, RL0, dummy, dummy, dummy); - - ldr %r1, [%sp], #4; - write_block(%r1, 0, RR0, RL0, RT0, RT1); - - pop {%r4-%r11, %ip, %pc}; -.ltorg -.size _gcry_cast5_armv6_encrypt_block,.-_gcry_cast5_armv6_encrypt_block; - -.align 3 -.globl _gcry_cast5_armv6_decrypt_block -.type _gcry_cast5_armv6_decrypt_block,%function; - -_gcry_cast5_armv6_decrypt_block: - /* input: - * %r0: CTX - * %r1: dst - * %r2: src - */ - push {%r1, %r4-%r11, %ip, %lr}; - - ldr Rs1, =_gcry_cast5_s1to4; - mov RMASK, #(0xff << 2); - add Rs2, Rs1, #(0x100 * 4); - add Rs3, Rs1, #(0x100 * 4 * 2); - add Rs4, Rs1, #(0x100 * 4 * 3); - - read_block(%r2, 0, RL0, RR0, RT0); - - load_km(15); - load_dec_kr(15); - dec_round(15, F1, RL0, RR0, load_km, shift_kr, dummy); - dec_round(14, F3, RR0, RL0, load_km, shift_kr, dummy); - dec_round(13, F2, RL0, RR0, load_km, shift_kr, dummy); - dec_round(12, F1, RR0, RL0, load_km, dummy, load_dec_kr); - dec_round(11, F3, RL0, RR0, load_km, shift_kr, dummy); - dec_round(10, F2, RR0, RL0, load_km, shift_kr, dummy); - dec_round(9, F1, RL0, RR0, load_km, shift_kr, dummy); - dec_round(8, F3, RR0, RL0, load_km, dummy, load_dec_kr); - dec_round(7, F2, RL0, RR0, load_km, shift_kr, dummy); - dec_round(6, F1, RR0, RL0, load_km, shift_kr, dummy); - dec_round(5, F3, RL0, RR0, load_km, shift_kr, dummy); - dec_round(4, F2, RR0, RL0, load_km, dummy, load_dec_kr); - dec_round(3, F1, RL0, RR0, load_km, shift_kr, dummy); - dec_round(2, F3, RR0, RL0, load_km, shift_kr, dummy); - dec_round(1, F2, RL0, RR0, load_km, shift_kr, dummy); - dec_round(0, F1, RR0, RL0, dummy, dummy, dummy); - - ldr %r1, [%sp], #4; - write_block(%r1, 0, RR0, RL0, RT0, RT1); - - pop {%r4-%r11, %ip, %pc}; -.ltorg -.size _gcry_cast5_armv6_decrypt_block,.-_gcry_cast5_armv6_decrypt_block; - -/********************************************************************** - 2-way cast5 - **********************************************************************/ - -#define F_2w(n, rl0, rr0, rl1, rr1, op1, op2, op3, op4, dec, loadkm, shiftkr, \ - loadkr) \ - op1 RT3, RKM, rr0; \ - op1 RKM, RKM, rr1; \ - mov RT3, RT3, ror RKR; \ - mov RKM, RKM, ror RKR; \ - \ - and RT0, RMASK, RT3, ror #(24); \ - and RT1, RMASK, RT3, lsr #(16); \ - and RT2, RMASK, RT3, lsr #(8); \ - and RT3, RMASK, RT3; \ - \ - ldr RT0, [Rs1, RT0]; \ - add RT2, #(0x100 * 4); \ - ldr RT1, [Rs2, RT1]; \ - add RT3, #(0x100 * 4 * 2); \ - \ - ldr RT2, [Rs2, RT2]; \ - \ - op2 RT0, RT1; \ - ldr RT3, [Rs2, RT3]; \ - and RT1, RMASK, RKM, ror #(24); \ - op3 RT0, RT2; \ - and RT2, RMASK, RKM, lsr #(16); \ - op4 RT0, RT3; \ - and RT3, RMASK, RKM, lsr #(8); \ - eor rl0, RT0; \ - add RT3, #(0x100 * 4); \ - ldr RT1, [Rs1, RT1]; \ - and RT0, RMASK, RKM; \ - ldr RT2, [Rs2, RT2]; \ - add RT0, #(0x100 * 4 * 2); \ - \ - ldr RT3, [Rs2, RT3]; \ - \ - op2 RT1, RT2; \ - ldr RT0, [Rs2, RT0]; \ - op3 RT1, RT3; \ - loadkm((n) + (1 - ((dec) * 2))); \ - op4 RT1, RT0; \ - loadkr((n) + (1 - ((dec) * 2))); \ - shiftkr(RKR); \ - eor rl1, RT1; - -#define F1_2w(n, rl0, rr0, rl1, rr1, dec, loadkm, shiftkr, loadkr) \ - F_2w(n, rl0, rr0, rl1, rr1, add, eor, sub, add, dec, \ - loadkm, shiftkr, loadkr) -#define F2_2w(n, rl0, rr0, rl1, rr1, dec, loadkm, shiftkr, loadkr) \ - F_2w(n, rl0, rr0, rl1, rr1, eor, sub, add, eor, dec, \ - loadkm, shiftkr, loadkr) -#define F3_2w(n, rl0, rr0, rl1, rr1, dec, loadkm, shiftkr, loadkr) \ - F_2w(n, rl0, rr0, rl1, rr1, sub, add, eor, sub, dec, \ - loadkm, shiftkr, loadkr) - -#define enc_round2(n, Fx, rl, rr, loadkm, shiftkr, loadkr) \ - Fx##_2w(n, rl##0, rr##0, rl##1, rr##1, 0, loadkm, shiftkr, loadkr) - -#define dec_round2(n, Fx, rl, rr, loadkm, shiftkr, loadkr) \ - Fx##_2w(n, rl##0, rr##0, rl##1, rr##1, 1, loadkm, shiftkr, loadkr) - -#define read_block2_aligned(rin, l0, r0, l1, r1, convert) \ - ldr l0, [rin, #(0)]; \ - ldr r0, [rin, #(4)]; \ - convert(l0); \ - ldr l1, [rin, #(8)]; \ - convert(r0); \ - ldr r1, [rin, #(12)]; \ - convert(l1); \ - convert(r1); - -#define write_block2_aligned(rout, l0, r0, l1, r1, convert) \ - convert(l0); \ - convert(r0); \ - convert(l1); \ - str l0, [rout, #(0)]; \ - convert(r1); \ - str r0, [rout, #(4)]; \ - str l1, [rout, #(8)]; \ - str r1, [rout, #(12)]; - -#ifdef __ARM_FEATURE_UNALIGNED - /* unaligned word reads allowed */ - #define read_block2(rin, l0, r0, l1, r1, rtmp0) \ - read_block2_aligned(rin, l0, r0, l1, r1, host_to_be) - - #define write_block2(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ - write_block2_aligned(rout, l0, r0, l1, r1, be_to_host) - - #define read_block2_host(rin, l0, r0, l1, r1, rtmp0) \ - read_block2_aligned(rin, l0, r0, l1, r1, host_to_host) - - #define write_block2_host(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ - write_block2_aligned(rout, l0, r0, l1, r1, host_to_host) -#else - /* need to handle unaligned reads by byte reads */ - #define read_block2(rin, l0, r0, l1, r1, rtmp0) \ - tst rin, #3; \ - beq 1f; \ - ldr_unaligned_be(l0, rin, 0, rtmp0); \ - ldr_unaligned_be(r0, rin, 4, rtmp0); \ - ldr_unaligned_be(l1, rin, 8, rtmp0); \ - ldr_unaligned_be(r1, rin, 12, rtmp0); \ - b 2f; \ - 1:;\ - read_block2_aligned(rin, l0, r0, l1, r1, host_to_be); \ - 2:; - - #define write_block2(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ - tst rout, #3; \ - beq 1f; \ - str_unaligned_be(l0, rout, 0, rtmp0, rtmp1); \ - str_unaligned_be(r0, rout, 4, rtmp0, rtmp1); \ - str_unaligned_be(l1, rout, 8, rtmp0, rtmp1); \ - str_unaligned_be(r1, rout, 12, rtmp0, rtmp1); \ - b 2f; \ - 1:;\ - write_block2_aligned(rout, l0, r0, l1, r1, be_to_host); \ - 2:; - - #define read_block2_host(rin, l0, r0, l1, r1, rtmp0) \ - tst rin, #3; \ - beq 1f; \ - ldr_unaligned_host(l0, rin, 0, rtmp0); \ - ldr_unaligned_host(r0, rin, 4, rtmp0); \ - ldr_unaligned_host(l1, rin, 8, rtmp0); \ - ldr_unaligned_host(r1, rin, 12, rtmp0); \ - b 2f; \ - 1:;\ - read_block2_aligned(rin, l0, r0, l1, r1, host_to_host); \ - 2:; - - #define write_block2_host(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ - tst rout, #3; \ - beq 1f; \ - str_unaligned_host(l0, rout, 0, rtmp0, rtmp1); \ - str_unaligned_host(r0, rout, 4, rtmp0, rtmp1); \ - str_unaligned_host(l1, rout, 8, rtmp0, rtmp1); \ - str_unaligned_host(r1, rout, 12, rtmp0, rtmp1); \ - b 2f; \ - 1:;\ - write_block2_aligned(rout, l0, r0, l1, r1, host_to_host); \ - 2:; -#endif - -.align 3 -.type _gcry_cast5_armv6_enc_blk2,%function; - -_gcry_cast5_armv6_enc_blk2: - /* input: - * preloaded: CTX - * [RL0, RR0], [RL1, RR1]: src - * output: - * [RR0, RL0], [RR1, RL1]: dst - */ - push {%lr}; - - ldr Rs1, =_gcry_cast5_s1to4; - mov RMASK, #(0xff << 2); - add Rs2, Rs1, #(0x100 * 4); - - load_km(0); - load_kr(0); - enc_round2(0, F1, RL, RR, load_km, shift_kr, dummy); - enc_round2(1, F2, RR, RL, load_km, shift_kr, dummy); - enc_round2(2, F3, RL, RR, load_km, shift_kr, dummy); - enc_round2(3, F1, RR, RL, load_km, dummy, load_kr); - enc_round2(4, F2, RL, RR, load_km, shift_kr, dummy); - enc_round2(5, F3, RR, RL, load_km, shift_kr, dummy); - enc_round2(6, F1, RL, RR, load_km, shift_kr, dummy); - enc_round2(7, F2, RR, RL, load_km, dummy, load_kr); - enc_round2(8, F3, RL, RR, load_km, shift_kr, dummy); - enc_round2(9, F1, RR, RL, load_km, shift_kr, dummy); - enc_round2(10, F2, RL, RR, load_km, shift_kr, dummy); - enc_round2(11, F3, RR, RL, load_km, dummy, load_kr); - enc_round2(12, F1, RL, RR, load_km, shift_kr, dummy); - enc_round2(13, F2, RR, RL, load_km, shift_kr, dummy); - enc_round2(14, F3, RL, RR, load_km, shift_kr, dummy); - enc_round2(15, F1, RR, RL, dummy, dummy, dummy); - - host_to_be(RR0); - host_to_be(RL0); - host_to_be(RR1); - host_to_be(RL1); - - pop {%pc}; -.ltorg -.size _gcry_cast5_armv6_enc_blk2,.-_gcry_cast5_armv6_enc_blk2; - -.align 3 -.globl _gcry_cast5_armv6_cfb_dec; -.type _gcry_cast5_armv6_cfb_dec,%function; - -_gcry_cast5_armv6_cfb_dec: - /* input: - * %r0: CTX - * %r1: dst (2 blocks) - * %r2: src (2 blocks) - * %r3: iv (64bit) - */ - push {%r1, %r2, %r4-%r11, %ip, %lr}; - - mov %lr, %r3; - - /* Load input (iv/%r3 is aligned, src/%r2 might not be) */ - ldm %r3, {RL0, RR0}; - host_to_be(RL0); - host_to_be(RR0); - read_block(%r2, 0, RL1, RR1, %ip); - - /* Update IV, load src[1] and save to iv[0] */ - read_block_host(%r2, 8, %r5, %r6, %r7); - stm %lr, {%r5, %r6}; - - bl _gcry_cast5_armv6_enc_blk2; - /* result in RR0:RL0, RR1:RL1 = %r4:%r3, %r10:%r9 */ - - /* %r0: dst, %r1: %src */ - pop {%r0, %r1}; - - /* dst = src ^ result */ - read_block2_host(%r1, %r5, %r6, %r7, %r8, %lr); - eor %r5, %r4; - eor %r6, %r3; - eor %r7, %r10; - eor %r8, %r9; - write_block2_host(%r0, %r5, %r6, %r7, %r8, %r1, %r2); - - pop {%r4-%r11, %ip, %pc}; -.ltorg -.size _gcry_cast5_armv6_cfb_dec,.-_gcry_cast5_armv6_cfb_dec; - -.align 3 -.globl _gcry_cast5_armv6_ctr_enc; -.type _gcry_cast5_armv6_ctr_enc,%function; - -_gcry_cast5_armv6_ctr_enc: - /* input: - * %r0: CTX - * %r1: dst (2 blocks) - * %r2: src (2 blocks) - * %r3: iv (64bit, big-endian) - */ - push {%r1, %r2, %r4-%r11, %ip, %lr}; - - mov %lr, %r3; - - /* Load IV (big => host endian) */ - read_block_aligned(%lr, 0, RL0, RR0, be_to_host); - - /* Construct IVs */ - adds RR1, RR0, #1; /* +1 */ - adc RL1, RL0, #0; - adds %r6, RR1, #1; /* +2 */ - adc %r5, RL1, #0; - - /* Store new IV (host => big-endian) */ - write_block_aligned(%lr, 0, %r5, %r6, host_to_be); - - bl _gcry_cast5_armv6_enc_blk2; - /* result in RR0:RL0, RR1:RL1 = %r4:%r3, %r10:%r9 */ - - /* %r0: dst, %r1: %src */ - pop {%r0, %r1}; - - /* XOR key-stream with plaintext */ - read_block2_host(%r1, %r5, %r6, %r7, %r8, %lr); - eor %r5, %r4; - eor %r6, %r3; - eor %r7, %r10; - eor %r8, %r9; - write_block2_host(%r0, %r5, %r6, %r7, %r8, %r1, %r2); - - pop {%r4-%r11, %ip, %pc}; -.ltorg -.size _gcry_cast5_armv6_ctr_enc,.-_gcry_cast5_armv6_ctr_enc; - -.align 3 -.type _gcry_cast5_armv6_dec_blk2,%function; - -_gcry_cast5_armv6_dec_blk2: - /* input: - * preloaded: CTX - * [RL0, RR0], [RL1, RR1]: src - * output: - * [RR0, RL0], [RR1, RL1]: dst - */ - - ldr Rs1, =_gcry_cast5_s1to4; - mov RMASK, #(0xff << 2); - add Rs2, Rs1, #(0x100 * 4); - - load_km(15); - load_dec_kr(15); - dec_round2(15, F1, RL, RR, load_km, shift_kr, dummy); - dec_round2(14, F3, RR, RL, load_km, shift_kr, dummy); - dec_round2(13, F2, RL, RR, load_km, shift_kr, dummy); - dec_round2(12, F1, RR, RL, load_km, dummy, load_dec_kr); - dec_round2(11, F3, RL, RR, load_km, shift_kr, dummy); - dec_round2(10, F2, RR, RL, load_km, shift_kr, dummy); - dec_round2(9, F1, RL, RR, load_km, shift_kr, dummy); - dec_round2(8, F3, RR, RL, load_km, dummy, load_dec_kr); - dec_round2(7, F2, RL, RR, load_km, shift_kr, dummy); - dec_round2(6, F1, RR, RL, load_km, shift_kr, dummy); - dec_round2(5, F3, RL, RR, load_km, shift_kr, dummy); - dec_round2(4, F2, RR, RL, load_km, dummy, load_dec_kr); - dec_round2(3, F1, RL, RR, load_km, shift_kr, dummy); - dec_round2(2, F3, RR, RL, load_km, shift_kr, dummy); - dec_round2(1, F2, RL, RR, load_km, shift_kr, dummy); - dec_round2(0, F1, RR, RL, dummy, dummy, dummy); - - host_to_be(RR0); - host_to_be(RL0); - host_to_be(RR1); - host_to_be(RL1); - - b .Ldec_cbc_tail; -.ltorg -.size _gcry_cast5_armv6_dec_blk2,.-_gcry_cast5_armv6_dec_blk2; - -.align 3 -.globl _gcry_cast5_armv6_cbc_dec; -.type _gcry_cast5_armv6_cbc_dec,%function; - -_gcry_cast5_armv6_cbc_dec: - /* input: - * %r0: CTX - * %r1: dst (2 blocks) - * %r2: src (2 blocks) - * %r3: iv (64bit) - */ - push {%r1-%r11, %ip, %lr}; - - read_block2(%r2, RL0, RR0, RL1, RR1, RT0); - - /* dec_blk2 is only used by cbc_dec, jump directly in/out instead - * of function call. */ - b _gcry_cast5_armv6_dec_blk2; -.Ldec_cbc_tail: - /* result in RR0:RL0, RR1:RL1 = %r4:%r3, %r10:%r9 */ - - /* %r0: dst, %r1: %src, %r2: iv */ - pop {%r0-%r2}; - - /* load IV+1 (src[0]) to %r7:%r8. Might be unaligned. */ - read_block_host(%r1, 0, %r7, %r8, %r5); - /* load IV (iv[0]) to %r5:%r6. 'iv' is aligned. */ - ldm %r2, {%r5, %r6}; - - /* out[1] ^= IV+1 */ - eor %r10, %r7; - eor %r9, %r8; - /* out[0] ^= IV */ - eor %r4, %r5; - eor %r3, %r6; - - /* load IV+2 (src[1]) to %r7:%r8. Might be unaligned. */ - read_block_host(%r1, 8, %r7, %r8, %r5); - /* store IV+2 to iv[0] (aligned). */ - stm %r2, {%r7, %r8}; - - /* store result to dst[0-3]. Might be unaligned. */ - write_block2_host(%r0, %r4, %r3, %r10, %r9, %r5, %r6); - - pop {%r4-%r11, %ip, %pc}; -.ltorg -.size _gcry_cast5_armv6_cbc_dec,.-_gcry_cast5_armv6_cbc_dec; - -#endif /*HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS*/ -#endif /*__ARM_ARCH >= 6*/ diff --git a/cipher/cast5.c b/cipher/cast5.c index 92d9af8..a954657 100644 --- a/cipher/cast5.c +++ b/cipher/cast5.c @@ -52,11 +52,11 @@ # define USE_AMD64_ASM 1 #endif -/* USE_ARMV6_ASM indicates whether to use ARMv6 assembly code. */ -#undef USE_ARMV6_ASM -#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +/* USE_ARM_ASM indicates whether to use ARMv6 assembly code. */ +#undef USE_ARM_ASM +#if defined(__ARMEL__) # ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS -# define USE_ARMV6_ASM 1 +# define USE_ARM_ASM 1 # endif #endif @@ -65,7 +65,7 @@ typedef struct { u32 Km[16]; byte Kr[16]; -#ifdef USE_ARMV6_ASM +#ifdef USE_ARM_ASM u32 Kr_arm_enc[16 / sizeof(u32)]; u32 Kr_arm_dec[16 / sizeof(u32)]; #endif @@ -400,35 +400,35 @@ decrypt_block (void *context, byte *outbuf, const byte *inbuf) return /*burn_stack*/ (2*8); } -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) /* ARMv6 assembly implementations of CAST5. */ -extern void _gcry_cast5_armv6_encrypt_block(CAST5_context *c, byte *outbuf, +extern void _gcry_cast5_arm_encrypt_block(CAST5_context *c, byte *outbuf, const byte *inbuf); -extern void _gcry_cast5_armv6_decrypt_block(CAST5_context *c, byte *outbuf, +extern void _gcry_cast5_arm_decrypt_block(CAST5_context *c, byte *outbuf, const byte *inbuf); /* These assembly implementations process two blocks in parallel. */ -extern void _gcry_cast5_armv6_ctr_enc(CAST5_context *ctx, byte *out, +extern void _gcry_cast5_arm_ctr_enc(CAST5_context *ctx, byte *out, const byte *in, byte *ctr); -extern void _gcry_cast5_armv6_cbc_dec(CAST5_context *ctx, byte *out, +extern void _gcry_cast5_arm_cbc_dec(CAST5_context *ctx, byte *out, const byte *in, byte *iv); -extern void _gcry_cast5_armv6_cfb_dec(CAST5_context *ctx, byte *out, +extern void _gcry_cast5_arm_cfb_dec(CAST5_context *ctx, byte *out, const byte *in, byte *iv); static void do_encrypt_block (CAST5_context *context, byte *outbuf, const byte *inbuf) { - _gcry_cast5_armv6_encrypt_block (context, outbuf, inbuf); + _gcry_cast5_arm_encrypt_block (context, outbuf, inbuf); } static void do_decrypt_block (CAST5_context *context, byte *outbuf, const byte *inbuf) { - _gcry_cast5_armv6_decrypt_block (context, outbuf, inbuf); + _gcry_cast5_arm_decrypt_block (context, outbuf, inbuf); } static unsigned int @@ -447,7 +447,7 @@ decrypt_block (void *context, byte *outbuf, const byte *inbuf) return /*burn_stack*/ (10*4); } -#else /*USE_ARMV6_ASM*/ +#else /*USE_ARM_ASM*/ #define F1(D,m,r) ( (I = ((m) + (D))), (I=rol(I,(r))), \ (((s1[I >> 24] ^ s2[(I>>16)&0xff]) - s3[(I>>8)&0xff]) + s4[I&0xff]) ) @@ -556,7 +556,7 @@ decrypt_block (void *context, byte *outbuf, const byte *inbuf) return /*burn_stack*/ (20+4*sizeof(void*)); } -#endif /*!USE_ARMV6_ASM*/ +#endif /*!USE_ARM_ASM*/ /* Bulk encryption of complete blocks in CTR mode. This function is only @@ -592,12 +592,12 @@ _gcry_cast5_ctr_enc(void *context, unsigned char *ctr, void *outbuf_arg, /* Use generic code to handle smaller chunks... */ /* TODO: use caching instead? */ } -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) { /* Process data in 2 block chunks. */ while (nblocks >= 2) { - _gcry_cast5_armv6_ctr_enc(ctx, outbuf, inbuf, ctr); + _gcry_cast5_arm_ctr_enc(ctx, outbuf, inbuf, ctr); nblocks -= 2; outbuf += 2 * CAST5_BLOCKSIZE; @@ -660,12 +660,12 @@ _gcry_cast5_cbc_dec(void *context, unsigned char *iv, void *outbuf_arg, /* Use generic code to handle smaller chunks... */ } -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) { /* Process data in 2 block chunks. */ while (nblocks >= 2) { - _gcry_cast5_armv6_cbc_dec(ctx, outbuf, inbuf, iv); + _gcry_cast5_arm_cbc_dec(ctx, outbuf, inbuf, iv); nblocks -= 2; outbuf += 2 * CAST5_BLOCKSIZE; @@ -722,12 +722,12 @@ _gcry_cast5_cfb_dec(void *context, unsigned char *iv, void *outbuf_arg, /* Use generic code to handle smaller chunks... */ } -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) { /* Process data in 2 block chunks. */ while (nblocks >= 2) { - _gcry_cast5_armv6_cfb_dec(ctx, outbuf, inbuf, iv); + _gcry_cast5_arm_cfb_dec(ctx, outbuf, inbuf, iv); nblocks -= 2; outbuf += 2 * CAST5_BLOCKSIZE; @@ -936,7 +936,7 @@ do_cast_setkey( CAST5_context *c, const byte *key, unsigned keylen ) for(i=0; i < 16; i++ ) c->Kr[i] = k[i] & 0x1f; -#ifdef USE_ARMV6_ASM +#ifdef USE_ARM_ASM for (i = 0; i < 4; i++) { byte Kr_arm[4]; diff --git a/cipher/rijndael-arm.S b/cipher/rijndael-arm.S new file mode 100644 index 0000000..2a747bf --- /dev/null +++ b/cipher/rijndael-arm.S @@ -0,0 +1,853 @@ +/* rijndael-arm.S - ARM assembly implementation of AES cipher + * + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include + +#if defined(__ARMEL__) +#ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS + +.text + +.syntax unified +.arm + +/* register macros */ +#define CTX %r0 +#define RTAB %lr +#define RMASK %ip + +#define RA %r4 +#define RB %r5 +#define RC %r6 +#define RD %r7 + +#define RNA %r8 +#define RNB %r9 +#define RNC %r10 +#define RND %r11 + +#define RT0 %r1 +#define RT1 %r2 +#define RT2 %r3 + +/* helper macros */ +#define ldr_unaligned_le(rout, rsrc, offs, rtmp) \ + ldrb rout, [rsrc, #((offs) + 0)]; \ + ldrb rtmp, [rsrc, #((offs) + 1)]; \ + orr rout, rout, rtmp, lsl #8; \ + ldrb rtmp, [rsrc, #((offs) + 2)]; \ + orr rout, rout, rtmp, lsl #16; \ + ldrb rtmp, [rsrc, #((offs) + 3)]; \ + orr rout, rout, rtmp, lsl #24; + +#define str_unaligned_le(rin, rdst, offs, rtmp0, rtmp1) \ + mov rtmp0, rin, lsr #8; \ + strb rin, [rdst, #((offs) + 0)]; \ + mov rtmp1, rin, lsr #16; \ + strb rtmp0, [rdst, #((offs) + 1)]; \ + mov rtmp0, rin, lsr #24; \ + strb rtmp1, [rdst, #((offs) + 2)]; \ + strb rtmp0, [rdst, #((offs) + 3)]; + +/*********************************************************************** + * ARM assembly implementation of the AES cipher + ***********************************************************************/ +#define preload_first_key(round, ra) \ + ldr ra, [CTX, #(((round) * 16) + 0 * 4)]; + +#define dummy(round, ra) /* nothing */ + +#define addroundkey(ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_key) \ + ldm CTX, {rna, rnb, rnc, rnd}; \ + eor ra, rna; \ + eor rb, rnb; \ + eor rc, rnc; \ + preload_key(1, rna); \ + eor rd, rnd; + +#define do_encround(next_r, ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_key) \ + ldr rnb, [CTX, #(((next_r) * 16) + 1 * 4)]; \ + \ + and RT0, RMASK, ra, lsl#3; \ + ldr rnc, [CTX, #(((next_r) * 16) + 2 * 4)]; \ + and RT1, RMASK, ra, lsr#(8 - 3); \ + ldr rnd, [CTX, #(((next_r) * 16) + 3 * 4)]; \ + and RT2, RMASK, ra, lsr#(16 - 3); \ + ldr RT0, [RTAB, RT0]; \ + and ra, RMASK, ra, lsr#(24 - 3); \ + \ + ldr RT1, [RTAB, RT1]; \ + eor rna, rna, RT0; \ + ldr RT2, [RTAB, RT2]; \ + and RT0, RMASK, rd, lsl#3; \ + ldr ra, [RTAB, ra]; \ + \ + eor rnd, rnd, RT1, ror #24; \ + and RT1, RMASK, rd, lsr#(8 - 3); \ + eor rnc, rnc, RT2, ror #16; \ + and RT2, RMASK, rd, lsr#(16 - 3); \ + eor rnb, rnb, ra, ror #8; \ + ldr RT0, [RTAB, RT0]; \ + and rd, RMASK, rd, lsr#(24 - 3); \ + \ + ldr RT1, [RTAB, RT1]; \ + eor rnd, rnd, RT0; \ + ldr RT2, [RTAB, RT2]; \ + and RT0, RMASK, rc, lsl#3; \ + ldr rd, [RTAB, rd]; \ + \ + eor rnc, rnc, RT1, ror #24; \ + and RT1, RMASK, rc, lsr#(8 - 3); \ + eor rnb, rnb, RT2, ror #16; \ + and RT2, RMASK, rc, lsr#(16 - 3); \ + eor rna, rna, rd, ror #8; \ + ldr RT0, [RTAB, RT0]; \ + and rc, RMASK, rc, lsr#(24 - 3); \ + \ + ldr RT1, [RTAB, RT1]; \ + eor rnc, rnc, RT0; \ + ldr RT2, [RTAB, RT2]; \ + and RT0, RMASK, rb, lsl#3; \ + ldr rc, [RTAB, rc]; \ + \ + eor rnb, rnb, RT1, ror #24; \ + and RT1, RMASK, rb, lsr#(8 - 3); \ + eor rna, rna, RT2, ror #16; \ + and RT2, RMASK, rb, lsr#(16 - 3); \ + eor rnd, rnd, rc, ror #8; \ + ldr RT0, [RTAB, RT0]; \ + and rb, RMASK, rb, lsr#(24 - 3); \ + \ + ldr RT1, [RTAB, RT1]; \ + eor rnb, rnb, RT0; \ + ldr RT2, [RTAB, RT2]; \ + eor rna, rna, RT1, ror #24; \ + ldr rb, [RTAB, rb]; \ + \ + eor rnd, rnd, RT2, ror #16; \ + preload_key((next_r) + 1, ra); \ + eor rnc, rnc, rb, ror #8; + +#define do_lastencround(ra, rb, rc, rd, rna, rnb, rnc, rnd) \ + and RT0, RMASK, ra, lsl#3; \ + and RT1, RMASK, ra, lsr#(8 - 3); \ + and RT2, RMASK, ra, lsr#(16 - 3); \ + ldr rna, [RTAB, RT0]; \ + and ra, RMASK, ra, lsr#(24 - 3); \ + ldr rnd, [RTAB, RT1]; \ + and RT0, RMASK, rd, lsl#3; \ + ldr rnc, [RTAB, RT2]; \ + mov rnd, rnd, ror #24; \ + ldr rnb, [RTAB, ra]; \ + and RT1, RMASK, rd, lsr#(8 - 3); \ + mov rnc, rnc, ror #16; \ + and RT2, RMASK, rd, lsr#(16 - 3); \ + mov rnb, rnb, ror #8; \ + ldr RT0, [RTAB, RT0]; \ + and rd, RMASK, rd, lsr#(24 - 3); \ + ldr RT1, [RTAB, RT1]; \ + \ + orr rnd, rnd, RT0; \ + ldr RT2, [RTAB, RT2]; \ + and RT0, RMASK, rc, lsl#3; \ + ldr rd, [RTAB, rd]; \ + orr rnc, rnc, RT1, ror #24; \ + and RT1, RMASK, rc, lsr#(8 - 3); \ + orr rnb, rnb, RT2, ror #16; \ + and RT2, RMASK, rc, lsr#(16 - 3); \ + orr rna, rna, rd, ror #8; \ + ldr RT0, [RTAB, RT0]; \ + and rc, RMASK, rc, lsr#(24 - 3); \ + ldr RT1, [RTAB, RT1]; \ + \ + orr rnc, rnc, RT0; \ + ldr RT2, [RTAB, RT2]; \ + and RT0, RMASK, rb, lsl#3; \ + ldr rc, [RTAB, rc]; \ + orr rnb, rnb, RT1, ror #24; \ + and RT1, RMASK, rb, lsr#(8 - 3); \ + orr rna, rna, RT2, ror #16; \ + ldr RT0, [RTAB, RT0]; \ + and RT2, RMASK, rb, lsr#(16 - 3); \ + ldr RT1, [RTAB, RT1]; \ + orr rnd, rnd, rc, ror #8; \ + ldr RT2, [RTAB, RT2]; \ + and rb, RMASK, rb, lsr#(24 - 3); \ + ldr rb, [RTAB, rb]; \ + \ + orr rnb, rnb, RT0; \ + orr rna, rna, RT1, ror #24; \ + orr rnd, rnd, RT2, ror #16; \ + orr rnc, rnc, rb, ror #8; + +#define firstencround(round, ra, rb, rc, rd, rna, rnb, rnc, rnd) \ + addroundkey(ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_first_key); \ + do_encround((round) + 1, ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_first_key); + +#define encround(round, ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_key) \ + do_encround((round) + 1, ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_key); + +#define lastencround(round, ra, rb, rc, rd, rna, rnb, rnc, rnd) \ + add CTX, #(((round) + 1) * 16); \ + add RTAB, #4; \ + do_lastencround(ra, rb, rc, rd, rna, rnb, rnc, rnd); \ + addroundkey(rna, rnb, rnc, rnd, ra, rb, rc, rd, dummy); + +.align 3 +.global _gcry_aes_arm_encrypt_block +.type _gcry_aes_arm_encrypt_block,%function; + +_gcry_aes_arm_encrypt_block: + /* input: + * %r0: keysched, CTX + * %r1: dst + * %r2: src + * %r3: number of rounds.. 10, 12 or 14 + */ + push {%r4-%r11, %ip, %lr}; + + /* read input block */ +#ifndef __ARM_FEATURE_UNALIGNED + /* test if src is unaligned */ + tst %r2, #3; + beq 1f; + + /* unaligned load */ + ldr_unaligned_le(RA, %r2, 0, RNA); + ldr_unaligned_le(RB, %r2, 4, RNB); + ldr_unaligned_le(RC, %r2, 8, RNA); + ldr_unaligned_le(RD, %r2, 12, RNB); + b 2f; +.ltorg +1: +#endif + /* aligned load */ + ldm %r2, {RA, RB, RC, RD}; +#ifndef __ARMEL__ + rev RA, RA; + rev RB, RB; + rev RC, RC; + rev RD, RD; +#endif +2: + sub %sp, #16; + + ldr RTAB, =.LtableE0; + + str %r1, [%sp, #4]; /* dst */ + mov RMASK, #0xff; + str %r3, [%sp, #8]; /* nrounds */ + mov RMASK, RMASK, lsl#3; /* byte mask */ + + firstencround(0, RA, RB, RC, RD, RNA, RNB, RNC, RND); + encround(1, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); + encround(2, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); + encround(3, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); + encround(4, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); + encround(5, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); + encround(6, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); + encround(7, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); + + ldr RT0, [%sp, #8]; /* nrounds */ + cmp RT0, #12; + bge .Lenc_not_128; + + encround(8, RA, RB, RC, RD, RNA, RNB, RNC, RND, dummy); + lastencround(9, RNA, RNB, RNC, RND, RA, RB, RC, RD); + +.Lenc_done: + ldr RT0, [%sp, #4]; /* dst */ + add %sp, #16; + + /* store output block */ +#ifndef __ARM_FEATURE_UNALIGNED + /* test if dst is unaligned */ + tst RT0, #3; + beq 1f; + + /* unaligned store */ + str_unaligned_le(RA, RT0, 0, RNA, RNB); + str_unaligned_le(RB, RT0, 4, RNA, RNB); + str_unaligned_le(RC, RT0, 8, RNA, RNB); + str_unaligned_le(RD, RT0, 12, RNA, RNB); + b 2f; +.ltorg +1: +#endif + /* aligned store */ +#ifndef __ARMEL__ + rev RA, RA; + rev RB, RB; + rev RC, RC; + rev RD, RD; +#endif + /* write output block */ + stm RT0, {RA, RB, RC, RD}; +2: + pop {%r4-%r11, %ip, %pc}; + +.ltorg +.Lenc_not_128: + beq .Lenc_192 + + encround(8, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); + encround(9, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); + encround(10, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); + encround(11, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); + encround(12, RA, RB, RC, RD, RNA, RNB, RNC, RND, dummy); + lastencround(13, RNA, RNB, RNC, RND, RA, RB, RC, RD); + + b .Lenc_done; + +.ltorg +.Lenc_192: + encround(8, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); + encround(9, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); + encround(10, RA, RB, RC, RD, RNA, RNB, RNC, RND, dummy); + lastencround(11, RNA, RNB, RNC, RND, RA, RB, RC, RD); + + b .Lenc_done; +.size _gcry_aes_arm_encrypt_block,.-_gcry_aes_arm_encrypt_block; + +#define addroundkey_dec(round, ra, rb, rc, rd, rna, rnb, rnc, rnd) \ + ldr rna, [CTX, #(((round) * 16) + 0 * 4)]; \ + ldr rnb, [CTX, #(((round) * 16) + 1 * 4)]; \ + eor ra, rna; \ + ldr rnc, [CTX, #(((round) * 16) + 2 * 4)]; \ + eor rb, rnb; \ + ldr rnd, [CTX, #(((round) * 16) + 3 * 4)]; \ + eor rc, rnc; \ + preload_first_key((round) - 1, rna); \ + eor rd, rnd; + +#define do_decround(next_r, ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_key) \ + ldr rnb, [CTX, #(((next_r) * 16) + 1 * 4)]; \ + \ + and RT0, RMASK, ra, lsl#3; \ + ldr rnc, [CTX, #(((next_r) * 16) + 2 * 4)]; \ + and RT1, RMASK, ra, lsr#(8 - 3); \ + ldr rnd, [CTX, #(((next_r) * 16) + 3 * 4)]; \ + and RT2, RMASK, ra, lsr#(16 - 3); \ + ldr RT0, [RTAB, RT0]; \ + and ra, RMASK, ra, lsr#(24 - 3); \ + \ + ldr RT1, [RTAB, RT1]; \ + eor rna, rna, RT0; \ + ldr RT2, [RTAB, RT2]; \ + and RT0, RMASK, rb, lsl#3; \ + ldr ra, [RTAB, ra]; \ + \ + eor rnb, rnb, RT1, ror #24; \ + and RT1, RMASK, rb, lsr#(8 - 3); \ + eor rnc, rnc, RT2, ror #16; \ + and RT2, RMASK, rb, lsr#(16 - 3); \ + eor rnd, rnd, ra, ror #8; \ + ldr RT0, [RTAB, RT0]; \ + and rb, RMASK, rb, lsr#(24 - 3); \ + \ + ldr RT1, [RTAB, RT1]; \ + eor rnb, rnb, RT0; \ + ldr RT2, [RTAB, RT2]; \ + and RT0, RMASK, rc, lsl#3; \ + ldr rb, [RTAB, rb]; \ + \ + eor rnc, rnc, RT1, ror #24; \ + and RT1, RMASK, rc, lsr#(8 - 3); \ + eor rnd, rnd, RT2, ror #16; \ + and RT2, RMASK, rc, lsr#(16 - 3); \ + eor rna, rna, rb, ror #8; \ + ldr RT0, [RTAB, RT0]; \ + and rc, RMASK, rc, lsr#(24 - 3); \ + \ + ldr RT1, [RTAB, RT1]; \ + eor rnc, rnc, RT0; \ + ldr RT2, [RTAB, RT2]; \ + and RT0, RMASK, rd, lsl#3; \ + ldr rc, [RTAB, rc]; \ + \ + eor rnd, rnd, RT1, ror #24; \ + and RT1, RMASK, rd, lsr#(8 - 3); \ + eor rna, rna, RT2, ror #16; \ + and RT2, RMASK, rd, lsr#(16 - 3); \ + eor rnb, rnb, rc, ror #8; \ + ldr RT0, [RTAB, RT0]; \ + and rd, RMASK, rd, lsr#(24 - 3); \ + \ + ldr RT1, [RTAB, RT1]; \ + eor rnd, rnd, RT0; \ + ldr RT2, [RTAB, RT2]; \ + eor rna, rna, RT1, ror #24; \ + ldr rd, [RTAB, rd]; \ + \ + eor rnb, rnb, RT2, ror #16; \ + preload_key((next_r) - 1, ra); \ + eor rnc, rnc, rd, ror #8; + +#define do_lastdecround(ra, rb, rc, rd, rna, rnb, rnc, rnd) \ + and RT0, RMASK, ra, lsl#3; \ + and RT1, RMASK, ra, lsr#(8 - 3); \ + and RT2, RMASK, ra, lsr#(16 - 3); \ + ldr rna, [RTAB, RT0]; \ + and ra, RMASK, ra, lsr#(24 - 3); \ + ldr rnb, [RTAB, RT1]; \ + and RT0, RMASK, rb, lsl#3; \ + ldr rnc, [RTAB, RT2]; \ + mov rnb, rnb, ror #24; \ + ldr rnd, [RTAB, ra]; \ + and RT1, RMASK, rb, lsr#(8 - 3); \ + mov rnc, rnc, ror #16; \ + and RT2, RMASK, rb, lsr#(16 - 3); \ + mov rnd, rnd, ror #8; \ + ldr RT0, [RTAB, RT0]; \ + and rb, RMASK, rb, lsr#(24 - 3); \ + ldr RT1, [RTAB, RT1]; \ + \ + orr rnb, rnb, RT0; \ + ldr RT2, [RTAB, RT2]; \ + and RT0, RMASK, rc, lsl#3; \ + ldr rb, [RTAB, rb]; \ + orr rnc, rnc, RT1, ror #24; \ + and RT1, RMASK, rc, lsr#(8 - 3); \ + orr rnd, rnd, RT2, ror #16; \ + and RT2, RMASK, rc, lsr#(16 - 3); \ + orr rna, rna, rb, ror #8; \ + ldr RT0, [RTAB, RT0]; \ + and rc, RMASK, rc, lsr#(24 - 3); \ + ldr RT1, [RTAB, RT1]; \ + \ + orr rnc, rnc, RT0; \ + ldr RT2, [RTAB, RT2]; \ + and RT0, RMASK, rd, lsl#3; \ + ldr rc, [RTAB, rc]; \ + orr rnd, rnd, RT1, ror #24; \ + and RT1, RMASK, rd, lsr#(8 - 3); \ + orr rna, rna, RT2, ror #16; \ + ldr RT0, [RTAB, RT0]; \ + and RT2, RMASK, rd, lsr#(16 - 3); \ + ldr RT1, [RTAB, RT1]; \ + orr rnb, rnb, rc, ror #8; \ + ldr RT2, [RTAB, RT2]; \ + and rd, RMASK, rd, lsr#(24 - 3); \ + ldr rd, [RTAB, rd]; \ + \ + orr rnd, rnd, RT0; \ + orr rna, rna, RT1, ror #24; \ + orr rnb, rnb, RT2, ror #16; \ + orr rnc, rnc, rd, ror #8; + +#define firstdecround(round, ra, rb, rc, rd, rna, rnb, rnc, rnd) \ + addroundkey_dec(((round) + 1), ra, rb, rc, rd, rna, rnb, rnc, rnd); \ + do_decround(round, ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_first_key); + +#define decround(round, ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_key) \ + do_decround(round, ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_key); + +#define lastdecround(round, ra, rb, rc, rd, rna, rnb, rnc, rnd) \ + add RTAB, #4; \ + do_lastdecround(ra, rb, rc, rd, rna, rnb, rnc, rnd); \ + addroundkey(rna, rnb, rnc, rnd, ra, rb, rc, rd, dummy); + +.align 3 +.global _gcry_aes_arm_decrypt_block +.type _gcry_aes_arm_decrypt_block,%function; + +_gcry_aes_arm_decrypt_block: + /* input: + * %r0: keysched, CTX + * %r1: dst + * %r2: src + * %r3: number of rounds.. 10, 12 or 14 + */ + push {%r4-%r11, %ip, %lr}; + + /* read input block */ +#ifndef __ARM_FEATURE_UNALIGNED + /* test if src is unaligned */ + tst %r2, #3; + beq 1f; + + /* unaligned load */ + ldr_unaligned_le(RA, %r2, 0, RNA); + ldr_unaligned_le(RB, %r2, 4, RNB); + ldr_unaligned_le(RC, %r2, 8, RNA); + ldr_unaligned_le(RD, %r2, 12, RNB); + b 2f; +.ltorg +1: +#endif + /* aligned load */ + ldm %r2, {RA, RB, RC, RD}; +#ifndef __ARMEL__ + rev RA, RA; + rev RB, RB; + rev RC, RC; + rev RD, RD; +#endif +2: + sub %sp, #16; + + ldr RTAB, =.LtableD0; + + mov RMASK, #0xff; + str %r1, [%sp, #4]; /* dst */ + mov RMASK, RMASK, lsl#3; /* byte mask */ + + cmp %r3, #12; + bge .Ldec_256; + + firstdecround(9, RA, RB, RC, RD, RNA, RNB, RNC, RND); +.Ldec_tail: + decround(8, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); + decround(7, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); + decround(6, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); + decround(5, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); + decround(4, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); + decround(3, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); + decround(2, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); + decround(1, RA, RB, RC, RD, RNA, RNB, RNC, RND, dummy); + lastdecround(0, RNA, RNB, RNC, RND, RA, RB, RC, RD); + + ldr RT0, [%sp, #4]; /* dst */ + add %sp, #16; + + /* store output block */ +#ifndef __ARM_FEATURE_UNALIGNED + /* test if dst is unaligned */ + tst RT0, #3; + beq 1f; + + /* unaligned store */ + str_unaligned_le(RA, RT0, 0, RNA, RNB); + str_unaligned_le(RB, RT0, 4, RNA, RNB); + str_unaligned_le(RC, RT0, 8, RNA, RNB); + str_unaligned_le(RD, RT0, 12, RNA, RNB); + b 2f; +.ltorg +1: +#endif + /* aligned store */ +#ifndef __ARMEL__ + rev RA, RA; + rev RB, RB; + rev RC, RC; + rev RD, RD; +#endif + /* write output block */ + stm RT0, {RA, RB, RC, RD}; +2: + pop {%r4-%r11, %ip, %pc}; + +.ltorg +.Ldec_256: + beq .Ldec_192; + + firstdecround(13, RA, RB, RC, RD, RNA, RNB, RNC, RND); + decround(12, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); + decround(11, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); + decround(10, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); + decround(9, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); + + b .Ldec_tail; + +.ltorg +.Ldec_192: + firstdecround(11, RA, RB, RC, RD, RNA, RNB, RNC, RND); + decround(10, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); + decround(9, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); + + b .Ldec_tail; +.size _gcry_aes_arm_encrypt_block,.-_gcry_aes_arm_encrypt_block; + +.data + +/* Encryption tables */ +.align 5 +.type .LtableE0, %object +.type .LtableEs0, %object +.LtableE0: +.long 0xa56363c6 +.LtableEs0: +.long 0x00000063, 0x847c7cf8, 0x0000007c +.long 0x997777ee, 0x00000077, 0x8d7b7bf6, 0x0000007b +.long 0x0df2f2ff, 0x000000f2, 0xbd6b6bd6, 0x0000006b +.long 0xb16f6fde, 0x0000006f, 0x54c5c591, 0x000000c5 +.long 0x50303060, 0x00000030, 0x03010102, 0x00000001 +.long 0xa96767ce, 0x00000067, 0x7d2b2b56, 0x0000002b +.long 0x19fefee7, 0x000000fe, 0x62d7d7b5, 0x000000d7 +.long 0xe6abab4d, 0x000000ab, 0x9a7676ec, 0x00000076 +.long 0x45caca8f, 0x000000ca, 0x9d82821f, 0x00000082 +.long 0x40c9c989, 0x000000c9, 0x877d7dfa, 0x0000007d +.long 0x15fafaef, 0x000000fa, 0xeb5959b2, 0x00000059 +.long 0xc947478e, 0x00000047, 0x0bf0f0fb, 0x000000f0 +.long 0xecadad41, 0x000000ad, 0x67d4d4b3, 0x000000d4 +.long 0xfda2a25f, 0x000000a2, 0xeaafaf45, 0x000000af +.long 0xbf9c9c23, 0x0000009c, 0xf7a4a453, 0x000000a4 +.long 0x967272e4, 0x00000072, 0x5bc0c09b, 0x000000c0 +.long 0xc2b7b775, 0x000000b7, 0x1cfdfde1, 0x000000fd +.long 0xae93933d, 0x00000093, 0x6a26264c, 0x00000026 +.long 0x5a36366c, 0x00000036, 0x413f3f7e, 0x0000003f +.long 0x02f7f7f5, 0x000000f7, 0x4fcccc83, 0x000000cc +.long 0x5c343468, 0x00000034, 0xf4a5a551, 0x000000a5 +.long 0x34e5e5d1, 0x000000e5, 0x08f1f1f9, 0x000000f1 +.long 0x937171e2, 0x00000071, 0x73d8d8ab, 0x000000d8 +.long 0x53313162, 0x00000031, 0x3f15152a, 0x00000015 +.long 0x0c040408, 0x00000004, 0x52c7c795, 0x000000c7 +.long 0x65232346, 0x00000023, 0x5ec3c39d, 0x000000c3 +.long 0x28181830, 0x00000018, 0xa1969637, 0x00000096 +.long 0x0f05050a, 0x00000005, 0xb59a9a2f, 0x0000009a +.long 0x0907070e, 0x00000007, 0x36121224, 0x00000012 +.long 0x9b80801b, 0x00000080, 0x3de2e2df, 0x000000e2 +.long 0x26ebebcd, 0x000000eb, 0x6927274e, 0x00000027 +.long 0xcdb2b27f, 0x000000b2, 0x9f7575ea, 0x00000075 +.long 0x1b090912, 0x00000009, 0x9e83831d, 0x00000083 +.long 0x742c2c58, 0x0000002c, 0x2e1a1a34, 0x0000001a +.long 0x2d1b1b36, 0x0000001b, 0xb26e6edc, 0x0000006e +.long 0xee5a5ab4, 0x0000005a, 0xfba0a05b, 0x000000a0 +.long 0xf65252a4, 0x00000052, 0x4d3b3b76, 0x0000003b +.long 0x61d6d6b7, 0x000000d6, 0xceb3b37d, 0x000000b3 +.long 0x7b292952, 0x00000029, 0x3ee3e3dd, 0x000000e3 +.long 0x712f2f5e, 0x0000002f, 0x97848413, 0x00000084 +.long 0xf55353a6, 0x00000053, 0x68d1d1b9, 0x000000d1 +.long 0x00000000, 0x00000000, 0x2cededc1, 0x000000ed +.long 0x60202040, 0x00000020, 0x1ffcfce3, 0x000000fc +.long 0xc8b1b179, 0x000000b1, 0xed5b5bb6, 0x0000005b +.long 0xbe6a6ad4, 0x0000006a, 0x46cbcb8d, 0x000000cb +.long 0xd9bebe67, 0x000000be, 0x4b393972, 0x00000039 +.long 0xde4a4a94, 0x0000004a, 0xd44c4c98, 0x0000004c +.long 0xe85858b0, 0x00000058, 0x4acfcf85, 0x000000cf +.long 0x6bd0d0bb, 0x000000d0, 0x2aefefc5, 0x000000ef +.long 0xe5aaaa4f, 0x000000aa, 0x16fbfbed, 0x000000fb +.long 0xc5434386, 0x00000043, 0xd74d4d9a, 0x0000004d +.long 0x55333366, 0x00000033, 0x94858511, 0x00000085 +.long 0xcf45458a, 0x00000045, 0x10f9f9e9, 0x000000f9 +.long 0x06020204, 0x00000002, 0x817f7ffe, 0x0000007f +.long 0xf05050a0, 0x00000050, 0x443c3c78, 0x0000003c +.long 0xba9f9f25, 0x0000009f, 0xe3a8a84b, 0x000000a8 +.long 0xf35151a2, 0x00000051, 0xfea3a35d, 0x000000a3 +.long 0xc0404080, 0x00000040, 0x8a8f8f05, 0x0000008f +.long 0xad92923f, 0x00000092, 0xbc9d9d21, 0x0000009d +.long 0x48383870, 0x00000038, 0x04f5f5f1, 0x000000f5 +.long 0xdfbcbc63, 0x000000bc, 0xc1b6b677, 0x000000b6 +.long 0x75dadaaf, 0x000000da, 0x63212142, 0x00000021 +.long 0x30101020, 0x00000010, 0x1affffe5, 0x000000ff +.long 0x0ef3f3fd, 0x000000f3, 0x6dd2d2bf, 0x000000d2 +.long 0x4ccdcd81, 0x000000cd, 0x140c0c18, 0x0000000c +.long 0x35131326, 0x00000013, 0x2fececc3, 0x000000ec +.long 0xe15f5fbe, 0x0000005f, 0xa2979735, 0x00000097 +.long 0xcc444488, 0x00000044, 0x3917172e, 0x00000017 +.long 0x57c4c493, 0x000000c4, 0xf2a7a755, 0x000000a7 +.long 0x827e7efc, 0x0000007e, 0x473d3d7a, 0x0000003d +.long 0xac6464c8, 0x00000064, 0xe75d5dba, 0x0000005d +.long 0x2b191932, 0x00000019, 0x957373e6, 0x00000073 +.long 0xa06060c0, 0x00000060, 0x98818119, 0x00000081 +.long 0xd14f4f9e, 0x0000004f, 0x7fdcdca3, 0x000000dc +.long 0x66222244, 0x00000022, 0x7e2a2a54, 0x0000002a +.long 0xab90903b, 0x00000090, 0x8388880b, 0x00000088 +.long 0xca46468c, 0x00000046, 0x29eeeec7, 0x000000ee +.long 0xd3b8b86b, 0x000000b8, 0x3c141428, 0x00000014 +.long 0x79dedea7, 0x000000de, 0xe25e5ebc, 0x0000005e +.long 0x1d0b0b16, 0x0000000b, 0x76dbdbad, 0x000000db +.long 0x3be0e0db, 0x000000e0, 0x56323264, 0x00000032 +.long 0x4e3a3a74, 0x0000003a, 0x1e0a0a14, 0x0000000a +.long 0xdb494992, 0x00000049, 0x0a06060c, 0x00000006 +.long 0x6c242448, 0x00000024, 0xe45c5cb8, 0x0000005c +.long 0x5dc2c29f, 0x000000c2, 0x6ed3d3bd, 0x000000d3 +.long 0xefacac43, 0x000000ac, 0xa66262c4, 0x00000062 +.long 0xa8919139, 0x00000091, 0xa4959531, 0x00000095 +.long 0x37e4e4d3, 0x000000e4, 0x8b7979f2, 0x00000079 +.long 0x32e7e7d5, 0x000000e7, 0x43c8c88b, 0x000000c8 +.long 0x5937376e, 0x00000037, 0xb76d6dda, 0x0000006d +.long 0x8c8d8d01, 0x0000008d, 0x64d5d5b1, 0x000000d5 +.long 0xd24e4e9c, 0x0000004e, 0xe0a9a949, 0x000000a9 +.long 0xb46c6cd8, 0x0000006c, 0xfa5656ac, 0x00000056 +.long 0x07f4f4f3, 0x000000f4, 0x25eaeacf, 0x000000ea +.long 0xaf6565ca, 0x00000065, 0x8e7a7af4, 0x0000007a +.long 0xe9aeae47, 0x000000ae, 0x18080810, 0x00000008 +.long 0xd5baba6f, 0x000000ba, 0x887878f0, 0x00000078 +.long 0x6f25254a, 0x00000025, 0x722e2e5c, 0x0000002e +.long 0x241c1c38, 0x0000001c, 0xf1a6a657, 0x000000a6 +.long 0xc7b4b473, 0x000000b4, 0x51c6c697, 0x000000c6 +.long 0x23e8e8cb, 0x000000e8, 0x7cdddda1, 0x000000dd +.long 0x9c7474e8, 0x00000074, 0x211f1f3e, 0x0000001f +.long 0xdd4b4b96, 0x0000004b, 0xdcbdbd61, 0x000000bd +.long 0x868b8b0d, 0x0000008b, 0x858a8a0f, 0x0000008a +.long 0x907070e0, 0x00000070, 0x423e3e7c, 0x0000003e +.long 0xc4b5b571, 0x000000b5, 0xaa6666cc, 0x00000066 +.long 0xd8484890, 0x00000048, 0x05030306, 0x00000003 +.long 0x01f6f6f7, 0x000000f6, 0x120e0e1c, 0x0000000e +.long 0xa36161c2, 0x00000061, 0x5f35356a, 0x00000035 +.long 0xf95757ae, 0x00000057, 0xd0b9b969, 0x000000b9 +.long 0x91868617, 0x00000086, 0x58c1c199, 0x000000c1 +.long 0x271d1d3a, 0x0000001d, 0xb99e9e27, 0x0000009e +.long 0x38e1e1d9, 0x000000e1, 0x13f8f8eb, 0x000000f8 +.long 0xb398982b, 0x00000098, 0x33111122, 0x00000011 +.long 0xbb6969d2, 0x00000069, 0x70d9d9a9, 0x000000d9 +.long 0x898e8e07, 0x0000008e, 0xa7949433, 0x00000094 +.long 0xb69b9b2d, 0x0000009b, 0x221e1e3c, 0x0000001e +.long 0x92878715, 0x00000087, 0x20e9e9c9, 0x000000e9 +.long 0x49cece87, 0x000000ce, 0xff5555aa, 0x00000055 +.long 0x78282850, 0x00000028, 0x7adfdfa5, 0x000000df +.long 0x8f8c8c03, 0x0000008c, 0xf8a1a159, 0x000000a1 +.long 0x80898909, 0x00000089, 0x170d0d1a, 0x0000000d +.long 0xdabfbf65, 0x000000bf, 0x31e6e6d7, 0x000000e6 +.long 0xc6424284, 0x00000042, 0xb86868d0, 0x00000068 +.long 0xc3414182, 0x00000041, 0xb0999929, 0x00000099 +.long 0x772d2d5a, 0x0000002d, 0x110f0f1e, 0x0000000f +.long 0xcbb0b07b, 0x000000b0, 0xfc5454a8, 0x00000054 +.long 0xd6bbbb6d, 0x000000bb, 0x3a16162c, 0x00000016 + +/* Decryption tables */ +.align 5 +.type .LtableD0, %object +.type .LtableDs0, %object +.LtableD0: +.long 0x50a7f451 +.LtableDs0: +.long 0x00000052, 0x5365417e, 0x00000009 +.long 0xc3a4171a, 0x0000006a, 0x965e273a, 0x000000d5 +.long 0xcb6bab3b, 0x00000030, 0xf1459d1f, 0x00000036 +.long 0xab58faac, 0x000000a5, 0x9303e34b, 0x00000038 +.long 0x55fa3020, 0x000000bf, 0xf66d76ad, 0x00000040 +.long 0x9176cc88, 0x000000a3, 0x254c02f5, 0x0000009e +.long 0xfcd7e54f, 0x00000081, 0xd7cb2ac5, 0x000000f3 +.long 0x80443526, 0x000000d7, 0x8fa362b5, 0x000000fb +.long 0x495ab1de, 0x0000007c, 0x671bba25, 0x000000e3 +.long 0x980eea45, 0x00000039, 0xe1c0fe5d, 0x00000082 +.long 0x02752fc3, 0x0000009b, 0x12f04c81, 0x0000002f +.long 0xa397468d, 0x000000ff, 0xc6f9d36b, 0x00000087 +.long 0xe75f8f03, 0x00000034, 0x959c9215, 0x0000008e +.long 0xeb7a6dbf, 0x00000043, 0xda595295, 0x00000044 +.long 0x2d83bed4, 0x000000c4, 0xd3217458, 0x000000de +.long 0x2969e049, 0x000000e9, 0x44c8c98e, 0x000000cb +.long 0x6a89c275, 0x00000054, 0x78798ef4, 0x0000007b +.long 0x6b3e5899, 0x00000094, 0xdd71b927, 0x00000032 +.long 0xb64fe1be, 0x000000a6, 0x17ad88f0, 0x000000c2 +.long 0x66ac20c9, 0x00000023, 0xb43ace7d, 0x0000003d +.long 0x184adf63, 0x000000ee, 0x82311ae5, 0x0000004c +.long 0x60335197, 0x00000095, 0x457f5362, 0x0000000b +.long 0xe07764b1, 0x00000042, 0x84ae6bbb, 0x000000fa +.long 0x1ca081fe, 0x000000c3, 0x942b08f9, 0x0000004e +.long 0x58684870, 0x00000008, 0x19fd458f, 0x0000002e +.long 0x876cde94, 0x000000a1, 0xb7f87b52, 0x00000066 +.long 0x23d373ab, 0x00000028, 0xe2024b72, 0x000000d9 +.long 0x578f1fe3, 0x00000024, 0x2aab5566, 0x000000b2 +.long 0x0728ebb2, 0x00000076, 0x03c2b52f, 0x0000005b +.long 0x9a7bc586, 0x000000a2, 0xa50837d3, 0x00000049 +.long 0xf2872830, 0x0000006d, 0xb2a5bf23, 0x0000008b +.long 0xba6a0302, 0x000000d1, 0x5c8216ed, 0x00000025 +.long 0x2b1ccf8a, 0x00000072, 0x92b479a7, 0x000000f8 +.long 0xf0f207f3, 0x000000f6, 0xa1e2694e, 0x00000064 +.long 0xcdf4da65, 0x00000086, 0xd5be0506, 0x00000068 +.long 0x1f6234d1, 0x00000098, 0x8afea6c4, 0x00000016 +.long 0x9d532e34, 0x000000d4, 0xa055f3a2, 0x000000a4 +.long 0x32e18a05, 0x0000005c, 0x75ebf6a4, 0x000000cc +.long 0x39ec830b, 0x0000005d, 0xaaef6040, 0x00000065 +.long 0x069f715e, 0x000000b6, 0x51106ebd, 0x00000092 +.long 0xf98a213e, 0x0000006c, 0x3d06dd96, 0x00000070 +.long 0xae053edd, 0x00000048, 0x46bde64d, 0x00000050 +.long 0xb58d5491, 0x000000fd, 0x055dc471, 0x000000ed +.long 0x6fd40604, 0x000000b9, 0xff155060, 0x000000da +.long 0x24fb9819, 0x0000005e, 0x97e9bdd6, 0x00000015 +.long 0xcc434089, 0x00000046, 0x779ed967, 0x00000057 +.long 0xbd42e8b0, 0x000000a7, 0x888b8907, 0x0000008d +.long 0x385b19e7, 0x0000009d, 0xdbeec879, 0x00000084 +.long 0x470a7ca1, 0x00000090, 0xe90f427c, 0x000000d8 +.long 0xc91e84f8, 0x000000ab, 0x00000000, 0x00000000 +.long 0x83868009, 0x0000008c, 0x48ed2b32, 0x000000bc +.long 0xac70111e, 0x000000d3, 0x4e725a6c, 0x0000000a +.long 0xfbff0efd, 0x000000f7, 0x5638850f, 0x000000e4 +.long 0x1ed5ae3d, 0x00000058, 0x27392d36, 0x00000005 +.long 0x64d90f0a, 0x000000b8, 0x21a65c68, 0x000000b3 +.long 0xd1545b9b, 0x00000045, 0x3a2e3624, 0x00000006 +.long 0xb1670a0c, 0x000000d0, 0x0fe75793, 0x0000002c +.long 0xd296eeb4, 0x0000001e, 0x9e919b1b, 0x0000008f +.long 0x4fc5c080, 0x000000ca, 0xa220dc61, 0x0000003f +.long 0x694b775a, 0x0000000f, 0x161a121c, 0x00000002 +.long 0x0aba93e2, 0x000000c1, 0xe52aa0c0, 0x000000af +.long 0x43e0223c, 0x000000bd, 0x1d171b12, 0x00000003 +.long 0x0b0d090e, 0x00000001, 0xadc78bf2, 0x00000013 +.long 0xb9a8b62d, 0x0000008a, 0xc8a91e14, 0x0000006b +.long 0x8519f157, 0x0000003a, 0x4c0775af, 0x00000091 +.long 0xbbdd99ee, 0x00000011, 0xfd607fa3, 0x00000041 +.long 0x9f2601f7, 0x0000004f, 0xbcf5725c, 0x00000067 +.long 0xc53b6644, 0x000000dc, 0x347efb5b, 0x000000ea +.long 0x7629438b, 0x00000097, 0xdcc623cb, 0x000000f2 +.long 0x68fcedb6, 0x000000cf, 0x63f1e4b8, 0x000000ce +.long 0xcadc31d7, 0x000000f0, 0x10856342, 0x000000b4 +.long 0x40229713, 0x000000e6, 0x2011c684, 0x00000073 +.long 0x7d244a85, 0x00000096, 0xf83dbbd2, 0x000000ac +.long 0x1132f9ae, 0x00000074, 0x6da129c7, 0x00000022 +.long 0x4b2f9e1d, 0x000000e7, 0xf330b2dc, 0x000000ad +.long 0xec52860d, 0x00000035, 0xd0e3c177, 0x00000085 +.long 0x6c16b32b, 0x000000e2, 0x99b970a9, 0x000000f9 +.long 0xfa489411, 0x00000037, 0x2264e947, 0x000000e8 +.long 0xc48cfca8, 0x0000001c, 0x1a3ff0a0, 0x00000075 +.long 0xd82c7d56, 0x000000df, 0xef903322, 0x0000006e +.long 0xc74e4987, 0x00000047, 0xc1d138d9, 0x000000f1 +.long 0xfea2ca8c, 0x0000001a, 0x360bd498, 0x00000071 +.long 0xcf81f5a6, 0x0000001d, 0x28de7aa5, 0x00000029 +.long 0x268eb7da, 0x000000c5, 0xa4bfad3f, 0x00000089 +.long 0xe49d3a2c, 0x0000006f, 0x0d927850, 0x000000b7 +.long 0x9bcc5f6a, 0x00000062, 0x62467e54, 0x0000000e +.long 0xc2138df6, 0x000000aa, 0xe8b8d890, 0x00000018 +.long 0x5ef7392e, 0x000000be, 0xf5afc382, 0x0000001b +.long 0xbe805d9f, 0x000000fc, 0x7c93d069, 0x00000056 +.long 0xa92dd56f, 0x0000003e, 0xb31225cf, 0x0000004b +.long 0x3b99acc8, 0x000000c6, 0xa77d1810, 0x000000d2 +.long 0x6e639ce8, 0x00000079, 0x7bbb3bdb, 0x00000020 +.long 0x097826cd, 0x0000009a, 0xf418596e, 0x000000db +.long 0x01b79aec, 0x000000c0, 0xa89a4f83, 0x000000fe +.long 0x656e95e6, 0x00000078, 0x7ee6ffaa, 0x000000cd +.long 0x08cfbc21, 0x0000005a, 0xe6e815ef, 0x000000f4 +.long 0xd99be7ba, 0x0000001f, 0xce366f4a, 0x000000dd +.long 0xd4099fea, 0x000000a8, 0xd67cb029, 0x00000033 +.long 0xafb2a431, 0x00000088, 0x31233f2a, 0x00000007 +.long 0x3094a5c6, 0x000000c7, 0xc066a235, 0x00000031 +.long 0x37bc4e74, 0x000000b1, 0xa6ca82fc, 0x00000012 +.long 0xb0d090e0, 0x00000010, 0x15d8a733, 0x00000059 +.long 0x4a9804f1, 0x00000027, 0xf7daec41, 0x00000080 +.long 0x0e50cd7f, 0x000000ec, 0x2ff69117, 0x0000005f +.long 0x8dd64d76, 0x00000060, 0x4db0ef43, 0x00000051 +.long 0x544daacc, 0x0000007f, 0xdf0496e4, 0x000000a9 +.long 0xe3b5d19e, 0x00000019, 0x1b886a4c, 0x000000b5 +.long 0xb81f2cc1, 0x0000004a, 0x7f516546, 0x0000000d +.long 0x04ea5e9d, 0x0000002d, 0x5d358c01, 0x000000e5 +.long 0x737487fa, 0x0000007a, 0x2e410bfb, 0x0000009f +.long 0x5a1d67b3, 0x00000093, 0x52d2db92, 0x000000c9 +.long 0x335610e9, 0x0000009c, 0x1347d66d, 0x000000ef +.long 0x8c61d79a, 0x000000a0, 0x7a0ca137, 0x000000e0 +.long 0x8e14f859, 0x0000003b, 0x893c13eb, 0x0000004d +.long 0xee27a9ce, 0x000000ae, 0x35c961b7, 0x0000002a +.long 0xede51ce1, 0x000000f5, 0x3cb1477a, 0x000000b0 +.long 0x59dfd29c, 0x000000c8, 0x3f73f255, 0x000000eb +.long 0x79ce1418, 0x000000bb, 0xbf37c773, 0x0000003c +.long 0xeacdf753, 0x00000083, 0x5baafd5f, 0x00000053 +.long 0x146f3ddf, 0x00000099, 0x86db4478, 0x00000061 +.long 0x81f3afca, 0x00000017, 0x3ec468b9, 0x0000002b +.long 0x2c342438, 0x00000004, 0x5f40a3c2, 0x0000007e +.long 0x72c31d16, 0x000000ba, 0x0c25e2bc, 0x00000077 +.long 0x8b493c28, 0x000000d6, 0x41950dff, 0x00000026 +.long 0x7101a839, 0x000000e1, 0xdeb30c08, 0x00000069 +.long 0x9ce4b4d8, 0x00000014, 0x90c15664, 0x00000063 +.long 0x6184cb7b, 0x00000055, 0x70b632d5, 0x00000021 +.long 0x745c6c48, 0x0000000c, 0x4257b8d0, 0x0000007d + +#endif /*HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS*/ +#endif /*__ARMEL__ */ diff --git a/cipher/rijndael-armv6.S b/cipher/rijndael-armv6.S deleted file mode 100644 index bbbfb0e..0000000 --- a/cipher/rijndael-armv6.S +++ /dev/null @@ -1,853 +0,0 @@ -/* rijndael-armv6.S - ARM assembly implementation of AES cipher - * - * Copyright ? 2013 Jussi Kivilinna - * - * This file is part of Libgcrypt. - * - * Libgcrypt is free software; you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation; either version 2.1 of - * the License, or (at your option) any later version. - * - * Libgcrypt is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public - * License along with this program; if not, see . - */ - -#include - -#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) -#ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS - -.text - -.syntax unified -.arm - -/* register macros */ -#define CTX %r0 -#define RTAB %lr -#define RMASK %ip - -#define RA %r4 -#define RB %r5 -#define RC %r6 -#define RD %r7 - -#define RNA %r8 -#define RNB %r9 -#define RNC %r10 -#define RND %r11 - -#define RT0 %r1 -#define RT1 %r2 -#define RT2 %r3 - -/* helper macros */ -#define ldr_unaligned_le(rout, rsrc, offs, rtmp) \ - ldrb rout, [rsrc, #((offs) + 0)]; \ - ldrb rtmp, [rsrc, #((offs) + 1)]; \ - orr rout, rout, rtmp, lsl #8; \ - ldrb rtmp, [rsrc, #((offs) + 2)]; \ - orr rout, rout, rtmp, lsl #16; \ - ldrb rtmp, [rsrc, #((offs) + 3)]; \ - orr rout, rout, rtmp, lsl #24; - -#define str_unaligned_le(rin, rdst, offs, rtmp0, rtmp1) \ - mov rtmp0, rin, lsr #8; \ - strb rin, [rdst, #((offs) + 0)]; \ - mov rtmp1, rin, lsr #16; \ - strb rtmp0, [rdst, #((offs) + 1)]; \ - mov rtmp0, rin, lsr #24; \ - strb rtmp1, [rdst, #((offs) + 2)]; \ - strb rtmp0, [rdst, #((offs) + 3)]; - -/*********************************************************************** - * ARM assembly implementation of the AES cipher - ***********************************************************************/ -#define preload_first_key(round, ra) \ - ldr ra, [CTX, #(((round) * 16) + 0 * 4)]; - -#define dummy(round, ra) /* nothing */ - -#define addroundkey(ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_key) \ - ldm CTX, {rna, rnb, rnc, rnd}; \ - eor ra, rna; \ - eor rb, rnb; \ - eor rc, rnc; \ - preload_key(1, rna); \ - eor rd, rnd; - -#define do_encround(next_r, ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_key) \ - ldr rnb, [CTX, #(((next_r) * 16) + 1 * 4)]; \ - \ - and RT0, RMASK, ra, lsl#3; \ - ldr rnc, [CTX, #(((next_r) * 16) + 2 * 4)]; \ - and RT1, RMASK, ra, lsr#(8 - 3); \ - ldr rnd, [CTX, #(((next_r) * 16) + 3 * 4)]; \ - and RT2, RMASK, ra, lsr#(16 - 3); \ - ldr RT0, [RTAB, RT0]; \ - and ra, RMASK, ra, lsr#(24 - 3); \ - \ - ldr RT1, [RTAB, RT1]; \ - eor rna, rna, RT0; \ - ldr RT2, [RTAB, RT2]; \ - and RT0, RMASK, rd, lsl#3; \ - ldr ra, [RTAB, ra]; \ - \ - eor rnd, rnd, RT1, ror #24; \ - and RT1, RMASK, rd, lsr#(8 - 3); \ - eor rnc, rnc, RT2, ror #16; \ - and RT2, RMASK, rd, lsr#(16 - 3); \ - eor rnb, rnb, ra, ror #8; \ - ldr RT0, [RTAB, RT0]; \ - and rd, RMASK, rd, lsr#(24 - 3); \ - \ - ldr RT1, [RTAB, RT1]; \ - eor rnd, rnd, RT0; \ - ldr RT2, [RTAB, RT2]; \ - and RT0, RMASK, rc, lsl#3; \ - ldr rd, [RTAB, rd]; \ - \ - eor rnc, rnc, RT1, ror #24; \ - and RT1, RMASK, rc, lsr#(8 - 3); \ - eor rnb, rnb, RT2, ror #16; \ - and RT2, RMASK, rc, lsr#(16 - 3); \ - eor rna, rna, rd, ror #8; \ - ldr RT0, [RTAB, RT0]; \ - and rc, RMASK, rc, lsr#(24 - 3); \ - \ - ldr RT1, [RTAB, RT1]; \ - eor rnc, rnc, RT0; \ - ldr RT2, [RTAB, RT2]; \ - and RT0, RMASK, rb, lsl#3; \ - ldr rc, [RTAB, rc]; \ - \ - eor rnb, rnb, RT1, ror #24; \ - and RT1, RMASK, rb, lsr#(8 - 3); \ - eor rna, rna, RT2, ror #16; \ - and RT2, RMASK, rb, lsr#(16 - 3); \ - eor rnd, rnd, rc, ror #8; \ - ldr RT0, [RTAB, RT0]; \ - and rb, RMASK, rb, lsr#(24 - 3); \ - \ - ldr RT1, [RTAB, RT1]; \ - eor rnb, rnb, RT0; \ - ldr RT2, [RTAB, RT2]; \ - eor rna, rna, RT1, ror #24; \ - ldr rb, [RTAB, rb]; \ - \ - eor rnd, rnd, RT2, ror #16; \ - preload_key((next_r) + 1, ra); \ - eor rnc, rnc, rb, ror #8; - -#define do_lastencround(ra, rb, rc, rd, rna, rnb, rnc, rnd) \ - and RT0, RMASK, ra, lsl#3; \ - and RT1, RMASK, ra, lsr#(8 - 3); \ - and RT2, RMASK, ra, lsr#(16 - 3); \ - ldr rna, [RTAB, RT0]; \ - and ra, RMASK, ra, lsr#(24 - 3); \ - ldr rnd, [RTAB, RT1]; \ - and RT0, RMASK, rd, lsl#3; \ - ldr rnc, [RTAB, RT2]; \ - mov rnd, rnd, ror #24; \ - ldr rnb, [RTAB, ra]; \ - and RT1, RMASK, rd, lsr#(8 - 3); \ - mov rnc, rnc, ror #16; \ - and RT2, RMASK, rd, lsr#(16 - 3); \ - mov rnb, rnb, ror #8; \ - ldr RT0, [RTAB, RT0]; \ - and rd, RMASK, rd, lsr#(24 - 3); \ - ldr RT1, [RTAB, RT1]; \ - \ - orr rnd, rnd, RT0; \ - ldr RT2, [RTAB, RT2]; \ - and RT0, RMASK, rc, lsl#3; \ - ldr rd, [RTAB, rd]; \ - orr rnc, rnc, RT1, ror #24; \ - and RT1, RMASK, rc, lsr#(8 - 3); \ - orr rnb, rnb, RT2, ror #16; \ - and RT2, RMASK, rc, lsr#(16 - 3); \ - orr rna, rna, rd, ror #8; \ - ldr RT0, [RTAB, RT0]; \ - and rc, RMASK, rc, lsr#(24 - 3); \ - ldr RT1, [RTAB, RT1]; \ - \ - orr rnc, rnc, RT0; \ - ldr RT2, [RTAB, RT2]; \ - and RT0, RMASK, rb, lsl#3; \ - ldr rc, [RTAB, rc]; \ - orr rnb, rnb, RT1, ror #24; \ - and RT1, RMASK, rb, lsr#(8 - 3); \ - orr rna, rna, RT2, ror #16; \ - ldr RT0, [RTAB, RT0]; \ - and RT2, RMASK, rb, lsr#(16 - 3); \ - ldr RT1, [RTAB, RT1]; \ - orr rnd, rnd, rc, ror #8; \ - ldr RT2, [RTAB, RT2]; \ - and rb, RMASK, rb, lsr#(24 - 3); \ - ldr rb, [RTAB, rb]; \ - \ - orr rnb, rnb, RT0; \ - orr rna, rna, RT1, ror #24; \ - orr rnd, rnd, RT2, ror #16; \ - orr rnc, rnc, rb, ror #8; - -#define firstencround(round, ra, rb, rc, rd, rna, rnb, rnc, rnd) \ - addroundkey(ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_first_key); \ - do_encround((round) + 1, ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_first_key); - -#define encround(round, ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_key) \ - do_encround((round) + 1, ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_key); - -#define lastencround(round, ra, rb, rc, rd, rna, rnb, rnc, rnd) \ - add CTX, #(((round) + 1) * 16); \ - add RTAB, #4; \ - do_lastencround(ra, rb, rc, rd, rna, rnb, rnc, rnd); \ - addroundkey(rna, rnb, rnc, rnd, ra, rb, rc, rd, dummy); - -.align 3 -.global _gcry_aes_armv6_encrypt_block -.type _gcry_aes_armv6_encrypt_block,%function; - -_gcry_aes_armv6_encrypt_block: - /* input: - * %r0: keysched, CTX - * %r1: dst - * %r2: src - * %r3: number of rounds.. 10, 12 or 14 - */ - push {%r4-%r11, %ip, %lr}; - - /* read input block */ -#ifndef __ARM_FEATURE_UNALIGNED - /* test if src is unaligned */ - tst %r2, #3; - beq 1f; - - /* unaligned load */ - ldr_unaligned_le(RA, %r2, 0, RNA); - ldr_unaligned_le(RB, %r2, 4, RNB); - ldr_unaligned_le(RC, %r2, 8, RNA); - ldr_unaligned_le(RD, %r2, 12, RNB); - b 2f; -.ltorg -1: -#endif - /* aligned load */ - ldm %r2, {RA, RB, RC, RD}; -#ifndef __ARMEL__ - rev RA, RA; - rev RB, RB; - rev RC, RC; - rev RD, RD; -#endif -2: - sub %sp, #16; - - ldr RTAB, =.LtableE0; - - str %r1, [%sp, #4]; /* dst */ - mov RMASK, #0xff; - str %r3, [%sp, #8]; /* nrounds */ - mov RMASK, RMASK, lsl#3; /* byte mask */ - - firstencround(0, RA, RB, RC, RD, RNA, RNB, RNC, RND); - encround(1, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); - encround(2, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); - encround(3, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); - encround(4, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); - encround(5, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); - encround(6, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); - encround(7, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); - - ldr RT0, [%sp, #8]; /* nrounds */ - cmp RT0, #12; - bge .Lenc_not_128; - - encround(8, RA, RB, RC, RD, RNA, RNB, RNC, RND, dummy); - lastencround(9, RNA, RNB, RNC, RND, RA, RB, RC, RD); - -.Lenc_done: - ldr RT0, [%sp, #4]; /* dst */ - add %sp, #16; - - /* store output block */ -#ifndef __ARM_FEATURE_UNALIGNED - /* test if dst is unaligned */ - tst RT0, #3; - beq 1f; - - /* unaligned store */ - str_unaligned_le(RA, RT0, 0, RNA, RNB); - str_unaligned_le(RB, RT0, 4, RNA, RNB); - str_unaligned_le(RC, RT0, 8, RNA, RNB); - str_unaligned_le(RD, RT0, 12, RNA, RNB); - b 2f; -.ltorg -1: -#endif - /* aligned store */ -#ifndef __ARMEL__ - rev RA, RA; - rev RB, RB; - rev RC, RC; - rev RD, RD; -#endif - /* write output block */ - stm RT0, {RA, RB, RC, RD}; -2: - pop {%r4-%r11, %ip, %pc}; - -.ltorg -.Lenc_not_128: - beq .Lenc_192 - - encround(8, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); - encround(9, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); - encround(10, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); - encround(11, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); - encround(12, RA, RB, RC, RD, RNA, RNB, RNC, RND, dummy); - lastencround(13, RNA, RNB, RNC, RND, RA, RB, RC, RD); - - b .Lenc_done; - -.ltorg -.Lenc_192: - encround(8, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); - encround(9, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); - encround(10, RA, RB, RC, RD, RNA, RNB, RNC, RND, dummy); - lastencround(11, RNA, RNB, RNC, RND, RA, RB, RC, RD); - - b .Lenc_done; -.size _gcry_aes_armv6_encrypt_block,.-_gcry_aes_armv6_encrypt_block; - -#define addroundkey_dec(round, ra, rb, rc, rd, rna, rnb, rnc, rnd) \ - ldr rna, [CTX, #(((round) * 16) + 0 * 4)]; \ - ldr rnb, [CTX, #(((round) * 16) + 1 * 4)]; \ - eor ra, rna; \ - ldr rnc, [CTX, #(((round) * 16) + 2 * 4)]; \ - eor rb, rnb; \ - ldr rnd, [CTX, #(((round) * 16) + 3 * 4)]; \ - eor rc, rnc; \ - preload_first_key((round) - 1, rna); \ - eor rd, rnd; - -#define do_decround(next_r, ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_key) \ - ldr rnb, [CTX, #(((next_r) * 16) + 1 * 4)]; \ - \ - and RT0, RMASK, ra, lsl#3; \ - ldr rnc, [CTX, #(((next_r) * 16) + 2 * 4)]; \ - and RT1, RMASK, ra, lsr#(8 - 3); \ - ldr rnd, [CTX, #(((next_r) * 16) + 3 * 4)]; \ - and RT2, RMASK, ra, lsr#(16 - 3); \ - ldr RT0, [RTAB, RT0]; \ - and ra, RMASK, ra, lsr#(24 - 3); \ - \ - ldr RT1, [RTAB, RT1]; \ - eor rna, rna, RT0; \ - ldr RT2, [RTAB, RT2]; \ - and RT0, RMASK, rb, lsl#3; \ - ldr ra, [RTAB, ra]; \ - \ - eor rnb, rnb, RT1, ror #24; \ - and RT1, RMASK, rb, lsr#(8 - 3); \ - eor rnc, rnc, RT2, ror #16; \ - and RT2, RMASK, rb, lsr#(16 - 3); \ - eor rnd, rnd, ra, ror #8; \ - ldr RT0, [RTAB, RT0]; \ - and rb, RMASK, rb, lsr#(24 - 3); \ - \ - ldr RT1, [RTAB, RT1]; \ - eor rnb, rnb, RT0; \ - ldr RT2, [RTAB, RT2]; \ - and RT0, RMASK, rc, lsl#3; \ - ldr rb, [RTAB, rb]; \ - \ - eor rnc, rnc, RT1, ror #24; \ - and RT1, RMASK, rc, lsr#(8 - 3); \ - eor rnd, rnd, RT2, ror #16; \ - and RT2, RMASK, rc, lsr#(16 - 3); \ - eor rna, rna, rb, ror #8; \ - ldr RT0, [RTAB, RT0]; \ - and rc, RMASK, rc, lsr#(24 - 3); \ - \ - ldr RT1, [RTAB, RT1]; \ - eor rnc, rnc, RT0; \ - ldr RT2, [RTAB, RT2]; \ - and RT0, RMASK, rd, lsl#3; \ - ldr rc, [RTAB, rc]; \ - \ - eor rnd, rnd, RT1, ror #24; \ - and RT1, RMASK, rd, lsr#(8 - 3); \ - eor rna, rna, RT2, ror #16; \ - and RT2, RMASK, rd, lsr#(16 - 3); \ - eor rnb, rnb, rc, ror #8; \ - ldr RT0, [RTAB, RT0]; \ - and rd, RMASK, rd, lsr#(24 - 3); \ - \ - ldr RT1, [RTAB, RT1]; \ - eor rnd, rnd, RT0; \ - ldr RT2, [RTAB, RT2]; \ - eor rna, rna, RT1, ror #24; \ - ldr rd, [RTAB, rd]; \ - \ - eor rnb, rnb, RT2, ror #16; \ - preload_key((next_r) - 1, ra); \ - eor rnc, rnc, rd, ror #8; - -#define do_lastdecround(ra, rb, rc, rd, rna, rnb, rnc, rnd) \ - and RT0, RMASK, ra, lsl#3; \ - and RT1, RMASK, ra, lsr#(8 - 3); \ - and RT2, RMASK, ra, lsr#(16 - 3); \ - ldr rna, [RTAB, RT0]; \ - and ra, RMASK, ra, lsr#(24 - 3); \ - ldr rnb, [RTAB, RT1]; \ - and RT0, RMASK, rb, lsl#3; \ - ldr rnc, [RTAB, RT2]; \ - mov rnb, rnb, ror #24; \ - ldr rnd, [RTAB, ra]; \ - and RT1, RMASK, rb, lsr#(8 - 3); \ - mov rnc, rnc, ror #16; \ - and RT2, RMASK, rb, lsr#(16 - 3); \ - mov rnd, rnd, ror #8; \ - ldr RT0, [RTAB, RT0]; \ - and rb, RMASK, rb, lsr#(24 - 3); \ - ldr RT1, [RTAB, RT1]; \ - \ - orr rnb, rnb, RT0; \ - ldr RT2, [RTAB, RT2]; \ - and RT0, RMASK, rc, lsl#3; \ - ldr rb, [RTAB, rb]; \ - orr rnc, rnc, RT1, ror #24; \ - and RT1, RMASK, rc, lsr#(8 - 3); \ - orr rnd, rnd, RT2, ror #16; \ - and RT2, RMASK, rc, lsr#(16 - 3); \ - orr rna, rna, rb, ror #8; \ - ldr RT0, [RTAB, RT0]; \ - and rc, RMASK, rc, lsr#(24 - 3); \ - ldr RT1, [RTAB, RT1]; \ - \ - orr rnc, rnc, RT0; \ - ldr RT2, [RTAB, RT2]; \ - and RT0, RMASK, rd, lsl#3; \ - ldr rc, [RTAB, rc]; \ - orr rnd, rnd, RT1, ror #24; \ - and RT1, RMASK, rd, lsr#(8 - 3); \ - orr rna, rna, RT2, ror #16; \ - ldr RT0, [RTAB, RT0]; \ - and RT2, RMASK, rd, lsr#(16 - 3); \ - ldr RT1, [RTAB, RT1]; \ - orr rnb, rnb, rc, ror #8; \ - ldr RT2, [RTAB, RT2]; \ - and rd, RMASK, rd, lsr#(24 - 3); \ - ldr rd, [RTAB, rd]; \ - \ - orr rnd, rnd, RT0; \ - orr rna, rna, RT1, ror #24; \ - orr rnb, rnb, RT2, ror #16; \ - orr rnc, rnc, rd, ror #8; - -#define firstdecround(round, ra, rb, rc, rd, rna, rnb, rnc, rnd) \ - addroundkey_dec(((round) + 1), ra, rb, rc, rd, rna, rnb, rnc, rnd); \ - do_decround(round, ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_first_key); - -#define decround(round, ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_key) \ - do_decround(round, ra, rb, rc, rd, rna, rnb, rnc, rnd, preload_key); - -#define lastdecround(round, ra, rb, rc, rd, rna, rnb, rnc, rnd) \ - add RTAB, #4; \ - do_lastdecround(ra, rb, rc, rd, rna, rnb, rnc, rnd); \ - addroundkey(rna, rnb, rnc, rnd, ra, rb, rc, rd, dummy); - -.align 3 -.global _gcry_aes_armv6_decrypt_block -.type _gcry_aes_armv6_decrypt_block,%function; - -_gcry_aes_armv6_decrypt_block: - /* input: - * %r0: keysched, CTX - * %r1: dst - * %r2: src - * %r3: number of rounds.. 10, 12 or 14 - */ - push {%r4-%r11, %ip, %lr}; - - /* read input block */ -#ifndef __ARM_FEATURE_UNALIGNED - /* test if src is unaligned */ - tst %r2, #3; - beq 1f; - - /* unaligned load */ - ldr_unaligned_le(RA, %r2, 0, RNA); - ldr_unaligned_le(RB, %r2, 4, RNB); - ldr_unaligned_le(RC, %r2, 8, RNA); - ldr_unaligned_le(RD, %r2, 12, RNB); - b 2f; -.ltorg -1: -#endif - /* aligned load */ - ldm %r2, {RA, RB, RC, RD}; -#ifndef __ARMEL__ - rev RA, RA; - rev RB, RB; - rev RC, RC; - rev RD, RD; -#endif -2: - sub %sp, #16; - - ldr RTAB, =.LtableD0; - - mov RMASK, #0xff; - str %r1, [%sp, #4]; /* dst */ - mov RMASK, RMASK, lsl#3; /* byte mask */ - - cmp %r3, #12; - bge .Ldec_256; - - firstdecround(9, RA, RB, RC, RD, RNA, RNB, RNC, RND); -.Ldec_tail: - decround(8, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); - decround(7, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); - decround(6, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); - decround(5, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); - decround(4, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); - decround(3, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); - decround(2, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); - decround(1, RA, RB, RC, RD, RNA, RNB, RNC, RND, dummy); - lastdecround(0, RNA, RNB, RNC, RND, RA, RB, RC, RD); - - ldr RT0, [%sp, #4]; /* dst */ - add %sp, #16; - - /* store output block */ -#ifndef __ARM_FEATURE_UNALIGNED - /* test if dst is unaligned */ - tst RT0, #3; - beq 1f; - - /* unaligned store */ - str_unaligned_le(RA, RT0, 0, RNA, RNB); - str_unaligned_le(RB, RT0, 4, RNA, RNB); - str_unaligned_le(RC, RT0, 8, RNA, RNB); - str_unaligned_le(RD, RT0, 12, RNA, RNB); - b 2f; -.ltorg -1: -#endif - /* aligned store */ -#ifndef __ARMEL__ - rev RA, RA; - rev RB, RB; - rev RC, RC; - rev RD, RD; -#endif - /* write output block */ - stm RT0, {RA, RB, RC, RD}; -2: - pop {%r4-%r11, %ip, %pc}; - -.ltorg -.Ldec_256: - beq .Ldec_192; - - firstdecround(13, RA, RB, RC, RD, RNA, RNB, RNC, RND); - decround(12, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); - decround(11, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); - decround(10, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); - decround(9, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); - - b .Ldec_tail; - -.ltorg -.Ldec_192: - firstdecround(11, RA, RB, RC, RD, RNA, RNB, RNC, RND); - decround(10, RNA, RNB, RNC, RND, RA, RB, RC, RD, preload_first_key); - decround(9, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); - - b .Ldec_tail; -.size _gcry_aes_armv6_encrypt_block,.-_gcry_aes_armv6_encrypt_block; - -.data - -/* Encryption tables */ -.align 5 -.type .LtableE0, %object -.type .LtableEs0, %object -.LtableE0: -.long 0xa56363c6 -.LtableEs0: -.long 0x00000063, 0x847c7cf8, 0x0000007c -.long 0x997777ee, 0x00000077, 0x8d7b7bf6, 0x0000007b -.long 0x0df2f2ff, 0x000000f2, 0xbd6b6bd6, 0x0000006b -.long 0xb16f6fde, 0x0000006f, 0x54c5c591, 0x000000c5 -.long 0x50303060, 0x00000030, 0x03010102, 0x00000001 -.long 0xa96767ce, 0x00000067, 0x7d2b2b56, 0x0000002b -.long 0x19fefee7, 0x000000fe, 0x62d7d7b5, 0x000000d7 -.long 0xe6abab4d, 0x000000ab, 0x9a7676ec, 0x00000076 -.long 0x45caca8f, 0x000000ca, 0x9d82821f, 0x00000082 -.long 0x40c9c989, 0x000000c9, 0x877d7dfa, 0x0000007d -.long 0x15fafaef, 0x000000fa, 0xeb5959b2, 0x00000059 -.long 0xc947478e, 0x00000047, 0x0bf0f0fb, 0x000000f0 -.long 0xecadad41, 0x000000ad, 0x67d4d4b3, 0x000000d4 -.long 0xfda2a25f, 0x000000a2, 0xeaafaf45, 0x000000af -.long 0xbf9c9c23, 0x0000009c, 0xf7a4a453, 0x000000a4 -.long 0x967272e4, 0x00000072, 0x5bc0c09b, 0x000000c0 -.long 0xc2b7b775, 0x000000b7, 0x1cfdfde1, 0x000000fd -.long 0xae93933d, 0x00000093, 0x6a26264c, 0x00000026 -.long 0x5a36366c, 0x00000036, 0x413f3f7e, 0x0000003f -.long 0x02f7f7f5, 0x000000f7, 0x4fcccc83, 0x000000cc -.long 0x5c343468, 0x00000034, 0xf4a5a551, 0x000000a5 -.long 0x34e5e5d1, 0x000000e5, 0x08f1f1f9, 0x000000f1 -.long 0x937171e2, 0x00000071, 0x73d8d8ab, 0x000000d8 -.long 0x53313162, 0x00000031, 0x3f15152a, 0x00000015 -.long 0x0c040408, 0x00000004, 0x52c7c795, 0x000000c7 -.long 0x65232346, 0x00000023, 0x5ec3c39d, 0x000000c3 -.long 0x28181830, 0x00000018, 0xa1969637, 0x00000096 -.long 0x0f05050a, 0x00000005, 0xb59a9a2f, 0x0000009a -.long 0x0907070e, 0x00000007, 0x36121224, 0x00000012 -.long 0x9b80801b, 0x00000080, 0x3de2e2df, 0x000000e2 -.long 0x26ebebcd, 0x000000eb, 0x6927274e, 0x00000027 -.long 0xcdb2b27f, 0x000000b2, 0x9f7575ea, 0x00000075 -.long 0x1b090912, 0x00000009, 0x9e83831d, 0x00000083 -.long 0x742c2c58, 0x0000002c, 0x2e1a1a34, 0x0000001a -.long 0x2d1b1b36, 0x0000001b, 0xb26e6edc, 0x0000006e -.long 0xee5a5ab4, 0x0000005a, 0xfba0a05b, 0x000000a0 -.long 0xf65252a4, 0x00000052, 0x4d3b3b76, 0x0000003b -.long 0x61d6d6b7, 0x000000d6, 0xceb3b37d, 0x000000b3 -.long 0x7b292952, 0x00000029, 0x3ee3e3dd, 0x000000e3 -.long 0x712f2f5e, 0x0000002f, 0x97848413, 0x00000084 -.long 0xf55353a6, 0x00000053, 0x68d1d1b9, 0x000000d1 -.long 0x00000000, 0x00000000, 0x2cededc1, 0x000000ed -.long 0x60202040, 0x00000020, 0x1ffcfce3, 0x000000fc -.long 0xc8b1b179, 0x000000b1, 0xed5b5bb6, 0x0000005b -.long 0xbe6a6ad4, 0x0000006a, 0x46cbcb8d, 0x000000cb -.long 0xd9bebe67, 0x000000be, 0x4b393972, 0x00000039 -.long 0xde4a4a94, 0x0000004a, 0xd44c4c98, 0x0000004c -.long 0xe85858b0, 0x00000058, 0x4acfcf85, 0x000000cf -.long 0x6bd0d0bb, 0x000000d0, 0x2aefefc5, 0x000000ef -.long 0xe5aaaa4f, 0x000000aa, 0x16fbfbed, 0x000000fb -.long 0xc5434386, 0x00000043, 0xd74d4d9a, 0x0000004d -.long 0x55333366, 0x00000033, 0x94858511, 0x00000085 -.long 0xcf45458a, 0x00000045, 0x10f9f9e9, 0x000000f9 -.long 0x06020204, 0x00000002, 0x817f7ffe, 0x0000007f -.long 0xf05050a0, 0x00000050, 0x443c3c78, 0x0000003c -.long 0xba9f9f25, 0x0000009f, 0xe3a8a84b, 0x000000a8 -.long 0xf35151a2, 0x00000051, 0xfea3a35d, 0x000000a3 -.long 0xc0404080, 0x00000040, 0x8a8f8f05, 0x0000008f -.long 0xad92923f, 0x00000092, 0xbc9d9d21, 0x0000009d -.long 0x48383870, 0x00000038, 0x04f5f5f1, 0x000000f5 -.long 0xdfbcbc63, 0x000000bc, 0xc1b6b677, 0x000000b6 -.long 0x75dadaaf, 0x000000da, 0x63212142, 0x00000021 -.long 0x30101020, 0x00000010, 0x1affffe5, 0x000000ff -.long 0x0ef3f3fd, 0x000000f3, 0x6dd2d2bf, 0x000000d2 -.long 0x4ccdcd81, 0x000000cd, 0x140c0c18, 0x0000000c -.long 0x35131326, 0x00000013, 0x2fececc3, 0x000000ec -.long 0xe15f5fbe, 0x0000005f, 0xa2979735, 0x00000097 -.long 0xcc444488, 0x00000044, 0x3917172e, 0x00000017 -.long 0x57c4c493, 0x000000c4, 0xf2a7a755, 0x000000a7 -.long 0x827e7efc, 0x0000007e, 0x473d3d7a, 0x0000003d -.long 0xac6464c8, 0x00000064, 0xe75d5dba, 0x0000005d -.long 0x2b191932, 0x00000019, 0x957373e6, 0x00000073 -.long 0xa06060c0, 0x00000060, 0x98818119, 0x00000081 -.long 0xd14f4f9e, 0x0000004f, 0x7fdcdca3, 0x000000dc -.long 0x66222244, 0x00000022, 0x7e2a2a54, 0x0000002a -.long 0xab90903b, 0x00000090, 0x8388880b, 0x00000088 -.long 0xca46468c, 0x00000046, 0x29eeeec7, 0x000000ee -.long 0xd3b8b86b, 0x000000b8, 0x3c141428, 0x00000014 -.long 0x79dedea7, 0x000000de, 0xe25e5ebc, 0x0000005e -.long 0x1d0b0b16, 0x0000000b, 0x76dbdbad, 0x000000db -.long 0x3be0e0db, 0x000000e0, 0x56323264, 0x00000032 -.long 0x4e3a3a74, 0x0000003a, 0x1e0a0a14, 0x0000000a -.long 0xdb494992, 0x00000049, 0x0a06060c, 0x00000006 -.long 0x6c242448, 0x00000024, 0xe45c5cb8, 0x0000005c -.long 0x5dc2c29f, 0x000000c2, 0x6ed3d3bd, 0x000000d3 -.long 0xefacac43, 0x000000ac, 0xa66262c4, 0x00000062 -.long 0xa8919139, 0x00000091, 0xa4959531, 0x00000095 -.long 0x37e4e4d3, 0x000000e4, 0x8b7979f2, 0x00000079 -.long 0x32e7e7d5, 0x000000e7, 0x43c8c88b, 0x000000c8 -.long 0x5937376e, 0x00000037, 0xb76d6dda, 0x0000006d -.long 0x8c8d8d01, 0x0000008d, 0x64d5d5b1, 0x000000d5 -.long 0xd24e4e9c, 0x0000004e, 0xe0a9a949, 0x000000a9 -.long 0xb46c6cd8, 0x0000006c, 0xfa5656ac, 0x00000056 -.long 0x07f4f4f3, 0x000000f4, 0x25eaeacf, 0x000000ea -.long 0xaf6565ca, 0x00000065, 0x8e7a7af4, 0x0000007a -.long 0xe9aeae47, 0x000000ae, 0x18080810, 0x00000008 -.long 0xd5baba6f, 0x000000ba, 0x887878f0, 0x00000078 -.long 0x6f25254a, 0x00000025, 0x722e2e5c, 0x0000002e -.long 0x241c1c38, 0x0000001c, 0xf1a6a657, 0x000000a6 -.long 0xc7b4b473, 0x000000b4, 0x51c6c697, 0x000000c6 -.long 0x23e8e8cb, 0x000000e8, 0x7cdddda1, 0x000000dd -.long 0x9c7474e8, 0x00000074, 0x211f1f3e, 0x0000001f -.long 0xdd4b4b96, 0x0000004b, 0xdcbdbd61, 0x000000bd -.long 0x868b8b0d, 0x0000008b, 0x858a8a0f, 0x0000008a -.long 0x907070e0, 0x00000070, 0x423e3e7c, 0x0000003e -.long 0xc4b5b571, 0x000000b5, 0xaa6666cc, 0x00000066 -.long 0xd8484890, 0x00000048, 0x05030306, 0x00000003 -.long 0x01f6f6f7, 0x000000f6, 0x120e0e1c, 0x0000000e -.long 0xa36161c2, 0x00000061, 0x5f35356a, 0x00000035 -.long 0xf95757ae, 0x00000057, 0xd0b9b969, 0x000000b9 -.long 0x91868617, 0x00000086, 0x58c1c199, 0x000000c1 -.long 0x271d1d3a, 0x0000001d, 0xb99e9e27, 0x0000009e -.long 0x38e1e1d9, 0x000000e1, 0x13f8f8eb, 0x000000f8 -.long 0xb398982b, 0x00000098, 0x33111122, 0x00000011 -.long 0xbb6969d2, 0x00000069, 0x70d9d9a9, 0x000000d9 -.long 0x898e8e07, 0x0000008e, 0xa7949433, 0x00000094 -.long 0xb69b9b2d, 0x0000009b, 0x221e1e3c, 0x0000001e -.long 0x92878715, 0x00000087, 0x20e9e9c9, 0x000000e9 -.long 0x49cece87, 0x000000ce, 0xff5555aa, 0x00000055 -.long 0x78282850, 0x00000028, 0x7adfdfa5, 0x000000df -.long 0x8f8c8c03, 0x0000008c, 0xf8a1a159, 0x000000a1 -.long 0x80898909, 0x00000089, 0x170d0d1a, 0x0000000d -.long 0xdabfbf65, 0x000000bf, 0x31e6e6d7, 0x000000e6 -.long 0xc6424284, 0x00000042, 0xb86868d0, 0x00000068 -.long 0xc3414182, 0x00000041, 0xb0999929, 0x00000099 -.long 0x772d2d5a, 0x0000002d, 0x110f0f1e, 0x0000000f -.long 0xcbb0b07b, 0x000000b0, 0xfc5454a8, 0x00000054 -.long 0xd6bbbb6d, 0x000000bb, 0x3a16162c, 0x00000016 - -/* Decryption tables */ -.align 5 -.type .LtableD0, %object -.type .LtableDs0, %object -.LtableD0: -.long 0x50a7f451 -.LtableDs0: -.long 0x00000052, 0x5365417e, 0x00000009 -.long 0xc3a4171a, 0x0000006a, 0x965e273a, 0x000000d5 -.long 0xcb6bab3b, 0x00000030, 0xf1459d1f, 0x00000036 -.long 0xab58faac, 0x000000a5, 0x9303e34b, 0x00000038 -.long 0x55fa3020, 0x000000bf, 0xf66d76ad, 0x00000040 -.long 0x9176cc88, 0x000000a3, 0x254c02f5, 0x0000009e -.long 0xfcd7e54f, 0x00000081, 0xd7cb2ac5, 0x000000f3 -.long 0x80443526, 0x000000d7, 0x8fa362b5, 0x000000fb -.long 0x495ab1de, 0x0000007c, 0x671bba25, 0x000000e3 -.long 0x980eea45, 0x00000039, 0xe1c0fe5d, 0x00000082 -.long 0x02752fc3, 0x0000009b, 0x12f04c81, 0x0000002f -.long 0xa397468d, 0x000000ff, 0xc6f9d36b, 0x00000087 -.long 0xe75f8f03, 0x00000034, 0x959c9215, 0x0000008e -.long 0xeb7a6dbf, 0x00000043, 0xda595295, 0x00000044 -.long 0x2d83bed4, 0x000000c4, 0xd3217458, 0x000000de -.long 0x2969e049, 0x000000e9, 0x44c8c98e, 0x000000cb -.long 0x6a89c275, 0x00000054, 0x78798ef4, 0x0000007b -.long 0x6b3e5899, 0x00000094, 0xdd71b927, 0x00000032 -.long 0xb64fe1be, 0x000000a6, 0x17ad88f0, 0x000000c2 -.long 0x66ac20c9, 0x00000023, 0xb43ace7d, 0x0000003d -.long 0x184adf63, 0x000000ee, 0x82311ae5, 0x0000004c -.long 0x60335197, 0x00000095, 0x457f5362, 0x0000000b -.long 0xe07764b1, 0x00000042, 0x84ae6bbb, 0x000000fa -.long 0x1ca081fe, 0x000000c3, 0x942b08f9, 0x0000004e -.long 0x58684870, 0x00000008, 0x19fd458f, 0x0000002e -.long 0x876cde94, 0x000000a1, 0xb7f87b52, 0x00000066 -.long 0x23d373ab, 0x00000028, 0xe2024b72, 0x000000d9 -.long 0x578f1fe3, 0x00000024, 0x2aab5566, 0x000000b2 -.long 0x0728ebb2, 0x00000076, 0x03c2b52f, 0x0000005b -.long 0x9a7bc586, 0x000000a2, 0xa50837d3, 0x00000049 -.long 0xf2872830, 0x0000006d, 0xb2a5bf23, 0x0000008b -.long 0xba6a0302, 0x000000d1, 0x5c8216ed, 0x00000025 -.long 0x2b1ccf8a, 0x00000072, 0x92b479a7, 0x000000f8 -.long 0xf0f207f3, 0x000000f6, 0xa1e2694e, 0x00000064 -.long 0xcdf4da65, 0x00000086, 0xd5be0506, 0x00000068 -.long 0x1f6234d1, 0x00000098, 0x8afea6c4, 0x00000016 -.long 0x9d532e34, 0x000000d4, 0xa055f3a2, 0x000000a4 -.long 0x32e18a05, 0x0000005c, 0x75ebf6a4, 0x000000cc -.long 0x39ec830b, 0x0000005d, 0xaaef6040, 0x00000065 -.long 0x069f715e, 0x000000b6, 0x51106ebd, 0x00000092 -.long 0xf98a213e, 0x0000006c, 0x3d06dd96, 0x00000070 -.long 0xae053edd, 0x00000048, 0x46bde64d, 0x00000050 -.long 0xb58d5491, 0x000000fd, 0x055dc471, 0x000000ed -.long 0x6fd40604, 0x000000b9, 0xff155060, 0x000000da -.long 0x24fb9819, 0x0000005e, 0x97e9bdd6, 0x00000015 -.long 0xcc434089, 0x00000046, 0x779ed967, 0x00000057 -.long 0xbd42e8b0, 0x000000a7, 0x888b8907, 0x0000008d -.long 0x385b19e7, 0x0000009d, 0xdbeec879, 0x00000084 -.long 0x470a7ca1, 0x00000090, 0xe90f427c, 0x000000d8 -.long 0xc91e84f8, 0x000000ab, 0x00000000, 0x00000000 -.long 0x83868009, 0x0000008c, 0x48ed2b32, 0x000000bc -.long 0xac70111e, 0x000000d3, 0x4e725a6c, 0x0000000a -.long 0xfbff0efd, 0x000000f7, 0x5638850f, 0x000000e4 -.long 0x1ed5ae3d, 0x00000058, 0x27392d36, 0x00000005 -.long 0x64d90f0a, 0x000000b8, 0x21a65c68, 0x000000b3 -.long 0xd1545b9b, 0x00000045, 0x3a2e3624, 0x00000006 -.long 0xb1670a0c, 0x000000d0, 0x0fe75793, 0x0000002c -.long 0xd296eeb4, 0x0000001e, 0x9e919b1b, 0x0000008f -.long 0x4fc5c080, 0x000000ca, 0xa220dc61, 0x0000003f -.long 0x694b775a, 0x0000000f, 0x161a121c, 0x00000002 -.long 0x0aba93e2, 0x000000c1, 0xe52aa0c0, 0x000000af -.long 0x43e0223c, 0x000000bd, 0x1d171b12, 0x00000003 -.long 0x0b0d090e, 0x00000001, 0xadc78bf2, 0x00000013 -.long 0xb9a8b62d, 0x0000008a, 0xc8a91e14, 0x0000006b -.long 0x8519f157, 0x0000003a, 0x4c0775af, 0x00000091 -.long 0xbbdd99ee, 0x00000011, 0xfd607fa3, 0x00000041 -.long 0x9f2601f7, 0x0000004f, 0xbcf5725c, 0x00000067 -.long 0xc53b6644, 0x000000dc, 0x347efb5b, 0x000000ea -.long 0x7629438b, 0x00000097, 0xdcc623cb, 0x000000f2 -.long 0x68fcedb6, 0x000000cf, 0x63f1e4b8, 0x000000ce -.long 0xcadc31d7, 0x000000f0, 0x10856342, 0x000000b4 -.long 0x40229713, 0x000000e6, 0x2011c684, 0x00000073 -.long 0x7d244a85, 0x00000096, 0xf83dbbd2, 0x000000ac -.long 0x1132f9ae, 0x00000074, 0x6da129c7, 0x00000022 -.long 0x4b2f9e1d, 0x000000e7, 0xf330b2dc, 0x000000ad -.long 0xec52860d, 0x00000035, 0xd0e3c177, 0x00000085 -.long 0x6c16b32b, 0x000000e2, 0x99b970a9, 0x000000f9 -.long 0xfa489411, 0x00000037, 0x2264e947, 0x000000e8 -.long 0xc48cfca8, 0x0000001c, 0x1a3ff0a0, 0x00000075 -.long 0xd82c7d56, 0x000000df, 0xef903322, 0x0000006e -.long 0xc74e4987, 0x00000047, 0xc1d138d9, 0x000000f1 -.long 0xfea2ca8c, 0x0000001a, 0x360bd498, 0x00000071 -.long 0xcf81f5a6, 0x0000001d, 0x28de7aa5, 0x00000029 -.long 0x268eb7da, 0x000000c5, 0xa4bfad3f, 0x00000089 -.long 0xe49d3a2c, 0x0000006f, 0x0d927850, 0x000000b7 -.long 0x9bcc5f6a, 0x00000062, 0x62467e54, 0x0000000e -.long 0xc2138df6, 0x000000aa, 0xe8b8d890, 0x00000018 -.long 0x5ef7392e, 0x000000be, 0xf5afc382, 0x0000001b -.long 0xbe805d9f, 0x000000fc, 0x7c93d069, 0x00000056 -.long 0xa92dd56f, 0x0000003e, 0xb31225cf, 0x0000004b -.long 0x3b99acc8, 0x000000c6, 0xa77d1810, 0x000000d2 -.long 0x6e639ce8, 0x00000079, 0x7bbb3bdb, 0x00000020 -.long 0x097826cd, 0x0000009a, 0xf418596e, 0x000000db -.long 0x01b79aec, 0x000000c0, 0xa89a4f83, 0x000000fe -.long 0x656e95e6, 0x00000078, 0x7ee6ffaa, 0x000000cd -.long 0x08cfbc21, 0x0000005a, 0xe6e815ef, 0x000000f4 -.long 0xd99be7ba, 0x0000001f, 0xce366f4a, 0x000000dd -.long 0xd4099fea, 0x000000a8, 0xd67cb029, 0x00000033 -.long 0xafb2a431, 0x00000088, 0x31233f2a, 0x00000007 -.long 0x3094a5c6, 0x000000c7, 0xc066a235, 0x00000031 -.long 0x37bc4e74, 0x000000b1, 0xa6ca82fc, 0x00000012 -.long 0xb0d090e0, 0x00000010, 0x15d8a733, 0x00000059 -.long 0x4a9804f1, 0x00000027, 0xf7daec41, 0x00000080 -.long 0x0e50cd7f, 0x000000ec, 0x2ff69117, 0x0000005f -.long 0x8dd64d76, 0x00000060, 0x4db0ef43, 0x00000051 -.long 0x544daacc, 0x0000007f, 0xdf0496e4, 0x000000a9 -.long 0xe3b5d19e, 0x00000019, 0x1b886a4c, 0x000000b5 -.long 0xb81f2cc1, 0x0000004a, 0x7f516546, 0x0000000d -.long 0x04ea5e9d, 0x0000002d, 0x5d358c01, 0x000000e5 -.long 0x737487fa, 0x0000007a, 0x2e410bfb, 0x0000009f -.long 0x5a1d67b3, 0x00000093, 0x52d2db92, 0x000000c9 -.long 0x335610e9, 0x0000009c, 0x1347d66d, 0x000000ef -.long 0x8c61d79a, 0x000000a0, 0x7a0ca137, 0x000000e0 -.long 0x8e14f859, 0x0000003b, 0x893c13eb, 0x0000004d -.long 0xee27a9ce, 0x000000ae, 0x35c961b7, 0x0000002a -.long 0xede51ce1, 0x000000f5, 0x3cb1477a, 0x000000b0 -.long 0x59dfd29c, 0x000000c8, 0x3f73f255, 0x000000eb -.long 0x79ce1418, 0x000000bb, 0xbf37c773, 0x0000003c -.long 0xeacdf753, 0x00000083, 0x5baafd5f, 0x00000053 -.long 0x146f3ddf, 0x00000099, 0x86db4478, 0x00000061 -.long 0x81f3afca, 0x00000017, 0x3ec468b9, 0x0000002b -.long 0x2c342438, 0x00000004, 0x5f40a3c2, 0x0000007e -.long 0x72c31d16, 0x000000ba, 0x0c25e2bc, 0x00000077 -.long 0x8b493c28, 0x000000d6, 0x41950dff, 0x00000026 -.long 0x7101a839, 0x000000e1, 0xdeb30c08, 0x00000069 -.long 0x9ce4b4d8, 0x00000014, 0x90c15664, 0x00000063 -.long 0x6184cb7b, 0x00000055, 0x70b632d5, 0x00000021 -.long 0x745c6c48, 0x0000000c, 0x4257b8d0, 0x0000007d - -#endif /*HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS*/ -#endif /*__ARM_ARCH >= 6*/ diff --git a/cipher/rijndael.c b/cipher/rijndael.c index 85c1a41..68ab5ea 100644 --- a/cipher/rijndael.c +++ b/cipher/rijndael.c @@ -67,11 +67,11 @@ # define USE_AMD64_ASM 1 #endif -/* USE_ARMV6_ASM indicates whether to use ARMv6 assembly code. */ -#undef USE_ARMV6_ASM -#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +/* USE_ARM_ASM indicates whether to use ARMv6 assembly code. */ +#undef USE_ARM_ASM +#if defined(__ARMEL__) # ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS -# define USE_ARMV6_ASM 1 +# define USE_ARM_ASM 1 # endif #endif @@ -123,18 +123,18 @@ extern void _gcry_aes_amd64_decrypt_block(const void *keysched_dec, int rounds); #endif /*USE_AMD64_ASM*/ -#ifdef USE_ARMV6_ASM +#ifdef USE_ARM_ASM /* ARMv6 assembly implementations of AES */ -extern void _gcry_aes_armv6_encrypt_block(const void *keysched_enc, +extern void _gcry_aes_arm_encrypt_block(const void *keysched_enc, unsigned char *out, const unsigned char *in, int rounds); -extern void _gcry_aes_armv6_decrypt_block(const void *keysched_dec, +extern void _gcry_aes_arm_decrypt_block(const void *keysched_dec, unsigned char *out, const unsigned char *in, int rounds); -#endif /*USE_ARMV6_ASM*/ +#endif /*USE_ARM_ASM*/ @@ -567,8 +567,8 @@ do_encrypt_aligned (const RIJNDAEL_context *ctx, { #ifdef USE_AMD64_ASM _gcry_aes_amd64_encrypt_block(ctx->keyschenc, b, a, ctx->rounds); -#elif defined(USE_ARMV6_ASM) - _gcry_aes_armv6_encrypt_block(ctx->keyschenc, b, a, ctx->rounds); +#elif defined(USE_ARM_ASM) + _gcry_aes_arm_encrypt_block(ctx->keyschenc, b, a, ctx->rounds); #else #define rk (ctx->keyschenc) int rounds = ctx->rounds; @@ -651,7 +651,7 @@ do_encrypt_aligned (const RIJNDAEL_context *ctx, *((u32_a_t*)(b+ 8)) ^= *((u32_a_t*)rk[rounds][2]); *((u32_a_t*)(b+12)) ^= *((u32_a_t*)rk[rounds][3]); #undef rk -#endif /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ +#endif /*!USE_AMD64_ASM && !USE_ARM_ASM*/ } @@ -659,7 +659,7 @@ static void do_encrypt (const RIJNDAEL_context *ctx, unsigned char *bx, const unsigned char *ax) { -#if !defined(USE_AMD64_ASM) && !defined(USE_ARMV6_ASM) +#if !defined(USE_AMD64_ASM) && !defined(USE_ARM_ASM) /* BX and AX are not necessary correctly aligned. Thus we might need to copy them here. We try to align to a 16 bytes. */ if (((size_t)ax & 0x0f) || ((size_t)bx & 0x0f)) @@ -680,7 +680,7 @@ do_encrypt (const RIJNDAEL_context *ctx, memcpy (bx, b.b, 16); } else -#endif /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ +#endif /*!USE_AMD64_ASM && !USE_ARM_ASM*/ { do_encrypt_aligned (ctx, bx, ax); } @@ -1694,8 +1694,8 @@ do_decrypt_aligned (RIJNDAEL_context *ctx, { #ifdef USE_AMD64_ASM _gcry_aes_amd64_decrypt_block(ctx->keyschdec, b, a, ctx->rounds); -#elif defined(USE_ARMV6_ASM) - _gcry_aes_armv6_decrypt_block(ctx->keyschdec, b, a, ctx->rounds); +#elif defined(USE_ARM_ASM) + _gcry_aes_arm_decrypt_block(ctx->keyschdec, b, a, ctx->rounds); #else #define rk (ctx->keyschdec) int rounds = ctx->rounds; @@ -1779,7 +1779,7 @@ do_decrypt_aligned (RIJNDAEL_context *ctx, *((u32_a_t*)(b+ 8)) ^= *((u32_a_t*)rk[0][2]); *((u32_a_t*)(b+12)) ^= *((u32_a_t*)rk[0][3]); #undef rk -#endif /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ +#endif /*!USE_AMD64_ASM && !USE_ARM_ASM*/ } @@ -1794,7 +1794,7 @@ do_decrypt (RIJNDAEL_context *ctx, byte *bx, const byte *ax) ctx->decryption_prepared = 1; } -#if !defined(USE_AMD64_ASM) && !defined(USE_ARMV6_ASM) +#if !defined(USE_AMD64_ASM) && !defined(USE_ARM_ASM) /* BX and AX are not necessary correctly aligned. Thus we might need to copy them here. We try to align to a 16 bytes. */ if (((size_t)ax & 0x0f) || ((size_t)bx & 0x0f)) @@ -1815,7 +1815,7 @@ do_decrypt (RIJNDAEL_context *ctx, byte *bx, const byte *ax) memcpy (bx, b.b, 16); } else -#endif /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ +#endif /*!USE_AMD64_ASM && !USE_ARM_ASM*/ { do_decrypt_aligned (ctx, bx, ax); } diff --git a/cipher/twofish-arm.S b/cipher/twofish-arm.S new file mode 100644 index 0000000..9565c4a --- /dev/null +++ b/cipher/twofish-arm.S @@ -0,0 +1,365 @@ +/* twofish-arm.S - ARM assembly implementation of Twofish cipher + * + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include + +#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +#ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS + +.text + +.syntax unified +.arm + +/* structure of TWOFISH_context: */ +#define s0 0 +#define s1 ((s0) + 4 * 256) +#define s2 ((s1) + 4 * 256) +#define s3 ((s2) + 4 * 256) +#define w ((s3) + 4 * 256) +#define k ((w) + 4 * 8) + +/* register macros */ +#define CTX %r0 +#define CTXs0 %r0 +#define CTXs1 %r1 +#define CTXs3 %r7 + +#define RA %r3 +#define RB %r4 +#define RC %r5 +#define RD %r6 + +#define RX %r2 +#define RY %ip + +#define RMASK %lr + +#define RT0 %r8 +#define RT1 %r9 +#define RT2 %r10 +#define RT3 %r11 + +/* helper macros */ +#define ldr_unaligned_le(rout, rsrc, offs, rtmp) \ + ldrb rout, [rsrc, #((offs) + 0)]; \ + ldrb rtmp, [rsrc, #((offs) + 1)]; \ + orr rout, rout, rtmp, lsl #8; \ + ldrb rtmp, [rsrc, #((offs) + 2)]; \ + orr rout, rout, rtmp, lsl #16; \ + ldrb rtmp, [rsrc, #((offs) + 3)]; \ + orr rout, rout, rtmp, lsl #24; + +#define str_unaligned_le(rin, rdst, offs, rtmp0, rtmp1) \ + mov rtmp0, rin, lsr #8; \ + strb rin, [rdst, #((offs) + 0)]; \ + mov rtmp1, rin, lsr #16; \ + strb rtmp0, [rdst, #((offs) + 1)]; \ + mov rtmp0, rin, lsr #24; \ + strb rtmp1, [rdst, #((offs) + 2)]; \ + strb rtmp0, [rdst, #((offs) + 3)]; + +#ifndef __ARMEL__ + /* bswap on big-endian */ + #define host_to_le(reg) \ + rev reg, reg; + #define le_to_host(reg) \ + rev reg, reg; +#else + /* nop on little-endian */ + #define host_to_le(reg) /*_*/ + #define le_to_host(reg) /*_*/ +#endif + +#define ldr_input_aligned_le(rin, a, b, c, d) \ + ldr a, [rin, #0]; \ + ldr b, [rin, #4]; \ + le_to_host(a); \ + ldr c, [rin, #8]; \ + le_to_host(b); \ + ldr d, [rin, #12]; \ + le_to_host(c); \ + le_to_host(d); + +#define str_output_aligned_le(rout, a, b, c, d) \ + le_to_host(a); \ + le_to_host(b); \ + str a, [rout, #0]; \ + le_to_host(c); \ + str b, [rout, #4]; \ + le_to_host(d); \ + str c, [rout, #8]; \ + str d, [rout, #12]; + +#ifdef __ARM_FEATURE_UNALIGNED + /* unaligned word reads/writes allowed */ + #define ldr_input_le(rin, ra, rb, rc, rd, rtmp) \ + ldr_input_aligned_le(rin, ra, rb, rc, rd) + + #define str_output_le(rout, ra, rb, rc, rd, rtmp0, rtmp1) \ + str_output_aligned_le(rout, ra, rb, rc, rd) +#else + /* need to handle unaligned reads/writes by byte reads */ + #define ldr_input_le(rin, ra, rb, rc, rd, rtmp0) \ + tst rin, #3; \ + beq 1f; \ + ldr_unaligned_le(ra, rin, 0, rtmp0); \ + ldr_unaligned_le(rb, rin, 4, rtmp0); \ + ldr_unaligned_le(rc, rin, 8, rtmp0); \ + ldr_unaligned_le(rd, rin, 12, rtmp0); \ + b 2f; \ + 1:;\ + ldr_input_aligned_le(rin, ra, rb, rc, rd); \ + 2:; + + #define str_output_le(rout, ra, rb, rc, rd, rtmp0, rtmp1) \ + tst rout, #3; \ + beq 1f; \ + str_unaligned_le(ra, rout, 0, rtmp0, rtmp1); \ + str_unaligned_le(rb, rout, 4, rtmp0, rtmp1); \ + str_unaligned_le(rc, rout, 8, rtmp0, rtmp1); \ + str_unaligned_le(rd, rout, 12, rtmp0, rtmp1); \ + b 2f; \ + 1:;\ + str_output_aligned_le(rout, ra, rb, rc, rd); \ + 2:; +#endif + +/********************************************************************** + 1-way twofish + **********************************************************************/ +#define encrypt_round(a, b, rc, rd, n, ror_a, adj_a) \ + and RT0, RMASK, b, lsr#(8 - 2); \ + and RY, RMASK, b, lsr#(16 - 2); \ + add RT0, RT0, #(s2 - s1); \ + and RT1, RMASK, b, lsr#(24 - 2); \ + ldr RY, [CTXs3, RY]; \ + and RT2, RMASK, b, lsl#(2); \ + ldr RT0, [CTXs1, RT0]; \ + and RT3, RMASK, a, lsr#(16 - 2 + (adj_a)); \ + ldr RT1, [CTXs0, RT1]; \ + and RX, RMASK, a, lsr#(8 - 2 + (adj_a)); \ + ldr RT2, [CTXs1, RT2]; \ + add RT3, RT3, #(s2 - s1); \ + ldr RX, [CTXs1, RX]; \ + ror_a(a); \ + \ + eor RY, RY, RT0; \ + ldr RT3, [CTXs1, RT3]; \ + and RT0, RMASK, a, lsl#(2); \ + eor RY, RY, RT1; \ + and RT1, RMASK, a, lsr#(24 - 2); \ + eor RY, RY, RT2; \ + ldr RT0, [CTXs0, RT0]; \ + eor RX, RX, RT3; \ + ldr RT1, [CTXs3, RT1]; \ + eor RX, RX, RT0; \ + \ + ldr RT3, [CTXs3, #(k - s3 + 8 * (n) + 4)]; \ + eor RX, RX, RT1; \ + ldr RT2, [CTXs3, #(k - s3 + 8 * (n))]; \ + \ + add RT0, RX, RY, lsl #1; \ + add RX, RX, RY; \ + add RT0, RT0, RT3; \ + add RX, RX, RT2; \ + eor rd, RT0, rd, ror #31; \ + eor rc, rc, RX; + +#define dummy(x) /*_*/ + +#define ror1(r) \ + ror r, r, #1; + +#define decrypt_round(a, b, rc, rd, n, ror_b, adj_b) \ + and RT3, RMASK, b, lsl#(2 - (adj_b)); \ + and RT1, RMASK, b, lsr#(8 - 2 + (adj_b)); \ + ror_b(b); \ + and RT2, RMASK, a, lsl#(2); \ + and RT0, RMASK, a, lsr#(8 - 2); \ + \ + ldr RY, [CTXs1, RT3]; \ + add RT1, RT1, #(s2 - s1); \ + ldr RX, [CTXs0, RT2]; \ + and RT3, RMASK, b, lsr#(16 - 2); \ + ldr RT1, [CTXs1, RT1]; \ + and RT2, RMASK, a, lsr#(16 - 2); \ + ldr RT0, [CTXs1, RT0]; \ + \ + add RT2, RT2, #(s2 - s1); \ + ldr RT3, [CTXs3, RT3]; \ + eor RY, RY, RT1; \ + \ + and RT1, RMASK, b, lsr#(24 - 2); \ + eor RX, RX, RT0; \ + ldr RT2, [CTXs1, RT2]; \ + and RT0, RMASK, a, lsr#(24 - 2); \ + \ + ldr RT1, [CTXs0, RT1]; \ + \ + eor RY, RY, RT3; \ + ldr RT0, [CTXs3, RT0]; \ + eor RX, RX, RT2; \ + eor RY, RY, RT1; \ + \ + ldr RT1, [CTXs3, #(k - s3 + 8 * (n) + 4)]; \ + eor RX, RX, RT0; \ + ldr RT2, [CTXs3, #(k - s3 + 8 * (n))]; \ + \ + add RT0, RX, RY, lsl #1; \ + add RX, RX, RY; \ + add RT0, RT0, RT1; \ + add RX, RX, RT2; \ + eor rd, rd, RT0; \ + eor rc, RX, rc, ror #31; + +#define first_encrypt_cycle(nc) \ + encrypt_round(RA, RB, RC, RD, (nc) * 2, dummy, 0); \ + encrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); + +#define encrypt_cycle(nc) \ + encrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); \ + encrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); + +#define last_encrypt_cycle(nc) \ + encrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); \ + encrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); \ + ror1(RA); + +#define first_decrypt_cycle(nc) \ + decrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, dummy, 0); \ + decrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); + +#define decrypt_cycle(nc) \ + decrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); \ + decrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); + +#define last_decrypt_cycle(nc) \ + decrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); \ + decrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); \ + ror1(RD); + +.align 3 +.global _gcry_twofish_arm_encrypt_block +.type _gcry_twofish_arm_encrypt_block,%function; + +_gcry_twofish_arm_encrypt_block: + /* input: + * %r0: ctx + * %r1: dst + * %r2: src + */ + push {%r1, %r4-%r11, %ip, %lr}; + + add RY, CTXs0, #w; + + ldr_input_le(%r2, RA, RB, RC, RD, RT0); + + /* Input whitening */ + ldm RY, {RT0, RT1, RT2, RT3}; + add CTXs3, CTXs0, #(s3 - s0); + add CTXs1, CTXs0, #(s1 - s0); + mov RMASK, #(0xff << 2); + eor RA, RA, RT0; + eor RB, RB, RT1; + eor RC, RC, RT2; + eor RD, RD, RT3; + + first_encrypt_cycle(0); + encrypt_cycle(1); + encrypt_cycle(2); + encrypt_cycle(3); + encrypt_cycle(4); + encrypt_cycle(5); + encrypt_cycle(6); + last_encrypt_cycle(7); + + add RY, CTXs3, #(w + 4*4 - s3); + pop {%r1}; /* dst */ + + /* Output whitening */ + ldm RY, {RT0, RT1, RT2, RT3}; + eor RC, RC, RT0; + eor RD, RD, RT1; + eor RA, RA, RT2; + eor RB, RB, RT3; + + str_output_le(%r1, RC, RD, RA, RB, RT0, RT1); + + pop {%r4-%r11, %ip, %lr}; + bx %lr; +.ltorg +.size _gcry_twofish_arm_encrypt_block,.-_gcry_twofish_arm_encrypt_block; + +.align 3 +.global _gcry_twofish_arm_decrypt_block +.type _gcry_twofish_arm_decrypt_block,%function; + +_gcry_twofish_arm_decrypt_block: + /* input: + * %r0: ctx + * %r1: dst + * %r2: src + */ + push {%r1, %r4-%r11, %ip, %lr}; + + add CTXs3, CTXs0, #(s3 - s0); + + ldr_input_le(%r2, RC, RD, RA, RB, RT0); + + add RY, CTXs3, #(w + 4*4 - s3); + add CTXs3, CTXs0, #(s3 - s0); + + /* Input whitening */ + ldm RY, {RT0, RT1, RT2, RT3}; + add CTXs1, CTXs0, #(s1 - s0); + mov RMASK, #(0xff << 2); + eor RC, RC, RT0; + eor RD, RD, RT1; + eor RA, RA, RT2; + eor RB, RB, RT3; + + first_decrypt_cycle(7); + decrypt_cycle(6); + decrypt_cycle(5); + decrypt_cycle(4); + decrypt_cycle(3); + decrypt_cycle(2); + decrypt_cycle(1); + last_decrypt_cycle(0); + + add RY, CTXs0, #w; + pop {%r1}; /* dst */ + + /* Output whitening */ + ldm RY, {RT0, RT1, RT2, RT3}; + eor RA, RA, RT0; + eor RB, RB, RT1; + eor RC, RC, RT2; + eor RD, RD, RT3; + + str_output_le(%r1, RA, RB, RC, RD, RT0, RT1); + + pop {%r4-%r11, %ip, %lr}; + bx %lr; +.size _gcry_twofish_arm_decrypt_block,.-_gcry_twofish_arm_decrypt_block; + +#endif /*HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS*/ +#endif /*__ARM_ARCH >= 6*/ diff --git a/cipher/twofish-armv6.S b/cipher/twofish-armv6.S deleted file mode 100644 index b76ab37..0000000 --- a/cipher/twofish-armv6.S +++ /dev/null @@ -1,365 +0,0 @@ -/* twofish-armv6.S - ARM assembly implementation of Twofish cipher - * - * Copyright ? 2013 Jussi Kivilinna - * - * This file is part of Libgcrypt. - * - * Libgcrypt is free software; you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation; either version 2.1 of - * the License, or (at your option) any later version. - * - * Libgcrypt is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public - * License along with this program; if not, see . - */ - -#include - -#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) -#ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS - -.text - -.syntax unified -.arm - -/* structure of TWOFISH_context: */ -#define s0 0 -#define s1 ((s0) + 4 * 256) -#define s2 ((s1) + 4 * 256) -#define s3 ((s2) + 4 * 256) -#define w ((s3) + 4 * 256) -#define k ((w) + 4 * 8) - -/* register macros */ -#define CTX %r0 -#define CTXs0 %r0 -#define CTXs1 %r1 -#define CTXs3 %r7 - -#define RA %r3 -#define RB %r4 -#define RC %r5 -#define RD %r6 - -#define RX %r2 -#define RY %ip - -#define RMASK %lr - -#define RT0 %r8 -#define RT1 %r9 -#define RT2 %r10 -#define RT3 %r11 - -/* helper macros */ -#define ldr_unaligned_le(rout, rsrc, offs, rtmp) \ - ldrb rout, [rsrc, #((offs) + 0)]; \ - ldrb rtmp, [rsrc, #((offs) + 1)]; \ - orr rout, rout, rtmp, lsl #8; \ - ldrb rtmp, [rsrc, #((offs) + 2)]; \ - orr rout, rout, rtmp, lsl #16; \ - ldrb rtmp, [rsrc, #((offs) + 3)]; \ - orr rout, rout, rtmp, lsl #24; - -#define str_unaligned_le(rin, rdst, offs, rtmp0, rtmp1) \ - mov rtmp0, rin, lsr #8; \ - strb rin, [rdst, #((offs) + 0)]; \ - mov rtmp1, rin, lsr #16; \ - strb rtmp0, [rdst, #((offs) + 1)]; \ - mov rtmp0, rin, lsr #24; \ - strb rtmp1, [rdst, #((offs) + 2)]; \ - strb rtmp0, [rdst, #((offs) + 3)]; - -#ifndef __ARMEL__ - /* bswap on big-endian */ - #define host_to_le(reg) \ - rev reg, reg; - #define le_to_host(reg) \ - rev reg, reg; -#else - /* nop on little-endian */ - #define host_to_le(reg) /*_*/ - #define le_to_host(reg) /*_*/ -#endif - -#define ldr_input_aligned_le(rin, a, b, c, d) \ - ldr a, [rin, #0]; \ - ldr b, [rin, #4]; \ - le_to_host(a); \ - ldr c, [rin, #8]; \ - le_to_host(b); \ - ldr d, [rin, #12]; \ - le_to_host(c); \ - le_to_host(d); - -#define str_output_aligned_le(rout, a, b, c, d) \ - le_to_host(a); \ - le_to_host(b); \ - str a, [rout, #0]; \ - le_to_host(c); \ - str b, [rout, #4]; \ - le_to_host(d); \ - str c, [rout, #8]; \ - str d, [rout, #12]; - -#ifdef __ARM_FEATURE_UNALIGNED - /* unaligned word reads/writes allowed */ - #define ldr_input_le(rin, ra, rb, rc, rd, rtmp) \ - ldr_input_aligned_le(rin, ra, rb, rc, rd) - - #define str_output_le(rout, ra, rb, rc, rd, rtmp0, rtmp1) \ - str_output_aligned_le(rout, ra, rb, rc, rd) -#else - /* need to handle unaligned reads/writes by byte reads */ - #define ldr_input_le(rin, ra, rb, rc, rd, rtmp0) \ - tst rin, #3; \ - beq 1f; \ - ldr_unaligned_le(ra, rin, 0, rtmp0); \ - ldr_unaligned_le(rb, rin, 4, rtmp0); \ - ldr_unaligned_le(rc, rin, 8, rtmp0); \ - ldr_unaligned_le(rd, rin, 12, rtmp0); \ - b 2f; \ - 1:;\ - ldr_input_aligned_le(rin, ra, rb, rc, rd); \ - 2:; - - #define str_output_le(rout, ra, rb, rc, rd, rtmp0, rtmp1) \ - tst rout, #3; \ - beq 1f; \ - str_unaligned_le(ra, rout, 0, rtmp0, rtmp1); \ - str_unaligned_le(rb, rout, 4, rtmp0, rtmp1); \ - str_unaligned_le(rc, rout, 8, rtmp0, rtmp1); \ - str_unaligned_le(rd, rout, 12, rtmp0, rtmp1); \ - b 2f; \ - 1:;\ - str_output_aligned_le(rout, ra, rb, rc, rd); \ - 2:; -#endif - -/********************************************************************** - 1-way twofish - **********************************************************************/ -#define encrypt_round(a, b, rc, rd, n, ror_a, adj_a) \ - and RT0, RMASK, b, lsr#(8 - 2); \ - and RY, RMASK, b, lsr#(16 - 2); \ - add RT0, RT0, #(s2 - s1); \ - and RT1, RMASK, b, lsr#(24 - 2); \ - ldr RY, [CTXs3, RY]; \ - and RT2, RMASK, b, lsl#(2); \ - ldr RT0, [CTXs1, RT0]; \ - and RT3, RMASK, a, lsr#(16 - 2 + (adj_a)); \ - ldr RT1, [CTXs0, RT1]; \ - and RX, RMASK, a, lsr#(8 - 2 + (adj_a)); \ - ldr RT2, [CTXs1, RT2]; \ - add RT3, RT3, #(s2 - s1); \ - ldr RX, [CTXs1, RX]; \ - ror_a(a); \ - \ - eor RY, RY, RT0; \ - ldr RT3, [CTXs1, RT3]; \ - and RT0, RMASK, a, lsl#(2); \ - eor RY, RY, RT1; \ - and RT1, RMASK, a, lsr#(24 - 2); \ - eor RY, RY, RT2; \ - ldr RT0, [CTXs0, RT0]; \ - eor RX, RX, RT3; \ - ldr RT1, [CTXs3, RT1]; \ - eor RX, RX, RT0; \ - \ - ldr RT3, [CTXs3, #(k - s3 + 8 * (n) + 4)]; \ - eor RX, RX, RT1; \ - ldr RT2, [CTXs3, #(k - s3 + 8 * (n))]; \ - \ - add RT0, RX, RY, lsl #1; \ - add RX, RX, RY; \ - add RT0, RT0, RT3; \ - add RX, RX, RT2; \ - eor rd, RT0, rd, ror #31; \ - eor rc, rc, RX; - -#define dummy(x) /*_*/ - -#define ror1(r) \ - ror r, r, #1; - -#define decrypt_round(a, b, rc, rd, n, ror_b, adj_b) \ - and RT3, RMASK, b, lsl#(2 - (adj_b)); \ - and RT1, RMASK, b, lsr#(8 - 2 + (adj_b)); \ - ror_b(b); \ - and RT2, RMASK, a, lsl#(2); \ - and RT0, RMASK, a, lsr#(8 - 2); \ - \ - ldr RY, [CTXs1, RT3]; \ - add RT1, RT1, #(s2 - s1); \ - ldr RX, [CTXs0, RT2]; \ - and RT3, RMASK, b, lsr#(16 - 2); \ - ldr RT1, [CTXs1, RT1]; \ - and RT2, RMASK, a, lsr#(16 - 2); \ - ldr RT0, [CTXs1, RT0]; \ - \ - add RT2, RT2, #(s2 - s1); \ - ldr RT3, [CTXs3, RT3]; \ - eor RY, RY, RT1; \ - \ - and RT1, RMASK, b, lsr#(24 - 2); \ - eor RX, RX, RT0; \ - ldr RT2, [CTXs1, RT2]; \ - and RT0, RMASK, a, lsr#(24 - 2); \ - \ - ldr RT1, [CTXs0, RT1]; \ - \ - eor RY, RY, RT3; \ - ldr RT0, [CTXs3, RT0]; \ - eor RX, RX, RT2; \ - eor RY, RY, RT1; \ - \ - ldr RT1, [CTXs3, #(k - s3 + 8 * (n) + 4)]; \ - eor RX, RX, RT0; \ - ldr RT2, [CTXs3, #(k - s3 + 8 * (n))]; \ - \ - add RT0, RX, RY, lsl #1; \ - add RX, RX, RY; \ - add RT0, RT0, RT1; \ - add RX, RX, RT2; \ - eor rd, rd, RT0; \ - eor rc, RX, rc, ror #31; - -#define first_encrypt_cycle(nc) \ - encrypt_round(RA, RB, RC, RD, (nc) * 2, dummy, 0); \ - encrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); - -#define encrypt_cycle(nc) \ - encrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); \ - encrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); - -#define last_encrypt_cycle(nc) \ - encrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); \ - encrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); \ - ror1(RA); - -#define first_decrypt_cycle(nc) \ - decrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, dummy, 0); \ - decrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); - -#define decrypt_cycle(nc) \ - decrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); \ - decrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); - -#define last_decrypt_cycle(nc) \ - decrypt_round(RC, RD, RA, RB, (nc) * 2 + 1, ror1, 1); \ - decrypt_round(RA, RB, RC, RD, (nc) * 2, ror1, 1); \ - ror1(RD); - -.align 3 -.global _gcry_twofish_armv6_encrypt_block -.type _gcry_twofish_armv6_encrypt_block,%function; - -_gcry_twofish_armv6_encrypt_block: - /* input: - * %r0: ctx - * %r1: dst - * %r2: src - */ - push {%r1, %r4-%r11, %ip, %lr}; - - add RY, CTXs0, #w; - - ldr_input_le(%r2, RA, RB, RC, RD, RT0); - - /* Input whitening */ - ldm RY, {RT0, RT1, RT2, RT3}; - add CTXs3, CTXs0, #(s3 - s0); - add CTXs1, CTXs0, #(s1 - s0); - mov RMASK, #(0xff << 2); - eor RA, RA, RT0; - eor RB, RB, RT1; - eor RC, RC, RT2; - eor RD, RD, RT3; - - first_encrypt_cycle(0); - encrypt_cycle(1); - encrypt_cycle(2); - encrypt_cycle(3); - encrypt_cycle(4); - encrypt_cycle(5); - encrypt_cycle(6); - last_encrypt_cycle(7); - - add RY, CTXs3, #(w + 4*4 - s3); - pop {%r1}; /* dst */ - - /* Output whitening */ - ldm RY, {RT0, RT1, RT2, RT3}; - eor RC, RC, RT0; - eor RD, RD, RT1; - eor RA, RA, RT2; - eor RB, RB, RT3; - - str_output_le(%r1, RC, RD, RA, RB, RT0, RT1); - - pop {%r4-%r11, %ip, %lr}; - bx %lr; -.ltorg -.size _gcry_twofish_armv6_encrypt_block,.-_gcry_twofish_armv6_encrypt_block; - -.align 3 -.global _gcry_twofish_armv6_decrypt_block -.type _gcry_twofish_armv6_decrypt_block,%function; - -_gcry_twofish_armv6_decrypt_block: - /* input: - * %r0: ctx - * %r1: dst - * %r2: src - */ - push {%r1, %r4-%r11, %ip, %lr}; - - add CTXs3, CTXs0, #(s3 - s0); - - ldr_input_le(%r2, RC, RD, RA, RB, RT0); - - add RY, CTXs3, #(w + 4*4 - s3); - add CTXs3, CTXs0, #(s3 - s0); - - /* Input whitening */ - ldm RY, {RT0, RT1, RT2, RT3}; - add CTXs1, CTXs0, #(s1 - s0); - mov RMASK, #(0xff << 2); - eor RC, RC, RT0; - eor RD, RD, RT1; - eor RA, RA, RT2; - eor RB, RB, RT3; - - first_decrypt_cycle(7); - decrypt_cycle(6); - decrypt_cycle(5); - decrypt_cycle(4); - decrypt_cycle(3); - decrypt_cycle(2); - decrypt_cycle(1); - last_decrypt_cycle(0); - - add RY, CTXs0, #w; - pop {%r1}; /* dst */ - - /* Output whitening */ - ldm RY, {RT0, RT1, RT2, RT3}; - eor RA, RA, RT0; - eor RB, RB, RT1; - eor RC, RC, RT2; - eor RD, RD, RT3; - - str_output_le(%r1, RA, RB, RC, RD, RT0, RT1); - - pop {%r4-%r11, %ip, %lr}; - bx %lr; -.size _gcry_twofish_armv6_decrypt_block,.-_gcry_twofish_armv6_decrypt_block; - -#endif /*HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS*/ -#endif /*__ARM_ARCH >= 6*/ diff --git a/cipher/twofish.c b/cipher/twofish.c index d2cabbe..8f9f3fc 100644 --- a/cipher/twofish.c +++ b/cipher/twofish.c @@ -757,10 +757,10 @@ extern void _gcry_twofish_amd64_cfb_dec(const TWOFISH_context *c, byte *out, #elif defined(USE_ARMV6_ASM) /* Assembly implementations of Twofish. */ -extern void _gcry_twofish_armv6_encrypt_block(const TWOFISH_context *c, +extern void _gcry_twofish_arm_encrypt_block(const TWOFISH_context *c, byte *out, const byte *in); -extern void _gcry_twofish_armv6_decrypt_block(const TWOFISH_context *c, +extern void _gcry_twofish_arm_decrypt_block(const TWOFISH_context *c, byte *out, const byte *in); #else /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ @@ -843,7 +843,7 @@ static unsigned int twofish_encrypt (void *context, byte *out, const byte *in) { TWOFISH_context *ctx = context; - _gcry_twofish_armv6_encrypt_block(ctx, out, in); + _gcry_twofish_arm_encrypt_block(ctx, out, in); return /*burn_stack*/ (4*sizeof (void*)); } @@ -910,7 +910,7 @@ static unsigned int twofish_decrypt (void *context, byte *out, const byte *in) { TWOFISH_context *ctx = context; - _gcry_twofish_armv6_decrypt_block(ctx, out, in); + _gcry_twofish_arm_decrypt_block(ctx, out, in); return /*burn_stack*/ (4*sizeof (void*)); } diff --git a/configure.ac b/configure.ac index 58916e8..9b8ce33 100644 --- a/configure.ac +++ b/configure.ac @@ -1424,7 +1424,7 @@ if test "$found" = "1" ; then ;; arm*-*-*) # Build with the assembly implementation - GCRYPT_CIPHERS="$GCRYPT_CIPHERS blowfish-armv6.lo" + GCRYPT_CIPHERS="$GCRYPT_CIPHERS blowfish-arm.lo" ;; esac fi @@ -1441,7 +1441,7 @@ if test "$found" = "1" ; then ;; arm*-*-*) # Build with the assembly implementation - GCRYPT_CIPHERS="$GCRYPT_CIPHERS cast5-armv6.lo" + GCRYPT_CIPHERS="$GCRYPT_CIPHERS cast5-arm.lo" ;; esac fi @@ -1464,7 +1464,7 @@ if test "$found" = "1" ; then ;; arm*-*-*) # Build with the assembly implementation - GCRYPT_CIPHERS="$GCRYPT_CIPHERS rijndael-armv6.lo" + GCRYPT_CIPHERS="$GCRYPT_CIPHERS rijndael-arm.lo" ;; esac fi @@ -1481,7 +1481,7 @@ if test "$found" = "1" ; then ;; arm*-*-*) # Build with the assembly implementation - GCRYPT_CIPHERS="$GCRYPT_CIPHERS twofish-armv6.lo" + GCRYPT_CIPHERS="$GCRYPT_CIPHERS twofish-arm.lo" ;; esac fi @@ -1524,7 +1524,7 @@ if test "$found" = "1" ; then case "${host}" in arm*-*-*) # Build with the assembly implementation - GCRYPT_CIPHERS="$GCRYPT_CIPHERS camellia-armv6.lo" + GCRYPT_CIPHERS="$GCRYPT_CIPHERS camellia-arm.lo" ;; esac -- 1.8.4.rc3 From jussi.kivilinna at iki.fi Wed Oct 23 13:11:45 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Wed, 23 Oct 2013 14:11:45 +0300 Subject: [PATCH 3/3] Enable assembler optimizations on earlier ARM cores In-Reply-To: <1382470167-11975-3-git-send-email-dbaryshkov@gmail.com> References: <1382470167-11975-1-git-send-email-dbaryshkov@gmail.com> <1382470167-11975-3-git-send-email-dbaryshkov@gmail.com> Message-ID: <5267AEF1.5010504@iki.fi> On 22.10.2013 22:29, Dmitry Eremin-Solenikov wrote: > * cipher/blowfish-armv6.S => cipher/blowfish-arm.S: adapt to pre-armv6 CPUs. > * cipher/blowfish.c: enable assembly on armv4/armv5 little-endian CPUs. > * cipher/camellia-armv6.S => cipher/camellia-arm.S: adapt to pre-armv6 CPUs. > * cipher/camellia.c, cipher-camellia-glue.c: enable assembly on armv4/armv5 > little-endian CPUs. > * cipher/cast5-armv6.S => cipher/cast5-arm.S: adapt to pre-armv6 CPUs. > * cipher/cast5.c: enable assembly on armv4/armv5 little-endian CPUs. > * cipher/rijndael-armv6.S => cipher/rijndael-arm.S: adapt to pre-armv6 CPUs. > * cipher/rijndael.c: enable assembly on armv4/armv5 little-endian CPUs. > * cipher/twofish-armv6.S => cipher/twofish-arm.S: adapt to pre-armv6 CPUs. > * cipher/twofish.c: enable assembly on armv4/armv5 little-endian CPUs. > > -- > Our ARMv6 assembly optimized code can be easily adapted to earlier CPUs. > The only incompatible place is rev instruction used to do byte swapping. > Replace it on <= ARMv6 with a series of 4 instructions. > Thanks. I've added few fixes and if none objects, I'll push these later today or tomorrow. -Jussi From jussi.kivilinna at iki.fi Wed Oct 23 14:26:37 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Wed, 23 Oct 2013 15:26:37 +0300 Subject: [PATCH 1/2] Improve the speed of the cipher mode code Message-ID: <20131023122637.19376.55401.stgit@localhost6.localdomain6> * cipher/bufhelp.h (buf_cpy): New. (buf_xor, buf_xor_2dst): If buffers unaligned, always jump to per-byte processing. (buf_xor_n_copy_2): New. (buf_xor_n_copy): Use 'buf_xor_n_copy_2'. * cipher/blowfish.c (_gcry_blowfish_cbc_dec): Avoid extra memory copy and use new 'buf_xor_n_copy_2'. * cipher/camellia-glue.c (_gcry_camellia_cbc_dec): Ditto. * cipher/cast5.c (_gcry_cast_cbc_dec): Ditto. * cipher/serpent.c (_gcry_serpent_cbc_dec): Ditto. * cipher/twofish.c (_gcry_twofish_cbc_dec): Ditto. * cipher/rijndael.c (_gcry_aes_cbc_dec): Ditto. (do_encrypt, do_decrypt): Use 'buf_cpy' instead of 'memcpy'. (_gcry_aes_cbc_enc): Avoid copying IV, use 'last_iv' pointer instead. * cipher/cipher-cbc.c (_gcry_cipher_cbc_encrypt): Avoid copying IV, update pointer to IV instead. (_gcry_cipher_cbc_decrypt): Avoid extra memory copy and use new 'buf_xor_n_copy_2'. (_gcry_cipher_cbc_encrypt, _gcry_cipher_cbc_decrypt): Avoid extra accesses to c->spec, use 'buf_cpy' instead of memcpy. * cipher/cipher-ccm.c (do_cbc_mac): Ditto. * cipher/cipher-cfb.c (_gcry_cipher_cfb_encrypt) (_gcry_cipher_cfb_decrypt): Ditto. * cipher/cipher-ctr.c (_gcry_cipher_ctr_encrypt): Ditto. * cipher/cipher-ofb.c (_gcry_cipher_ofb_encrypt) (_gcry_cipher_ofb_decrypt): Ditto. * cipher/cipher.c (do_ecb_encrypt, do_ecb_decrypt): Ditto. -- Patch improves the speed of the generic block cipher mode code. Especially on targets without faster unaligned memory accesses, the generic code was slower than the algorithm specific bulk versions. With this patch, this issue should be solved. Tests on Cortex-A8; compiled for ARMv4, without unaligned-accesses: Before: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 490ms 500ms 560ms 580ms 530ms 540ms 560ms 560ms 550ms 540ms 1080ms 1080ms TWOFISH 230ms 230ms 290ms 300ms 260ms 240ms 290ms 290ms 240ms 240ms 520ms 510ms DES 720ms 720ms 800ms 860ms 770ms 770ms 810ms 820ms 770ms 780ms - - CAST5 340ms 340ms 440ms 250ms 390ms 250ms 440ms 430ms 260ms 250ms - - After: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 500ms 490ms 520ms 520ms 530ms 520ms 530ms 540ms 500ms 520ms 1060ms 1070ms TWOFISH 230ms 220ms 250ms 230ms 260ms 230ms 260ms 260ms 230ms 230ms 500ms 490ms DES 720ms 720ms 750ms 760ms 740ms 750ms 770ms 770ms 760ms 760ms - - CAST5 340ms 340ms 370ms 250ms 370ms 250ms 380ms 390ms 250ms 250ms - - Tests on Cortex-A8; compiled for ARMv7-A, with unaligned-accesses: Before: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 430ms 440ms 480ms 530ms 470ms 460ms 490ms 480ms 470ms 460ms 930ms 940ms TWOFISH 220ms 220ms 250ms 230ms 240ms 230ms 270ms 250ms 230ms 240ms 480ms 470ms DES 550ms 540ms 620ms 690ms 570ms 540ms 630ms 650ms 590ms 580ms - - CAST5 300ms 300ms 380ms 230ms 330ms 230ms 380ms 370ms 230ms 230ms - - After: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 430ms 430ms 460ms 450ms 460ms 450ms 470ms 470ms 460ms 470ms 900ms 930ms TWOFISH 220ms 210ms 240ms 230ms 230ms 230ms 250ms 250ms 230ms 230ms 470ms 470ms DES 540ms 540ms 580ms 570ms 570ms 570ms 560ms 620ms 580ms 570ms - - CAST5 300ms 290ms 310ms 230ms 320ms 230ms 350ms 350ms 230ms 230ms - - Tests on Intel Atom N160 (i386): Before: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 380ms 380ms 410ms 420ms 400ms 400ms 410ms 410ms 390ms 400ms 820ms 800ms TWOFISH 340ms 340ms 370ms 350ms 360ms 340ms 370ms 370ms 330ms 340ms 710ms 700ms DES 660ms 650ms 710ms 740ms 680ms 700ms 700ms 710ms 680ms 680ms - - CAST5 340ms 340ms 380ms 330ms 360ms 330ms 390ms 390ms 320ms 330ms - - After: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 380ms 380ms 390ms 410ms 400ms 390ms 410ms 400ms 400ms 390ms 810ms 800ms TWOFISH 330ms 340ms 350ms 360ms 350ms 340ms 380ms 370ms 340ms 360ms 700ms 710ms DES 630ms 640ms 660ms 690ms 680ms 680ms 700ms 690ms 680ms 680ms - - CAST5 340ms 330ms 350ms 330ms 370ms 340ms 380ms 390ms 330ms 330ms - - Tests in Intel i5-4570 (x86-64): Before: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 560ms 560ms 600ms 590ms 600ms 570ms 570ms 570ms 580ms 590ms 1200ms 1180ms TWOFISH 240ms 240ms 270ms 160ms 260ms 160ms 250ms 250ms 160ms 160ms 430ms 430ms DES 570ms 570ms 640ms 590ms 630ms 580ms 600ms 600ms 610ms 620ms - - CAST5 410ms 410ms 470ms 150ms 470ms 150ms 450ms 450ms 150ms 160ms - - After: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 560ms 560ms 590ms 570ms 580ms 570ms 570ms 570ms 590ms 590ms 1200ms 1200ms TWOFISH 240ms 240ms 260ms 160ms 250ms 170ms 250ms 250ms 160ms 160ms 430ms 430ms DES 570ms 570ms 620ms 580ms 630ms 570ms 600ms 590ms 620ms 620ms - - CAST5 410ms 410ms 460ms 150ms 460ms 160ms 450ms 450ms 150ms 150ms - - Signed-off-by: Jussi Kivilinna --- cipher/blowfish.c | 11 ++--- cipher/bufhelp.h | 104 ++++++++++++++++++++++++++++++++---------------- cipher/camellia-glue.c | 11 ++--- cipher/cast5.c | 11 ++--- cipher/cipher-cbc.c | 53 +++++++++++++----------- cipher/cipher-ccm.c | 7 +-- cipher/cipher-cfb.c | 24 ++++++----- cipher/cipher-ctr.c | 5 +- cipher/cipher-ofb.c | 20 +++++---- cipher/cipher.c | 10 +++-- cipher/rijndael.c | 58 ++++++++++++++++----------- cipher/serpent.c | 11 ++--- cipher/twofish.c | 11 ++--- 13 files changed, 188 insertions(+), 148 deletions(-) diff --git a/cipher/blowfish.c b/cipher/blowfish.c index ed4e901..3b6bf6b 100644 --- a/cipher/blowfish.c +++ b/cipher/blowfish.c @@ -701,14 +701,11 @@ _gcry_blowfish_cbc_dec(void *context, unsigned char *iv, void *outbuf_arg, for ( ;nblocks; nblocks-- ) { - /* We need to save INBUF away because it may be identical to - OUTBUF. */ - memcpy(savebuf, inbuf, BLOWFISH_BLOCKSIZE); + /* INBUF is needed later and it may be identical to OUTBUF, so store + the intermediate result to SAVEBUF. */ + do_decrypt_block (ctx, savebuf, inbuf); - do_decrypt_block (ctx, outbuf, inbuf); - - buf_xor(outbuf, outbuf, iv, BLOWFISH_BLOCKSIZE); - memcpy(iv, savebuf, BLOWFISH_BLOCKSIZE); + buf_xor_n_copy_2(outbuf, savebuf, iv, inbuf, BLOWFISH_BLOCKSIZE); inbuf += BLOWFISH_BLOCKSIZE; outbuf += BLOWFISH_BLOCKSIZE; } diff --git a/cipher/bufhelp.h b/cipher/bufhelp.h index 198d286..dc39b46 100644 --- a/cipher/bufhelp.h +++ b/cipher/bufhelp.h @@ -44,6 +44,45 @@ #endif +/* Optimized function for small buffer copying */ +static inline void +buf_cpy(void *_dst, const void *_src, size_t len) +{ +#if __GNUC__ >= 4 && (defined(__x86_64__) || defined(__i386__)) + /* For AMD64 and i386, memcpy is faster. */ + memcpy(_dst, _src, len); +#else + byte *dst = _dst; + const byte *src = _src; + uintptr_t *ldst; + const uintptr_t *lsrc; +#ifndef BUFHELP_FAST_UNALIGNED_ACCESS + const unsigned int longmask = sizeof(uintptr_t) - 1; + + /* Skip fast processing if buffers are unaligned. */ + if (((uintptr_t)dst | (uintptr_t)src) & longmask) + goto do_bytes; +#endif + + ldst = (uintptr_t *)(void *)dst; + lsrc = (const uintptr_t *)(const void *)src; + + for (; len >= sizeof(uintptr_t); len -= sizeof(uintptr_t)) + *ldst++ = *lsrc++; + + dst = (byte *)ldst; + src = (const byte *)lsrc; + +#ifndef BUFHELP_FAST_UNALIGNED_ACCESS +do_bytes: +#endif + /* Handle tail. */ + for (; len; len--) + *dst++ = *src++; +#endif /*__GNUC__ >= 4 && (__x86_64__ || __i386__)*/ +} + + /* Optimized function for buffer xoring */ static inline void buf_xor(void *_dst, const void *_src1, const void *_src2, size_t len) @@ -56,14 +95,9 @@ buf_xor(void *_dst, const void *_src1, const void *_src2, size_t len) #ifndef BUFHELP_FAST_UNALIGNED_ACCESS const unsigned int longmask = sizeof(uintptr_t) - 1; - /* Skip fast processing if alignment of buffers do not match. */ - if ((((uintptr_t)dst ^ (uintptr_t)src1) | - ((uintptr_t)dst ^ (uintptr_t)src2)) & longmask) + /* Skip fast processing if buffers are unaligned. */ + if (((uintptr_t)dst | (uintptr_t)src1 | (uintptr_t)src2) & longmask) goto do_bytes; - - /* Handle unaligned head. */ - for (; len && ((uintptr_t)dst & longmask); len--) - *dst++ = *src1++ ^ *src2++; #endif ldst = (uintptr_t *)(void *)dst; @@ -99,14 +133,9 @@ buf_xor_2dst(void *_dst1, void *_dst2, const void *_src, size_t len) #ifndef BUFHELP_FAST_UNALIGNED_ACCESS const unsigned int longmask = sizeof(uintptr_t) - 1; - /* Skip fast processing if alignment of buffers do not match. */ - if ((((uintptr_t)src ^ (uintptr_t)dst1) | - ((uintptr_t)src ^ (uintptr_t)dst2)) & longmask) + /* Skip fast processing if buffers are unaligned. */ + if (((uintptr_t)src | (uintptr_t)dst1 | (uintptr_t)dst2) & longmask) goto do_bytes; - - /* Handle unaligned head. */ - for (; len && ((uintptr_t)src & longmask); len--) - *dst1++ = (*dst2++ ^= *src++); #endif ldst1 = (uintptr_t *)(void *)dst1; @@ -130,48 +159,44 @@ do_bytes: /* Optimized function for combined buffer xoring and copying. Used by mainly - CFB mode decryption. */ + CBC mode decryption. */ static inline void -buf_xor_n_copy(void *_dst_xor, void *_srcdst_cpy, const void *_src, size_t len) +buf_xor_n_copy_2(void *_dst_xor, const void *_src_xor, void *_srcdst_cpy, + const void *_src_cpy, size_t len) { byte *dst_xor = _dst_xor; byte *srcdst_cpy = _srcdst_cpy; + const byte *src_xor = _src_xor; + const byte *src_cpy = _src_cpy; byte temp; - const byte *src = _src; uintptr_t *ldst_xor, *lsrcdst_cpy; - const uintptr_t *lsrc; + const uintptr_t *lsrc_cpy, *lsrc_xor; uintptr_t ltemp; #ifndef BUFHELP_FAST_UNALIGNED_ACCESS const unsigned int longmask = sizeof(uintptr_t) - 1; - /* Skip fast processing if alignment of buffers do not match. */ - if ((((uintptr_t)src ^ (uintptr_t)dst_xor) | - ((uintptr_t)src ^ (uintptr_t)srcdst_cpy)) & longmask) + /* Skip fast processing if buffers are unaligned. */ + if (((uintptr_t)src_cpy | (uintptr_t)src_xor | (uintptr_t)dst_xor | + (uintptr_t)srcdst_cpy) & longmask) goto do_bytes; - - /* Handle unaligned head. */ - for (; len && ((uintptr_t)src & longmask); len--) - { - temp = *src++; - *dst_xor++ = *srcdst_cpy ^ temp; - *srcdst_cpy++ = temp; - } #endif ldst_xor = (uintptr_t *)(void *)dst_xor; + lsrc_xor = (const uintptr_t *)(void *)src_xor; lsrcdst_cpy = (uintptr_t *)(void *)srcdst_cpy; - lsrc = (const uintptr_t *)(const void *)src; + lsrc_cpy = (const uintptr_t *)(const void *)src_cpy; for (; len >= sizeof(uintptr_t); len -= sizeof(uintptr_t)) { - ltemp = *lsrc++; - *ldst_xor++ = *lsrcdst_cpy ^ ltemp; + ltemp = *lsrc_cpy++; + *ldst_xor++ = *lsrcdst_cpy ^ *lsrc_xor++; *lsrcdst_cpy++ = ltemp; } dst_xor = (byte *)ldst_xor; + src_xor = (const byte *)lsrc_xor; srcdst_cpy = (byte *)lsrcdst_cpy; - src = (const byte *)lsrc; + src_cpy = (const byte *)lsrc_cpy; #ifndef BUFHELP_FAST_UNALIGNED_ACCESS do_bytes: @@ -179,13 +204,22 @@ do_bytes: /* Handle tail. */ for (; len; len--) { - temp = *src++; - *dst_xor++ = *srcdst_cpy ^ temp; + temp = *src_cpy++; + *dst_xor++ = *srcdst_cpy ^ *src_xor++; *srcdst_cpy++ = temp; } } +/* Optimized function for combined buffer xoring and copying. Used by mainly + CFB mode decryption. */ +static inline void +buf_xor_n_copy(void *_dst_xor, void *_srcdst_cpy, const void *_src, size_t len) +{ + buf_xor_n_copy_2(_dst_xor, _src, _srcdst_cpy, _src, len); +} + + #ifndef BUFHELP_FAST_UNALIGNED_ACCESS /* Functions for loading and storing unaligned u32 values of different diff --git a/cipher/camellia-glue.c b/cipher/camellia-glue.c index e6d4029..8c217a7 100644 --- a/cipher/camellia-glue.c +++ b/cipher/camellia-glue.c @@ -441,14 +441,11 @@ _gcry_camellia_cbc_dec(void *context, unsigned char *iv, for ( ;nblocks; nblocks-- ) { - /* We need to save INBUF away because it may be identical to - OUTBUF. */ - memcpy(savebuf, inbuf, CAMELLIA_BLOCK_SIZE); + /* INBUF is needed later and it may be identical to OUTBUF, so store + the intermediate result to SAVEBUF. */ + Camellia_DecryptBlock(ctx->keybitlength, inbuf, ctx->keytable, savebuf); - Camellia_DecryptBlock(ctx->keybitlength, inbuf, ctx->keytable, outbuf); - - buf_xor(outbuf, outbuf, iv, CAMELLIA_BLOCK_SIZE); - memcpy(iv, savebuf, CAMELLIA_BLOCK_SIZE); + buf_xor_n_copy_2(outbuf, savebuf, iv, inbuf, CAMELLIA_BLOCK_SIZE); inbuf += CAMELLIA_BLOCK_SIZE; outbuf += CAMELLIA_BLOCK_SIZE; } diff --git a/cipher/cast5.c b/cipher/cast5.c index 8c016d7..0df7886 100644 --- a/cipher/cast5.c +++ b/cipher/cast5.c @@ -678,14 +678,11 @@ _gcry_cast5_cbc_dec(void *context, unsigned char *iv, void *outbuf_arg, for ( ;nblocks; nblocks-- ) { - /* We need to save INBUF away because it may be identical to - OUTBUF. */ - memcpy(savebuf, inbuf, CAST5_BLOCKSIZE); + /* INBUF is needed later and it may be identical to OUTBUF, so store + the intermediate result to SAVEBUF. */ + do_decrypt_block (ctx, savebuf, inbuf); - do_decrypt_block (ctx, outbuf, inbuf); - - buf_xor(outbuf, outbuf, iv, CAST5_BLOCKSIZE); - memcpy(iv, savebuf, CAST5_BLOCKSIZE); + buf_xor_n_copy_2(outbuf, savebuf, iv, inbuf, CAST5_BLOCKSIZE); inbuf += CAST5_BLOCKSIZE; outbuf += CAST5_BLOCKSIZE; } diff --git a/cipher/cipher-cbc.c b/cipher/cipher-cbc.c index 523f5a6..4ad2ebd 100644 --- a/cipher/cipher-cbc.c +++ b/cipher/cipher-cbc.c @@ -41,14 +41,15 @@ _gcry_cipher_cbc_encrypt (gcry_cipher_hd_t c, unsigned char *ivp; int i; size_t blocksize = c->spec->blocksize; + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; unsigned nblocks = inbuflen / blocksize; unsigned int burn, nburn; if (outbuflen < ((c->flags & GCRY_CIPHER_CBC_MAC)? blocksize : inbuflen)) return GPG_ERR_BUFFER_TOO_SHORT; - if ((inbuflen % c->spec->blocksize) - && !(inbuflen > c->spec->blocksize + if ((inbuflen % blocksize) + && !(inbuflen > blocksize && (c->flags & GCRY_CIPHER_CBC_CTS))) return GPG_ERR_INV_LENGTH; @@ -70,16 +71,21 @@ _gcry_cipher_cbc_encrypt (gcry_cipher_hd_t c, } else { + ivp = c->u_iv.iv; + for (n=0; n < nblocks; n++ ) { - buf_xor(outbuf, inbuf, c->u_iv.iv, blocksize); - nburn = c->spec->encrypt ( &c->context.c, outbuf, outbuf ); + buf_xor (outbuf, inbuf, ivp, blocksize); + nburn = enc_fn ( &c->context.c, outbuf, outbuf ); burn = nburn > burn ? nburn : burn; - memcpy (c->u_iv.iv, outbuf, blocksize ); + ivp = outbuf; inbuf += blocksize; if (!(c->flags & GCRY_CIPHER_CBC_MAC)) outbuf += blocksize; } + + if (ivp != c->u_iv.iv) + buf_cpy (c->u_iv.iv, ivp, blocksize ); } if ((c->flags & GCRY_CIPHER_CBC_CTS) && inbuflen > blocksize) @@ -104,9 +110,9 @@ _gcry_cipher_cbc_encrypt (gcry_cipher_hd_t c, for (; i < blocksize; i++) outbuf[i] = 0 ^ *ivp++; - nburn = c->spec->encrypt (&c->context.c, outbuf, outbuf); + nburn = enc_fn (&c->context.c, outbuf, outbuf); burn = nburn > burn ? nburn : burn; - memcpy (c->u_iv.iv, outbuf, blocksize); + buf_cpy (c->u_iv.iv, outbuf, blocksize); } if (burn > 0) @@ -124,14 +130,15 @@ _gcry_cipher_cbc_decrypt (gcry_cipher_hd_t c, unsigned int n; int i; size_t blocksize = c->spec->blocksize; + gcry_cipher_decrypt_t dec_fn = c->spec->decrypt; unsigned int nblocks = inbuflen / blocksize; unsigned int burn, nburn; if (outbuflen < inbuflen) return GPG_ERR_BUFFER_TOO_SHORT; - if ((inbuflen % c->spec->blocksize) - && !(inbuflen > c->spec->blocksize + if ((inbuflen % blocksize) + && !(inbuflen > blocksize && (c->flags & GCRY_CIPHER_CBC_CTS))) return GPG_ERR_INV_LENGTH; @@ -142,7 +149,7 @@ _gcry_cipher_cbc_decrypt (gcry_cipher_hd_t c, nblocks--; if ((inbuflen % blocksize) == 0) nblocks--; - memcpy (c->lastiv, c->u_iv.iv, blocksize); + buf_cpy (c->lastiv, c->u_iv.iv, blocksize); } if (c->bulk.cbc_dec) @@ -155,16 +162,14 @@ _gcry_cipher_cbc_decrypt (gcry_cipher_hd_t c, { for (n=0; n < nblocks; n++ ) { - /* Because outbuf and inbuf might be the same, we have to - * save the original ciphertext block. We use LASTIV for - * this here because it is not used otherwise. */ - memcpy (c->lastiv, inbuf, blocksize); - nburn = c->spec->decrypt ( &c->context.c, outbuf, inbuf ); + /* Because outbuf and inbuf might be the same, we must not overwrite + the original ciphertext block. We use LASTIV as intermediate + storage here because it is not used otherwise. */ + nburn = dec_fn ( &c->context.c, c->lastiv, inbuf ); burn = nburn > burn ? nburn : burn; - buf_xor(outbuf, outbuf, c->u_iv.iv, blocksize); - memcpy(c->u_iv.iv, c->lastiv, blocksize ); - inbuf += c->spec->blocksize; - outbuf += c->spec->blocksize; + buf_xor_n_copy_2(outbuf, c->lastiv, c->u_iv.iv, inbuf, blocksize); + inbuf += blocksize; + outbuf += blocksize; } } @@ -177,17 +182,17 @@ _gcry_cipher_cbc_decrypt (gcry_cipher_hd_t c, else restbytes = inbuflen % blocksize; - memcpy (c->lastiv, c->u_iv.iv, blocksize ); /* Save Cn-2. */ - memcpy (c->u_iv.iv, inbuf + blocksize, restbytes ); /* Save Cn. */ + buf_cpy (c->lastiv, c->u_iv.iv, blocksize ); /* Save Cn-2. */ + buf_cpy (c->u_iv.iv, inbuf + blocksize, restbytes ); /* Save Cn. */ - nburn = c->spec->decrypt ( &c->context.c, outbuf, inbuf ); + nburn = dec_fn ( &c->context.c, outbuf, inbuf ); burn = nburn > burn ? nburn : burn; buf_xor(outbuf, outbuf, c->u_iv.iv, restbytes); - memcpy(outbuf + blocksize, outbuf, restbytes); + buf_cpy (outbuf + blocksize, outbuf, restbytes); for(i=restbytes; i < blocksize; i++) c->u_iv.iv[i] = outbuf[i]; - nburn = c->spec->decrypt (&c->context.c, outbuf, c->u_iv.iv); + nburn = dec_fn (&c->context.c, outbuf, c->u_iv.iv); burn = nburn > burn ? nburn : burn; buf_xor(outbuf, outbuf, c->lastiv, blocksize); /* c->lastiv is now really lastlastiv, does this matter? */ diff --git a/cipher/cipher-ccm.c b/cipher/cipher-ccm.c index 38752d5..ebcbf1e 100644 --- a/cipher/cipher-ccm.c +++ b/cipher/cipher-ccm.c @@ -40,6 +40,7 @@ do_cbc_mac (gcry_cipher_hd_t c, const unsigned char *inbuf, size_t inlen, int do_padding) { const unsigned int blocksize = 16; + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; unsigned char tmp[blocksize]; unsigned int burn = 0; unsigned int unused = c->u_mode.ccm.mac_unused; @@ -68,8 +69,7 @@ do_cbc_mac (gcry_cipher_hd_t c, const unsigned char *inbuf, size_t inlen, { /* Process one block from macbuf. */ buf_xor(c->u_iv.iv, c->u_iv.iv, c->u_mode.ccm.macbuf, blocksize); - set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_iv.iv, - c->u_iv.iv )); + set_burn (burn, enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv )); unused = 0; } @@ -89,8 +89,7 @@ do_cbc_mac (gcry_cipher_hd_t c, const unsigned char *inbuf, size_t inlen, { buf_xor(c->u_iv.iv, c->u_iv.iv, inbuf, blocksize); - set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_iv.iv, - c->u_iv.iv )); + set_burn (burn, enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv )); inlen -= blocksize; inbuf += blocksize; diff --git a/cipher/cipher-cfb.c b/cipher/cipher-cfb.c index 244f5fd..610d006 100644 --- a/cipher/cipher-cfb.c +++ b/cipher/cipher-cfb.c @@ -37,6 +37,7 @@ _gcry_cipher_cfb_encrypt (gcry_cipher_hd_t c, const unsigned char *inbuf, unsigned int inbuflen) { unsigned char *ivp; + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; size_t blocksize = c->spec->blocksize; size_t blocksize_x_2 = blocksize + blocksize; unsigned int burn, nburn; @@ -48,7 +49,7 @@ _gcry_cipher_cfb_encrypt (gcry_cipher_hd_t c, { /* Short enough to be encoded by the remaining XOR mask. */ /* XOR the input with the IV and store input into IV. */ - ivp = c->u_iv.iv + c->spec->blocksize - c->unused; + ivp = c->u_iv.iv + blocksize - c->unused; buf_xor_2dst(outbuf, ivp, inbuf, inbuflen); c->unused -= inbuflen; return 0; @@ -83,7 +84,7 @@ _gcry_cipher_cfb_encrypt (gcry_cipher_hd_t c, while ( inbuflen >= blocksize_x_2 ) { /* Encrypt the IV. */ - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; /* XOR the input with the IV and store input into IV. */ buf_xor_2dst(outbuf, c->u_iv.iv, inbuf, blocksize); @@ -96,8 +97,8 @@ _gcry_cipher_cfb_encrypt (gcry_cipher_hd_t c, if ( inbuflen >= blocksize ) { /* Save the current IV and then encrypt the IV. */ - memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; /* XOR the input with the IV and store input into IV */ buf_xor_2dst(outbuf, c->u_iv.iv, inbuf, blocksize); @@ -108,8 +109,8 @@ _gcry_cipher_cfb_encrypt (gcry_cipher_hd_t c, if ( inbuflen ) { /* Save the current IV and then encrypt the IV. */ - memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; c->unused = blocksize; /* Apply the XOR. */ @@ -133,6 +134,7 @@ _gcry_cipher_cfb_decrypt (gcry_cipher_hd_t c, const unsigned char *inbuf, unsigned int inbuflen) { unsigned char *ivp; + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; size_t blocksize = c->spec->blocksize; size_t blocksize_x_2 = blocksize + blocksize; unsigned int burn, nburn; @@ -179,7 +181,7 @@ _gcry_cipher_cfb_decrypt (gcry_cipher_hd_t c, while (inbuflen >= blocksize_x_2 ) { /* Encrypt the IV. */ - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; /* XOR the input with the IV and store input into IV. */ buf_xor_n_copy(outbuf, c->u_iv.iv, inbuf, blocksize); @@ -192,8 +194,8 @@ _gcry_cipher_cfb_decrypt (gcry_cipher_hd_t c, if (inbuflen >= blocksize ) { /* Save the current IV and then encrypt the IV. */ - memcpy ( c->lastiv, c->u_iv.iv, blocksize); - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + buf_cpy ( c->lastiv, c->u_iv.iv, blocksize); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; /* XOR the input with the IV and store input into IV */ buf_xor_n_copy(outbuf, c->u_iv.iv, inbuf, blocksize); @@ -205,8 +207,8 @@ _gcry_cipher_cfb_decrypt (gcry_cipher_hd_t c, if (inbuflen) { /* Save the current IV and then encrypt the IV. */ - memcpy ( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + buf_cpy ( c->lastiv, c->u_iv.iv, blocksize ); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; c->unused = blocksize; /* Apply the XOR. */ diff --git a/cipher/cipher-ctr.c b/cipher/cipher-ctr.c index fbc898f..37a6a79 100644 --- a/cipher/cipher-ctr.c +++ b/cipher/cipher-ctr.c @@ -38,6 +38,7 @@ _gcry_cipher_ctr_encrypt (gcry_cipher_hd_t c, { unsigned int n; int i; + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; unsigned int blocksize = c->spec->blocksize; unsigned int nblocks; unsigned int burn, nburn; @@ -77,7 +78,7 @@ _gcry_cipher_ctr_encrypt (gcry_cipher_hd_t c, unsigned char tmp[MAX_BLOCKSIZE]; do { - nburn = c->spec->encrypt (&c->context.c, tmp, c->u_ctr.ctr); + nburn = enc_fn (&c->context.c, tmp, c->u_ctr.ctr); burn = nburn > burn ? nburn : burn; for (i = blocksize; i > 0; i--) @@ -98,7 +99,7 @@ _gcry_cipher_ctr_encrypt (gcry_cipher_hd_t c, /* Save the unused bytes of the counter. */ c->unused = blocksize - n; if (c->unused) - memcpy (c->lastiv+n, tmp+n, c->unused); + buf_cpy (c->lastiv+n, tmp+n, c->unused); wipememory (tmp, sizeof tmp); } diff --git a/cipher/cipher-ofb.c b/cipher/cipher-ofb.c index 3d9d54c..333a748 100644 --- a/cipher/cipher-ofb.c +++ b/cipher/cipher-ofb.c @@ -37,6 +37,7 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, const unsigned char *inbuf, unsigned int inbuflen) { unsigned char *ivp; + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; size_t blocksize = c->spec->blocksize; unsigned int burn, nburn; @@ -47,7 +48,7 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, { /* Short enough to be encoded by the remaining XOR mask. */ /* XOR the input with the IV */ - ivp = c->u_iv.iv + c->spec->blocksize - c->unused; + ivp = c->u_iv.iv + blocksize - c->unused; buf_xor(outbuf, ivp, inbuf, inbuflen); c->unused -= inbuflen; return 0; @@ -69,8 +70,8 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, while ( inbuflen >= blocksize ) { /* Encrypt the IV (and save the current one). */ - memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; buf_xor(outbuf, c->u_iv.iv, inbuf, blocksize); outbuf += blocksize; @@ -79,8 +80,8 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, } if ( inbuflen ) { /* process the remaining bytes */ - memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; c->unused = blocksize; c->unused -= inbuflen; @@ -103,6 +104,7 @@ _gcry_cipher_ofb_decrypt (gcry_cipher_hd_t c, const unsigned char *inbuf, unsigned int inbuflen) { unsigned char *ivp; + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; size_t blocksize = c->spec->blocksize; unsigned int burn, nburn; @@ -134,8 +136,8 @@ _gcry_cipher_ofb_decrypt (gcry_cipher_hd_t c, while ( inbuflen >= blocksize ) { /* Encrypt the IV (and save the current one). */ - memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; buf_xor(outbuf, c->u_iv.iv, inbuf, blocksize); outbuf += blocksize; @@ -145,8 +147,8 @@ _gcry_cipher_ofb_decrypt (gcry_cipher_hd_t c, if ( inbuflen ) { /* Process the remaining bytes. */ /* Encrypt the IV (and save the current one). */ - memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; c->unused = blocksize; c->unused -= inbuflen; diff --git a/cipher/cipher.c b/cipher/cipher.c index 5214d26..c0d1d0b 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -631,6 +631,7 @@ do_ecb_encrypt (gcry_cipher_hd_t c, unsigned char *outbuf, unsigned int outbuflen, const unsigned char *inbuf, unsigned int inbuflen) { + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; unsigned int blocksize = c->spec->blocksize; unsigned int n, nblocks; unsigned int burn, nburn; @@ -640,12 +641,12 @@ do_ecb_encrypt (gcry_cipher_hd_t c, if ((inbuflen % blocksize)) return GPG_ERR_INV_LENGTH; - nblocks = inbuflen / c->spec->blocksize; + nblocks = inbuflen / blocksize; burn = 0; for (n=0; n < nblocks; n++ ) { - nburn = c->spec->encrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf); + nburn = enc_fn (&c->context.c, outbuf, (byte*)/*arggg*/inbuf); burn = nburn > burn ? nburn : burn; inbuf += blocksize; outbuf += blocksize; @@ -662,6 +663,7 @@ do_ecb_decrypt (gcry_cipher_hd_t c, unsigned char *outbuf, unsigned int outbuflen, const unsigned char *inbuf, unsigned int inbuflen) { + gcry_cipher_decrypt_t dec_fn = c->spec->decrypt; unsigned int blocksize = c->spec->blocksize; unsigned int n, nblocks; unsigned int burn, nburn; @@ -671,12 +673,12 @@ do_ecb_decrypt (gcry_cipher_hd_t c, if ((inbuflen % blocksize)) return GPG_ERR_INV_LENGTH; - nblocks = inbuflen / c->spec->blocksize; + nblocks = inbuflen / blocksize; burn = 0; for (n=0; n < nblocks; n++ ) { - nburn = c->spec->decrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf); + nburn = dec_fn (&c->context.c, outbuf, (byte*)/*arggg*/inbuf); burn = nburn > burn ? nburn : burn; inbuf += blocksize; outbuf += blocksize; diff --git a/cipher/rijndael.c b/cipher/rijndael.c index e9bb4f6..e8733c9 100644 --- a/cipher/rijndael.c +++ b/cipher/rijndael.c @@ -675,9 +675,9 @@ do_encrypt (const RIJNDAEL_context *ctx, byte b[16] ATTR_ALIGNED_16; } b; - memcpy (a.a, ax, 16); + buf_cpy (a.a, ax, 16); do_encrypt_aligned (ctx, b.b, a.a); - memcpy (bx, b.b, 16); + buf_cpy (bx, b.b, 16); } else #endif /*!USE_AMD64_ASM && !USE_ARM_ASM*/ @@ -1556,12 +1556,15 @@ _gcry_aes_cbc_enc (void *context, unsigned char *iv, RIJNDAEL_context *ctx = context; unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; + unsigned char *last_iv; #ifdef USE_AESNI if (ctx->use_aesni) aesni_prepare (); #endif /*USE_AESNI*/ + last_iv = iv; + for ( ;nblocks; nblocks-- ) { if (0) @@ -1576,24 +1579,17 @@ _gcry_aes_cbc_enc (void *context, unsigned char *iv, "pxor %%xmm0, %%xmm1\n\t" "movdqu %%xmm1, %[outbuf]\n\t" : /* No output */ - : [iv] "m" (*iv), + : [iv] "m" (*last_iv), [inbuf] "m" (*inbuf), [outbuf] "m" (*outbuf) : "memory" ); do_aesni (ctx, 0, outbuf, outbuf); - - asm volatile ("movdqu %[outbuf], %%xmm0\n\t" - "movdqu %%xmm0, %[iv]\n\t" - : /* No output */ - : [outbuf] "m" (*outbuf), - [iv] "m" (*iv) - : "memory" ); } #endif /*USE_AESNI*/ else { - buf_xor(outbuf, inbuf, iv, BLOCKSIZE); + buf_xor(outbuf, inbuf, last_iv, BLOCKSIZE); if (0) ; @@ -1603,18 +1599,34 @@ _gcry_aes_cbc_enc (void *context, unsigned char *iv, #endif /*USE_PADLOCK*/ else do_encrypt (ctx, outbuf, outbuf ); - - memcpy (iv, outbuf, BLOCKSIZE); } + last_iv = outbuf; inbuf += BLOCKSIZE; if (!cbc_mac) outbuf += BLOCKSIZE; } + if (last_iv != iv) + { + if (0) + ; +#ifdef USE_AESNI + else if (ctx->use_aesni) + asm volatile ("movdqu %[last], %%xmm0\n\t" + "movdqu %%xmm0, %[iv]\n\t" + : /* No output */ + : [last] "m" (*last_iv), + [iv] "m" (*iv) + : "memory" ); +#endif /*USE_AESNI*/ + else + buf_cpy (iv, last_iv, BLOCKSIZE); + } + #ifdef USE_AESNI - if (ctx->use_aesni) - aesni_cleanup (); + if (ctx->use_aesni) + aesni_cleanup (); #endif /*USE_AESNI*/ _gcry_burn_stack (48 + 2*sizeof(int)); @@ -1810,9 +1822,9 @@ do_decrypt (RIJNDAEL_context *ctx, byte *bx, const byte *ax) byte b[16] ATTR_ALIGNED_16; } b; - memcpy (a.a, ax, 16); + buf_cpy (a.a, ax, 16); do_decrypt_aligned (ctx, b.b, a.a); - memcpy (bx, b.b, 16); + buf_cpy (bx, b.b, 16); } else #endif /*!USE_AMD64_ASM && !USE_ARM_ASM*/ @@ -2068,21 +2080,19 @@ _gcry_aes_cbc_dec (void *context, unsigned char *iv, else for ( ;nblocks; nblocks-- ) { - /* We need to save INBUF away because it may be identical to - OUTBUF. */ - memcpy (savebuf, inbuf, BLOCKSIZE); + /* INBUF is needed later and it may be identical to OUTBUF, so store + the intermediate result to SAVEBUF. */ if (0) ; #ifdef USE_PADLOCK else if (ctx->use_padlock) - do_padlock (ctx, 1, outbuf, inbuf); + do_padlock (ctx, 1, savebuf, inbuf); #endif /*USE_PADLOCK*/ else - do_decrypt (ctx, outbuf, inbuf); + do_decrypt (ctx, savebuf, inbuf); - buf_xor(outbuf, outbuf, iv, BLOCKSIZE); - memcpy (iv, savebuf, BLOCKSIZE); + buf_xor_n_copy_2(outbuf, savebuf, iv, inbuf, BLOCKSIZE); inbuf += BLOCKSIZE; outbuf += BLOCKSIZE; } diff --git a/cipher/serpent.c b/cipher/serpent.c index c0898dc..a8ee15f 100644 --- a/cipher/serpent.c +++ b/cipher/serpent.c @@ -950,14 +950,11 @@ _gcry_serpent_cbc_dec(void *context, unsigned char *iv, for ( ;nblocks; nblocks-- ) { - /* We need to save INBUF away because it may be identical to - OUTBUF. */ - memcpy(savebuf, inbuf, sizeof(serpent_block_t)); + /* INBUF is needed later and it may be identical to OUTBUF, so store + the intermediate result to SAVEBUF. */ + serpent_decrypt_internal (ctx, inbuf, savebuf); - serpent_decrypt_internal (ctx, inbuf, outbuf); - - buf_xor(outbuf, outbuf, iv, sizeof(serpent_block_t)); - memcpy(iv, savebuf, sizeof(serpent_block_t)); + buf_xor_n_copy_2(outbuf, savebuf, iv, inbuf, sizeof(serpent_block_t)); inbuf += sizeof(serpent_block_t); outbuf += sizeof(serpent_block_t); } diff --git a/cipher/twofish.c b/cipher/twofish.c index 086df76..04081db 100644 --- a/cipher/twofish.c +++ b/cipher/twofish.c @@ -1055,16 +1055,13 @@ _gcry_twofish_cbc_dec(void *context, unsigned char *iv, void *outbuf_arg, for ( ;nblocks; nblocks-- ) { - /* We need to save INBUF away because it may be identical to - OUTBUF. */ - memcpy(savebuf, inbuf, TWOFISH_BLOCKSIZE); - - burn = twofish_decrypt (ctx, outbuf, inbuf); + /* INBUF is needed later and it may be identical to OUTBUF, so store + the intermediate result to SAVEBUF. */ + burn = twofish_decrypt (ctx, savebuf, inbuf); if (burn > burn_stack_depth) burn_stack_depth = burn; - buf_xor(outbuf, outbuf, iv, TWOFISH_BLOCKSIZE); - memcpy(iv, savebuf, TWOFISH_BLOCKSIZE); + buf_xor_n_copy_2(outbuf, savebuf, iv, inbuf, TWOFISH_BLOCKSIZE); inbuf += TWOFISH_BLOCKSIZE; outbuf += TWOFISH_BLOCKSIZE; } From jussi.kivilinna at iki.fi Wed Oct 23 14:26:42 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Wed, 23 Oct 2013 15:26:42 +0300 Subject: [PATCH 2/2] Replace architecture specific fast_wipememory2 with generic In-Reply-To: <20131023122637.19376.55401.stgit@localhost6.localdomain6> References: <20131023122637.19376.55401.stgit@localhost6.localdomain6> Message-ID: <20131023122642.19376.37577.stgit@localhost6.localdomain6> * src/g10lib.h (fast_wipememory2): Remove architecture specific implementations and add generic implementation. -- Reduce code size, adds support for other architectures and gcc appears to generated better code without assembly parts. Signed-off-by: Jussi Kivilinna --- src/g10lib.h | 85 +++++++++++++++++----------------------------------------- 1 file changed, 25 insertions(+), 60 deletions(-) diff --git a/src/g10lib.h b/src/g10lib.h index 3b09448..80c73ee 100644 --- a/src/g10lib.h +++ b/src/g10lib.h @@ -275,77 +275,42 @@ void __gcry_burn_stack (unsigned int bytes); } while(0) #define wipememory(_ptr,_len) wipememory2(_ptr,0,_len) +#ifdef HAVE_U64_TYPEDEF + #define FASTWIPE_T u64 + #define FASTWIPE_MULT (U64_C(0x0101010101010101)) +#else + #define FASTWIPE_T u32 + #define FASTWIPE_MULT (0x01010101U) +#endif -/* Optimized fast_wipememory2 for i386, x86-64 and arm architectures. May leave - tail bytes unhandled, in which case tail bytes are handled by wipememory2. - */ -#if defined(__x86_64__) && __GNUC__ >= 4 -#define fast_wipememory2(_vptr,_vset,_vlen) do { \ - unsigned long long int _vset8 = _vset; \ - if (_vlen < 8) \ - break; \ - _vset8 *= 0x0101010101010101ULL; \ - do { \ - asm volatile("movq %[set], %[ptr]\n\t" \ - : /**/ \ - : [set] "Cr" (_vset8), \ - [ptr] "m" (*_vptr) \ - : "memory"); \ - _vlen -= 8; \ - _vptr += 8; \ - } while (_vlen >= 8); \ - } while (0) -#elif defined (__i386__) && SIZEOF_UNSIGNED_LONG == 4 && __GNUC__ >= 4 -#define fast_wipememory2(_ptr,_set,_len) do { \ - unsigned long _vset4 = _vset; \ - if (_vlen < 4) \ - break; \ - _vset4 *= 0x01010101; \ - do { \ - asm volatile("movl %[set], %[ptr]\n\t" \ - : /**/ \ - : [set] "Cr" (_vset4), \ - [ptr] "m" (*_vptr) \ - : "memory"); \ - _vlen -= 4; \ - _vptr += 4; \ - } while (_vlen >= 4); \ - } while (0) -#elif defined (__arm__) && (defined (__thumb2__) || !defined (__thumb__)) && \ - __GNUC__ >= 4 - -#ifdef __ARM_FEATURE_UNALIGNED +/* Following architectures can handle unaligned accesses fast. */ +#if defined(__i386__) || defined(__x86_64__) || \ + defined(__powerpc__) || defined(__powerpc64__) || \ + (defined(__arm__) && defined(__ARM_FEATURE_UNALIGNED)) || \ + defined(__aarch64__) #define fast_wipememory2_unaligned_head(_ptr,_set,_len) /*do nothing*/ #else #define fast_wipememory2_unaligned_head(_vptr,_vset,_vlen) do { \ - while((size_t)(_vptr)&3 && _vlen) \ - { *_vptr=(_vset); _vptr++; _vlen--; } \ + while((size_t)(_vptr)&(sizeof(FASTWIPE_T)-1) && _vlen) \ + { *_vptr=(_vset); _vptr++; _vlen--; } \ } while(0) #endif +/* fast_wipememory2 may leave tail bytes unhandled, in which case tail bytes + are handled by wipememory2. */ #define fast_wipememory2(_vptr,_vset,_vlen) do { \ - unsigned long _vset4 = _vset; \ + FASTWIPE_T _vset_long = _vset; \ fast_wipememory2_unaligned_head(_vptr,_vset,_vlen); \ - if (_vlen < 8) \ + if (_vlen < sizeof(FASTWIPE_T)) \ break; \ - _vset4 *= 0x01010101; \ - asm volatile( \ - "mov %%r4, %[set];\n\t" \ - "mov %%r5, %[set];\n\t" \ - "1:;\n\t" \ - "stm %[ptr]!, {%%r4, %%r5};\n\t" \ - "cmp %[end], %[ptr];\n\t" \ - "bne 1b;\n\t" \ - : [ptr] "=r" (_vptr) \ - : [set] "r" (_vset4), \ - [end] "r" (_vptr+(_vlen&(~0x7))), \ - "0" (_vptr) \ - : "memory", "r4", "r5", "cc"); \ - _vlen &= 0x7; \ + _vset_long *= FASTWIPE_MULT; \ + do { \ + volatile FASTWIPE_T *_vptr_long = (volatile void *)_vptr; \ + *_vptr_long = _vset_long; \ + _vlen -= sizeof(FASTWIPE_T); \ + _vptr += sizeof(FASTWIPE_T); \ + } while (_vlen >= sizeof(FASTWIPE_T)); \ } while (0) -#else -#define fast_wipememory2(_ptr,_set,_len) -#endif /* Digit predicates. */ From jussi.kivilinna at iki.fi Wed Oct 23 18:12:01 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Wed, 23 Oct 2013 19:12:01 +0300 Subject: Libgcrypt performance, 1.6 vs 1.5 (on i386, arm & amd64) Message-ID: <5267F551.2@iki.fi> Hello, I wanted to check how much has the performance of libgcrypt improved in 1.6 (for block ciphers and hashes). And since the results weren't all bad, I'll wanted to share them here :) So, here's benchmark results on few architectures comparing performance of 1.5.3 and 1.6-git. First is results on i386, which shows 'baseline' of improvement; that is, performance improvements without new assembly implementations. After i386, there's results for AMD64 and ARM which show the additional speed-up with the help of assembly implementations. -Jussi libgcrypt 1.6-git vs 1.5.3: Intel Atom N270 (i386, Debian wheezy): Hash: buf buf/10 1 putc large --------------------------------------- MD5 1.18x 1.72x 1.09x 1.11x 1.12x SHA1 1.11x 1.39x 1.07x 1.11x 1.15x RIPEMD160 1.35x 1.61x 1.12x 1.24x 1.21x TIGER192 1.04x 1.34x 1.27x 1.04x 1.05x SHA256 1.27x 1.74x 1.26x 1.19x 1.19x SHA384 1.11x 1.54x 1.19x 1.11x 1.06x SHA512 1.11x 1.56x 1.19x 1.09x 1.03x SHA224 1.26x 1.77x 1.26x 1.22x 1.22x MD4 1.17x 1.96x 1.07x 1.23x 1.09x CRC32 1.09x 1.00x 1.00x 1.13x 1.09x CRC32RFC1510 0.92x 1.00x 1.00x 1.00x 1.00x CRC24RFC2440 1.01x 1.00x 1.02x 1.00x 1.00x WHIRLPOOL 1.00x 0.99x 0.88x 1.02x 1.01x TIGER 1.03x 1.31x 1.25x 1.03x 1.00x TIGER2 1.03x 1.33x 1.26x 1.02x 1.01x Cipher: ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- --------------- IDEA 1.37x 1.38x 1.44x 1.51x 1.39x 1.40x 1.36x 1.37x 1.89x 1.88x 3DES 1.16x 1.16x 1.21x 1.22x 1.18x 1.17x 1.17x 1.17x 1.36x 1.38x CAST5 1.80x 1.83x 1.92x 2.22x 1.78x 1.95x 1.71x 1.76x 3.13x 3.16x BLOWFISH 1.95x 2.11x 2.05x 2.62x 2.00x 2.15x 1.81x 1.88x 3.40x 3.34x AES 1.34x 1.29x 1.09x 1.06x 1.10x 1.10x 1.35x 1.37x 1.09x 1.09x AES192 1.26x 1.26x 1.07x 1.05x 1.08x 1.09x 1.31x 1.31x 1.08x 1.07x AES256 1.21x 1.22x 1.06x 1.06x 1.08x 1.07x 1.26x 1.27x 1.08x 1.07x TWOFISH 1.52x 1.45x 1.62x 1.69x 1.59x 1.67x 1.53x 1.57x 2.77x 2.74x ARCFOUR 1.00x 1.02x DES 1.52x 1.49x 1.56x 1.62x 1.53x 1.53x 1.53x 1.53x 2.05x 2.06x TWOFISH128 1.45x 1.47x 1.64x 1.71x 1.58x 1.66x 1.53x 1.52x 2.75x 2.71x SERPENT128 1.60x 1.62x 1.74x 1.80x 1.63x 1.77x 1.64x 1.64x 2.55x 2.55x SERPENT192 1.62x 1.63x 1.73x 1.79x 1.66x 1.74x 1.65x 1.61x 2.53x 2.58x SERPENT256 1.61x 1.60x 1.69x 1.77x 1.69x 1.73x 1.64x 1.68x 2.56x 2.56x RFC2268_40 0.97x 1.03x 1.06x 1.23x 0.99x 1.03x 1.01x 1.03x 1.51x 1.51x SEED 1.47x 1.46x 1.58x 1.58x 1.55x 1.53x 1.51x 1.52x 2.38x 2.43x CAMELLIA128 3.12x 3.13x 3.17x 3.30x 3.08x 3.27x 3.05x 3.06x 4.12x 4.14x CAMELLIA192 2.65x 2.63x 2.69x 2.79x 2.63x 2.76x 2.62x 2.64x 3.42x 3.37x CAMELLIA256 2.65x 2.65x 2.67x 2.80x 2.65x 2.74x 2.62x 2.61x 3.42x 3.43x ARM Cortex-A8 (armhf, Debian jessie): Hash: buf buf/10 1 putc large --------------------------------------- MD5 1.09x 1.40x 1.04x 0.98x 1.05x SHA1 1.33x 1.57x 1.05x 1.18x 1.27x RIPEMD160 1.02x 1.24x 0.99x 1.01x 0.96x TIGER192 1.02x 1.22x 1.31x 0.91x 1.00x SHA256 1.16x 1.69x 1.22x 1.11x 1.11x SHA384 4.25x 4.50x 1.88x 3.56x 4.17x SHA512 4.26x 4.54x 1.88x 3.59x 4.17x SHA224 1.16x 1.68x 1.20x 1.10x 1.11x MD4 0.95x 1.51x 1.02x 1.00x 0.94x CRC32 1.00x 0.96x 1.01x 1.00x 1.04x CRC32RFC1510 1.00x 0.96x 1.00x 1.00x 1.04x CRC24RFC2440 1.00x 1.00x 1.01x 1.00x 1.01x WHIRLPOOL 1.27x 1.24x 1.12x 1.26x 1.26x TIGER 1.01x 1.21x 1.31x 0.95x 0.99x TIGER2 1.02x 1.21x 1.31x 0.92x 1.00x Cipher: ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- --------------- IDEA 1.30x 1.30x 1.38x 1.43x 1.32x 1.36x 1.39x 1.35x 1.64x 1.63x 3DES 1.21x 1.21x 1.27x 1.30x 1.23x 1.22x 1.26x 1.25x 1.42x 1.42x CAST5 2.02x 2.07x 2.18x 3.36x 1.83x 3.01x 2.17x 2.14x 4.13x 4.22x BLOWFISH 2.07x 2.04x 2.24x 3.42x 2.10x 3.08x 2.19x 2.19x 4.48x 4.48x AES 2.79x 2.90x 2.28x 2.52x 2.32x 2.30x 2.83x 2.84x 2.29x 2.29x AES192 2.73x 2.88x 2.32x 2.45x 2.34x 2.34x 2.78x 2.74x 2.25x 2.25x AES256 2.75x 2.79x 2.35x 2.56x 2.42x 2.38x 2.75x 2.70x 2.36x 2.34x TWOFISH 1.92x 1.91x 2.17x 2.33x 2.03x 2.18x 2.07x 2.08x 3.21x 3.24x ARCFOUR 1.17x 1.19x DES 1.51x 1.51x 1.62x 1.72x 1.56x 1.54x 1.62x 1.60x 1.99x 2.02x TWOFISH128 1.92x 1.91x 2.15x 2.35x 2.04x 2.16x 2.08x 2.07x 3.23x 3.23x SERPENT128 1.33x 1.28x 1.46x 1.53x 1.41x 1.42x 1.42x 1.45x 2.04x 2.02x SERPENT192 1.32x 1.29x 1.46x 1.53x 1.40x 1.46x 1.45x 1.45x 2.03x 2.03x SERPENT256 1.33x 1.28x 1.43x 1.53x 1.41x 1.46x 1.44x 1.44x 2.03x 2.03x RFC2268_40 0.97x 1.00x 1.07x 1.17x 1.02x 0.99x 1.09x 1.11x 1.26x 1.27x SEED 1.35x 1.35x 1.50x 1.55x 1.44x 1.43x 1.48x 1.47x 2.02x 2.01x CAMELLIA128 4.52x 4.52x 4.53x 4.80x 4.42x 4.69x 4.35x 4.36x 5.50x 5.40x CAMELLIA192 3.83x 3.82x 3.89x 4.08x 3.80x 3.97x 3.76x 3.79x 4.58x 4.58x CAMELLIA256 3.80x 3.80x 3.88x 4.08x 3.80x 3.97x 3.77x 3.80x 4.58x 4.59x Intel Core i5-4570 (amd64, Ubuntu saucy): Hash: buf buf/10 1 putc large --------------------------------------- MD5 1.03x 1.40x 1.14x 1.02x 0.99x SHA1 1.08x 1.34x 1.14x 1.04x 1.03x RIPEMD160 1.02x 1.32x 1.20x 1.02x 0.99x TIGER192 1.07x 1.71x 1.34x 1.06x 1.01x SHA256 1.15x 1.64x 1.31x 1.09x 1.08x SHA384 1.18x 2.23x 1.33x 1.12x 1.02x SHA512 1.18x 2.22x 1.33x 1.12x 1.02x SHA224 1.15x 1.64x 1.31x 1.09x 1.08x MD4 1.05x 1.52x 1.09x 1.02x 1.00x CRC32 1.00x 1.01x 1.09x 1.00x 1.03x CRC32RFC1510 1.00x 0.99x 1.09x 1.01x 1.02x CRC24RFC2440 0.99x 0.99x 0.96x 0.99x 0.99x WHIRLPOOL 1.02x 0.98x 0.94x 1.01x 1.03x TIGER 1.08x 1.70x 1.35x 1.05x 1.01x TIGER2 1.09x 1.71x 1.34x 1.05x 1.02x Cipher: ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- --------------- IDEA 1.37x 1.37x 1.33x 1.49x 1.31x 1.46x 1.31x 1.32x 1.63x 1.63x 3DES 1.13x 1.12x 1.12x 1.15x 1.10x 1.14x 1.13x 1.13x 1.21x 1.21x CAST5 1.36x 1.38x 1.33x 4.15x 1.30x 3.95x 1.34x 1.34x 4.99x 5.02x BLOWFISH 1.69x 1.62x 1.47x 5.22x 1.42x 5.05x 1.46x 1.47x 6.36x 6.37x AES 14.11x 13.55x 3.46x 22.20x 3.61x 22.90x 4.65x 4.62x 17.92x 19.33x AES192 13.26x 12.20x 3.41x 21.33x 3.58x 22.08x 4.41x 4.45x 17.80x 17.80x AES256 12.52x 12.14x 3.38x 20.79x 3.49x 21.29x 4.26x 4.28x 17.65x 17.65x TWOFISH 1.45x 1.45x 1.47x 2.41x 1.46x 2.35x 1.48x 1.50x 3.23x 3.23x ARCFOUR 1.28x 1.27x DES 1.35x 1.32x 1.28x 1.41x 1.27x 1.35x 1.32x 1.33x 1.53x 1.53x TWOFISH128 1.44x 1.45x 1.47x 2.44x 1.46x 2.33x 1.49x 1.50x 3.23x 3.23x SERPENT128 1.23x 1.15x 1.24x 8.02x 1.23x 9.00x 1.26x 1.24x 10.81x 10.65x SERPENT192 1.23x 1.15x 1.24x 8.00x 1.24x 8.87x 1.26x 1.25x 10.65x 10.65x SERPENT256 1.23x 1.15x 1.24x 8.02x 1.24x 8.85x 1.26x 1.25x 10.65x 10.65x RFC2268_40 1.07x 0.99x 1.03x 1.08x 1.00x 1.02x 1.05x 1.04x 1.21x 1.21x SEED 1.20x 1.22x 1.21x 1.27x 1.19x 1.25x 1.24x 1.24x 1.47x 1.46x CAMELLIA128 3.12x 3.11x 3.00x 14.72x 2.97x 15.02x 3.07x 3.10x 16.82x 16.82x CAMELLIA192 2.57x 2.59x 2.54x 12.34x 2.51x 12.42x 2.59x 2.59x 14.15x 13.93x CAMELLIA256 2.56x 2.55x 2.51x 12.28x 2.50x 12.62x 2.56x 2.58x 13.96x 14.15x From cvs at cvs.gnupg.org Wed Oct 23 14:10:01 2013 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Wed, 23 Oct 2013 14:10:01 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-326-g164eb8c Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 164eb8c85d773ef4f0939115ec45f5e4b47c1700 (commit) via 45f6e6268bfdc4b608beaba6b7086b2286e33c71 (commit) from 98674fdaa30ab22a3ac86ca05d688b5b6112895d (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 164eb8c85d773ef4f0939115ec45f5e4b47c1700 Author: Werner Koch Date: Wed Oct 23 14:08:29 2013 +0200 ecc: Refactor ecc.c * cipher/ecc-ecdsa.c, cipher/ecc-eddsa.c, cipher/ecc-gost.c: New. * cipher/Makefile.am (EXTRA_libcipher_la_SOURCES): Add new files. * configure.ac (GCRYPT_PUBKEY_CIPHERS): Add new files. * cipher/ecc.c (point_init, point_free): Move to ecc-common.h. (sign_ecdsa): Move to ecc-ecdsa.c as _gcry_ecc_ecdsa_sign. (verify_ecdsa): Move to ecc-ecdsa.c as _gcry_ecc_ecdsa_verify. (sign_gost): Move to ecc-gots.c as _gcry_ecc_gost_sign. (verify_gost): Move to ecc-gost.c as _gcry_ecc_gost_verify. (sign_eddsa): Move to ecc-eddsa.c as _gcry_ecc_eddsa_sign. (verify_eddsa): Move to ecc-eddsa.c as _gcry_ecc_eddsa_verify. (eddsa_generate_key): Move to ecc-eddsa.c as _gcry_ecc_eddsa_genkey. (reverse_buffer): Move to ecc-eddsa.c. (eddsa_encodempi, eddsa_encode_x_y): Ditto. (_gcry_ecc_eddsa_encodepoint, _gcry_ecc_eddsa_decodepoint): Ditto. -- This change should make it easier to add new ECC algorithms. Signed-off-by: Werner Koch diff --git a/cipher/Makefile.am b/cipher/Makefile.am index 3d8149a..e6b1745 100644 --- a/cipher/Makefile.am +++ b/cipher/Makefile.am @@ -62,6 +62,7 @@ des.c \ dsa.c \ elgamal.c \ ecc.c ecc-curves.c ecc-misc.c ecc-common.h \ +ecc-ecdsa.c ecc-eddsa.c ecc-gost.c \ idea.c \ gost28147.c gost.h \ gostr3411-94.c \ diff --git a/cipher/ecc-common.h b/cipher/ecc-common.h index 0be1f2c..0a95b95 100644 --- a/cipher/ecc-common.h +++ b/cipher/ecc-common.h @@ -61,6 +61,9 @@ point_set (mpi_point_t d, mpi_point_t s) mpi_set (d->z, s->z); } +#define point_init(a) _gcry_mpi_point_init ((a)) +#define point_free(a) _gcry_mpi_point_free_parts ((a)) + /*-- ecc-curves.c --*/ gpg_err_code_t _gcry_ecc_fill_in_curve (unsigned int nbits, @@ -85,6 +88,15 @@ gcry_error_t _gcry_ecc_os2ec (mpi_point_t result, gcry_mpi_t value); mpi_point_t _gcry_ecc_compute_public (mpi_point_t Q, mpi_ec_t ec); /*-- ecc.c --*/ + +/*-- ecc-ecdsa.c --*/ +gpg_err_code_t _gcry_ecc_ecdsa_sign (gcry_mpi_t input, ECC_secret_key *skey, + gcry_mpi_t r, gcry_mpi_t s, + int flags, int hashalgo); +gpg_err_code_t _gcry_ecc_ecdsa_verify (gcry_mpi_t input, ECC_public_key *pkey, + gcry_mpi_t r, gcry_mpi_t s); + +/*-- ecc-eddsa.c --*/ gpg_err_code_t _gcry_ecc_eddsa_encodepoint (mpi_point_t point, mpi_ec_t ctx, gcry_mpi_t x, gcry_mpi_t y, unsigned char **r_buffer, @@ -94,5 +106,24 @@ gpg_err_code_t _gcry_ecc_eddsa_decodepoint (gcry_mpi_t pk, mpi_ec_t ctx, unsigned char **r_encpk, unsigned int *r_encpklen); +gpg_err_code_t _gcry_ecc_eddsa_genkey (ECC_secret_key *sk, + elliptic_curve_t *E, + mpi_ec_t ctx, + gcry_random_level_t random_level); +gpg_err_code_t _gcry_ecc_eddsa_sign (gcry_mpi_t input, + ECC_secret_key *sk, + gcry_mpi_t r_r, gcry_mpi_t s, + int hashalgo, gcry_mpi_t pk); +gpg_err_code_t _gcry_ecc_eddsa_verify (gcry_mpi_t input, + ECC_public_key *pk, + gcry_mpi_t r, gcry_mpi_t s, + int hashalgo, gcry_mpi_t pkmpi); + +/*-- ecc-gost.c --*/ +gpg_err_code_t _gcry_ecc_gost_sign (gcry_mpi_t input, ECC_secret_key *skey, + gcry_mpi_t r, gcry_mpi_t s); +gpg_err_code_t _gcry_ecc_gost_verify (gcry_mpi_t input, ECC_public_key *pkey, + gcry_mpi_t r, gcry_mpi_t s); + #endif /*GCRY_ECC_COMMON_H*/ diff --git a/cipher/ecc-ecdsa.c b/cipher/ecc-ecdsa.c new file mode 100644 index 0000000..70dfe38 --- /dev/null +++ b/cipher/ecc-ecdsa.c @@ -0,0 +1,235 @@ +/* ecc-ecdsa.c - Elliptic Curve ECDSA signatures + * Copyright (C) 2007, 2008, 2010, 2011 Free Software Foundation, Inc. + * Copyright (C) 2013 g10 Code GmbH + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include +#include +#include +#include +#include + +#include "g10lib.h" +#include "mpi.h" +#include "cipher.h" +#include "context.h" +#include "ec-context.h" +#include "pubkey-internal.h" +#include "ecc-common.h" + + +/* Compute an ECDSA signature. + * Return the signature struct (r,s) from the message hash. The caller + * must have allocated R and S. + */ +gpg_err_code_t +_gcry_ecc_ecdsa_sign (gcry_mpi_t input, ECC_secret_key *skey, + gcry_mpi_t r, gcry_mpi_t s, + int flags, int hashalgo) +{ + gpg_err_code_t err = 0; + int extraloops = 0; + gcry_mpi_t k, dr, sum, k_1, x; + mpi_point_struct I; + gcry_mpi_t hash; + const void *abuf; + unsigned int abits, qbits; + mpi_ec_t ctx; + + if (DBG_CIPHER) + log_mpidump ("ecdsa sign hash ", input ); + + qbits = mpi_get_nbits (skey->E.n); + + /* Convert the INPUT into an MPI if needed. */ + if (mpi_is_opaque (input)) + { + abuf = gcry_mpi_get_opaque (input, &abits); + err = gpg_err_code (gcry_mpi_scan (&hash, GCRYMPI_FMT_USG, + abuf, (abits+7)/8, NULL)); + if (err) + return err; + if (abits > qbits) + gcry_mpi_rshift (hash, hash, abits - qbits); + } + else + hash = input; + + + k = NULL; + dr = mpi_alloc (0); + sum = mpi_alloc (0); + k_1 = mpi_alloc (0); + x = mpi_alloc (0); + point_init (&I); + + ctx = _gcry_mpi_ec_p_internal_new (skey->E.model, skey->E.dialect, + skey->E.p, skey->E.a, skey->E.b); + + /* Two loops to avoid R or S are zero. This is more of a joke than + a real demand because the probability of them being zero is less + than any hardware failure. Some specs however require it. */ + do + { + do + { + mpi_free (k); + k = NULL; + if ((flags & PUBKEY_FLAG_RFC6979) && hashalgo) + { + /* Use Pornin's method for deterministic DSA. If this + flag is set, it is expected that HASH is an opaque + MPI with the to be signed hash. That hash is also + used as h1 from 3.2.a. */ + if (!mpi_is_opaque (input)) + { + err = GPG_ERR_CONFLICT; + goto leave; + } + + abuf = gcry_mpi_get_opaque (input, &abits); + err = _gcry_dsa_gen_rfc6979_k (&k, skey->E.n, skey->d, + abuf, (abits+7)/8, + hashalgo, extraloops); + if (err) + goto leave; + extraloops++; + } + else + k = _gcry_dsa_gen_k (skey->E.n, GCRY_STRONG_RANDOM); + + _gcry_mpi_ec_mul_point (&I, k, &skey->E.G, ctx); + if (_gcry_mpi_ec_get_affine (x, NULL, &I, ctx)) + { + if (DBG_CIPHER) + log_debug ("ecc sign: Failed to get affine coordinates\n"); + err = GPG_ERR_BAD_SIGNATURE; + goto leave; + } + mpi_mod (r, x, skey->E.n); /* r = x mod n */ + } + while (!mpi_cmp_ui (r, 0)); + + mpi_mulm (dr, skey->d, r, skey->E.n); /* dr = d*r mod n */ + mpi_addm (sum, hash, dr, skey->E.n); /* sum = hash + (d*r) mod n */ + mpi_invm (k_1, k, skey->E.n); /* k_1 = k^(-1) mod n */ + mpi_mulm (s, k_1, sum, skey->E.n); /* s = k^(-1)*(hash+(d*r)) mod n */ + } + while (!mpi_cmp_ui (s, 0)); + + if (DBG_CIPHER) + { + log_mpidump ("ecdsa sign result r ", r); + log_mpidump ("ecdsa sign result s ", s); + } + + leave: + _gcry_mpi_ec_free (ctx); + point_free (&I); + mpi_free (x); + mpi_free (k_1); + mpi_free (sum); + mpi_free (dr); + mpi_free (k); + + if (hash != input) + mpi_free (hash); + + return err; +} + + +/* Verify an ECDSA signature. + * Check if R and S verifies INPUT. + */ +gpg_err_code_t +_gcry_ecc_ecdsa_verify (gcry_mpi_t input, ECC_public_key *pkey, + gcry_mpi_t r, gcry_mpi_t s) +{ + gpg_err_code_t err = 0; + gcry_mpi_t h, h1, h2, x; + mpi_point_struct Q, Q1, Q2; + mpi_ec_t ctx; + + if( !(mpi_cmp_ui (r, 0) > 0 && mpi_cmp (r, pkey->E.n) < 0) ) + return GPG_ERR_BAD_SIGNATURE; /* Assertion 0 < r < n failed. */ + if( !(mpi_cmp_ui (s, 0) > 0 && mpi_cmp (s, pkey->E.n) < 0) ) + return GPG_ERR_BAD_SIGNATURE; /* Assertion 0 < s < n failed. */ + + h = mpi_alloc (0); + h1 = mpi_alloc (0); + h2 = mpi_alloc (0); + x = mpi_alloc (0); + point_init (&Q); + point_init (&Q1); + point_init (&Q2); + + ctx = _gcry_mpi_ec_p_internal_new (pkey->E.model, pkey->E.dialect, + pkey->E.p, pkey->E.a, pkey->E.b); + + /* h = s^(-1) (mod n) */ + mpi_invm (h, s, pkey->E.n); + /* h1 = hash * s^(-1) (mod n) */ + mpi_mulm (h1, input, h, pkey->E.n); + /* Q1 = [ hash * s^(-1) ]G */ + _gcry_mpi_ec_mul_point (&Q1, h1, &pkey->E.G, ctx); + /* h2 = r * s^(-1) (mod n) */ + mpi_mulm (h2, r, h, pkey->E.n); + /* Q2 = [ r * s^(-1) ]Q */ + _gcry_mpi_ec_mul_point (&Q2, h2, &pkey->Q, ctx); + /* Q = ([hash * s^(-1)]G) + ([r * s^(-1)]Q) */ + _gcry_mpi_ec_add_points (&Q, &Q1, &Q2, ctx); + + if (!mpi_cmp_ui (Q.z, 0)) + { + if (DBG_CIPHER) + log_debug ("ecc verify: Rejected\n"); + err = GPG_ERR_BAD_SIGNATURE; + goto leave; + } + if (_gcry_mpi_ec_get_affine (x, NULL, &Q, ctx)) + { + if (DBG_CIPHER) + log_debug ("ecc verify: Failed to get affine coordinates\n"); + err = GPG_ERR_BAD_SIGNATURE; + goto leave; + } + mpi_mod (x, x, pkey->E.n); /* x = x mod E_n */ + if (mpi_cmp (x, r)) /* x != r */ + { + if (DBG_CIPHER) + { + log_mpidump (" x", x); + log_mpidump (" r", r); + log_mpidump (" s", s); + } + err = GPG_ERR_BAD_SIGNATURE; + goto leave; + } + + leave: + _gcry_mpi_ec_free (ctx); + point_free (&Q2); + point_free (&Q1); + point_free (&Q); + mpi_free (x); + mpi_free (h2); + mpi_free (h1); + mpi_free (h); + return err; +} diff --git a/cipher/ecc-eddsa.c b/cipher/ecc-eddsa.c new file mode 100644 index 0000000..72103e9 --- /dev/null +++ b/cipher/ecc-eddsa.c @@ -0,0 +1,681 @@ +/* ecc-eddsa.c - Elliptic Curve EdDSA signatures + * Copyright (C) 2013 g10 Code GmbH + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include +#include +#include +#include +#include + +#include "g10lib.h" +#include "mpi.h" +#include "cipher.h" +#include "context.h" +#include "ec-context.h" +#include "ecc-common.h" + + + +static void +reverse_buffer (unsigned char *buffer, unsigned int length) +{ + unsigned int tmp, i; + + for (i=0; i < length/2; i++) + { + tmp = buffer[i]; + buffer[i] = buffer[length-1-i]; + buffer[length-1-i] = tmp; + } +} + + + +/* Encode MPI using the EdDSA scheme. MINLEN specifies the required + length of the buffer in bytes. On success 0 is returned an a + malloced buffer with the encoded point is stored at R_BUFFER; the + length of this buffer is stored at R_BUFLEN. */ +static gpg_err_code_t +eddsa_encodempi (gcry_mpi_t mpi, unsigned int minlen, + unsigned char **r_buffer, unsigned int *r_buflen) +{ + unsigned char *rawmpi; + unsigned int rawmpilen; + + rawmpi = _gcry_mpi_get_buffer (mpi, minlen, &rawmpilen, NULL); + if (!rawmpi) + return gpg_err_code_from_syserror (); + + *r_buffer = rawmpi; + *r_buflen = rawmpilen; + return 0; +} + + +/* Encode (X,Y) using the EdDSA scheme. MINLEN is the required length + in bytes for the result. On success 0 is returned and a malloced + buffer with the encoded point is stored at R_BUFFER; the length of + this buffer is stored at R_BUFLEN. */ +static gpg_err_code_t +eddsa_encode_x_y (gcry_mpi_t x, gcry_mpi_t y, unsigned int minlen, + unsigned char **r_buffer, unsigned int *r_buflen) +{ + unsigned char *rawmpi; + unsigned int rawmpilen; + + rawmpi = _gcry_mpi_get_buffer (y, minlen, &rawmpilen, NULL); + if (!rawmpi) + return gpg_err_code_from_syserror (); + if (mpi_test_bit (x, 0) && rawmpilen) + rawmpi[rawmpilen - 1] |= 0x80; /* Set sign bit. */ + + *r_buffer = rawmpi; + *r_buflen = rawmpilen; + return 0; +} + +/* Encode POINT using the EdDSA scheme. X and Y are either scratch + variables supplied by the caller or NULL. CTX is the usual + context. On success 0 is returned and a malloced buffer with the + encoded point is stored at R_BUFFER; the length of this buffer is + stored at R_BUFLEN. */ +gpg_err_code_t +_gcry_ecc_eddsa_encodepoint (mpi_point_t point, mpi_ec_t ec, + gcry_mpi_t x_in, gcry_mpi_t y_in, + unsigned char **r_buffer, unsigned int *r_buflen) +{ + gpg_err_code_t rc; + gcry_mpi_t x, y; + + x = x_in? x_in : mpi_new (0); + y = y_in? y_in : mpi_new (0); + + if (_gcry_mpi_ec_get_affine (x, y, point, ec)) + { + log_error ("eddsa_encodepoint: Failed to get affine coordinates\n"); + rc = GPG_ERR_INTERNAL; + } + else + rc = eddsa_encode_x_y (x, y, ec->nbits/8, r_buffer, r_buflen); + + if (!x_in) + mpi_free (x); + if (!y_in) + mpi_free (y); + return rc; +} + + +/* Decode the EdDSA style encoded PK and set it into RESULT. CTX is + the usual curve context. If R_ENCPK is not NULL, the encoded PK is + stored at that address; this is a new copy to be released by the + caller. In contrast to the supplied PK, this is not an MPI and + thus guarnateed to be properly padded. R_ENCPKLEN received the + length of that encoded key. */ +gpg_err_code_t +_gcry_ecc_eddsa_decodepoint (gcry_mpi_t pk, mpi_ec_t ctx, mpi_point_t result, + unsigned char **r_encpk, unsigned int *r_encpklen) +{ + gpg_err_code_t rc; + unsigned char *rawmpi; + unsigned int rawmpilen; + gcry_mpi_t yy, t, x, p1, p2, p3; + int sign; + + if (mpi_is_opaque (pk)) + { + const unsigned char *buf; + + buf = gcry_mpi_get_opaque (pk, &rawmpilen); + if (!buf) + return GPG_ERR_INV_OBJ; + rawmpilen = (rawmpilen + 7)/8; + + /* First check whether the public key has been given in standard + uncompressed format. No need to recover x in this case. + Detection is easy: The size of the buffer will be odd and the + first byte be 0x04. */ + if (rawmpilen > 1 && buf[0] == 0x04 && (rawmpilen%2)) + { + gcry_mpi_t y; + + rc = gcry_mpi_scan (&x, GCRYMPI_FMT_STD, + buf+1, (rawmpilen-1)/2, NULL); + if (rc) + return rc; + rc = gcry_mpi_scan (&y, GCRYMPI_FMT_STD, + buf+1+(rawmpilen-1)/2, (rawmpilen-1)/2, NULL); + if (rc) + { + mpi_free (x); + return rc; + } + + if (r_encpk) + { + rc = eddsa_encode_x_y (x, y, ctx->nbits/8, r_encpk, r_encpklen); + if (rc) + { + mpi_free (x); + mpi_free (y); + return rc; + } + } + mpi_snatch (result->x, x); + mpi_snatch (result->y, y); + mpi_set_ui (result->z, 1); + return 0; + } + + /* EdDSA compressed point. */ + rawmpi = gcry_malloc (rawmpilen? rawmpilen:1); + if (!rawmpi) + return gpg_err_code_from_syserror (); + memcpy (rawmpi, buf, rawmpilen); + reverse_buffer (rawmpi, rawmpilen); + } + else + { + /* Note: Without using an opaque MPI it is not reliable possible + to find out whether the public key has been given in + uncompressed format. Thus we expect EdDSA format here. */ + rawmpi = _gcry_mpi_get_buffer (pk, ctx->nbits/8, &rawmpilen, NULL); + if (!rawmpi) + return gpg_err_code_from_syserror (); + } + + if (rawmpilen) + { + sign = !!(rawmpi[0] & 0x80); + rawmpi[0] &= 0x7f; + } + else + sign = 0; + _gcry_mpi_set_buffer (result->y, rawmpi, rawmpilen, 0); + if (r_encpk) + { + /* Revert to little endian. */ + if (sign && rawmpilen) + rawmpi[0] |= 0x80; + reverse_buffer (rawmpi, rawmpilen); + *r_encpk = rawmpi; + if (r_encpklen) + *r_encpklen = rawmpilen; + } + else + gcry_free (rawmpi); + + /* Now recover X. */ + /* t = (y^2-1) ? ((b*y^2+1)^{p-2} mod p) */ + x = mpi_new (0); + yy = mpi_new (0); + mpi_mul (yy, result->y, result->y); + t = mpi_copy (yy); + mpi_mul (t, t, ctx->b); + mpi_add_ui (t, t, 1); + p2 = mpi_copy (ctx->p); + mpi_sub_ui (p2, p2, 2); + mpi_powm (t, t, p2, ctx->p); + + mpi_sub_ui (yy, yy, 1); + mpi_mul (t, yy, t); + + /* x = t^{(p+3)/8} mod p */ + p3 = mpi_copy (ctx->p); + mpi_add_ui (p3, p3, 3); + mpi_fdiv_q (p3, p3, mpi_const (MPI_C_EIGHT)); + mpi_powm (x, t, p3, ctx->p); + + /* (x^2 - t) % p != 0 ? x = (x*(2^{(p-1)/4} mod p)) % p */ + mpi_mul (yy, x, x); + mpi_subm (yy, yy, t, ctx->p); + if (mpi_cmp_ui (yy, 0)) + { + p1 = mpi_copy (ctx->p); + mpi_sub_ui (p1, p1, 1); + mpi_fdiv_q (p1, p1, mpi_const (MPI_C_FOUR)); + mpi_powm (yy, mpi_const (MPI_C_TWO), p1, ctx->p); + mpi_mulm (x, x, yy, ctx->p); + } + else + p1 = NULL; + + /* is_odd(x) ? x = p-x */ + if (mpi_test_bit (x, 0)) + mpi_sub (x, ctx->p, x); + + /* lowbit(x) != highbit(input) ? x = p-x */ + if (mpi_test_bit (x, 0) != sign) + mpi_sub (x, ctx->p, x); + + mpi_set (result->x, x); + mpi_set_ui (result->z, 1); + + gcry_mpi_release (x); + gcry_mpi_release (yy); + gcry_mpi_release (t); + gcry_mpi_release (p3); + gcry_mpi_release (p2); + gcry_mpi_release (p1); + + return 0; +} + + +/* Ed25519 version of the key generation. */ +gpg_err_code_t +_gcry_ecc_eddsa_genkey (ECC_secret_key *sk, elliptic_curve_t *E, mpi_ec_t ctx, + gcry_random_level_t random_level) +{ + gpg_err_code_t rc; + int b = 256/8; /* The only size we currently support. */ + gcry_mpi_t a, x, y; + mpi_point_struct Q; + char *dbuf; + size_t dlen; + gcry_buffer_t hvec[1]; + unsigned char *hash_d = NULL; + + point_init (&Q); + memset (hvec, 0, sizeof hvec); + + a = mpi_snew (0); + x = mpi_new (0); + y = mpi_new (0); + + /* Generate a secret. */ + hash_d = gcry_malloc_secure (2*b); + if (!hash_d) + { + rc = gpg_error_from_syserror (); + goto leave; + } + dlen = b; + dbuf = gcry_random_bytes_secure (dlen, random_level); + + /* Compute the A value. */ + hvec[0].data = dbuf; + hvec[0].len = dlen; + rc = _gcry_md_hash_buffers (GCRY_MD_SHA512, 0, hash_d, hvec, 1); + if (rc) + goto leave; + sk->d = _gcry_mpi_set_opaque (NULL, dbuf, dlen*8); + dbuf = NULL; + reverse_buffer (hash_d, 32); /* Only the first half of the hash. */ + hash_d[0] = (hash_d[0] & 0x7f) | 0x40; + hash_d[31] &= 0xf8; + _gcry_mpi_set_buffer (a, hash_d, 32, 0); + gcry_free (hash_d); hash_d = NULL; + /* log_printmpi ("ecgen a", a); */ + + /* Compute Q. */ + _gcry_mpi_ec_mul_point (&Q, a, &E->G, ctx); + if (DBG_CIPHER) + log_printpnt ("ecgen pk", &Q, ctx); + + /* Copy the stuff to the key structures. */ + sk->E.model = E->model; + sk->E.dialect = E->dialect; + sk->E.p = mpi_copy (E->p); + sk->E.a = mpi_copy (E->a); + sk->E.b = mpi_copy (E->b); + point_init (&sk->E.G); + point_set (&sk->E.G, &E->G); + sk->E.n = mpi_copy (E->n); + point_init (&sk->Q); + point_set (&sk->Q, &Q); + + leave: + gcry_mpi_release (a); + gcry_mpi_release (x); + gcry_mpi_release (y); + gcry_free (hash_d); + return rc; +} + + +/* Compute an EdDSA signature. See: + * [ed25519] 23pp. (PDF) Daniel J. Bernstein, Niels Duif, Tanja + * Lange, Peter Schwabe, Bo-Yin Yang. High-speed high-security + * signatures. Journal of Cryptographic Engineering 2 (2012), 77-89. + * Document ID: a1a62a2f76d23f65d622484ddd09caf8. + * URL: http://cr.yp.to/papers.html#ed25519. Date: 2011.09.26. + * + * Despite that this function requires the specification of a hash + * algorithm, we only support what has been specified by the paper. + * This may change in the future. Note that we don't check the used + * curve; the user is responsible to use Ed25519. + * + * Return the signature struct (r,s) from the message hash. The caller + * must have allocated R_R and S. + */ +gpg_err_code_t +_gcry_ecc_eddsa_sign (gcry_mpi_t input, ECC_secret_key *skey, + gcry_mpi_t r_r, gcry_mpi_t s, int hashalgo, gcry_mpi_t pk) +{ + int rc; + mpi_ec_t ctx = NULL; + int b; + unsigned int tmp; + unsigned char *digest; + gcry_buffer_t hvec[3]; + const void *mbuf; + size_t mlen; + unsigned char *rawmpi = NULL; + unsigned int rawmpilen; + unsigned char *encpk = NULL; /* Encoded public key. */ + unsigned int encpklen; + mpi_point_struct I; /* Intermediate value. */ + mpi_point_struct Q; /* Public key. */ + gcry_mpi_t a, x, y, r; + + memset (hvec, 0, sizeof hvec); + + if (!mpi_is_opaque (input)) + return GPG_ERR_INV_DATA; + if (hashalgo != GCRY_MD_SHA512) + return GPG_ERR_DIGEST_ALGO; + + /* Initialize some helpers. */ + point_init (&I); + point_init (&Q); + a = mpi_snew (0); + x = mpi_new (0); + y = mpi_new (0); + r = mpi_new (0); + ctx = _gcry_mpi_ec_p_internal_new (skey->E.model, skey->E.dialect, + skey->E.p, skey->E.a, skey->E.b); + b = (ctx->nbits+7)/8; + if (b != 256/8) + return GPG_ERR_INTERNAL; /* We only support 256 bit. */ + + digest = gcry_calloc_secure (2, b); + if (!digest) + { + rc = gpg_err_code_from_syserror (); + goto leave; + } + + /* Hash the secret key. We clear DIGEST so we can use it as input + to left pad the key with zeroes for hashing. */ + rawmpi = _gcry_mpi_get_buffer (skey->d, 0, &rawmpilen, NULL); + if (!rawmpi) + { + rc = gpg_err_code_from_syserror (); + goto leave; + } + hvec[0].data = digest; + hvec[0].off = 0; + hvec[0].len = b > rawmpilen? b - rawmpilen : 0; + hvec[1].data = rawmpi; + hvec[1].off = 0; + hvec[1].len = rawmpilen; + rc = _gcry_md_hash_buffers (hashalgo, 0, digest, hvec, 2); + gcry_free (rawmpi); rawmpi = NULL; + if (rc) + goto leave; + + /* Compute the A value (this modifies DIGEST). */ + reverse_buffer (digest, 32); /* Only the first half of the hash. */ + digest[0] = (digest[0] & 0x7f) | 0x40; + digest[31] &= 0xf8; + _gcry_mpi_set_buffer (a, digest, 32, 0); + + /* Compute the public key if it has not been supplied as optional + parameter. */ + if (pk) + { + rc = _gcry_ecc_eddsa_decodepoint (pk, ctx, &Q, &encpk, &encpklen); + if (rc) + goto leave; + if (DBG_CIPHER) + log_printhex ("* e_pk", encpk, encpklen); + if (!_gcry_mpi_ec_curve_point (&Q, ctx)) + { + rc = GPG_ERR_BROKEN_PUBKEY; + goto leave; + } + } + else + { + _gcry_mpi_ec_mul_point (&Q, a, &skey->E.G, ctx); + rc = _gcry_ecc_eddsa_encodepoint (&Q, ctx, x, y, &encpk, &encpklen); + if (rc) + goto leave; + if (DBG_CIPHER) + log_printhex (" e_pk", encpk, encpklen); + } + + /* Compute R. */ + mbuf = gcry_mpi_get_opaque (input, &tmp); + mlen = (tmp +7)/8; + if (DBG_CIPHER) + log_printhex (" m", mbuf, mlen); + + hvec[0].data = digest; + hvec[0].off = 32; + hvec[0].len = 32; + hvec[1].data = (char*)mbuf; + hvec[1].len = mlen; + rc = _gcry_md_hash_buffers (hashalgo, 0, digest, hvec, 2); + if (rc) + goto leave; + reverse_buffer (digest, 64); + if (DBG_CIPHER) + log_printhex (" r", digest, 64); + _gcry_mpi_set_buffer (r, digest, 64, 0); + _gcry_mpi_ec_mul_point (&I, r, &skey->E.G, ctx); + if (DBG_CIPHER) + log_printpnt (" r", &I, ctx); + + /* Convert R into affine coordinates and apply encoding. */ + rc = _gcry_ecc_eddsa_encodepoint (&I, ctx, x, y, &rawmpi, &rawmpilen); + if (rc) + goto leave; + if (DBG_CIPHER) + log_printhex (" e_r", rawmpi, rawmpilen); + + /* S = r + a * H(encodepoint(R) + encodepoint(pk) + m) mod n */ + hvec[0].data = rawmpi; /* (this is R) */ + hvec[0].off = 0; + hvec[0].len = rawmpilen; + hvec[1].data = encpk; + hvec[1].off = 0; + hvec[1].len = encpklen; + hvec[2].data = (char*)mbuf; + hvec[2].off = 0; + hvec[2].len = mlen; + rc = _gcry_md_hash_buffers (hashalgo, 0, digest, hvec, 3); + if (rc) + goto leave; + + /* No more need for RAWMPI thus we now transfer it to R_R. */ + gcry_mpi_set_opaque (r_r, rawmpi, rawmpilen*8); + rawmpi = NULL; + + reverse_buffer (digest, 64); + if (DBG_CIPHER) + log_printhex (" H(R+)", digest, 64); + _gcry_mpi_set_buffer (s, digest, 64, 0); + mpi_mulm (s, s, a, skey->E.n); + mpi_addm (s, s, r, skey->E.n); + rc = eddsa_encodempi (s, b, &rawmpi, &rawmpilen); + if (rc) + goto leave; + if (DBG_CIPHER) + log_printhex (" e_s", rawmpi, rawmpilen); + gcry_mpi_set_opaque (s, rawmpi, rawmpilen*8); + rawmpi = NULL; + + rc = 0; + + leave: + gcry_mpi_release (a); + gcry_mpi_release (x); + gcry_mpi_release (y); + gcry_mpi_release (r); + gcry_free (digest); + _gcry_mpi_ec_free (ctx); + point_free (&I); + point_free (&Q); + gcry_free (encpk); + gcry_free (rawmpi); + return rc; +} + + +/* Verify an EdDSA signature. See sign_eddsa for the reference. + * Check if R_IN and S_IN verifies INPUT. PKEY has the curve + * parameters and PK is the EdDSA style encoded public key. + */ +gpg_err_code_t +_gcry_ecc_eddsa_verify (gcry_mpi_t input, ECC_public_key *pkey, + gcry_mpi_t r_in, gcry_mpi_t s_in, int hashalgo, + gcry_mpi_t pk) +{ + int rc; + mpi_ec_t ctx = NULL; + int b; + unsigned int tmp; + mpi_point_struct Q; /* Public key. */ + unsigned char *encpk = NULL; /* Encoded public key. */ + unsigned int encpklen; + const void *mbuf, *rbuf; + unsigned char *tbuf = NULL; + size_t mlen, rlen; + unsigned int tlen; + unsigned char digest[64]; + gcry_buffer_t hvec[3]; + gcry_mpi_t h, s; + mpi_point_struct Ia, Ib; + + if (!mpi_is_opaque (input) || !mpi_is_opaque (r_in) || !mpi_is_opaque (s_in)) + return GPG_ERR_INV_DATA; + if (hashalgo != GCRY_MD_SHA512) + return GPG_ERR_DIGEST_ALGO; + + point_init (&Q); + point_init (&Ia); + point_init (&Ib); + h = mpi_new (0); + s = mpi_new (0); + + ctx = _gcry_mpi_ec_p_internal_new (pkey->E.model, pkey->E.dialect, + pkey->E.p, pkey->E.a, pkey->E.b); + b = ctx->nbits/8; + if (b != 256/8) + return GPG_ERR_INTERNAL; /* We only support 256 bit. */ + + /* Decode and check the public key. */ + rc = _gcry_ecc_eddsa_decodepoint (pk, ctx, &Q, &encpk, &encpklen); + if (rc) + goto leave; + if (!_gcry_mpi_ec_curve_point (&Q, ctx)) + { + rc = GPG_ERR_BROKEN_PUBKEY; + goto leave; + } + if (DBG_CIPHER) + log_printhex (" e_pk", encpk, encpklen); + if (encpklen != b) + { + rc = GPG_ERR_INV_LENGTH; + goto leave; + } + + /* Convert the other input parameters. */ + mbuf = gcry_mpi_get_opaque (input, &tmp); + mlen = (tmp +7)/8; + if (DBG_CIPHER) + log_printhex (" m", mbuf, mlen); + rbuf = gcry_mpi_get_opaque (r_in, &tmp); + rlen = (tmp +7)/8; + if (DBG_CIPHER) + log_printhex (" r", rbuf, rlen); + if (rlen != b) + { + rc = GPG_ERR_INV_LENGTH; + goto leave; + } + + /* h = H(encodepoint(R) + encodepoint(pk) + m) */ + hvec[0].data = (char*)rbuf; + hvec[0].off = 0; + hvec[0].len = rlen; + hvec[1].data = encpk; + hvec[1].off = 0; + hvec[1].len = encpklen; + hvec[2].data = (char*)mbuf; + hvec[2].off = 0; + hvec[2].len = mlen; + rc = _gcry_md_hash_buffers (hashalgo, 0, digest, hvec, 3); + if (rc) + goto leave; + reverse_buffer (digest, 64); + if (DBG_CIPHER) + log_printhex (" H(R+)", digest, 64); + _gcry_mpi_set_buffer (h, digest, 64, 0); + + /* According to the paper the best way for verification is: + encodepoint(sG - h?Q) = encodepoint(r) + because we don't need to decode R. */ + { + void *sbuf; + unsigned int slen; + + sbuf = _gcry_mpi_get_opaque_copy (s_in, &tmp); + slen = (tmp +7)/8; + reverse_buffer (sbuf, slen); + if (DBG_CIPHER) + log_printhex (" s", sbuf, slen); + _gcry_mpi_set_buffer (s, sbuf, slen, 0); + gcry_free (sbuf); + if (slen != b) + { + rc = GPG_ERR_INV_LENGTH; + goto leave; + } + } + + _gcry_mpi_ec_mul_point (&Ia, s, &pkey->E.G, ctx); + _gcry_mpi_ec_mul_point (&Ib, h, &Q, ctx); + _gcry_mpi_neg (Ib.x, Ib.x); + _gcry_mpi_ec_add_points (&Ia, &Ia, &Ib, ctx); + rc = _gcry_ecc_eddsa_encodepoint (&Ia, ctx, s, h, &tbuf, &tlen); + if (rc) + goto leave; + if (tlen != rlen || memcmp (tbuf, rbuf, tlen)) + { + rc = GPG_ERR_BAD_SIGNATURE; + goto leave; + } + + rc = 0; + + leave: + gcry_free (encpk); + gcry_free (tbuf); + _gcry_mpi_ec_free (ctx); + gcry_mpi_release (s); + gcry_mpi_release (h); + point_free (&Ia); + point_free (&Ib); + point_free (&Q); + return rc; +} diff --git a/cipher/ecc-gost.c b/cipher/ecc-gost.c new file mode 100644 index 0000000..1ebfd39 --- /dev/null +++ b/cipher/ecc-gost.c @@ -0,0 +1,233 @@ +/* ecc-gots.c - Elliptic Curve GOST signatures + * Copyright (C) 2007, 2008, 2010, 2011 Free Software Foundation, Inc. + * Copyright (C) 2013 Dmitry Eremin-Solenikov + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include +#include +#include +#include +#include + +#include "g10lib.h" +#include "mpi.h" +#include "cipher.h" +#include "context.h" +#include "ec-context.h" +#include "ecc-common.h" + + +/* Compute an GOST R 34.10-01/-12 signature. + * Return the signature struct (r,s) from the message hash. The caller + * must have allocated R and S. + */ +gpg_err_code_t +_gcry_ecc_gost_sign (gcry_mpi_t input, ECC_secret_key *skey, + gcry_mpi_t r, gcry_mpi_t s) +{ + gpg_err_code_t err = 0; + gcry_mpi_t k, dr, sum, ke, x, e; + mpi_point_struct I; + gcry_mpi_t hash; + const void *abuf; + unsigned int abits, qbits; + mpi_ec_t ctx; + + if (DBG_CIPHER) + log_mpidump ("gost sign hash ", input ); + + qbits = mpi_get_nbits (skey->E.n); + + /* Convert the INPUT into an MPI if needed. */ + if (mpi_is_opaque (input)) + { + abuf = gcry_mpi_get_opaque (input, &abits); + err = gpg_err_code (gcry_mpi_scan (&hash, GCRYMPI_FMT_USG, + abuf, (abits+7)/8, NULL)); + if (err) + return err; + if (abits > qbits) + gcry_mpi_rshift (hash, hash, abits - qbits); + } + else + hash = input; + + + k = NULL; + dr = mpi_alloc (0); + sum = mpi_alloc (0); + ke = mpi_alloc (0); + e = mpi_alloc (0); + x = mpi_alloc (0); + point_init (&I); + + ctx = _gcry_mpi_ec_p_internal_new (skey->E.model, skey->E.dialect, + skey->E.p, skey->E.a, skey->E.b); + + mpi_mod (e, input, skey->E.n); /* e = hash mod n */ + + if (!mpi_cmp_ui (e, 0)) + mpi_set_ui (e, 1); + + /* Two loops to avoid R or S are zero. This is more of a joke than + a real demand because the probability of them being zero is less + than any hardware failure. Some specs however require it. */ + do + { + do + { + mpi_free (k); + k = _gcry_dsa_gen_k (skey->E.n, GCRY_STRONG_RANDOM); + + _gcry_mpi_ec_mul_point (&I, k, &skey->E.G, ctx); + if (_gcry_mpi_ec_get_affine (x, NULL, &I, ctx)) + { + if (DBG_CIPHER) + log_debug ("ecc sign: Failed to get affine coordinates\n"); + err = GPG_ERR_BAD_SIGNATURE; + goto leave; + } + mpi_mod (r, x, skey->E.n); /* r = x mod n */ + } + while (!mpi_cmp_ui (r, 0)); + mpi_mulm (dr, skey->d, r, skey->E.n); /* dr = d*r mod n */ + mpi_mulm (ke, k, e, skey->E.n); /* ke = k*e mod n */ + mpi_addm (s, ke, dr, skey->E.n); /* sum = (k*e+ d*r) mod n */ + } + while (!mpi_cmp_ui (s, 0)); + + if (DBG_CIPHER) + { + log_mpidump ("gost sign result r ", r); + log_mpidump ("gost sign result s ", s); + } + + leave: + _gcry_mpi_ec_free (ctx); + point_free (&I); + mpi_free (x); + mpi_free (e); + mpi_free (ke); + mpi_free (sum); + mpi_free (dr); + mpi_free (k); + + if (hash != input) + mpi_free (hash); + + return err; +} + + +/* Verify a GOST R 34.10-01/-12 signature. + * Check if R and S verifies INPUT. + */ +gpg_err_code_t +_gcry_ecc_gost_verify (gcry_mpi_t input, ECC_public_key *pkey, + gcry_mpi_t r, gcry_mpi_t s) +{ + gpg_err_code_t err = 0; + gcry_mpi_t e, x, z1, z2, v, rv, zero; + mpi_point_struct Q, Q1, Q2; + mpi_ec_t ctx; + + if( !(mpi_cmp_ui (r, 0) > 0 && mpi_cmp (r, pkey->E.n) < 0) ) + return GPG_ERR_BAD_SIGNATURE; /* Assertion 0 < r < n failed. */ + if( !(mpi_cmp_ui (s, 0) > 0 && mpi_cmp (s, pkey->E.n) < 0) ) + return GPG_ERR_BAD_SIGNATURE; /* Assertion 0 < s < n failed. */ + + x = mpi_alloc (0); + e = mpi_alloc (0); + z1 = mpi_alloc (0); + z2 = mpi_alloc (0); + v = mpi_alloc (0); + rv = mpi_alloc (0); + zero = mpi_alloc (0); + + point_init (&Q); + point_init (&Q1); + point_init (&Q2); + + ctx = _gcry_mpi_ec_p_internal_new (pkey->E.model, pkey->E.dialect, + pkey->E.p, pkey->E.a, pkey->E.b); + + mpi_mod (e, input, pkey->E.n); /* e = hash mod n */ + if (!mpi_cmp_ui (e, 0)) + mpi_set_ui (e, 1); + mpi_invm (v, e, pkey->E.n); /* v = e^(-1) (mod n) */ + mpi_mulm (z1, s, v, pkey->E.n); /* z1 = s*v (mod n) */ + mpi_mulm (rv, r, v, pkey->E.n); /* rv = s*v (mod n) */ + mpi_subm (z2, zero, rv, pkey->E.n); /* z2 = -r*v (mod n) */ + + _gcry_mpi_ec_mul_point (&Q1, z1, &pkey->E.G, ctx); +/* log_mpidump ("Q1.x", Q1.x); */ +/* log_mpidump ("Q1.y", Q1.y); */ +/* log_mpidump ("Q1.z", Q1.z); */ + _gcry_mpi_ec_mul_point (&Q2, z2, &pkey->Q, ctx); +/* log_mpidump ("Q2.x", Q2.x); */ +/* log_mpidump ("Q2.y", Q2.y); */ +/* log_mpidump ("Q2.z", Q2.z); */ + _gcry_mpi_ec_add_points (&Q, &Q1, &Q2, ctx); +/* log_mpidump (" Q.x", Q.x); */ +/* log_mpidump (" Q.y", Q.y); */ +/* log_mpidump (" Q.z", Q.z); */ + + if (!mpi_cmp_ui (Q.z, 0)) + { + if (DBG_CIPHER) + log_debug ("ecc verify: Rejected\n"); + err = GPG_ERR_BAD_SIGNATURE; + goto leave; + } + if (_gcry_mpi_ec_get_affine (x, NULL, &Q, ctx)) + { + if (DBG_CIPHER) + log_debug ("ecc verify: Failed to get affine coordinates\n"); + err = GPG_ERR_BAD_SIGNATURE; + goto leave; + } + mpi_mod (x, x, pkey->E.n); /* x = x mod E_n */ + if (mpi_cmp (x, r)) /* x != r */ + { + if (DBG_CIPHER) + { + log_mpidump (" x", x); + log_mpidump (" r", r); + log_mpidump (" s", s); + log_debug ("ecc verify: Not verified\n"); + } + err = GPG_ERR_BAD_SIGNATURE; + goto leave; + } + if (DBG_CIPHER) + log_debug ("ecc verify: Accepted\n"); + + leave: + _gcry_mpi_ec_free (ctx); + point_free (&Q2); + point_free (&Q1); + point_free (&Q); + mpi_free (zero); + mpi_free (rv); + mpi_free (v); + mpi_free (z2); + mpi_free (z1); + mpi_free (x); + mpi_free (e); + return err; +} diff --git a/cipher/ecc.c b/cipher/ecc.c index 2774718..dca0423 100644 --- a/cipher/ecc.c +++ b/cipher/ecc.c @@ -81,19 +81,10 @@ static void (*progress_cb) (void *, const char*, int, int, int); static void *progress_cb_data; -#define point_init(a) _gcry_mpi_point_init ((a)) -#define point_free(a) _gcry_mpi_point_free_parts ((a)) - /* Local prototypes. */ static void test_keys (ECC_secret_key * sk, unsigned int nbits); static int check_secret_key (ECC_secret_key * sk); -static gpg_err_code_t sign_ecdsa (gcry_mpi_t input, ECC_secret_key *skey, - gcry_mpi_t r, gcry_mpi_t s, - int flags, int hashalgo); -static gpg_err_code_t verify_ecdsa (gcry_mpi_t input, ECC_public_key *pkey, - gcry_mpi_t r, gcry_mpi_t s); - static gcry_mpi_t gen_y_2 (gcry_mpi_t x, elliptic_curve_t * base); static unsigned int ecc_get_nbits (gcry_sexp_t parms); @@ -261,10 +252,10 @@ test_keys (ECC_secret_key *sk, unsigned int nbits) gcry_mpi_randomize (test, nbits, GCRY_WEAK_RANDOM); - if (sign_ecdsa (test, sk, r, s, 0, 0) ) + if (_gcry_ecc_ecdsa_sign (test, sk, r, s, 0, 0) ) log_fatal ("ECDSA operation: sign failed\n"); - if (verify_ecdsa (test, &pk, r, s)) + if (_gcry_ecc_ecdsa_verify (test, &pk, r, s)) { log_fatal ("ECDSA operation: sign, verify failed\n"); } @@ -389,1052 +380,6 @@ check_secret_key (ECC_secret_key * sk) } -/* Compute an ECDSA signature. - * Return the signature struct (r,s) from the message hash. The caller - * must have allocated R and S. - */ -static gpg_err_code_t -sign_ecdsa (gcry_mpi_t input, ECC_secret_key *skey, gcry_mpi_t r, gcry_mpi_t s, - int flags, int hashalgo) -{ - gpg_err_code_t err = 0; - int extraloops = 0; - gcry_mpi_t k, dr, sum, k_1, x; - mpi_point_struct I; - gcry_mpi_t hash; - const void *abuf; - unsigned int abits, qbits; - mpi_ec_t ctx; - - if (DBG_CIPHER) - log_mpidump ("ecdsa sign hash ", input ); - - qbits = mpi_get_nbits (skey->E.n); - - /* Convert the INPUT into an MPI if needed. */ - if (mpi_is_opaque (input)) - { - abuf = gcry_mpi_get_opaque (input, &abits); - err = gpg_err_code (gcry_mpi_scan (&hash, GCRYMPI_FMT_USG, - abuf, (abits+7)/8, NULL)); - if (err) - return err; - if (abits > qbits) - gcry_mpi_rshift (hash, hash, abits - qbits); - } - else - hash = input; - - - k = NULL; - dr = mpi_alloc (0); - sum = mpi_alloc (0); - k_1 = mpi_alloc (0); - x = mpi_alloc (0); - point_init (&I); - - ctx = _gcry_mpi_ec_p_internal_new (skey->E.model, skey->E.dialect, - skey->E.p, skey->E.a, skey->E.b); - - /* Two loops to avoid R or S are zero. This is more of a joke than - a real demand because the probability of them being zero is less - than any hardware failure. Some specs however require it. */ - do - { - do - { - mpi_free (k); - k = NULL; - if ((flags & PUBKEY_FLAG_RFC6979) && hashalgo) - { - /* Use Pornin's method for deterministic DSA. If this - flag is set, it is expected that HASH is an opaque - MPI with the to be signed hash. That hash is also - used as h1 from 3.2.a. */ - if (!mpi_is_opaque (input)) - { - err = GPG_ERR_CONFLICT; - goto leave; - } - - abuf = gcry_mpi_get_opaque (input, &abits); - err = _gcry_dsa_gen_rfc6979_k (&k, skey->E.n, skey->d, - abuf, (abits+7)/8, - hashalgo, extraloops); - if (err) - goto leave; - extraloops++; - } - else - k = _gcry_dsa_gen_k (skey->E.n, GCRY_STRONG_RANDOM); - - _gcry_mpi_ec_mul_point (&I, k, &skey->E.G, ctx); - if (_gcry_mpi_ec_get_affine (x, NULL, &I, ctx)) - { - if (DBG_CIPHER) - log_debug ("ecc sign: Failed to get affine coordinates\n"); - err = GPG_ERR_BAD_SIGNATURE; - goto leave; - } - mpi_mod (r, x, skey->E.n); /* r = x mod n */ - } - while (!mpi_cmp_ui (r, 0)); - - mpi_mulm (dr, skey->d, r, skey->E.n); /* dr = d*r mod n */ - mpi_addm (sum, hash, dr, skey->E.n); /* sum = hash + (d*r) mod n */ - mpi_invm (k_1, k, skey->E.n); /* k_1 = k^(-1) mod n */ - mpi_mulm (s, k_1, sum, skey->E.n); /* s = k^(-1)*(hash+(d*r)) mod n */ - } - while (!mpi_cmp_ui (s, 0)); - - if (DBG_CIPHER) - { - log_mpidump ("ecdsa sign result r ", r); - log_mpidump ("ecdsa sign result s ", s); - } - - leave: - _gcry_mpi_ec_free (ctx); - point_free (&I); - mpi_free (x); - mpi_free (k_1); - mpi_free (sum); - mpi_free (dr); - mpi_free (k); - - if (hash != input) - mpi_free (hash); - - return err; -} - - -/* Verify an ECDSA signature. - * Check if R and S verifies INPUT. - */ -static gpg_err_code_t -verify_ecdsa (gcry_mpi_t input, ECC_public_key *pkey, - gcry_mpi_t r, gcry_mpi_t s) -{ - gpg_err_code_t err = 0; - gcry_mpi_t h, h1, h2, x; - mpi_point_struct Q, Q1, Q2; - mpi_ec_t ctx; - - if( !(mpi_cmp_ui (r, 0) > 0 && mpi_cmp (r, pkey->E.n) < 0) ) - return GPG_ERR_BAD_SIGNATURE; /* Assertion 0 < r < n failed. */ - if( !(mpi_cmp_ui (s, 0) > 0 && mpi_cmp (s, pkey->E.n) < 0) ) - return GPG_ERR_BAD_SIGNATURE; /* Assertion 0 < s < n failed. */ - - h = mpi_alloc (0); - h1 = mpi_alloc (0); - h2 = mpi_alloc (0); - x = mpi_alloc (0); - point_init (&Q); - point_init (&Q1); - point_init (&Q2); - - ctx = _gcry_mpi_ec_p_internal_new (pkey->E.model, pkey->E.dialect, - pkey->E.p, pkey->E.a, pkey->E.b); - - /* h = s^(-1) (mod n) */ - mpi_invm (h, s, pkey->E.n); - /* h1 = hash * s^(-1) (mod n) */ - mpi_mulm (h1, input, h, pkey->E.n); - /* Q1 = [ hash * s^(-1) ]G */ - _gcry_mpi_ec_mul_point (&Q1, h1, &pkey->E.G, ctx); - /* h2 = r * s^(-1) (mod n) */ - mpi_mulm (h2, r, h, pkey->E.n); - /* Q2 = [ r * s^(-1) ]Q */ - _gcry_mpi_ec_mul_point (&Q2, h2, &pkey->Q, ctx); - /* Q = ([hash * s^(-1)]G) + ([r * s^(-1)]Q) */ - _gcry_mpi_ec_add_points (&Q, &Q1, &Q2, ctx); - - if (!mpi_cmp_ui (Q.z, 0)) - { - if (DBG_CIPHER) - log_debug ("ecc verify: Rejected\n"); - err = GPG_ERR_BAD_SIGNATURE; - goto leave; - } - if (_gcry_mpi_ec_get_affine (x, NULL, &Q, ctx)) - { - if (DBG_CIPHER) - log_debug ("ecc verify: Failed to get affine coordinates\n"); - err = GPG_ERR_BAD_SIGNATURE; - goto leave; - } - mpi_mod (x, x, pkey->E.n); /* x = x mod E_n */ - if (mpi_cmp (x, r)) /* x != r */ - { - if (DBG_CIPHER) - { - log_mpidump (" x", x); - log_mpidump (" r", r); - log_mpidump (" s", s); - } - err = GPG_ERR_BAD_SIGNATURE; - goto leave; - } - - leave: - _gcry_mpi_ec_free (ctx); - point_free (&Q2); - point_free (&Q1); - point_free (&Q); - mpi_free (x); - mpi_free (h2); - mpi_free (h1); - mpi_free (h); - return err; -} - -/* Compute an GOST R 34.10-01/-12 signature. - * Return the signature struct (r,s) from the message hash. The caller - * must have allocated R and S. - */ -static gpg_err_code_t -sign_gost (gcry_mpi_t input, ECC_secret_key *skey, gcry_mpi_t r, gcry_mpi_t s) -{ - gpg_err_code_t err = 0; - gcry_mpi_t k, dr, sum, ke, x, e; - mpi_point_struct I; - gcry_mpi_t hash; - const void *abuf; - unsigned int abits, qbits; - mpi_ec_t ctx; - - if (DBG_CIPHER) - log_mpidump ("gost sign hash ", input ); - - qbits = mpi_get_nbits (skey->E.n); - - /* Convert the INPUT into an MPI if needed. */ - if (mpi_is_opaque (input)) - { - abuf = gcry_mpi_get_opaque (input, &abits); - err = gpg_err_code (gcry_mpi_scan (&hash, GCRYMPI_FMT_USG, - abuf, (abits+7)/8, NULL)); - if (err) - return err; - if (abits > qbits) - gcry_mpi_rshift (hash, hash, abits - qbits); - } - else - hash = input; - - - k = NULL; - dr = mpi_alloc (0); - sum = mpi_alloc (0); - ke = mpi_alloc (0); - e = mpi_alloc (0); - x = mpi_alloc (0); - point_init (&I); - - ctx = _gcry_mpi_ec_p_internal_new (skey->E.model, skey->E.dialect, - skey->E.p, skey->E.a, skey->E.b); - - mpi_mod (e, input, skey->E.n); /* e = hash mod n */ - - if (!mpi_cmp_ui (e, 0)) - mpi_set_ui (e, 1); - - /* Two loops to avoid R or S are zero. This is more of a joke than - a real demand because the probability of them being zero is less - than any hardware failure. Some specs however require it. */ - do - { - do - { - mpi_free (k); - k = _gcry_dsa_gen_k (skey->E.n, GCRY_STRONG_RANDOM); - - _gcry_mpi_ec_mul_point (&I, k, &skey->E.G, ctx); - if (_gcry_mpi_ec_get_affine (x, NULL, &I, ctx)) - { - if (DBG_CIPHER) - log_debug ("ecc sign: Failed to get affine coordinates\n"); - err = GPG_ERR_BAD_SIGNATURE; - goto leave; - } - mpi_mod (r, x, skey->E.n); /* r = x mod n */ - } - while (!mpi_cmp_ui (r, 0)); - mpi_mulm (dr, skey->d, r, skey->E.n); /* dr = d*r mod n */ - mpi_mulm (ke, k, e, skey->E.n); /* ke = k*e mod n */ - mpi_addm (s, ke, dr, skey->E.n); /* sum = (k*e+ d*r) mod n */ - } - while (!mpi_cmp_ui (s, 0)); - - if (DBG_CIPHER) - { - log_mpidump ("gost sign result r ", r); - log_mpidump ("gost sign result s ", s); - } - - leave: - _gcry_mpi_ec_free (ctx); - point_free (&I); - mpi_free (x); - mpi_free (e); - mpi_free (ke); - mpi_free (sum); - mpi_free (dr); - mpi_free (k); - - if (hash != input) - mpi_free (hash); - - return err; -} - -/* Verify a GOST R 34.10-01/-12 signature. - * Check if R and S verifies INPUT. - */ -static gpg_err_code_t -verify_gost (gcry_mpi_t input, ECC_public_key *pkey, - gcry_mpi_t r, gcry_mpi_t s) -{ - gpg_err_code_t err = 0; - gcry_mpi_t e, x, z1, z2, v, rv, zero; - mpi_point_struct Q, Q1, Q2; - mpi_ec_t ctx; - - if( !(mpi_cmp_ui (r, 0) > 0 && mpi_cmp (r, pkey->E.n) < 0) ) - return GPG_ERR_BAD_SIGNATURE; /* Assertion 0 < r < n failed. */ - if( !(mpi_cmp_ui (s, 0) > 0 && mpi_cmp (s, pkey->E.n) < 0) ) - return GPG_ERR_BAD_SIGNATURE; /* Assertion 0 < s < n failed. */ - - x = mpi_alloc (0); - e = mpi_alloc (0); - z1 = mpi_alloc (0); - z2 = mpi_alloc (0); - v = mpi_alloc (0); - rv = mpi_alloc (0); - zero = mpi_alloc (0); - - point_init (&Q); - point_init (&Q1); - point_init (&Q2); - - ctx = _gcry_mpi_ec_p_internal_new (pkey->E.model, pkey->E.dialect, - pkey->E.p, pkey->E.a, pkey->E.b); - - mpi_mod (e, input, pkey->E.n); /* e = hash mod n */ - if (!mpi_cmp_ui (e, 0)) - mpi_set_ui (e, 1); - mpi_invm (v, e, pkey->E.n); /* v = e^(-1) (mod n) */ - mpi_mulm (z1, s, v, pkey->E.n); /* z1 = s*v (mod n) */ - mpi_mulm (rv, r, v, pkey->E.n); /* rv = s*v (mod n) */ - mpi_subm (z2, zero, rv, pkey->E.n); /* z2 = -r*v (mod n) */ - - _gcry_mpi_ec_mul_point (&Q1, z1, &pkey->E.G, ctx); -/* log_mpidump ("Q1.x", Q1.x); */ -/* log_mpidump ("Q1.y", Q1.y); */ -/* log_mpidump ("Q1.z", Q1.z); */ - _gcry_mpi_ec_mul_point (&Q2, z2, &pkey->Q, ctx); -/* log_mpidump ("Q2.x", Q2.x); */ -/* log_mpidump ("Q2.y", Q2.y); */ -/* log_mpidump ("Q2.z", Q2.z); */ - _gcry_mpi_ec_add_points (&Q, &Q1, &Q2, ctx); -/* log_mpidump (" Q.x", Q.x); */ -/* log_mpidump (" Q.y", Q.y); */ -/* log_mpidump (" Q.z", Q.z); */ - - if (!mpi_cmp_ui (Q.z, 0)) - { - if (DBG_CIPHER) - log_debug ("ecc verify: Rejected\n"); - err = GPG_ERR_BAD_SIGNATURE; - goto leave; - } - if (_gcry_mpi_ec_get_affine (x, NULL, &Q, ctx)) - { - if (DBG_CIPHER) - log_debug ("ecc verify: Failed to get affine coordinates\n"); - err = GPG_ERR_BAD_SIGNATURE; - goto leave; - } - mpi_mod (x, x, pkey->E.n); /* x = x mod E_n */ - if (mpi_cmp (x, r)) /* x != r */ - { - if (DBG_CIPHER) - { - log_mpidump (" x", x); - log_mpidump (" r", r); - log_mpidump (" s", s); - log_debug ("ecc verify: Not verified\n"); - } - err = GPG_ERR_BAD_SIGNATURE; - goto leave; - } - if (DBG_CIPHER) - log_debug ("ecc verify: Accepted\n"); - - leave: - _gcry_mpi_ec_free (ctx); - point_free (&Q2); - point_free (&Q1); - point_free (&Q); - mpi_free (zero); - mpi_free (rv); - mpi_free (v); - mpi_free (z2); - mpi_free (z1); - mpi_free (x); - mpi_free (e); - return err; -} - - -static void -reverse_buffer (unsigned char *buffer, unsigned int length) -{ - unsigned int tmp, i; - - for (i=0; i < length/2; i++) - { - tmp = buffer[i]; - buffer[i] = buffer[length-1-i]; - buffer[length-1-i] = tmp; - } -} - - -/* Encode MPI using the EdDSA scheme. MINLEN specifies the required - length of the buffer in bytes. On success 0 is returned an a - malloced buffer with the encoded point is stored at R_BUFFER; the - length of this buffer is stored at R_BUFLEN. */ -static gpg_err_code_t -eddsa_encodempi (gcry_mpi_t mpi, unsigned int minlen, - unsigned char **r_buffer, unsigned int *r_buflen) -{ - unsigned char *rawmpi; - unsigned int rawmpilen; - - rawmpi = _gcry_mpi_get_buffer (mpi, minlen, &rawmpilen, NULL); - if (!rawmpi) - return gpg_err_code_from_syserror (); - - *r_buffer = rawmpi; - *r_buflen = rawmpilen; - return 0; -} - - -/* Encode (X,Y) using the EdDSA scheme. MINLEN is the required length - in bytes for the result. On success 0 is returned and a malloced - buffer with the encoded point is stored at R_BUFFER; the length of - this buffer is stored at R_BUFLEN. */ -static gpg_err_code_t -eddsa_encode_x_y (gcry_mpi_t x, gcry_mpi_t y, unsigned int minlen, - unsigned char **r_buffer, unsigned int *r_buflen) -{ - unsigned char *rawmpi; - unsigned int rawmpilen; - - rawmpi = _gcry_mpi_get_buffer (y, minlen, &rawmpilen, NULL); - if (!rawmpi) - return gpg_err_code_from_syserror (); - if (mpi_test_bit (x, 0) && rawmpilen) - rawmpi[rawmpilen - 1] |= 0x80; /* Set sign bit. */ - - *r_buffer = rawmpi; - *r_buflen = rawmpilen; - return 0; -} - -/* Encode POINT using the EdDSA scheme. X and Y are either scratch - variables supplied by the caller or NULL. CTX is the usual - context. On success 0 is returned and a malloced buffer with the - encoded point is stored at R_BUFFER; the length of this buffer is - stored at R_BUFLEN. */ -gpg_err_code_t -_gcry_ecc_eddsa_encodepoint (mpi_point_t point, mpi_ec_t ec, - gcry_mpi_t x_in, gcry_mpi_t y_in, - unsigned char **r_buffer, unsigned int *r_buflen) -{ - gpg_err_code_t rc; - gcry_mpi_t x, y; - - x = x_in? x_in : mpi_new (0); - y = y_in? y_in : mpi_new (0); - - if (_gcry_mpi_ec_get_affine (x, y, point, ec)) - { - log_error ("eddsa_encodepoint: Failed to get affine coordinates\n"); - rc = GPG_ERR_INTERNAL; - } - else - rc = eddsa_encode_x_y (x, y, ec->nbits/8, r_buffer, r_buflen); - - if (!x_in) - mpi_free (x); - if (!y_in) - mpi_free (y); - return rc; -} - - -/* Decode the EdDSA style encoded PK and set it into RESULT. CTX is - the usual curve context. If R_ENCPK is not NULL, the encoded PK is - stored at that address; this is a new copy to be released by the - caller. In contrast to the supplied PK, this is not an MPI and - thus guarnateed to be properly padded. R_ENCPKLEN received the - length of that encoded key. */ -gpg_err_code_t -_gcry_ecc_eddsa_decodepoint (gcry_mpi_t pk, mpi_ec_t ctx, mpi_point_t result, - unsigned char **r_encpk, unsigned int *r_encpklen) -{ - gpg_err_code_t rc; - unsigned char *rawmpi; - unsigned int rawmpilen; - gcry_mpi_t yy, t, x, p1, p2, p3; - int sign; - - if (mpi_is_opaque (pk)) - { - const unsigned char *buf; - - buf = gcry_mpi_get_opaque (pk, &rawmpilen); - if (!buf) - return GPG_ERR_INV_OBJ; - rawmpilen = (rawmpilen + 7)/8; - - /* First check whether the public key has been given in standard - uncompressed format. No need to recover x in this case. - Detection is easy: The size of the buffer will be odd and the - first byte be 0x04. */ - if (rawmpilen > 1 && buf[0] == 0x04 && (rawmpilen%2)) - { - gcry_mpi_t y; - - rc = gcry_mpi_scan (&x, GCRYMPI_FMT_STD, - buf+1, (rawmpilen-1)/2, NULL); - if (rc) - return rc; - rc = gcry_mpi_scan (&y, GCRYMPI_FMT_STD, - buf+1+(rawmpilen-1)/2, (rawmpilen-1)/2, NULL); - if (rc) - { - mpi_free (x); - return rc; - } - - if (r_encpk) - { - rc = eddsa_encode_x_y (x, y, ctx->nbits/8, r_encpk, r_encpklen); - if (rc) - { - mpi_free (x); - mpi_free (y); - return rc; - } - } - mpi_snatch (result->x, x); - mpi_snatch (result->y, y); - mpi_set_ui (result->z, 1); - return 0; - } - - /* EdDSA compressed point. */ - rawmpi = gcry_malloc (rawmpilen? rawmpilen:1); - if (!rawmpi) - return gpg_err_code_from_syserror (); - memcpy (rawmpi, buf, rawmpilen); - reverse_buffer (rawmpi, rawmpilen); - } - else - { - /* Note: Without using an opaque MPI it is not reliable possible - to find out whether the public key has been given in - uncompressed format. Thus we expect EdDSA format here. */ - rawmpi = _gcry_mpi_get_buffer (pk, ctx->nbits/8, &rawmpilen, NULL); - if (!rawmpi) - return gpg_err_code_from_syserror (); - } - - if (rawmpilen) - { - sign = !!(rawmpi[0] & 0x80); - rawmpi[0] &= 0x7f; - } - else - sign = 0; - _gcry_mpi_set_buffer (result->y, rawmpi, rawmpilen, 0); - if (r_encpk) - { - /* Revert to little endian. */ - if (sign && rawmpilen) - rawmpi[0] |= 0x80; - reverse_buffer (rawmpi, rawmpilen); - *r_encpk = rawmpi; - if (r_encpklen) - *r_encpklen = rawmpilen; - } - else - gcry_free (rawmpi); - - /* Now recover X. */ - /* t = (y^2-1) ? ((b*y^2+1)^{p-2} mod p) */ - x = mpi_new (0); - yy = mpi_new (0); - mpi_mul (yy, result->y, result->y); - t = mpi_copy (yy); - mpi_mul (t, t, ctx->b); - mpi_add_ui (t, t, 1); - p2 = mpi_copy (ctx->p); - mpi_sub_ui (p2, p2, 2); - mpi_powm (t, t, p2, ctx->p); - - mpi_sub_ui (yy, yy, 1); - mpi_mul (t, yy, t); - - /* x = t^{(p+3)/8} mod p */ - p3 = mpi_copy (ctx->p); - mpi_add_ui (p3, p3, 3); - mpi_fdiv_q (p3, p3, mpi_const (MPI_C_EIGHT)); - mpi_powm (x, t, p3, ctx->p); - - /* (x^2 - t) % p != 0 ? x = (x*(2^{(p-1)/4} mod p)) % p */ - mpi_mul (yy, x, x); - mpi_subm (yy, yy, t, ctx->p); - if (mpi_cmp_ui (yy, 0)) - { - p1 = mpi_copy (ctx->p); - mpi_sub_ui (p1, p1, 1); - mpi_fdiv_q (p1, p1, mpi_const (MPI_C_FOUR)); - mpi_powm (yy, mpi_const (MPI_C_TWO), p1, ctx->p); - mpi_mulm (x, x, yy, ctx->p); - } - else - p1 = NULL; - - /* is_odd(x) ? x = p-x */ - if (mpi_test_bit (x, 0)) - mpi_sub (x, ctx->p, x); - - /* lowbit(x) != highbit(input) ? x = p-x */ - if (mpi_test_bit (x, 0) != sign) - mpi_sub (x, ctx->p, x); - - mpi_set (result->x, x); - mpi_set_ui (result->z, 1); - - gcry_mpi_release (x); - gcry_mpi_release (yy); - gcry_mpi_release (t); - gcry_mpi_release (p3); - gcry_mpi_release (p2); - gcry_mpi_release (p1); - - return 0; -} - - -/* Ed25519 version of the key generation. */ -static gpg_err_code_t -eddsa_generate_key (ECC_secret_key *sk, elliptic_curve_t *E, mpi_ec_t ctx, - gcry_random_level_t random_level) -{ - gpg_err_code_t rc; - int b = 256/8; /* The only size we currently support. */ - gcry_mpi_t a, x, y; - mpi_point_struct Q; - char *dbuf; - size_t dlen; - gcry_buffer_t hvec[1]; - unsigned char *hash_d = NULL; - - point_init (&Q); - memset (hvec, 0, sizeof hvec); - - a = mpi_snew (0); - x = mpi_new (0); - y = mpi_new (0); - - /* Generate a secret. */ - hash_d = gcry_malloc_secure (2*b); - if (!hash_d) - { - rc = gpg_error_from_syserror (); - goto leave; - } - dlen = b; - dbuf = gcry_random_bytes_secure (dlen, random_level); - - /* Compute the A value. */ - hvec[0].data = dbuf; - hvec[0].len = dlen; - rc = _gcry_md_hash_buffers (GCRY_MD_SHA512, 0, hash_d, hvec, 1); - if (rc) - goto leave; - sk->d = _gcry_mpi_set_opaque (NULL, dbuf, dlen*8); - dbuf = NULL; - reverse_buffer (hash_d, 32); /* Only the first half of the hash. */ - hash_d[0] = (hash_d[0] & 0x7f) | 0x40; - hash_d[31] &= 0xf8; - _gcry_mpi_set_buffer (a, hash_d, 32, 0); - gcry_free (hash_d); hash_d = NULL; - /* log_printmpi ("ecgen a", a); */ - - /* Compute Q. */ - _gcry_mpi_ec_mul_point (&Q, a, &E->G, ctx); - if (DBG_CIPHER) - log_printpnt ("ecgen pk", &Q, ctx); - - /* Copy the stuff to the key structures. */ - sk->E.model = E->model; - sk->E.dialect = E->dialect; - sk->E.p = mpi_copy (E->p); - sk->E.a = mpi_copy (E->a); - sk->E.b = mpi_copy (E->b); - point_init (&sk->E.G); - point_set (&sk->E.G, &E->G); - sk->E.n = mpi_copy (E->n); - point_init (&sk->Q); - point_set (&sk->Q, &Q); - - leave: - gcry_mpi_release (a); - gcry_mpi_release (x); - gcry_mpi_release (y); - gcry_free (hash_d); - return rc; -} - - -/* Compute an EdDSA signature. See: - * [ed25519] 23pp. (PDF) Daniel J. Bernstein, Niels Duif, Tanja - * Lange, Peter Schwabe, Bo-Yin Yang. High-speed high-security - * signatures. Journal of Cryptographic Engineering 2 (2012), 77-89. - * Document ID: a1a62a2f76d23f65d622484ddd09caf8. - * URL: http://cr.yp.to/papers.html#ed25519. Date: 2011.09.26. - * - * Despite that this function requires the specification of a hash - * algorithm, we only support what has been specified by the paper. - * This may change in the future. Note that we don't check the used - * curve; the user is responsible to use Ed25519. - * - * Return the signature struct (r,s) from the message hash. The caller - * must have allocated R_R and S. - */ -static gpg_err_code_t -sign_eddsa (gcry_mpi_t input, ECC_secret_key *skey, - gcry_mpi_t r_r, gcry_mpi_t s, int hashalgo, gcry_mpi_t pk) -{ - int rc; - mpi_ec_t ctx = NULL; - int b; - unsigned int tmp; - unsigned char *digest; - gcry_buffer_t hvec[3]; - const void *mbuf; - size_t mlen; - unsigned char *rawmpi = NULL; - unsigned int rawmpilen; - unsigned char *encpk = NULL; /* Encoded public key. */ - unsigned int encpklen; - mpi_point_struct I; /* Intermediate value. */ - mpi_point_struct Q; /* Public key. */ - gcry_mpi_t a, x, y, r; - - memset (hvec, 0, sizeof hvec); - - if (!mpi_is_opaque (input)) - return GPG_ERR_INV_DATA; - if (hashalgo != GCRY_MD_SHA512) - return GPG_ERR_DIGEST_ALGO; - - /* Initialize some helpers. */ - point_init (&I); - point_init (&Q); - a = mpi_snew (0); - x = mpi_new (0); - y = mpi_new (0); - r = mpi_new (0); - ctx = _gcry_mpi_ec_p_internal_new (skey->E.model, skey->E.dialect, - skey->E.p, skey->E.a, skey->E.b); - b = (ctx->nbits+7)/8; - if (b != 256/8) - return GPG_ERR_INTERNAL; /* We only support 256 bit. */ - - digest = gcry_calloc_secure (2, b); - if (!digest) - { - rc = gpg_err_code_from_syserror (); - goto leave; - } - - /* Hash the secret key. We clear DIGEST so we can use it as input - to left pad the key with zeroes for hashing. */ - rawmpi = _gcry_mpi_get_buffer (skey->d, 0, &rawmpilen, NULL); - if (!rawmpi) - { - rc = gpg_err_code_from_syserror (); - goto leave; - } - hvec[0].data = digest; - hvec[0].off = 0; - hvec[0].len = b > rawmpilen? b - rawmpilen : 0; - hvec[1].data = rawmpi; - hvec[1].off = 0; - hvec[1].len = rawmpilen; - rc = _gcry_md_hash_buffers (hashalgo, 0, digest, hvec, 2); - gcry_free (rawmpi); rawmpi = NULL; - if (rc) - goto leave; - - /* Compute the A value (this modifies DIGEST). */ - reverse_buffer (digest, 32); /* Only the first half of the hash. */ - digest[0] = (digest[0] & 0x7f) | 0x40; - digest[31] &= 0xf8; - _gcry_mpi_set_buffer (a, digest, 32, 0); - - /* Compute the public key if it has not been supplied as optional - parameter. */ - if (pk) - { - rc = _gcry_ecc_eddsa_decodepoint (pk, ctx, &Q, &encpk, &encpklen); - if (rc) - goto leave; - if (DBG_CIPHER) - log_printhex ("* e_pk", encpk, encpklen); - if (!_gcry_mpi_ec_curve_point (&Q, ctx)) - { - rc = GPG_ERR_BROKEN_PUBKEY; - goto leave; - } - } - else - { - _gcry_mpi_ec_mul_point (&Q, a, &skey->E.G, ctx); - rc = _gcry_ecc_eddsa_encodepoint (&Q, ctx, x, y, &encpk, &encpklen); - if (rc) - goto leave; - if (DBG_CIPHER) - log_printhex (" e_pk", encpk, encpklen); - } - - /* Compute R. */ - mbuf = gcry_mpi_get_opaque (input, &tmp); - mlen = (tmp +7)/8; - if (DBG_CIPHER) - log_printhex (" m", mbuf, mlen); - - hvec[0].data = digest; - hvec[0].off = 32; - hvec[0].len = 32; - hvec[1].data = (char*)mbuf; - hvec[1].len = mlen; - rc = _gcry_md_hash_buffers (hashalgo, 0, digest, hvec, 2); - if (rc) - goto leave; - reverse_buffer (digest, 64); - if (DBG_CIPHER) - log_printhex (" r", digest, 64); - _gcry_mpi_set_buffer (r, digest, 64, 0); - _gcry_mpi_ec_mul_point (&I, r, &skey->E.G, ctx); - if (DBG_CIPHER) - log_printpnt (" r", &I, ctx); - - /* Convert R into affine coordinates and apply encoding. */ - rc = _gcry_ecc_eddsa_encodepoint (&I, ctx, x, y, &rawmpi, &rawmpilen); - if (rc) - goto leave; - if (DBG_CIPHER) - log_printhex (" e_r", rawmpi, rawmpilen); - - /* S = r + a * H(encodepoint(R) + encodepoint(pk) + m) mod n */ - hvec[0].data = rawmpi; /* (this is R) */ - hvec[0].off = 0; - hvec[0].len = rawmpilen; - hvec[1].data = encpk; - hvec[1].off = 0; - hvec[1].len = encpklen; - hvec[2].data = (char*)mbuf; - hvec[2].off = 0; - hvec[2].len = mlen; - rc = _gcry_md_hash_buffers (hashalgo, 0, digest, hvec, 3); - if (rc) - goto leave; - - /* No more need for RAWMPI thus we now transfer it to R_R. */ - gcry_mpi_set_opaque (r_r, rawmpi, rawmpilen*8); - rawmpi = NULL; - - reverse_buffer (digest, 64); - if (DBG_CIPHER) - log_printhex (" H(R+)", digest, 64); - _gcry_mpi_set_buffer (s, digest, 64, 0); - mpi_mulm (s, s, a, skey->E.n); - mpi_addm (s, s, r, skey->E.n); - rc = eddsa_encodempi (s, b, &rawmpi, &rawmpilen); - if (rc) - goto leave; - if (DBG_CIPHER) - log_printhex (" e_s", rawmpi, rawmpilen); - gcry_mpi_set_opaque (s, rawmpi, rawmpilen*8); - rawmpi = NULL; - - rc = 0; - - leave: - gcry_mpi_release (a); - gcry_mpi_release (x); - gcry_mpi_release (y); - gcry_mpi_release (r); - gcry_free (digest); - _gcry_mpi_ec_free (ctx); - point_free (&I); - point_free (&Q); - gcry_free (encpk); - gcry_free (rawmpi); - return rc; -} - - -/* Verify an EdDSA signature. See sign_eddsa for the reference. - * Check if R_IN and S_IN verifies INPUT. PKEY has the curve - * parameters and PK is the EdDSA style encoded public key. - */ -static gpg_err_code_t -verify_eddsa (gcry_mpi_t input, ECC_public_key *pkey, - gcry_mpi_t r_in, gcry_mpi_t s_in, int hashalgo, gcry_mpi_t pk) -{ - int rc; - mpi_ec_t ctx = NULL; - int b; - unsigned int tmp; - mpi_point_struct Q; /* Public key. */ - unsigned char *encpk = NULL; /* Encoded public key. */ - unsigned int encpklen; - const void *mbuf, *rbuf; - unsigned char *tbuf = NULL; - size_t mlen, rlen; - unsigned int tlen; - unsigned char digest[64]; - gcry_buffer_t hvec[3]; - gcry_mpi_t h, s; - mpi_point_struct Ia, Ib; - - if (!mpi_is_opaque (input) || !mpi_is_opaque (r_in) || !mpi_is_opaque (s_in)) - return GPG_ERR_INV_DATA; - if (hashalgo != GCRY_MD_SHA512) - return GPG_ERR_DIGEST_ALGO; - - point_init (&Q); - point_init (&Ia); - point_init (&Ib); - h = mpi_new (0); - s = mpi_new (0); - - ctx = _gcry_mpi_ec_p_internal_new (pkey->E.model, pkey->E.dialect, - pkey->E.p, pkey->E.a, pkey->E.b); - b = ctx->nbits/8; - if (b != 256/8) - return GPG_ERR_INTERNAL; /* We only support 256 bit. */ - - /* Decode and check the public key. */ - rc = _gcry_ecc_eddsa_decodepoint (pk, ctx, &Q, &encpk, &encpklen); - if (rc) - goto leave; - if (!_gcry_mpi_ec_curve_point (&Q, ctx)) - { - rc = GPG_ERR_BROKEN_PUBKEY; - goto leave; - } - if (DBG_CIPHER) - log_printhex (" e_pk", encpk, encpklen); - if (encpklen != b) - { - rc = GPG_ERR_INV_LENGTH; - goto leave; - } - - /* Convert the other input parameters. */ - mbuf = gcry_mpi_get_opaque (input, &tmp); - mlen = (tmp +7)/8; - if (DBG_CIPHER) - log_printhex (" m", mbuf, mlen); - rbuf = gcry_mpi_get_opaque (r_in, &tmp); - rlen = (tmp +7)/8; - if (DBG_CIPHER) - log_printhex (" r", rbuf, rlen); - if (rlen != b) - { - rc = GPG_ERR_INV_LENGTH; - goto leave; - } - - /* h = H(encodepoint(R) + encodepoint(pk) + m) */ - hvec[0].data = (char*)rbuf; - hvec[0].off = 0; - hvec[0].len = rlen; - hvec[1].data = encpk; - hvec[1].off = 0; - hvec[1].len = encpklen; - hvec[2].data = (char*)mbuf; - hvec[2].off = 0; - hvec[2].len = mlen; - rc = _gcry_md_hash_buffers (hashalgo, 0, digest, hvec, 3); - if (rc) - goto leave; - reverse_buffer (digest, 64); - if (DBG_CIPHER) - log_printhex (" H(R+)", digest, 64); - _gcry_mpi_set_buffer (h, digest, 64, 0); - - /* According to the paper the best way for verification is: - encodepoint(sG - h?Q) = encodepoint(r) - because we don't need to decode R. */ - { - void *sbuf; - unsigned int slen; - - sbuf = _gcry_mpi_get_opaque_copy (s_in, &tmp); - slen = (tmp +7)/8; - reverse_buffer (sbuf, slen); - if (DBG_CIPHER) - log_printhex (" s", sbuf, slen); - _gcry_mpi_set_buffer (s, sbuf, slen, 0); - gcry_free (sbuf); - if (slen != b) - { - rc = GPG_ERR_INV_LENGTH; - goto leave; - } - } - - _gcry_mpi_ec_mul_point (&Ia, s, &pkey->E.G, ctx); - _gcry_mpi_ec_mul_point (&Ib, h, &Q, ctx); - _gcry_mpi_neg (Ib.x, Ib.x); - _gcry_mpi_ec_add_points (&Ia, &Ia, &Ib, ctx); - rc = _gcry_ecc_eddsa_encodepoint (&Ia, ctx, s, h, &tbuf, &tlen); - if (rc) - goto leave; - if (tlen != rlen || memcmp (tbuf, rbuf, tlen)) - { - rc = GPG_ERR_BAD_SIGNATURE; - goto leave; - } - - rc = 0; - - leave: - gcry_free (encpk); - gcry_free (tbuf); - _gcry_mpi_ec_free (ctx); - gcry_mpi_release (s); - gcry_mpi_release (h); - point_free (&Ia); - point_free (&Ib); - point_free (&Q); - return rc; -} - - /********************************************* ************** interface ****************** @@ -1540,7 +485,7 @@ ecc_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) rc = nist_generate_key (&sk, &E, ctx, random_level, nbits); } else - rc = eddsa_generate_key (&sk, &E, ctx, random_level); + rc = _gcry_ecc_eddsa_genkey (&sk, &E, ctx, random_level); break; default: rc = GPG_ERR_INTERNAL; @@ -1830,21 +775,22 @@ ecc_sign (gcry_sexp_t *r_sig, gcry_sexp_t s_data, gcry_sexp_t keyparms) if ((ctx.flags & PUBKEY_FLAG_EDDSA)) { /* EdDSA requires the public key. */ - rc = sign_eddsa (data, &sk, sig_r, sig_s, ctx.hash_algo, mpi_q); + rc = _gcry_ecc_eddsa_sign (data, &sk, sig_r, sig_s, ctx.hash_algo, mpi_q); if (!rc) rc = gcry_sexp_build (r_sig, NULL, "(sig-val(eddsa(r%M)(s%M)))", sig_r, sig_s); } else if ((ctx.flags & PUBKEY_FLAG_GOST)) { - rc = sign_gost (data, &sk, sig_r, sig_s); + rc = _gcry_ecc_gost_sign (data, &sk, sig_r, sig_s); if (!rc) rc = gcry_sexp_build (r_sig, NULL, "(sig-val(gost(r%M)(s%M)))", sig_r, sig_s); } else { - rc = sign_ecdsa (data, &sk, sig_r, sig_s, ctx.flags, ctx.hash_algo); + rc = _gcry_ecc_ecdsa_sign (data, &sk, sig_r, sig_s, + ctx.flags, ctx.hash_algo); if (!rc) rc = gcry_sexp_build (r_sig, NULL, "(sig-val(ecdsa(r%M)(s%M)))", sig_r, sig_s); @@ -1990,7 +936,8 @@ ecc_verify (gcry_sexp_t s_sig, gcry_sexp_t s_data, gcry_sexp_t s_keyparms) */ if ((sigflags & PUBKEY_FLAG_EDDSA)) { - rc = verify_eddsa (data, &pk, sig_r, sig_s, ctx.hash_algo, mpi_q); + rc = _gcry_ecc_eddsa_verify (data, &pk, sig_r, sig_s, + ctx.hash_algo, mpi_q); } else if ((sigflags & PUBKEY_FLAG_GOST)) { @@ -1999,7 +946,7 @@ ecc_verify (gcry_sexp_t s_sig, gcry_sexp_t s_data, gcry_sexp_t s_keyparms) if (rc) goto leave; - rc = verify_gost (data, &pk, sig_r, sig_s); + rc = _gcry_ecc_gost_verify (data, &pk, sig_r, sig_s); } else { @@ -2024,12 +971,12 @@ ecc_verify (gcry_sexp_t s_sig, gcry_sexp_t s_data, gcry_sexp_t s_keyparms) if (abits > qbits) gcry_mpi_rshift (a, a, abits - qbits); - rc = verify_ecdsa (a, &pk, sig_r, sig_s); + rc = _gcry_ecc_ecdsa_verify (a, &pk, sig_r, sig_s); gcry_mpi_release (a); } } else - rc = verify_ecdsa (data, &pk, sig_r, sig_s); + rc = _gcry_ecc_ecdsa_verify (data, &pk, sig_r, sig_s); } leave: diff --git a/configure.ac b/configure.ac index 739a650..69cfbd2 100644 --- a/configure.ac +++ b/configure.ac @@ -1583,7 +1583,8 @@ fi LIST_MEMBER(ecc, $enabled_pubkey_ciphers) if test "$found" = "1" ; then GCRYPT_PUBKEY_CIPHERS="$GCRYPT_PUBKEY_CIPHERS \ - ecc.lo ecc-curves.lo ecc-misc.lo" + ecc.lo ecc-curves.lo ecc-misc.lo \ + ecc-ecdsa.lo ecc-eddsa.lo ecc-gost.lo" AC_DEFINE(USE_ECC, 1, [Defined if this module should be included]) fi commit 45f6e6268bfdc4b608beaba6b7086b2286e33c71 Author: Werner Koch Date: Wed Oct 23 11:41:37 2013 +0200 mpi: Fix scanning of negative SSH formats and add more tests. * mpi/mpicoder.c (gcry_mpi_scan): Fix sign setting for SSH format. * tests/t-convert.c (negative_zero): Test all formats. (check_formats): Add tests for PGP and scan tests for SSH and USG. * src/gcrypt.h.in (mpi_is_neg): Fix macro. * mpi/mpi-scan.c (_gcry_mpi_getbyte, _gcry_mpi_putbyte): Comment out these unused functions. Signed-off-by: Werner Koch diff --git a/mpi/mpi-scan.c b/mpi/mpi-scan.c index 2473cd9..e27f7fa 100644 --- a/mpi/mpi-scan.c +++ b/mpi/mpi-scan.c @@ -31,79 +31,79 @@ * * FIXME: This code is VERY ugly! */ -int -_gcry_mpi_getbyte( gcry_mpi_t a, unsigned idx ) -{ - int i, j; - unsigned n; - mpi_ptr_t ap; - mpi_limb_t limb; +/* int */ +/* _gcry_mpi_getbyte( gcry_mpi_t a, unsigned idx ) */ +/* { */ +/* int i, j; */ +/* unsigned n; */ +/* mpi_ptr_t ap; */ +/* mpi_limb_t limb; */ - ap = a->d; - for(n=0,i=0; i < a->nlimbs; i++ ) { - limb = ap[i]; - for( j=0; j < BYTES_PER_MPI_LIMB; j++, n++ ) - if( n == idx ) - return (limb >> j*8) & 0xff; - } - return -1; -} +/* ap = a->d; */ +/* for(n=0,i=0; i < a->nlimbs; i++ ) { */ +/* limb = ap[i]; */ +/* for( j=0; j < BYTES_PER_MPI_LIMB; j++, n++ ) */ +/* if( n == idx ) */ +/* return (limb >> j*8) & 0xff; */ +/* } */ +/* return -1; */ +/* } */ /**************** * Put a value at position IDX into A. idx counts from lsb to msb */ -void -_gcry_mpi_putbyte( gcry_mpi_t a, unsigned idx, int xc ) -{ - int i, j; - unsigned n; - mpi_ptr_t ap; - mpi_limb_t limb, c; +/* void */ +/* _gcry_mpi_putbyte( gcry_mpi_t a, unsigned idx, int xc ) */ +/* { */ +/* int i, j; */ +/* unsigned n; */ +/* mpi_ptr_t ap; */ +/* mpi_limb_t limb, c; */ - c = xc & 0xff; - ap = a->d; - for(n=0,i=0; i < a->alloced; i++ ) { - limb = ap[i]; - for( j=0; j < BYTES_PER_MPI_LIMB; j++, n++ ) - if( n == idx ) { - #if BYTES_PER_MPI_LIMB == 4 - if( j == 0 ) - limb = (limb & 0xffffff00) | c; - else if( j == 1 ) - limb = (limb & 0xffff00ff) | (c<<8); - else if( j == 2 ) - limb = (limb & 0xff00ffff) | (c<<16); - else - limb = (limb & 0x00ffffff) | (c<<24); - #elif BYTES_PER_MPI_LIMB == 8 - if( j == 0 ) - limb = (limb & 0xffffffffffffff00) | c; - else if( j == 1 ) - limb = (limb & 0xffffffffffff00ff) | (c<<8); - else if( j == 2 ) - limb = (limb & 0xffffffffff00ffff) | (c<<16); - else if( j == 3 ) - limb = (limb & 0xffffffff00ffffff) | (c<<24); - else if( j == 4 ) - limb = (limb & 0xffffff00ffffffff) | (c<<32); - else if( j == 5 ) - limb = (limb & 0xffff00ffffffffff) | (c<<40); - else if( j == 6 ) - limb = (limb & 0xff00ffffffffffff) | (c<<48); - else - limb = (limb & 0x00ffffffffffffff) | (c<<56); - #else - #error please enhance this function, its ugly - i know. - #endif - if( a->nlimbs <= i ) - a->nlimbs = i+1; - ap[i] = limb; - return; - } - } - abort(); /* index out of range */ -} +/* c = xc & 0xff; */ +/* ap = a->d; */ +/* for(n=0,i=0; i < a->alloced; i++ ) { */ +/* limb = ap[i]; */ +/* for( j=0; j < BYTES_PER_MPI_LIMB; j++, n++ ) */ +/* if( n == idx ) { */ +/* #if BYTES_PER_MPI_LIMB == 4 */ +/* if( j == 0 ) */ +/* limb = (limb & 0xffffff00) | c; */ +/* else if( j == 1 ) */ +/* limb = (limb & 0xffff00ff) | (c<<8); */ +/* else if( j == 2 ) */ +/* limb = (limb & 0xff00ffff) | (c<<16); */ +/* else */ +/* limb = (limb & 0x00ffffff) | (c<<24); */ +/* #elif BYTES_PER_MPI_LIMB == 8 */ +/* if( j == 0 ) */ +/* limb = (limb & 0xffffffffffffff00) | c; */ +/* else if( j == 1 ) */ +/* limb = (limb & 0xffffffffffff00ff) | (c<<8); */ +/* else if( j == 2 ) */ +/* limb = (limb & 0xffffffffff00ffff) | (c<<16); */ +/* else if( j == 3 ) */ +/* limb = (limb & 0xffffffff00ffffff) | (c<<24); */ +/* else if( j == 4 ) */ +/* limb = (limb & 0xffffff00ffffffff) | (c<<32); */ +/* else if( j == 5 ) */ +/* limb = (limb & 0xffff00ffffffffff) | (c<<40); */ +/* else if( j == 6 ) */ +/* limb = (limb & 0xff00ffffffffffff) | (c<<48); */ +/* else */ +/* limb = (limb & 0x00ffffffffffffff) | (c<<56); */ +/* #else */ +/* #error please enhance this function, its ugly - i know. */ +/* #endif */ +/* if( a->nlimbs <= i ) */ +/* a->nlimbs = i+1; */ +/* ap[i] = limb; */ +/* return; */ +/* } */ +/* } */ +/* abort(); /\* index out of range *\/ */ +/* } */ /**************** diff --git a/mpi/mpicoder.c b/mpi/mpicoder.c index 1d2c87e..b598521 100644 --- a/mpi/mpicoder.c +++ b/mpi/mpicoder.c @@ -519,8 +519,8 @@ gcry_mpi_scan (struct gcry_mpi **ret_mpi, enum gcry_mpi_format format, : mpi_alloc ((n+BYTES_PER_MPI_LIMB-1)/BYTES_PER_MPI_LIMB); if (n) { - a->sign = !!(*s & 0x80); _gcry_mpi_set_buffer( a, s, n, 0 ); + a->sign = !!(*s & 0x80); if (a->sign) { onecompl (a); diff --git a/src/gcrypt.h.in b/src/gcrypt.h.in index 948202d..2742556 100644 --- a/src/gcrypt.h.in +++ b/src/gcrypt.h.in @@ -771,7 +771,7 @@ gcry_mpi_t _gcry_mpi_get_const (int no); #define mpi_neg( w, u) gcry_mpi_neg( (w), (u) ) #define mpi_cmp( u, v ) gcry_mpi_cmp( (u), (v) ) #define mpi_cmp_ui( u, v ) gcry_mpi_cmp_ui( (u), (v) ) -#define mpi_is_neg( a ) gcry_mpi_is_new ((a)) +#define mpi_is_neg( a ) gcry_mpi_is_neg ((a)) #define mpi_add_ui(w,u,v) gcry_mpi_add_ui((w),(u),(v)) #define mpi_add(w,u,v) gcry_mpi_add ((w),(u),(v)) diff --git a/tests/t-convert.c b/tests/t-convert.c index d44c439..072bf32 100644 --- a/tests/t-convert.c +++ b/tests/t-convert.c @@ -153,15 +153,18 @@ negative_zero (void) void *bufaddr = &buf; struct { const char *name; enum gcry_mpi_format format; } fmts[] = { - /* { "STD", GCRYMPI_FMT_STD }, */ - /* { "PGP", GCRYMPI_FMT_PGP }, */ - /* { "SSH", GCRYMPI_FMT_SSH }, */ - /* { "HEX", GCRYMPI_FMT_HEX }, */ + { "STD", GCRYMPI_FMT_STD }, + { "PGP", GCRYMPI_FMT_PGP }, + { "SSH", GCRYMPI_FMT_SSH }, + { "HEX", GCRYMPI_FMT_HEX }, { "USG", GCRYMPI_FMT_USG }, { NULL, 0 } }; int i; + if (debug) + show ("negative zero printing\n"); + a = gcry_mpi_new (0); for (i=0; fmts[i].name; i++) { @@ -205,52 +208,63 @@ check_formats (void) const char *ssh; size_t usglen; const char *usg; + size_t pgplen; + const char *pgp; } a; } data[] = { { 0, { "00", 0, "", 4, "\x00\x00\x00\x00", - 0, "" } + 0, "", + 2, "\x00\x00"} }, { 1, { "01", 1, "\x01", 5, "\x00\x00\x00\x01\x01", - 1, "\x01" } + 1, "\x01", + 3, "\x00\x01\x01" } }, { 2, { "02", 1, "\x02", 5, "\x00\x00\x00\x01\x02", - 1, "\x02", } + 1, "\x02", + 3, "\x00\x02\x02" } }, { 127, { "7F", 1, "\x7f", 5, "\x00\x00\x00\x01\x7f", - 1, "\x7f" } + 1, "\x7f", + 3, "\x00\x07\x7f" } }, { 128, { "0080", 2, "\x00\x80", 6, "\x00\x00\x00\x02\x00\x80", - 1, "\x80" } + 1, "\x80", + 3, "\x00\x08\x80" } }, { 129, { "0081", 2, "\x00\x81", 6, "\x00\x00\x00\x02\x00\x81", - 1, "\x81" } + 1, "\x81", + 3, "\x00\x08\x81" } }, { 255, { "00FF", 2, "\x00\xff", 6, "\x00\x00\x00\x02\x00\xff", - 1, "\xff" } + 1, "\xff", + 3, "\x00\x08\xff" } }, { 256, { "0100", 2, "\x01\x00", 6, "\x00\x00\x00\x02\x01\x00", - 2, "\x01\x00" } + 2, "\x01\x00", + 4, "\x00\x09\x01\x00" } }, { 257, { "0101", 2, "\x01\x01", 6, "\x00\x00\x00\x02\x01\x01", - 2, "\x01\x01" } + 2, "\x01\x01", + 4, "\x00\x09\x01\x01" } }, { -1, { "-01", 1, "\xff", @@ -295,17 +309,20 @@ check_formats (void) { 65535, { "00FFFF", 3, "\x00\xff\xff", 7, "\x00\x00\x00\x03\x00\xff\xff", - 2, "\xff\xff" } + 2, "\xff\xff", + 4, "\x00\x10\xff\xff" } }, { 65536, { "010000", 3, "\x01\00\x00", 7, "\x00\x00\x00\x03\x01\x00\x00", - 3, "\x01\x00\x00" } + 3, "\x01\x00\x00", + 5, "\x00\x11\x01\x00\x00 "} }, { 65537, { "010001", 3, "\x01\00\x01", 7, "\x00\x00\x00\x03\x01\x00\x01", - 3, "\x01\x00\x01" } + 3, "\x01\x00\x01", + 5, "\x00\x11\x01\x00\x01" } }, { -65537, { "-010001", 3, "\xfe\xff\xff", @@ -410,6 +427,29 @@ check_formats (void) } gcry_free (buf); } + + err = gcry_mpi_aprint (GCRYMPI_FMT_PGP, bufaddr, &buflen, a); + if (gcry_mpi_is_neg (a)) + { + if (gpg_err_code (err) != GPG_ERR_INV_ARG) + fail ("error printing value %d as %s: %s\n", + data[idx].value, "PGP", "Expected error not returned"); + } + else if (err) + fail ("error printing value %d as %s: %s\n", + data[idx].value, "PGP", gpg_strerror (err)); + else + { + if (buflen != data[idx].a.pgplen + || memcmp (buf, data[idx].a.pgp, data[idx].a.pgplen)) + { + fail ("error printing value %d as %s: %s\n", + data[idx].value, "PGP", "wrong result"); + showhex ("expected:", data[idx].a.pgp, data[idx].a.pgplen); + showhex (" got:", buf, buflen); + } + gcry_free (buf); + } } @@ -460,38 +500,62 @@ check_formats (void) gcry_mpi_release (b); } - err = gcry_mpi_aprint (GCRYMPI_FMT_SSH, bufaddr, &buflen, a); + err = gcry_mpi_scan (&b, GCRYMPI_FMT_SSH, + data[idx].a.ssh, data[idx].a.sshlen, &buflen); if (err) - fail ("error printing value %d as %s: %s\n", + fail ("error scanning value %d as %s: %s\n", data[idx].value, "SSH", gpg_strerror (err)); else { - if (buflen != data[idx].a.sshlen - || memcmp (buf, data[idx].a.ssh, data[idx].a.sshlen)) + if (gcry_mpi_cmp (a, b) || data[idx].a.sshlen != buflen) { - fail ("error printing value %d as %s: %s\n", - data[idx].value, "SSH", "wrong result"); - showhex ("expected:", data[idx].a.ssh, data[idx].a.sshlen); - showhex (" got:", buf, buflen); + fail ("error scanning value %d from %s: %s (%u)\n", + data[idx].value, "SSH", "wrong result", buflen); + showmpi ("expected:", a); + showmpi (" got:", b); } - gcry_free (buf); + gcry_mpi_release (b); } - err = gcry_mpi_aprint (GCRYMPI_FMT_USG, bufaddr, &buflen, a); + err = gcry_mpi_scan (&b, GCRYMPI_FMT_USG, + data[idx].a.usg, data[idx].a.usglen, &buflen); if (err) - fail ("error printing value %d as %s: %s\n", + fail ("error scanning value %d as %s: %s\n", data[idx].value, "USG", gpg_strerror (err)); else { - if (buflen != data[idx].a.usglen - || memcmp (buf, data[idx].a.usg, data[idx].a.usglen)) + if (gcry_mpi_is_neg (a)) + gcry_mpi_neg (b, b); + if (gcry_mpi_cmp (a, b) || data[idx].a.usglen != buflen) { - fail ("error printing value %d as %s: %s\n", - data[idx].value, "USG", "wrong result"); - showhex ("expected:", data[idx].a.usg, data[idx].a.usglen); - showhex (" got:", buf, buflen); + fail ("error scanning value %d from %s: %s (%u)\n", + data[idx].value, "USG", "wrong result", buflen); + showmpi ("expected:", a); + showmpi (" got:", b); + } + gcry_mpi_release (b); + } + + /* Negative values are not supported by PGP, thus we don't have + an samples. */ + if (!gcry_mpi_is_neg (a)) + { + err = gcry_mpi_scan (&b, GCRYMPI_FMT_PGP, + data[idx].a.pgp, data[idx].a.pgplen, &buflen); + if (err) + fail ("error scanning value %d as %s: %s\n", + data[idx].value, "PGP", gpg_strerror (err)); + else + { + if (gcry_mpi_cmp (a, b) || data[idx].a.pgplen != buflen) + { + fail ("error scanning value %d from %s: %s (%u)\n", + data[idx].value, "PGP", "wrong result", buflen); + showmpi ("expected:", a); + showmpi (" got:", b); + } + gcry_mpi_release (b); } - gcry_free (buf); } } ----------------------------------------------------------------------- Summary of changes: cipher/Makefile.am | 1 + cipher/ecc-common.h | 31 ++ cipher/ecc-ecdsa.c | 235 +++++++++++ cipher/ecc-eddsa.c | 681 ++++++++++++++++++++++++++++++++ cipher/ecc-gost.c | 233 +++++++++++ cipher/ecc.c | 1077 +-------------------------------------------------- configure.ac | 3 +- mpi/mpi-scan.c | 132 +++---- mpi/mpicoder.c | 2 +- src/gcrypt.h.in | 2 +- tests/t-convert.c | 132 +++++-- 11 files changed, 1361 insertions(+), 1168 deletions(-) create mode 100644 cipher/ecc-ecdsa.c create mode 100644 cipher/ecc-eddsa.c create mode 100644 cipher/ecc-gost.c hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Wed Oct 23 17:51:07 2013 From: cvs at cvs.gnupg.org (by Dmitry Eremin-Solenikov) Date: Wed, 23 Oct 2013 17:51:07 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-329-g2fd83fa Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 2fd83faa876d0be91ab7884b1a9eaa7793559eb9 (commit) via 0b39fce7e3ce6761d6bd5195d093ec6857edb7c2 (commit) via 10bf6a7e16ed193f90d2749970a420f00d1d3320 (commit) from 164eb8c85d773ef4f0939115ec45f5e4b47c1700 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 2fd83faa876d0be91ab7884b1a9eaa7793559eb9 Author: Dmitry Eremin-Solenikov Date: Wed Oct 23 18:36:10 2013 +0300 Enable assembler optimizations on earlier ARM cores * cipher/blowfish-armv6.S => cipher/blowfish-arm.S: adapt to pre-armv6 CPUs. * cipher/blowfish.c: enable assembly on armv4/armv5 little-endian CPUs. * cipher/camellia-armv6.S => cipher/camellia-arm.S: adapt to pre-armv6 CPUs. * cipher/camellia.c, cipher-camellia-glue.c: enable assembly on armv4/armv5 little-endian CPUs. * cipher/cast5-armv6.S => cipher/cast5-arm.S: adapt to pre-armv6 CPUs. * cipher/cast5.c: enable assembly on armv4/armv5 little-endian CPUs. * cipher/rijndael-armv6.S => cipher/rijndael-arm.S: adapt to pre-armv6 CPUs. * cipher/rijndael.c: enable assembly on armv4/armv5 little-endian CPUs. * cipher/twofish-armv6.S => cipher/twofish-arm.S: adapt to pre-armv6 CPUs. * cipher/twofish.c: enable assembly on armv4/armv5 little-endian CPUs. -- Our ARMv6 assembly optimized code can be easily adapted to earlier CPUs. The only incompatible place is rev instruction used to do byte swapping. Replace it on <= ARMv6 with a series of 4 instructions. Compare: ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- --------------- AES 620ms 610ms 650ms 680ms 620ms 630ms 660ms 660ms 630ms 630ms CAMELLIA128 720ms 720ms 780ms 790ms 770ms 760ms 780ms 780ms 770ms 760ms CAMELLIA256 910ms 910ms 970ms 970ms 960ms 950ms 970ms 970ms 960ms 950ms CAST5 820ms 820ms 930ms 920ms 890ms 860ms 930ms 920ms 880ms 890ms BLOWFISH 550ms 560ms 650ms 660ms 630ms 600ms 660ms 650ms 610ms 620ms ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- --------------- AES 130ms 140ms 180ms 200ms 160ms 170ms 190ms 200ms 170ms 170ms CAMELLIA128 150ms 160ms 210ms 220ms 200ms 190ms 210ms 220ms 190ms 190ms CAMELLIA256 180ms 180ms 260ms 240ms 240ms 230ms 250ms 250ms 230ms 230ms CAST5 170ms 160ms 270ms 120ms 240ms 130ms 260ms 270ms 130ms 120ms BLOWFISH 160ms 150ms 260ms 110ms 230ms 120ms 250ms 260ms 110ms 120ms Signed-off-by: Dmitry Eremin-Solenikov [ jk: in camellia.h and twofish.c, USE_ARMV6_ASM => USE_ARM_ASM ] [ jk: fix blowfish-arm.S when __ARM_FEATURE_UNALIGNED defined ] [ jk: in twofish.S remove defined(HAVE_ARM_ARCH_V6)?] [ jk: ARMv6 => ARM in comments ] diff --git a/cipher/Makefile.am b/cipher/Makefile.am index e6b1745..d7db933 100644 --- a/cipher/Makefile.am +++ b/cipher/Makefile.am @@ -56,7 +56,7 @@ rmd.h EXTRA_libcipher_la_SOURCES = \ arcfour.c \ blowfish.c blowfish-amd64.S \ -cast5.c cast5-amd64.S cast5-armv6.S \ +cast5.c cast5-amd64.S cast5-arm.S \ crc.c \ des.c \ dsa.c \ @@ -68,7 +68,7 @@ gost28147.c gost.h \ gostr3411-94.c \ md4.c \ md5.c \ -rijndael.c rijndael-tables.h rijndael-amd64.S rijndael-armv6.S \ +rijndael.c rijndael-tables.h rijndael-amd64.S rijndael-arm.S \ rmd160.c \ rsa.c \ salsa20.c \ @@ -81,10 +81,10 @@ sha512.c sha512-armv7-neon.S \ stribog.c \ tiger.c \ whirlpool.c \ -twofish.c twofish-amd64.S twofish-armv6.S \ +twofish.c twofish-amd64.S twofish-arm.S \ rfc2268.c \ camellia.c camellia.h camellia-glue.c camellia-aesni-avx-amd64.S \ - camellia-aesni-avx2-amd64.S camellia-armv6.S + camellia-aesni-avx2-amd64.S camellia-arm.S if ENABLE_O_FLAG_MUNGING o_flag_munging = sed -e 's/-O\([2-9s][2-9s]*\)/-O1/' -e 's/-Ofast/-O1/g' diff --git a/cipher/blowfish-armv6.S b/cipher/blowfish-arm.S similarity index 78% rename from cipher/blowfish-armv6.S rename to cipher/blowfish-arm.S index eea879f..43090d7 100644 --- a/cipher/blowfish-armv6.S +++ b/cipher/blowfish-arm.S @@ -1,4 +1,4 @@ -/* blowfish-armv6.S - ARM assembly implementation of Blowfish cipher +/* blowfish-arm.S - ARM assembly implementation of Blowfish cipher * * Copyright ? 2013 Jussi Kivilinna * @@ -20,7 +20,7 @@ #include -#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +#if defined(__ARMEL__) #ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS .text @@ -97,20 +97,33 @@ #define str_unaligned_host str_unaligned_le /* bswap on little-endian */ - #define host_to_be(reg) \ +#ifdef HAVE_ARM_ARCH_V6 + #define host_to_be(reg, rtmp) \ rev reg, reg; - #define be_to_host(reg) \ + #define be_to_host(reg, rtmp) \ rev reg, reg; #else + #define host_to_be(reg, rtmp) \ + eor rtmp, reg, reg, ror #16; \ + mov rtmp, rtmp, lsr #8; \ + bic rtmp, rtmp, #65280; \ + eor reg, rtmp, reg, ror #8; + #define be_to_host(reg, rtmp) \ + eor rtmp, reg, reg, ror #16; \ + mov rtmp, rtmp, lsr #8; \ + bic rtmp, rtmp, #65280; \ + eor reg, rtmp, reg, ror #8; +#endif +#else #define ldr_unaligned_host ldr_unaligned_be #define str_unaligned_host str_unaligned_be /* nop on big-endian */ - #define host_to_be(reg) /*_*/ - #define be_to_host(reg) /*_*/ + #define host_to_be(reg, rtmp) /*_*/ + #define be_to_host(reg, rtmp) /*_*/ #endif -#define host_to_host(x) /*_*/ +#define host_to_host(x, y) /*_*/ /*********************************************************************** * 1-way blowfish @@ -159,31 +172,31 @@ F(RL0, RR0); \ F(RR0, RL0); -#define read_block_aligned(rin, offs, l0, r0, convert) \ +#define read_block_aligned(rin, offs, l0, r0, convert, rtmp) \ ldr l0, [rin, #((offs) + 0)]; \ ldr r0, [rin, #((offs) + 4)]; \ - convert(l0); \ - convert(r0); + convert(l0, rtmp); \ + convert(r0, rtmp); -#define write_block_aligned(rout, offs, l0, r0, convert) \ - convert(l0); \ - convert(r0); \ +#define write_block_aligned(rout, offs, l0, r0, convert, rtmp) \ + convert(l0, rtmp); \ + convert(r0, rtmp); \ str l0, [rout, #((offs) + 0)]; \ str r0, [rout, #((offs) + 4)]; #ifdef __ARM_FEATURE_UNALIGNED /* unaligned word reads allowed */ #define read_block(rin, offs, l0, r0, rtmp0) \ - read_block_aligned(rin, offs, l0, r0, host_to_be) + read_block_aligned(rin, offs, l0, r0, host_to_be, rtmp0) #define write_block(rout, offs, r0, l0, rtmp0, rtmp1) \ - write_block_aligned(rout, offs, r0, l0, be_to_host) + write_block_aligned(rout, offs, r0, l0, be_to_host, rtmp0) #define read_block_host(rin, offs, l0, r0, rtmp0) \ - read_block_aligned(rin, offs, l0, r0, host_to_host) + read_block_aligned(rin, offs, l0, r0, host_to_host, rtmp0) #define write_block_host(rout, offs, r0, l0, rtmp0, rtmp1) \ - write_block_aligned(rout, offs, r0, l0, host_to_host) + write_block_aligned(rout, offs, r0, l0, host_to_host, rtmp0) #else /* need to handle unaligned reads by byte reads */ #define read_block(rin, offs, l0, r0, rtmp0) \ @@ -193,7 +206,7 @@ ldr_unaligned_be(r0, rin, (offs) + 4, rtmp0); \ b 2f; \ 1:;\ - read_block_aligned(rin, offs, l0, r0, host_to_be); \ + read_block_aligned(rin, offs, l0, r0, host_to_be, rtmp0); \ 2:; #define write_block(rout, offs, l0, r0, rtmp0, rtmp1) \ @@ -203,7 +216,7 @@ str_unaligned_be(r0, rout, (offs) + 4, rtmp0, rtmp1); \ b 2f; \ 1:;\ - write_block_aligned(rout, offs, l0, r0, be_to_host); \ + write_block_aligned(rout, offs, l0, r0, be_to_host, rtmp0); \ 2:; #define read_block_host(rin, offs, l0, r0, rtmp0) \ @@ -213,7 +226,7 @@ ldr_unaligned_host(r0, rin, (offs) + 4, rtmp0); \ b 2f; \ 1:;\ - read_block_aligned(rin, offs, l0, r0, host_to_host); \ + read_block_aligned(rin, offs, l0, r0, host_to_host, rtmp0); \ 2:; #define write_block_host(rout, offs, l0, r0, rtmp0, rtmp1) \ @@ -259,10 +272,10 @@ __blowfish_enc_blk1: .size __blowfish_enc_blk1,.-__blowfish_enc_blk1; .align 8 -.globl _gcry_blowfish_armv6_do_encrypt -.type _gcry_blowfish_armv6_do_encrypt,%function; +.globl _gcry_blowfish_arm_do_encrypt +.type _gcry_blowfish_arm_do_encrypt,%function; -_gcry_blowfish_armv6_do_encrypt: +_gcry_blowfish_arm_do_encrypt: /* input: * %r0: ctx, CTX * %r1: u32 *ret_xl @@ -280,13 +293,13 @@ _gcry_blowfish_armv6_do_encrypt: str RL0, [%r2]; pop {%r4-%r11, %ip, %pc}; -.size _gcry_blowfish_armv6_do_encrypt,.-_gcry_blowfish_armv6_do_encrypt; +.size _gcry_blowfish_arm_do_encrypt,.-_gcry_blowfish_arm_do_encrypt; .align 3 -.global _gcry_blowfish_armv6_encrypt_block -.type _gcry_blowfish_armv6_encrypt_block,%function; +.global _gcry_blowfish_arm_encrypt_block +.type _gcry_blowfish_arm_encrypt_block,%function; -_gcry_blowfish_armv6_encrypt_block: +_gcry_blowfish_arm_encrypt_block: /* input: * %r0: ctx, CTX * %r1: dst @@ -301,13 +314,13 @@ _gcry_blowfish_armv6_encrypt_block: write_block(%r1, 0, RR0, RL0, RT0, RT1); pop {%r4-%r11, %ip, %pc}; -.size _gcry_blowfish_armv6_encrypt_block,.-_gcry_blowfish_armv6_encrypt_block; +.size _gcry_blowfish_arm_encrypt_block,.-_gcry_blowfish_arm_encrypt_block; .align 3 -.global _gcry_blowfish_armv6_decrypt_block -.type _gcry_blowfish_armv6_decrypt_block,%function; +.global _gcry_blowfish_arm_decrypt_block +.type _gcry_blowfish_arm_decrypt_block,%function; -_gcry_blowfish_armv6_decrypt_block: +_gcry_blowfish_arm_decrypt_block: /* input: * %r0: ctx, CTX * %r1: dst @@ -336,7 +349,7 @@ _gcry_blowfish_armv6_decrypt_block: write_block(%r1, 0, RR0, RL0, RT0, RT1); pop {%r4-%r11, %ip, %pc}; -.size _gcry_blowfish_armv6_decrypt_block,.-_gcry_blowfish_armv6_decrypt_block; +.size _gcry_blowfish_arm_decrypt_block,.-_gcry_blowfish_arm_decrypt_block; /*********************************************************************** * 2-way blowfish @@ -441,22 +454,22 @@ _gcry_blowfish_armv6_decrypt_block: #define round_dec2(n, load_next_key) \ F2((n) - 3, RL0, RR0, RL1, RR1, load_next_key, 1); -#define read_block2_aligned(rin, l0, r0, l1, r1, convert) \ +#define read_block2_aligned(rin, l0, r0, l1, r1, convert, rtmp) \ ldr l0, [rin, #(0)]; \ ldr r0, [rin, #(4)]; \ - convert(l0); \ + convert(l0, rtmp); \ ldr l1, [rin, #(8)]; \ - convert(r0); \ + convert(r0, rtmp); \ ldr r1, [rin, #(12)]; \ - convert(l1); \ - convert(r1); + convert(l1, rtmp); \ + convert(r1, rtmp); -#define write_block2_aligned(rout, l0, r0, l1, r1, convert) \ - convert(l0); \ - convert(r0); \ - convert(l1); \ +#define write_block2_aligned(rout, l0, r0, l1, r1, convert, rtmp) \ + convert(l0, rtmp); \ + convert(r0, rtmp); \ + convert(l1, rtmp); \ str l0, [rout, #(0)]; \ - convert(r1); \ + convert(r1, rtmp); \ str r0, [rout, #(4)]; \ str l1, [rout, #(8)]; \ str r1, [rout, #(12)]; @@ -464,16 +477,16 @@ _gcry_blowfish_armv6_decrypt_block: #ifdef __ARM_FEATURE_UNALIGNED /* unaligned word reads allowed */ #define read_block2(rin, l0, r0, l1, r1, rtmp0) \ - read_block2_aligned(rin, l0, r0, l1, r1, host_to_be) + read_block2_aligned(rin, l0, r0, l1, r1, host_to_be, rtmp0) #define write_block2(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ - write_block2_aligned(rout, l0, r0, l1, r1, be_to_host) + write_block2_aligned(rout, l0, r0, l1, r1, be_to_host, rtmp0) #define read_block2_host(rin, l0, r0, l1, r1, rtmp0) \ - read_block2_aligned(rin, l0, r0, l1, r1, host_to_host) + read_block2_aligned(rin, l0, r0, l1, r1, host_to_host, rtmp0) #define write_block2_host(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ - write_block2_aligned(rout, l0, r0, l1, r1, host_to_host) + write_block2_aligned(rout, l0, r0, l1, r1, host_to_host, rtmp0) #else /* need to handle unaligned reads by byte reads */ #define read_block2(rin, l0, r0, l1, r1, rtmp0) \ @@ -485,7 +498,7 @@ _gcry_blowfish_armv6_decrypt_block: ldr_unaligned_be(r1, rin, 12, rtmp0); \ b 2f; \ 1:;\ - read_block2_aligned(rin, l0, r0, l1, r1, host_to_be); \ + read_block2_aligned(rin, l0, r0, l1, r1, host_to_be, rtmp0); \ 2:; #define write_block2(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ @@ -497,7 +510,7 @@ _gcry_blowfish_armv6_decrypt_block: str_unaligned_be(r1, rout, 12, rtmp0, rtmp1); \ b 2f; \ 1:;\ - write_block2_aligned(rout, l0, r0, l1, r1, be_to_host); \ + write_block2_aligned(rout, l0, r0, l1, r1, be_to_host, rtmp0); \ 2:; #define read_block2_host(rin, l0, r0, l1, r1, rtmp0) \ @@ -509,7 +522,7 @@ _gcry_blowfish_armv6_decrypt_block: ldr_unaligned_host(r1, rin, 12, rtmp0); \ b 2f; \ 1:;\ - read_block2_aligned(rin, l0, r0, l1, r1, host_to_host); \ + read_block2_aligned(rin, l0, r0, l1, r1, host_to_host, rtmp0); \ 2:; #define write_block2_host(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ @@ -521,21 +534,21 @@ _gcry_blowfish_armv6_decrypt_block: str_unaligned_host(r1, rout, 12, rtmp0, rtmp1); \ b 2f; \ 1:;\ - write_block2_aligned(rout, l0, r0, l1, r1, host_to_host); \ + write_block2_aligned(rout, l0, r0, l1, r1, host_to_host, rtmp0); \ 2:; #endif .align 3 -.type _gcry_blowfish_armv6_enc_blk2,%function; +.type _gcry_blowfish_arm_enc_blk2,%function; -_gcry_blowfish_armv6_enc_blk2: +_gcry_blowfish_arm_enc_blk2: /* input: * preloaded: CTX * [RL0, RR0], [RL1, RR1]: src * output: * [RR0, RL0], [RR1, RL1]: dst */ - push {%lr}; + push {RT0,%lr}; add CTXs2, CTXs0, #(s2 - s0); mov RMASK, #(0xff << 2); /* byte mask */ @@ -550,19 +563,19 @@ _gcry_blowfish_armv6_enc_blk2: round_enc2(14, next_key); round_enc2(16, dummy); - host_to_be(RR0); - host_to_be(RL0); - host_to_be(RR1); - host_to_be(RL1); + host_to_be(RR0, RT0); + host_to_be(RL0, RT0); + host_to_be(RR1, RT0); + host_to_be(RL1, RT0); - pop {%pc}; -.size _gcry_blowfish_armv6_enc_blk2,.-_gcry_blowfish_armv6_enc_blk2; + pop {RT0,%pc}; +.size _gcry_blowfish_arm_enc_blk2,.-_gcry_blowfish_arm_enc_blk2; .align 3 -.globl _gcry_blowfish_armv6_cfb_dec; -.type _gcry_blowfish_armv6_cfb_dec,%function; +.globl _gcry_blowfish_arm_cfb_dec; +.type _gcry_blowfish_arm_cfb_dec,%function; -_gcry_blowfish_armv6_cfb_dec: +_gcry_blowfish_arm_cfb_dec: /* input: * %r0: CTX * %r1: dst (2 blocks) @@ -575,15 +588,15 @@ _gcry_blowfish_armv6_cfb_dec: /* Load input (iv/%r3 is aligned, src/%r2 might not be) */ ldm %r3, {RL0, RR0}; - host_to_be(RL0); - host_to_be(RR0); + host_to_be(RL0, RT0); + host_to_be(RR0, RT0); read_block(%r2, 0, RL1, RR1, RT0); /* Update IV, load src[1] and save to iv[0] */ read_block_host(%r2, 8, %r5, %r6, RT0); stm %lr, {%r5, %r6}; - bl _gcry_blowfish_armv6_enc_blk2; + bl _gcry_blowfish_arm_enc_blk2; /* result in RR0:RL0, RR1:RL1 = %r4:%r3, %r10:%r9 */ /* %r1: dst, %r0: %src */ @@ -599,13 +612,13 @@ _gcry_blowfish_armv6_cfb_dec: pop {%r4-%r11, %ip, %pc}; .ltorg -.size _gcry_blowfish_armv6_cfb_dec,.-_gcry_blowfish_armv6_cfb_dec; +.size _gcry_blowfish_arm_cfb_dec,.-_gcry_blowfish_arm_cfb_dec; .align 3 -.globl _gcry_blowfish_armv6_ctr_enc; -.type _gcry_blowfish_armv6_ctr_enc,%function; +.globl _gcry_blowfish_arm_ctr_enc; +.type _gcry_blowfish_arm_ctr_enc,%function; -_gcry_blowfish_armv6_ctr_enc: +_gcry_blowfish_arm_ctr_enc: /* input: * %r0: CTX * %r1: dst (2 blocks) @@ -617,7 +630,7 @@ _gcry_blowfish_armv6_ctr_enc: mov %lr, %r3; /* Load IV (big => host endian) */ - read_block_aligned(%lr, 0, RL0, RR0, be_to_host); + read_block_aligned(%lr, 0, RL0, RR0, be_to_host, RT0); /* Construct IVs */ adds RR1, RR0, #1; /* +1 */ @@ -626,9 +639,9 @@ _gcry_blowfish_armv6_ctr_enc: adc %r5, RL1, #0; /* Store new IV (host => big-endian) */ - write_block_aligned(%lr, 0, %r5, %r6, host_to_be); + write_block_aligned(%lr, 0, %r5, %r6, host_to_be, RT0); - bl _gcry_blowfish_armv6_enc_blk2; + bl _gcry_blowfish_arm_enc_blk2; /* result in RR0:RL0, RR1:RL1 = %r4:%r3, %r10:%r9 */ /* %r1: dst, %r0: %src */ @@ -644,12 +657,12 @@ _gcry_blowfish_armv6_ctr_enc: pop {%r4-%r11, %ip, %pc}; .ltorg -.size _gcry_blowfish_armv6_ctr_enc,.-_gcry_blowfish_armv6_ctr_enc; +.size _gcry_blowfish_arm_ctr_enc,.-_gcry_blowfish_arm_ctr_enc; .align 3 -.type _gcry_blowfish_armv6_dec_blk2,%function; +.type _gcry_blowfish_arm_dec_blk2,%function; -_gcry_blowfish_armv6_dec_blk2: +_gcry_blowfish_arm_dec_blk2: /* input: * preloaded: CTX * [RL0, RR0], [RL1, RR1]: src @@ -669,20 +682,20 @@ _gcry_blowfish_armv6_dec_blk2: round_dec2(3, next_key); round_dec2(1, dummy); - host_to_be(RR0); - host_to_be(RL0); - host_to_be(RR1); - host_to_be(RL1); + host_to_be(RR0, RT0); + host_to_be(RL0, RT0); + host_to_be(RR1, RT0); + host_to_be(RL1, RT0); b .Ldec_cbc_tail; .ltorg -.size _gcry_blowfish_armv6_dec_blk2,.-_gcry_blowfish_armv6_dec_blk2; +.size _gcry_blowfish_arm_dec_blk2,.-_gcry_blowfish_arm_dec_blk2; .align 3 -.globl _gcry_blowfish_armv6_cbc_dec; -.type _gcry_blowfish_armv6_cbc_dec,%function; +.globl _gcry_blowfish_arm_cbc_dec; +.type _gcry_blowfish_arm_cbc_dec,%function; -_gcry_blowfish_armv6_cbc_dec: +_gcry_blowfish_arm_cbc_dec: /* input: * %r0: CTX * %r1: dst (2 blocks) @@ -695,7 +708,7 @@ _gcry_blowfish_armv6_cbc_dec: /* dec_blk2 is only used by cbc_dec, jump directly in/out instead * of function call. */ - b _gcry_blowfish_armv6_dec_blk2; + b _gcry_blowfish_arm_dec_blk2; .Ldec_cbc_tail: /* result in RR0:RL0, RR1:RL1 = %r4:%r3, %r10:%r9 */ @@ -724,7 +737,7 @@ _gcry_blowfish_armv6_cbc_dec: pop {%r4-%r11, %ip, %pc}; .ltorg -.size _gcry_blowfish_armv6_cbc_dec,.-_gcry_blowfish_armv6_cbc_dec; +.size _gcry_blowfish_arm_cbc_dec,.-_gcry_blowfish_arm_cbc_dec; #endif /*HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS*/ #endif /*__ARM_ARCH >= 6*/ diff --git a/cipher/blowfish.c b/cipher/blowfish.c index 2f739c8..ed4e901 100644 --- a/cipher/blowfish.c +++ b/cipher/blowfish.c @@ -50,11 +50,11 @@ # define USE_AMD64_ASM 1 #endif -/* USE_ARMV6_ASM indicates whether to use ARMv6 assembly code. */ -#undef USE_ARMV6_ASM -#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +/* USE_ARM_ASM indicates whether to use ARM assembly code. */ +#undef USE_ARM_ASM +#if defined(__ARMEL__) # if (BLOWFISH_ROUNDS == 16) && defined(HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS) -# define USE_ARMV6_ASM 1 +# define USE_ARM_ASM 1 # endif #endif @@ -314,44 +314,44 @@ decrypt_block (void *context, byte *outbuf, const byte *inbuf) return /*burn_stack*/ (2*8); } -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) /* Assembly implementations of Blowfish. */ -extern void _gcry_blowfish_armv6_do_encrypt(BLOWFISH_context *c, u32 *ret_xl, +extern void _gcry_blowfish_arm_do_encrypt(BLOWFISH_context *c, u32 *ret_xl, u32 *ret_xr); -extern void _gcry_blowfish_armv6_encrypt_block(BLOWFISH_context *c, byte *out, +extern void _gcry_blowfish_arm_encrypt_block(BLOWFISH_context *c, byte *out, const byte *in); -extern void _gcry_blowfish_armv6_decrypt_block(BLOWFISH_context *c, byte *out, +extern void _gcry_blowfish_arm_decrypt_block(BLOWFISH_context *c, byte *out, const byte *in); /* These assembly implementations process two blocks in parallel. */ -extern void _gcry_blowfish_armv6_ctr_enc(BLOWFISH_context *ctx, byte *out, +extern void _gcry_blowfish_arm_ctr_enc(BLOWFISH_context *ctx, byte *out, const byte *in, byte *ctr); -extern void _gcry_blowfish_armv6_cbc_dec(BLOWFISH_context *ctx, byte *out, +extern void _gcry_blowfish_arm_cbc_dec(BLOWFISH_context *ctx, byte *out, const byte *in, byte *iv); -extern void _gcry_blowfish_armv6_cfb_dec(BLOWFISH_context *ctx, byte *out, +extern void _gcry_blowfish_arm_cfb_dec(BLOWFISH_context *ctx, byte *out, const byte *in, byte *iv); static void do_encrypt ( BLOWFISH_context *bc, u32 *ret_xl, u32 *ret_xr ) { - _gcry_blowfish_armv6_do_encrypt (bc, ret_xl, ret_xr); + _gcry_blowfish_arm_do_encrypt (bc, ret_xl, ret_xr); } static void do_encrypt_block (BLOWFISH_context *context, byte *outbuf, const byte *inbuf) { - _gcry_blowfish_armv6_encrypt_block (context, outbuf, inbuf); + _gcry_blowfish_arm_encrypt_block (context, outbuf, inbuf); } static void do_decrypt_block (BLOWFISH_context *context, byte *outbuf, const byte *inbuf) { - _gcry_blowfish_armv6_decrypt_block (context, outbuf, inbuf); + _gcry_blowfish_arm_decrypt_block (context, outbuf, inbuf); } static unsigned int @@ -370,7 +370,7 @@ decrypt_block (void *context, byte *outbuf, const byte *inbuf) return /*burn_stack*/ (10*4); } -#else /*USE_ARMV6_ASM*/ +#else /*USE_ARM_ASM*/ #if BLOWFISH_ROUNDS != 16 static inline u32 @@ -580,7 +580,7 @@ decrypt_block (void *context, byte *outbuf, const byte *inbuf) return /*burn_stack*/ (64); } -#endif /*!USE_AMD64_ASM&&!USE_ARMV6_ASM*/ +#endif /*!USE_AMD64_ASM&&!USE_ARM_ASM*/ /* Bulk encryption of complete blocks in CTR mode. This function is only @@ -615,12 +615,12 @@ _gcry_blowfish_ctr_enc(void *context, unsigned char *ctr, void *outbuf_arg, /* Use generic code to handle smaller chunks... */ /* TODO: use caching instead? */ } -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) { /* Process data in 2 block chunks. */ while (nblocks >= 2) { - _gcry_blowfish_armv6_ctr_enc(ctx, outbuf, inbuf, ctr); + _gcry_blowfish_arm_ctr_enc(ctx, outbuf, inbuf, ctr); nblocks -= 2; outbuf += 2 * BLOWFISH_BLOCKSIZE; @@ -683,12 +683,12 @@ _gcry_blowfish_cbc_dec(void *context, unsigned char *iv, void *outbuf_arg, /* Use generic code to handle smaller chunks... */ } -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) { /* Process data in 2 block chunks. */ while (nblocks >= 2) { - _gcry_blowfish_armv6_cbc_dec(ctx, outbuf, inbuf, iv); + _gcry_blowfish_arm_cbc_dec(ctx, outbuf, inbuf, iv); nblocks -= 2; outbuf += 2 * BLOWFISH_BLOCKSIZE; @@ -746,12 +746,12 @@ _gcry_blowfish_cfb_dec(void *context, unsigned char *iv, void *outbuf_arg, /* Use generic code to handle smaller chunks... */ } -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) { /* Process data in 2 block chunks. */ while (nblocks >= 2) { - _gcry_blowfish_armv6_cfb_dec(ctx, outbuf, inbuf, iv); + _gcry_blowfish_arm_cfb_dec(ctx, outbuf, inbuf, iv); nblocks -= 2; outbuf += 2 * BLOWFISH_BLOCKSIZE; diff --git a/cipher/camellia-armv6.S b/cipher/camellia-arm.S similarity index 93% rename from cipher/camellia-armv6.S rename to cipher/camellia-arm.S index 3544754..820c46e 100644 --- a/cipher/camellia-armv6.S +++ b/cipher/camellia-arm.S @@ -1,4 +1,4 @@ -/* camellia-armv6.S - ARM assembly implementation of Camellia cipher +/* camellia-arm.S - ARM assembly implementation of Camellia cipher * * Copyright ? 2013 Jussi Kivilinna * @@ -20,7 +20,7 @@ #include -#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +#if defined(__ARMEL__) #ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS .text @@ -73,44 +73,56 @@ strb rtmp0, [rdst, #((offs) + 0)]; #ifdef __ARMEL__ - /* bswap on little-endian */ - #define host_to_be(reg) \ +#ifdef HAVE_ARM_ARCH_V6 + #define host_to_be(reg, rtmp) \ rev reg, reg; - #define be_to_host(reg) \ + #define be_to_host(reg, rtmp) \ rev reg, reg; #else + #define host_to_be(reg, rtmp) \ + eor rtmp, reg, reg, ror #16; \ + mov rtmp, rtmp, lsr #8; \ + bic rtmp, rtmp, #65280; \ + eor reg, rtmp, reg, ror #8; + #define be_to_host(reg, rtmp) \ + eor rtmp, reg, reg, ror #16; \ + mov rtmp, rtmp, lsr #8; \ + bic rtmp, rtmp, #65280; \ + eor reg, rtmp, reg, ror #8; +#endif +#else /* nop on big-endian */ - #define host_to_be(reg) /*_*/ - #define be_to_host(reg) /*_*/ + #define host_to_be(reg, rtmp) /*_*/ + #define be_to_host(reg, rtmp) /*_*/ #endif -#define ldr_input_aligned_be(rin, a, b, c, d) \ +#define ldr_input_aligned_be(rin, a, b, c, d, rtmp) \ ldr a, [rin, #0]; \ ldr b, [rin, #4]; \ - be_to_host(a); \ + be_to_host(a, rtmp); \ ldr c, [rin, #8]; \ - be_to_host(b); \ + be_to_host(b, rtmp); \ ldr d, [rin, #12]; \ - be_to_host(c); \ - be_to_host(d); + be_to_host(c, rtmp); \ + be_to_host(d, rtmp); -#define str_output_aligned_be(rout, a, b, c, d) \ - be_to_host(a); \ - be_to_host(b); \ +#define str_output_aligned_be(rout, a, b, c, d, rtmp) \ + be_to_host(a, rtmp); \ + be_to_host(b, rtmp); \ str a, [rout, #0]; \ - be_to_host(c); \ + be_to_host(c, rtmp); \ str b, [rout, #4]; \ - be_to_host(d); \ + be_to_host(d, rtmp); \ str c, [rout, #8]; \ str d, [rout, #12]; #ifdef __ARM_FEATURE_UNALIGNED /* unaligned word reads/writes allowed */ #define ldr_input_be(rin, ra, rb, rc, rd, rtmp) \ - ldr_input_aligned_be(rin, ra, rb, rc, rd) + ldr_input_aligned_be(rin, ra, rb, rc, rd, rtmp) #define str_output_be(rout, ra, rb, rc, rd, rtmp0, rtmp1) \ - str_output_aligned_be(rout, ra, rb, rc, rd) + str_output_aligned_be(rout, ra, rb, rc, rd, rtmp0) #else /* need to handle unaligned reads/writes by byte reads */ #define ldr_input_be(rin, ra, rb, rc, rd, rtmp0) \ @@ -122,7 +134,7 @@ ldr_unaligned_be(rd, rin, 12, rtmp0); \ b 2f; \ 1:;\ - ldr_input_aligned_be(rin, ra, rb, rc, rd); \ + ldr_input_aligned_be(rin, ra, rb, rc, rd, rtmp0); \ 2:; #define str_output_be(rout, ra, rb, rc, rd, rtmp0, rtmp1) \ @@ -134,7 +146,7 @@ str_unaligned_be(rd, rout, 12, rtmp0, rtmp1); \ b 2f; \ 1:;\ - str_output_aligned_be(rout, ra, rb, rc, rd); \ + str_output_aligned_be(rout, ra, rb, rc, rd, rtmp0); \ 2:; #endif @@ -240,10 +252,10 @@ str_output_be(%r1, YL, YR, XL, XR, RT0, RT1); .align 3 -.global _gcry_camellia_armv6_encrypt_block -.type _gcry_camellia_armv6_encrypt_block,%function; +.global _gcry_camellia_arm_encrypt_block +.type _gcry_camellia_arm_encrypt_block,%function; -_gcry_camellia_armv6_encrypt_block: +_gcry_camellia_arm_encrypt_block: /* input: * %r0: keytable * %r1: dst @@ -285,13 +297,13 @@ _gcry_camellia_armv6_encrypt_block: pop {%r4-%r11, %ip, %pc}; .ltorg -.size _gcry_camellia_armv6_encrypt_block,.-_gcry_camellia_armv6_encrypt_block; +.size _gcry_camellia_arm_encrypt_block,.-_gcry_camellia_arm_encrypt_block; .align 3 -.global _gcry_camellia_armv6_decrypt_block -.type _gcry_camellia_armv6_decrypt_block,%function; +.global _gcry_camellia_arm_decrypt_block +.type _gcry_camellia_arm_decrypt_block,%function; -_gcry_camellia_armv6_decrypt_block: +_gcry_camellia_arm_decrypt_block: /* input: * %r0: keytable * %r1: dst @@ -330,7 +342,7 @@ _gcry_camellia_armv6_decrypt_block: b .Ldec_128; .ltorg -.size _gcry_camellia_armv6_decrypt_block,.-_gcry_camellia_armv6_decrypt_block; +.size _gcry_camellia_arm_decrypt_block,.-_gcry_camellia_arm_decrypt_block; .data diff --git a/cipher/camellia-glue.c b/cipher/camellia-glue.c index 29cb7a5..e6d4029 100644 --- a/cipher/camellia-glue.c +++ b/cipher/camellia-glue.c @@ -193,14 +193,14 @@ camellia_setkey(void *c, const byte *key, unsigned keylen) return 0; } -#ifdef USE_ARMV6_ASM +#ifdef USE_ARM_ASM /* Assembly implementations of CAST5. */ -extern void _gcry_camellia_armv6_encrypt_block(const KEY_TABLE_TYPE keyTable, +extern void _gcry_camellia_arm_encrypt_block(const KEY_TABLE_TYPE keyTable, byte *outbuf, const byte *inbuf, const int keybits); -extern void _gcry_camellia_armv6_decrypt_block(const KEY_TABLE_TYPE keyTable, +extern void _gcry_camellia_arm_decrypt_block(const KEY_TABLE_TYPE keyTable, byte *outbuf, const byte *inbuf, const int keybits); @@ -209,7 +209,7 @@ static void Camellia_EncryptBlock(const int keyBitLength, const KEY_TABLE_TYPE keyTable, unsigned char *cipherText) { - _gcry_camellia_armv6_encrypt_block(keyTable, cipherText, plaintext, + _gcry_camellia_arm_encrypt_block(keyTable, cipherText, plaintext, keyBitLength); } @@ -218,7 +218,7 @@ static void Camellia_DecryptBlock(const int keyBitLength, const KEY_TABLE_TYPE keyTable, unsigned char *plaintext) { - _gcry_camellia_armv6_decrypt_block(keyTable, plaintext, cipherText, + _gcry_camellia_arm_decrypt_block(keyTable, plaintext, cipherText, keyBitLength); } @@ -240,7 +240,7 @@ camellia_decrypt(void *c, byte *outbuf, const byte *inbuf) return /*burn_stack*/ (CAMELLIA_decrypt_stack_burn_size); } -#else /*USE_ARMV6_ASM*/ +#else /*USE_ARM_ASM*/ static unsigned int camellia_encrypt(void *c, byte *outbuf, const byte *inbuf) @@ -276,7 +276,7 @@ camellia_decrypt(void *c, byte *outbuf, const byte *inbuf) return /*burn_stack*/ (CAMELLIA_decrypt_stack_burn_size); } -#endif /*!USE_ARMV6_ASM*/ +#endif /*!USE_ARM_ASM*/ /* Bulk encryption of complete blocks in CTR mode. This function is only intended for the bulk encryption feature of cipher.c. CTR is expected to be diff --git a/cipher/camellia.c b/cipher/camellia.c index 03510a3..9067246 100644 --- a/cipher/camellia.c +++ b/cipher/camellia.c @@ -861,7 +861,7 @@ void camellia_setup192(const unsigned char *key, u32 *subkey) } -#ifndef USE_ARMV6_ASM +#ifndef USE_ARM_ASM /** * Stuff related to camellia encryption/decryption * @@ -1321,7 +1321,7 @@ void camellia_decrypt256(const u32 *subkey, u32 *blocks) return; } -#endif /*!USE_ARMV6_ASM*/ +#endif /*!USE_ARM_ASM*/ /*** @@ -1349,7 +1349,7 @@ void Camellia_Ekeygen(const int keyBitLength, } -#ifndef USE_ARMV6_ASM +#ifndef USE_ARM_ASM void Camellia_EncryptBlock(const int keyBitLength, const unsigned char *plaintext, const KEY_TABLE_TYPE keyTable, @@ -1410,4 +1410,4 @@ void Camellia_DecryptBlock(const int keyBitLength, PUTU32(plaintext + 8, tmp[2]); PUTU32(plaintext + 12, tmp[3]); } -#endif /*!USE_ARMV6_ASM*/ +#endif /*!USE_ARM_ASM*/ diff --git a/cipher/camellia.h b/cipher/camellia.h index 72f2d1f..d0e3c18 100644 --- a/cipher/camellia.h +++ b/cipher/camellia.h @@ -30,11 +30,11 @@ */ #ifdef HAVE_CONFIG_H #include -/* USE_ARMV6_ASM indicates whether to use ARMv6 assembly code. */ -# undef USE_ARMV6_ASM -# if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +/* USE_ARM_ASM indicates whether to use ARM assembly code. */ +# undef USE_ARM_ASM +# if defined(__ARMEL__) # ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS -# define USE_ARMV6_ASM 1 +# define USE_ARM_ASM 1 # endif # endif #endif @@ -70,7 +70,7 @@ void Camellia_Ekeygen(const int keyBitLength, const unsigned char *rawKey, KEY_TABLE_TYPE keyTable); -#ifndef USE_ARMV6_ASM +#ifndef USE_ARM_ASM void Camellia_EncryptBlock(const int keyBitLength, const unsigned char *plaintext, const KEY_TABLE_TYPE keyTable, diff --git a/cipher/cast5-armv6.S b/cipher/cast5-arm.S similarity index 81% rename from cipher/cast5-armv6.S rename to cipher/cast5-arm.S index 038fc4f..ce7fa93 100644 --- a/cipher/cast5-armv6.S +++ b/cipher/cast5-arm.S @@ -1,4 +1,4 @@ -/* cast5-armv6.S - ARM assembly implementation of CAST5 cipher +/* cast5-arm.S - ARM assembly implementation of CAST5 cipher * * Copyright ? 2013 Jussi Kivilinna * @@ -20,7 +20,7 @@ #include -#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +#if defined(__ARMEL__) #ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS .text @@ -99,20 +99,33 @@ #define str_unaligned_host str_unaligned_le /* bswap on little-endian */ - #define host_to_be(reg) \ +#ifdef HAVE_ARM_ARCH_V6 + #define host_to_be(reg, rtmp) \ rev reg, reg; - #define be_to_host(reg) \ + #define be_to_host(reg, rtmp) \ rev reg, reg; #else + #define host_to_be(reg, rtmp) \ + eor rtmp, reg, reg, ror #16; \ + mov rtmp, rtmp, lsr #8; \ + bic rtmp, rtmp, #65280; \ + eor reg, rtmp, reg, ror #8; + #define be_to_host(reg, rtmp) \ + eor rtmp, reg, reg, ror #16; \ + mov rtmp, rtmp, lsr #8; \ + bic rtmp, rtmp, #65280; \ + eor reg, rtmp, reg, ror #8; +#endif +#else #define ldr_unaligned_host ldr_unaligned_be #define str_unaligned_host str_unaligned_be /* nop on big-endian */ - #define host_to_be(reg) /*_*/ - #define be_to_host(reg) /*_*/ + #define host_to_be(reg, rtmp) /*_*/ + #define be_to_host(reg, rtmp) /*_*/ #endif -#define host_to_host(x) /*_*/ +#define host_to_host(x, y) /*_*/ /********************************************************************** 1-way cast5 @@ -167,31 +180,31 @@ #define dec_round(n, Fx, rl, rr, loadkm, shiftkr, loadkr) \ Fx(n, rl, rr, 1, loadkm, shiftkr, loadkr) -#define read_block_aligned(rin, offs, l0, r0, convert) \ +#define read_block_aligned(rin, offs, l0, r0, convert, rtmp) \ ldr l0, [rin, #((offs) + 0)]; \ ldr r0, [rin, #((offs) + 4)]; \ - convert(l0); \ - convert(r0); + convert(l0, rtmp); \ + convert(r0, rtmp); -#define write_block_aligned(rout, offs, l0, r0, convert) \ - convert(l0); \ - convert(r0); \ +#define write_block_aligned(rout, offs, l0, r0, convert, rtmp) \ + convert(l0, rtmp); \ + convert(r0, rtmp); \ str l0, [rout, #((offs) + 0)]; \ str r0, [rout, #((offs) + 4)]; #ifdef __ARM_FEATURE_UNALIGNED /* unaligned word reads allowed */ #define read_block(rin, offs, l0, r0, rtmp0) \ - read_block_aligned(rin, offs, l0, r0, host_to_be) + read_block_aligned(rin, offs, l0, r0, host_to_be, rtmp0) #define write_block(rout, offs, r0, l0, rtmp0, rtmp1) \ - write_block_aligned(rout, offs, r0, l0, be_to_host) + write_block_aligned(rout, offs, r0, l0, be_to_host, rtmp0) #define read_block_host(rin, offs, l0, r0, rtmp0) \ - read_block_aligned(rin, offs, l0, r0, host_to_host) + read_block_aligned(rin, offs, l0, r0, host_to_host, rtmp0) #define write_block_host(rout, offs, r0, l0, rtmp0, rtmp1) \ - write_block_aligned(rout, offs, r0, l0, host_to_host) + write_block_aligned(rout, offs, r0, l0, host_to_host, rtmp0) #else /* need to handle unaligned reads by byte reads */ #define read_block(rin, offs, l0, r0, rtmp0) \ @@ -201,7 +214,7 @@ ldr_unaligned_be(r0, rin, (offs) + 4, rtmp0); \ b 2f; \ 1:;\ - read_block_aligned(rin, offs, l0, r0, host_to_be); \ + read_block_aligned(rin, offs, l0, r0, host_to_be, rtmp0); \ 2:; #define write_block(rout, offs, l0, r0, rtmp0, rtmp1) \ @@ -211,7 +224,7 @@ str_unaligned_be(r0, rout, (offs) + 4, rtmp0, rtmp1); \ b 2f; \ 1:;\ - write_block_aligned(rout, offs, l0, r0, be_to_host); \ + write_block_aligned(rout, offs, l0, r0, be_to_host, rtmp0); \ 2:; #define read_block_host(rin, offs, l0, r0, rtmp0) \ @@ -221,7 +234,7 @@ ldr_unaligned_host(r0, rin, (offs) + 4, rtmp0); \ b 2f; \ 1:;\ - read_block_aligned(rin, offs, l0, r0, host_to_host); \ + read_block_aligned(rin, offs, l0, r0, host_to_host, rtmp0); \ 2:; #define write_block_host(rout, offs, l0, r0, rtmp0, rtmp1) \ @@ -231,15 +244,15 @@ str_unaligned_host(r0, rout, (offs) + 4, rtmp0, rtmp1); \ b 2f; \ 1:;\ - write_block_aligned(rout, offs, l0, r0, host_to_host); \ + write_block_aligned(rout, offs, l0, r0, host_to_host, rtmp0); \ 2:; #endif .align 3 -.globl _gcry_cast5_armv6_encrypt_block -.type _gcry_cast5_armv6_encrypt_block,%function; +.globl _gcry_cast5_arm_encrypt_block +.type _gcry_cast5_arm_encrypt_block,%function; -_gcry_cast5_armv6_encrypt_block: +_gcry_cast5_arm_encrypt_block: /* input: * %r0: CTX * %r1: dst @@ -279,13 +292,13 @@ _gcry_cast5_armv6_encrypt_block: pop {%r4-%r11, %ip, %pc}; .ltorg -.size _gcry_cast5_armv6_encrypt_block,.-_gcry_cast5_armv6_encrypt_block; +.size _gcry_cast5_arm_encrypt_block,.-_gcry_cast5_arm_encrypt_block; .align 3 -.globl _gcry_cast5_armv6_decrypt_block -.type _gcry_cast5_armv6_decrypt_block,%function; +.globl _gcry_cast5_arm_decrypt_block +.type _gcry_cast5_arm_decrypt_block,%function; -_gcry_cast5_armv6_decrypt_block: +_gcry_cast5_arm_decrypt_block: /* input: * %r0: CTX * %r1: dst @@ -325,7 +338,7 @@ _gcry_cast5_armv6_decrypt_block: pop {%r4-%r11, %ip, %pc}; .ltorg -.size _gcry_cast5_armv6_decrypt_block,.-_gcry_cast5_armv6_decrypt_block; +.size _gcry_cast5_arm_decrypt_block,.-_gcry_cast5_arm_decrypt_block; /********************************************************************** 2-way cast5 @@ -391,22 +404,22 @@ _gcry_cast5_armv6_decrypt_block: #define dec_round2(n, Fx, rl, rr, loadkm, shiftkr, loadkr) \ Fx##_2w(n, rl##0, rr##0, rl##1, rr##1, 1, loadkm, shiftkr, loadkr) -#define read_block2_aligned(rin, l0, r0, l1, r1, convert) \ +#define read_block2_aligned(rin, l0, r0, l1, r1, convert, rtmp) \ ldr l0, [rin, #(0)]; \ ldr r0, [rin, #(4)]; \ - convert(l0); \ + convert(l0, rtmp); \ ldr l1, [rin, #(8)]; \ - convert(r0); \ + convert(r0, rtmp); \ ldr r1, [rin, #(12)]; \ - convert(l1); \ - convert(r1); + convert(l1, rtmp); \ + convert(r1, rtmp); -#define write_block2_aligned(rout, l0, r0, l1, r1, convert) \ - convert(l0); \ - convert(r0); \ - convert(l1); \ +#define write_block2_aligned(rout, l0, r0, l1, r1, convert, rtmp) \ + convert(l0, rtmp); \ + convert(r0, rtmp); \ + convert(l1, rtmp); \ str l0, [rout, #(0)]; \ - convert(r1); \ + convert(r1, rtmp); \ str r0, [rout, #(4)]; \ str l1, [rout, #(8)]; \ str r1, [rout, #(12)]; @@ -414,16 +427,16 @@ _gcry_cast5_armv6_decrypt_block: #ifdef __ARM_FEATURE_UNALIGNED /* unaligned word reads allowed */ #define read_block2(rin, l0, r0, l1, r1, rtmp0) \ - read_block2_aligned(rin, l0, r0, l1, r1, host_to_be) + read_block2_aligned(rin, l0, r0, l1, r1, host_to_be, rtmp0) #define write_block2(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ - write_block2_aligned(rout, l0, r0, l1, r1, be_to_host) + write_block2_aligned(rout, l0, r0, l1, r1, be_to_host, rtmp0) #define read_block2_host(rin, l0, r0, l1, r1, rtmp0) \ - read_block2_aligned(rin, l0, r0, l1, r1, host_to_host) + read_block2_aligned(rin, l0, r0, l1, r1, host_to_host, rtmp0) #define write_block2_host(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ - write_block2_aligned(rout, l0, r0, l1, r1, host_to_host) + write_block2_aligned(rout, l0, r0, l1, r1, host_to_host, rtmp0) #else /* need to handle unaligned reads by byte reads */ #define read_block2(rin, l0, r0, l1, r1, rtmp0) \ @@ -435,7 +448,7 @@ _gcry_cast5_armv6_decrypt_block: ldr_unaligned_be(r1, rin, 12, rtmp0); \ b 2f; \ 1:;\ - read_block2_aligned(rin, l0, r0, l1, r1, host_to_be); \ + read_block2_aligned(rin, l0, r0, l1, r1, host_to_be, rtmp0); \ 2:; #define write_block2(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ @@ -447,7 +460,7 @@ _gcry_cast5_armv6_decrypt_block: str_unaligned_be(r1, rout, 12, rtmp0, rtmp1); \ b 2f; \ 1:;\ - write_block2_aligned(rout, l0, r0, l1, r1, be_to_host); \ + write_block2_aligned(rout, l0, r0, l1, r1, be_to_host, rtmp0); \ 2:; #define read_block2_host(rin, l0, r0, l1, r1, rtmp0) \ @@ -459,7 +472,7 @@ _gcry_cast5_armv6_decrypt_block: ldr_unaligned_host(r1, rin, 12, rtmp0); \ b 2f; \ 1:;\ - read_block2_aligned(rin, l0, r0, l1, r1, host_to_host); \ + read_block2_aligned(rin, l0, r0, l1, r1, host_to_host, rtmp0); \ 2:; #define write_block2_host(rout, l0, r0, l1, r1, rtmp0, rtmp1) \ @@ -471,14 +484,14 @@ _gcry_cast5_armv6_decrypt_block: str_unaligned_host(r1, rout, 12, rtmp0, rtmp1); \ b 2f; \ 1:;\ - write_block2_aligned(rout, l0, r0, l1, r1, host_to_host); \ + write_block2_aligned(rout, l0, r0, l1, r1, host_to_host, rtmp0); \ 2:; #endif .align 3 -.type _gcry_cast5_armv6_enc_blk2,%function; +.type _gcry_cast5_arm_enc_blk2,%function; -_gcry_cast5_armv6_enc_blk2: +_gcry_cast5_arm_enc_blk2: /* input: * preloaded: CTX * [RL0, RR0], [RL1, RR1]: src @@ -510,20 +523,20 @@ _gcry_cast5_armv6_enc_blk2: enc_round2(14, F3, RL, RR, load_km, shift_kr, dummy); enc_round2(15, F1, RR, RL, dummy, dummy, dummy); - host_to_be(RR0); - host_to_be(RL0); - host_to_be(RR1); - host_to_be(RL1); + host_to_be(RR0, RT0); + host_to_be(RL0, RT0); + host_to_be(RR1, RT0); + host_to_be(RL1, RT0); pop {%pc}; .ltorg -.size _gcry_cast5_armv6_enc_blk2,.-_gcry_cast5_armv6_enc_blk2; +.size _gcry_cast5_arm_enc_blk2,.-_gcry_cast5_arm_enc_blk2; .align 3 -.globl _gcry_cast5_armv6_cfb_dec; -.type _gcry_cast5_armv6_cfb_dec,%function; +.globl _gcry_cast5_arm_cfb_dec; +.type _gcry_cast5_arm_cfb_dec,%function; -_gcry_cast5_armv6_cfb_dec: +_gcry_cast5_arm_cfb_dec: /* input: * %r0: CTX * %r1: dst (2 blocks) @@ -536,15 +549,15 @@ _gcry_cast5_armv6_cfb_dec: /* Load input (iv/%r3 is aligned, src/%r2 might not be) */ ldm %r3, {RL0, RR0}; - host_to_be(RL0); - host_to_be(RR0); + host_to_be(RL0, RT1); + host_to_be(RR0, RT1); read_block(%r2, 0, RL1, RR1, %ip); /* Update IV, load src[1] and save to iv[0] */ read_block_host(%r2, 8, %r5, %r6, %r7); stm %lr, {%r5, %r6}; - bl _gcry_cast5_armv6_enc_blk2; + bl _gcry_cast5_arm_enc_blk2; /* result in RR0:RL0, RR1:RL1 = %r4:%r3, %r10:%r9 */ /* %r0: dst, %r1: %src */ @@ -560,13 +573,13 @@ _gcry_cast5_armv6_cfb_dec: pop {%r4-%r11, %ip, %pc}; .ltorg -.size _gcry_cast5_armv6_cfb_dec,.-_gcry_cast5_armv6_cfb_dec; +.size _gcry_cast5_arm_cfb_dec,.-_gcry_cast5_arm_cfb_dec; .align 3 -.globl _gcry_cast5_armv6_ctr_enc; -.type _gcry_cast5_armv6_ctr_enc,%function; +.globl _gcry_cast5_arm_ctr_enc; +.type _gcry_cast5_arm_ctr_enc,%function; -_gcry_cast5_armv6_ctr_enc: +_gcry_cast5_arm_ctr_enc: /* input: * %r0: CTX * %r1: dst (2 blocks) @@ -578,7 +591,7 @@ _gcry_cast5_armv6_ctr_enc: mov %lr, %r3; /* Load IV (big => host endian) */ - read_block_aligned(%lr, 0, RL0, RR0, be_to_host); + read_block_aligned(%lr, 0, RL0, RR0, be_to_host, RT1); /* Construct IVs */ adds RR1, RR0, #1; /* +1 */ @@ -587,9 +600,9 @@ _gcry_cast5_armv6_ctr_enc: adc %r5, RL1, #0; /* Store new IV (host => big-endian) */ - write_block_aligned(%lr, 0, %r5, %r6, host_to_be); + write_block_aligned(%lr, 0, %r5, %r6, host_to_be, RT1); - bl _gcry_cast5_armv6_enc_blk2; + bl _gcry_cast5_arm_enc_blk2; /* result in RR0:RL0, RR1:RL1 = %r4:%r3, %r10:%r9 */ /* %r0: dst, %r1: %src */ @@ -605,12 +618,12 @@ _gcry_cast5_armv6_ctr_enc: pop {%r4-%r11, %ip, %pc}; .ltorg -.size _gcry_cast5_armv6_ctr_enc,.-_gcry_cast5_armv6_ctr_enc; +.size _gcry_cast5_arm_ctr_enc,.-_gcry_cast5_arm_ctr_enc; .align 3 -.type _gcry_cast5_armv6_dec_blk2,%function; +.type _gcry_cast5_arm_dec_blk2,%function; -_gcry_cast5_armv6_dec_blk2: +_gcry_cast5_arm_dec_blk2: /* input: * preloaded: CTX * [RL0, RR0], [RL1, RR1]: src @@ -641,20 +654,20 @@ _gcry_cast5_armv6_dec_blk2: dec_round2(1, F2, RL, RR, load_km, shift_kr, dummy); dec_round2(0, F1, RR, RL, dummy, dummy, dummy); - host_to_be(RR0); - host_to_be(RL0); - host_to_be(RR1); - host_to_be(RL1); + host_to_be(RR0, RT0); + host_to_be(RL0, RT0); + host_to_be(RR1, RT0); + host_to_be(RL1, RT0); b .Ldec_cbc_tail; .ltorg -.size _gcry_cast5_armv6_dec_blk2,.-_gcry_cast5_armv6_dec_blk2; +.size _gcry_cast5_arm_dec_blk2,.-_gcry_cast5_arm_dec_blk2; .align 3 -.globl _gcry_cast5_armv6_cbc_dec; -.type _gcry_cast5_armv6_cbc_dec,%function; +.globl _gcry_cast5_arm_cbc_dec; +.type _gcry_cast5_arm_cbc_dec,%function; -_gcry_cast5_armv6_cbc_dec: +_gcry_cast5_arm_cbc_dec: /* input: * %r0: CTX * %r1: dst (2 blocks) @@ -667,7 +680,7 @@ _gcry_cast5_armv6_cbc_dec: /* dec_blk2 is only used by cbc_dec, jump directly in/out instead * of function call. */ - b _gcry_cast5_armv6_dec_blk2; + b _gcry_cast5_arm_dec_blk2; .Ldec_cbc_tail: /* result in RR0:RL0, RR1:RL1 = %r4:%r3, %r10:%r9 */ @@ -696,7 +709,7 @@ _gcry_cast5_armv6_cbc_dec: pop {%r4-%r11, %ip, %pc}; .ltorg -.size _gcry_cast5_armv6_cbc_dec,.-_gcry_cast5_armv6_cbc_dec; +.size _gcry_cast5_arm_cbc_dec,.-_gcry_cast5_arm_cbc_dec; #endif /*HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS*/ #endif /*__ARM_ARCH >= 6*/ diff --git a/cipher/cast5.c b/cipher/cast5.c index 92d9af8..8c016d7 100644 --- a/cipher/cast5.c +++ b/cipher/cast5.c @@ -52,11 +52,11 @@ # define USE_AMD64_ASM 1 #endif -/* USE_ARMV6_ASM indicates whether to use ARMv6 assembly code. */ -#undef USE_ARMV6_ASM -#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +/* USE_ARM_ASM indicates whether to use ARM assembly code. */ +#undef USE_ARM_ASM +#if defined(__ARMEL__) # ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS -# define USE_ARMV6_ASM 1 +# define USE_ARM_ASM 1 # endif #endif @@ -65,7 +65,7 @@ typedef struct { u32 Km[16]; byte Kr[16]; -#ifdef USE_ARMV6_ASM +#ifdef USE_ARM_ASM u32 Kr_arm_enc[16 / sizeof(u32)]; u32 Kr_arm_dec[16 / sizeof(u32)]; #endif @@ -400,35 +400,35 @@ decrypt_block (void *context, byte *outbuf, const byte *inbuf) return /*burn_stack*/ (2*8); } -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) -/* ARMv6 assembly implementations of CAST5. */ -extern void _gcry_cast5_armv6_encrypt_block(CAST5_context *c, byte *outbuf, +/* ARM assembly implementations of CAST5. */ +extern void _gcry_cast5_arm_encrypt_block(CAST5_context *c, byte *outbuf, const byte *inbuf); -extern void _gcry_cast5_armv6_decrypt_block(CAST5_context *c, byte *outbuf, +extern void _gcry_cast5_arm_decrypt_block(CAST5_context *c, byte *outbuf, const byte *inbuf); /* These assembly implementations process two blocks in parallel. */ -extern void _gcry_cast5_armv6_ctr_enc(CAST5_context *ctx, byte *out, +extern void _gcry_cast5_arm_ctr_enc(CAST5_context *ctx, byte *out, const byte *in, byte *ctr); -extern void _gcry_cast5_armv6_cbc_dec(CAST5_context *ctx, byte *out, +extern void _gcry_cast5_arm_cbc_dec(CAST5_context *ctx, byte *out, const byte *in, byte *iv); -extern void _gcry_cast5_armv6_cfb_dec(CAST5_context *ctx, byte *out, +extern void _gcry_cast5_arm_cfb_dec(CAST5_context *ctx, byte *out, const byte *in, byte *iv); static void do_encrypt_block (CAST5_context *context, byte *outbuf, const byte *inbuf) { - _gcry_cast5_armv6_encrypt_block (context, outbuf, inbuf); + _gcry_cast5_arm_encrypt_block (context, outbuf, inbuf); } static void do_decrypt_block (CAST5_context *context, byte *outbuf, const byte *inbuf) { - _gcry_cast5_armv6_decrypt_block (context, outbuf, inbuf); + _gcry_cast5_arm_decrypt_block (context, outbuf, inbuf); } static unsigned int @@ -447,7 +447,7 @@ decrypt_block (void *context, byte *outbuf, const byte *inbuf) return /*burn_stack*/ (10*4); } -#else /*USE_ARMV6_ASM*/ +#else /*USE_ARM_ASM*/ #define F1(D,m,r) ( (I = ((m) + (D))), (I=rol(I,(r))), \ (((s1[I >> 24] ^ s2[(I>>16)&0xff]) - s3[(I>>8)&0xff]) + s4[I&0xff]) ) @@ -556,7 +556,7 @@ decrypt_block (void *context, byte *outbuf, const byte *inbuf) return /*burn_stack*/ (20+4*sizeof(void*)); } -#endif /*!USE_ARMV6_ASM*/ +#endif /*!USE_ARM_ASM*/ /* Bulk encryption of complete blocks in CTR mode. This function is only @@ -592,12 +592,12 @@ _gcry_cast5_ctr_enc(void *context, unsigned char *ctr, void *outbuf_arg, /* Use generic code to handle smaller chunks... */ /* TODO: use caching instead? */ } -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) { /* Process data in 2 block chunks. */ while (nblocks >= 2) { - _gcry_cast5_armv6_ctr_enc(ctx, outbuf, inbuf, ctr); + _gcry_cast5_arm_ctr_enc(ctx, outbuf, inbuf, ctr); nblocks -= 2; outbuf += 2 * CAST5_BLOCKSIZE; @@ -660,12 +660,12 @@ _gcry_cast5_cbc_dec(void *context, unsigned char *iv, void *outbuf_arg, /* Use generic code to handle smaller chunks... */ } -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) { /* Process data in 2 block chunks. */ while (nblocks >= 2) { - _gcry_cast5_armv6_cbc_dec(ctx, outbuf, inbuf, iv); + _gcry_cast5_arm_cbc_dec(ctx, outbuf, inbuf, iv); nblocks -= 2; outbuf += 2 * CAST5_BLOCKSIZE; @@ -722,12 +722,12 @@ _gcry_cast5_cfb_dec(void *context, unsigned char *iv, void *outbuf_arg, /* Use generic code to handle smaller chunks... */ } -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) { /* Process data in 2 block chunks. */ while (nblocks >= 2) { - _gcry_cast5_armv6_cfb_dec(ctx, outbuf, inbuf, iv); + _gcry_cast5_arm_cfb_dec(ctx, outbuf, inbuf, iv); nblocks -= 2; outbuf += 2 * CAST5_BLOCKSIZE; @@ -936,7 +936,7 @@ do_cast_setkey( CAST5_context *c, const byte *key, unsigned keylen ) for(i=0; i < 16; i++ ) c->Kr[i] = k[i] & 0x1f; -#ifdef USE_ARMV6_ASM +#ifdef USE_ARM_ASM for (i = 0; i < 4; i++) { byte Kr_arm[4]; diff --git a/cipher/rijndael-armv6.S b/cipher/rijndael-arm.S similarity index 98% rename from cipher/rijndael-armv6.S rename to cipher/rijndael-arm.S index bbbfb0e..2a747bf 100644 --- a/cipher/rijndael-armv6.S +++ b/cipher/rijndael-arm.S @@ -1,4 +1,4 @@ -/* rijndael-armv6.S - ARM assembly implementation of AES cipher +/* rijndael-arm.S - ARM assembly implementation of AES cipher * * Copyright ? 2013 Jussi Kivilinna * @@ -20,7 +20,7 @@ #include -#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +#if defined(__ARMEL__) #ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS .text @@ -211,10 +211,10 @@ addroundkey(rna, rnb, rnc, rnd, ra, rb, rc, rd, dummy); .align 3 -.global _gcry_aes_armv6_encrypt_block -.type _gcry_aes_armv6_encrypt_block,%function; +.global _gcry_aes_arm_encrypt_block +.type _gcry_aes_arm_encrypt_block,%function; -_gcry_aes_armv6_encrypt_block: +_gcry_aes_arm_encrypt_block: /* input: * %r0: keysched, CTX * %r1: dst @@ -324,7 +324,7 @@ _gcry_aes_armv6_encrypt_block: lastencround(11, RNA, RNB, RNC, RND, RA, RB, RC, RD); b .Lenc_done; -.size _gcry_aes_armv6_encrypt_block,.-_gcry_aes_armv6_encrypt_block; +.size _gcry_aes_arm_encrypt_block,.-_gcry_aes_arm_encrypt_block; #define addroundkey_dec(round, ra, rb, rc, rd, rna, rnb, rnc, rnd) \ ldr rna, [CTX, #(((round) * 16) + 0 * 4)]; \ @@ -465,10 +465,10 @@ _gcry_aes_armv6_encrypt_block: addroundkey(rna, rnb, rnc, rnd, ra, rb, rc, rd, dummy); .align 3 -.global _gcry_aes_armv6_decrypt_block -.type _gcry_aes_armv6_decrypt_block,%function; +.global _gcry_aes_arm_decrypt_block +.type _gcry_aes_arm_decrypt_block,%function; -_gcry_aes_armv6_decrypt_block: +_gcry_aes_arm_decrypt_block: /* input: * %r0: keysched, CTX * %r1: dst @@ -573,7 +573,7 @@ _gcry_aes_armv6_decrypt_block: decround(9, RA, RB, RC, RD, RNA, RNB, RNC, RND, preload_first_key); b .Ldec_tail; -.size _gcry_aes_armv6_encrypt_block,.-_gcry_aes_armv6_encrypt_block; +.size _gcry_aes_arm_encrypt_block,.-_gcry_aes_arm_encrypt_block; .data @@ -850,4 +850,4 @@ _gcry_aes_armv6_decrypt_block: .long 0x745c6c48, 0x0000000c, 0x4257b8d0, 0x0000007d #endif /*HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS*/ -#endif /*__ARM_ARCH >= 6*/ +#endif /*__ARMEL__ */ diff --git a/cipher/rijndael.c b/cipher/rijndael.c index 85c1a41..e9bb4f6 100644 --- a/cipher/rijndael.c +++ b/cipher/rijndael.c @@ -67,11 +67,11 @@ # define USE_AMD64_ASM 1 #endif -/* USE_ARMV6_ASM indicates whether to use ARMv6 assembly code. */ -#undef USE_ARMV6_ASM -#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +/* USE_ARM_ASM indicates whether to use ARM assembly code. */ +#undef USE_ARM_ASM +#if defined(__ARMEL__) # ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS -# define USE_ARMV6_ASM 1 +# define USE_ARM_ASM 1 # endif #endif @@ -123,18 +123,18 @@ extern void _gcry_aes_amd64_decrypt_block(const void *keysched_dec, int rounds); #endif /*USE_AMD64_ASM*/ -#ifdef USE_ARMV6_ASM -/* ARMv6 assembly implementations of AES */ -extern void _gcry_aes_armv6_encrypt_block(const void *keysched_enc, +#ifdef USE_ARM_ASM +/* ARM assembly implementations of AES */ +extern void _gcry_aes_arm_encrypt_block(const void *keysched_enc, unsigned char *out, const unsigned char *in, int rounds); -extern void _gcry_aes_armv6_decrypt_block(const void *keysched_dec, +extern void _gcry_aes_arm_decrypt_block(const void *keysched_dec, unsigned char *out, const unsigned char *in, int rounds); -#endif /*USE_ARMV6_ASM*/ +#endif /*USE_ARM_ASM*/ @@ -567,8 +567,8 @@ do_encrypt_aligned (const RIJNDAEL_context *ctx, { #ifdef USE_AMD64_ASM _gcry_aes_amd64_encrypt_block(ctx->keyschenc, b, a, ctx->rounds); -#elif defined(USE_ARMV6_ASM) - _gcry_aes_armv6_encrypt_block(ctx->keyschenc, b, a, ctx->rounds); +#elif defined(USE_ARM_ASM) + _gcry_aes_arm_encrypt_block(ctx->keyschenc, b, a, ctx->rounds); #else #define rk (ctx->keyschenc) int rounds = ctx->rounds; @@ -651,7 +651,7 @@ do_encrypt_aligned (const RIJNDAEL_context *ctx, *((u32_a_t*)(b+ 8)) ^= *((u32_a_t*)rk[rounds][2]); *((u32_a_t*)(b+12)) ^= *((u32_a_t*)rk[rounds][3]); #undef rk -#endif /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ +#endif /*!USE_AMD64_ASM && !USE_ARM_ASM*/ } @@ -659,7 +659,7 @@ static void do_encrypt (const RIJNDAEL_context *ctx, unsigned char *bx, const unsigned char *ax) { -#if !defined(USE_AMD64_ASM) && !defined(USE_ARMV6_ASM) +#if !defined(USE_AMD64_ASM) && !defined(USE_ARM_ASM) /* BX and AX are not necessary correctly aligned. Thus we might need to copy them here. We try to align to a 16 bytes. */ if (((size_t)ax & 0x0f) || ((size_t)bx & 0x0f)) @@ -680,7 +680,7 @@ do_encrypt (const RIJNDAEL_context *ctx, memcpy (bx, b.b, 16); } else -#endif /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ +#endif /*!USE_AMD64_ASM && !USE_ARM_ASM*/ { do_encrypt_aligned (ctx, bx, ax); } @@ -1694,8 +1694,8 @@ do_decrypt_aligned (RIJNDAEL_context *ctx, { #ifdef USE_AMD64_ASM _gcry_aes_amd64_decrypt_block(ctx->keyschdec, b, a, ctx->rounds); -#elif defined(USE_ARMV6_ASM) - _gcry_aes_armv6_decrypt_block(ctx->keyschdec, b, a, ctx->rounds); +#elif defined(USE_ARM_ASM) + _gcry_aes_arm_decrypt_block(ctx->keyschdec, b, a, ctx->rounds); #else #define rk (ctx->keyschdec) int rounds = ctx->rounds; @@ -1779,7 +1779,7 @@ do_decrypt_aligned (RIJNDAEL_context *ctx, *((u32_a_t*)(b+ 8)) ^= *((u32_a_t*)rk[0][2]); *((u32_a_t*)(b+12)) ^= *((u32_a_t*)rk[0][3]); #undef rk -#endif /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ +#endif /*!USE_AMD64_ASM && !USE_ARM_ASM*/ } @@ -1794,7 +1794,7 @@ do_decrypt (RIJNDAEL_context *ctx, byte *bx, const byte *ax) ctx->decryption_prepared = 1; } -#if !defined(USE_AMD64_ASM) && !defined(USE_ARMV6_ASM) +#if !defined(USE_AMD64_ASM) && !defined(USE_ARM_ASM) /* BX and AX are not necessary correctly aligned. Thus we might need to copy them here. We try to align to a 16 bytes. */ if (((size_t)ax & 0x0f) || ((size_t)bx & 0x0f)) @@ -1815,7 +1815,7 @@ do_decrypt (RIJNDAEL_context *ctx, byte *bx, const byte *ax) memcpy (bx, b.b, 16); } else -#endif /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ +#endif /*!USE_AMD64_ASM && !USE_ARM_ASM*/ { do_decrypt_aligned (ctx, bx, ax); } diff --git a/cipher/twofish-armv6.S b/cipher/twofish-arm.S similarity index 92% rename from cipher/twofish-armv6.S rename to cipher/twofish-arm.S index b76ab37..ee22f56 100644 --- a/cipher/twofish-armv6.S +++ b/cipher/twofish-arm.S @@ -1,4 +1,4 @@ -/* twofish-armv6.S - ARM assembly implementation of Twofish cipher +/* twofish-arm.S - ARM assembly implementation of Twofish cipher * * Copyright ? 2013 Jussi Kivilinna * @@ -20,7 +20,7 @@ #include -#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +#if defined(__ARMEL__) #ifdef HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS .text @@ -257,10 +257,10 @@ ror1(RD); .align 3 -.global _gcry_twofish_armv6_encrypt_block -.type _gcry_twofish_armv6_encrypt_block,%function; +.global _gcry_twofish_arm_encrypt_block +.type _gcry_twofish_arm_encrypt_block,%function; -_gcry_twofish_armv6_encrypt_block: +_gcry_twofish_arm_encrypt_block: /* input: * %r0: ctx * %r1: dst @@ -303,16 +303,15 @@ _gcry_twofish_armv6_encrypt_block: str_output_le(%r1, RC, RD, RA, RB, RT0, RT1); - pop {%r4-%r11, %ip, %lr}; - bx %lr; + pop {%r4-%r11, %ip, %pc}; .ltorg -.size _gcry_twofish_armv6_encrypt_block,.-_gcry_twofish_armv6_encrypt_block; +.size _gcry_twofish_arm_encrypt_block,.-_gcry_twofish_arm_encrypt_block; .align 3 -.global _gcry_twofish_armv6_decrypt_block -.type _gcry_twofish_armv6_decrypt_block,%function; +.global _gcry_twofish_arm_decrypt_block +.type _gcry_twofish_arm_decrypt_block,%function; -_gcry_twofish_armv6_decrypt_block: +_gcry_twofish_arm_decrypt_block: /* input: * %r0: ctx * %r1: dst @@ -357,9 +356,8 @@ _gcry_twofish_armv6_decrypt_block: str_output_le(%r1, RA, RB, RC, RD, RT0, RT1); - pop {%r4-%r11, %ip, %lr}; - bx %lr; -.size _gcry_twofish_armv6_decrypt_block,.-_gcry_twofish_armv6_decrypt_block; + pop {%r4-%r11, %ip, %pc}; +.size _gcry_twofish_arm_decrypt_block,.-_gcry_twofish_arm_decrypt_block; #endif /*HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS*/ -#endif /*__ARM_ARCH >= 6*/ +#endif /*__ARMEL__*/ diff --git a/cipher/twofish.c b/cipher/twofish.c index d2cabbe..086df76 100644 --- a/cipher/twofish.c +++ b/cipher/twofish.c @@ -57,11 +57,11 @@ # define USE_AMD64_ASM 1 #endif -/* USE_ARMV6_ASM indicates whether to use ARMv6 assembly code. */ -#undef USE_ARMV6_ASM -#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +/* USE_ARM_ASM indicates whether to use ARM assembly code. */ +#undef USE_ARM_ASM +#if defined(__ARMEL__) # if defined(HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS) -# define USE_ARMV6_ASM 1 +# define USE_ARM_ASM 1 # endif #endif @@ -754,16 +754,16 @@ extern void _gcry_twofish_amd64_cbc_dec(const TWOFISH_context *c, byte *out, extern void _gcry_twofish_amd64_cfb_dec(const TWOFISH_context *c, byte *out, const byte *in, byte *iv); -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) /* Assembly implementations of Twofish. */ -extern void _gcry_twofish_armv6_encrypt_block(const TWOFISH_context *c, +extern void _gcry_twofish_arm_encrypt_block(const TWOFISH_context *c, byte *out, const byte *in); -extern void _gcry_twofish_armv6_decrypt_block(const TWOFISH_context *c, +extern void _gcry_twofish_arm_decrypt_block(const TWOFISH_context *c, byte *out, const byte *in); -#else /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ +#else /*!USE_AMD64_ASM && !USE_ARM_ASM*/ /* Macros to compute the g() function in the encryption and decryption * rounds. G1 is the straight g() function; G2 includes the 8-bit @@ -837,17 +837,17 @@ twofish_encrypt (void *context, byte *out, const byte *in) return /*burn_stack*/ (4*sizeof (void*)); } -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) static unsigned int twofish_encrypt (void *context, byte *out, const byte *in) { TWOFISH_context *ctx = context; - _gcry_twofish_armv6_encrypt_block(ctx, out, in); + _gcry_twofish_arm_encrypt_block(ctx, out, in); return /*burn_stack*/ (4*sizeof (void*)); } -#else /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ +#else /*!USE_AMD64_ASM && !USE_ARM_ASM*/ static void do_twofish_encrypt (const TWOFISH_context *ctx, byte *out, const byte *in) @@ -889,7 +889,7 @@ twofish_encrypt (void *context, byte *out, const byte *in) return /*burn_stack*/ (24+3*sizeof (void*)); } -#endif /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ +#endif /*!USE_AMD64_ASM && !USE_ARM_ASM*/ /* Decrypt one block. in and out may be the same. */ @@ -904,17 +904,17 @@ twofish_decrypt (void *context, byte *out, const byte *in) return /*burn_stack*/ (4*sizeof (void*)); } -#elif defined(USE_ARMV6_ASM) +#elif defined(USE_ARM_ASM) static unsigned int twofish_decrypt (void *context, byte *out, const byte *in) { TWOFISH_context *ctx = context; - _gcry_twofish_armv6_decrypt_block(ctx, out, in); + _gcry_twofish_arm_decrypt_block(ctx, out, in); return /*burn_stack*/ (4*sizeof (void*)); } -#else /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ +#else /*!USE_AMD64_ASM && !USE_ARM_ASM*/ static void do_twofish_decrypt (const TWOFISH_context *ctx, byte *out, const byte *in) @@ -957,7 +957,7 @@ twofish_decrypt (void *context, byte *out, const byte *in) return /*burn_stack*/ (24+3*sizeof (void*)); } -#endif /*!USE_AMD64_ASM && !USE_ARMV6_ASM*/ +#endif /*!USE_AMD64_ASM && !USE_ARM_ASM*/ diff --git a/configure.ac b/configure.ac index a1ffdc7..5b7ba0d 100644 --- a/configure.ac +++ b/configure.ac @@ -1424,7 +1424,7 @@ if test "$found" = "1" ; then ;; arm*-*-*) # Build with the assembly implementation - GCRYPT_CIPHERS="$GCRYPT_CIPHERS blowfish-armv6.lo" + GCRYPT_CIPHERS="$GCRYPT_CIPHERS blowfish-arm.lo" ;; esac fi @@ -1441,7 +1441,7 @@ if test "$found" = "1" ; then ;; arm*-*-*) # Build with the assembly implementation - GCRYPT_CIPHERS="$GCRYPT_CIPHERS cast5-armv6.lo" + GCRYPT_CIPHERS="$GCRYPT_CIPHERS cast5-arm.lo" ;; esac fi @@ -1464,7 +1464,7 @@ if test "$found" = "1" ; then ;; arm*-*-*) # Build with the assembly implementation - GCRYPT_CIPHERS="$GCRYPT_CIPHERS rijndael-armv6.lo" + GCRYPT_CIPHERS="$GCRYPT_CIPHERS rijndael-arm.lo" ;; esac fi @@ -1481,7 +1481,7 @@ if test "$found" = "1" ; then ;; arm*-*-*) # Build with the assembly implementation - GCRYPT_CIPHERS="$GCRYPT_CIPHERS twofish-armv6.lo" + GCRYPT_CIPHERS="$GCRYPT_CIPHERS twofish-arm.lo" ;; esac fi @@ -1524,7 +1524,7 @@ if test "$found" = "1" ; then case "${host}" in arm*-*-*) # Build with the assembly implementation - GCRYPT_CIPHERS="$GCRYPT_CIPHERS camellia-armv6.lo" + GCRYPT_CIPHERS="$GCRYPT_CIPHERS camellia-arm.lo" ;; esac commit 0b39fce7e3ce6761d6bd5195d093ec6857edb7c2 Author: Dmitry Eremin-Solenikov Date: Wed Oct 23 18:36:10 2013 +0300 mpi: enable assembler on all arm architectures * mpi/config.links: remove check for arm >= v6 * mpi/armv6 => mpi/arm: rename directory to reflect that is is generic enough -- MPI ARM assembly do not depend on CPU being armv6. Verified on PXA255: Before: Algorithm generate 100*sign 100*verify ------------------------------------------------ RSA 1024 bit 3990ms 57980ms 1680ms RSA 2048 bit 59620ms 389430ms 5690ms RSA 3072 bit 535850ms 1223200ms 12000ms RSA 4096 bit 449350ms 2707370ms 20050ms After: Algorithm generate 100*sign 100*verify ------------------------------------------------ RSA 1024 bit 2190ms 13730ms 320ms RSA 2048 bit 12750ms 67640ms 810ms RSA 3072 bit 110520ms 166100ms 1350ms RSA 4096 bit 100870ms 357560ms 2170ms Signed-off-by: Dmitry Eremin-Solenikov [ jk: ARMv6 => ARM in header comments ] diff --git a/mpi/armv6/mpi-asm-defs.h b/mpi/arm/mpi-asm-defs.h similarity index 100% rename from mpi/armv6/mpi-asm-defs.h rename to mpi/arm/mpi-asm-defs.h diff --git a/mpi/armv6/mpih-add1.S b/mpi/arm/mpih-add1.S similarity index 93% rename from mpi/armv6/mpih-add1.S rename to mpi/arm/mpih-add1.S index 60ea4c3..de6d5ed 100644 --- a/mpi/armv6/mpih-add1.S +++ b/mpi/arm/mpih-add1.S @@ -1,5 +1,5 @@ -/* ARMv6 add_n -- Add two limb vectors of the same length > 0 and store - * sum in a third limb vector. +/* ARM add_n -- Add two limb vectors of the same length > 0 and store + * sum in a third limb vector. * * Copyright ? 2013 Jussi Kivilinna * diff --git a/mpi/armv6/mpih-mul1.S b/mpi/arm/mpih-mul1.S similarity index 94% rename from mpi/armv6/mpih-mul1.S rename to mpi/arm/mpih-mul1.S index 0aa41ef..9e6f361 100644 --- a/mpi/armv6/mpih-mul1.S +++ b/mpi/arm/mpih-mul1.S @@ -1,5 +1,5 @@ -/* ARMv6 mul_1 -- Multiply a limb vector with a limb and store the result in - * a second limb vector. +/* ARM mul_1 -- Multiply a limb vector with a limb and store the result in + * a second limb vector. * * Copyright ? 2013 Jussi Kivilinna * diff --git a/mpi/armv6/mpih-mul2.S b/mpi/arm/mpih-mul2.S similarity index 94% rename from mpi/armv6/mpih-mul2.S rename to mpi/arm/mpih-mul2.S index a7eb8a1..2063be5 100644 --- a/mpi/armv6/mpih-mul2.S +++ b/mpi/arm/mpih-mul2.S @@ -1,5 +1,5 @@ -/* ARMv6 mul_2 -- Multiply a limb vector with a limb and add the result to - * a second limb vector. +/* ARM mul_2 -- Multiply a limb vector with a limb and add the result to + * a second limb vector. * * Copyright ? 2013 Jussi Kivilinna * diff --git a/mpi/armv6/mpih-mul3.S b/mpi/arm/mpih-mul3.S similarity index 94% rename from mpi/armv6/mpih-mul3.S rename to mpi/arm/mpih-mul3.S index 034929e..be2c5e6 100644 --- a/mpi/armv6/mpih-mul3.S +++ b/mpi/arm/mpih-mul3.S @@ -1,5 +1,5 @@ -/* ARMv6 mul_3 -- Multiply a limb vector with a limb and subtract the result - * from a second limb vector. +/* ARM mul_3 -- Multiply a limb vector with a limb and subtract the result + * from a second limb vector. * * Copyright ? 2013 Jussi Kivilinna * diff --git a/mpi/armv6/mpih-sub1.S b/mpi/arm/mpih-sub1.S similarity index 93% rename from mpi/armv6/mpih-sub1.S rename to mpi/arm/mpih-sub1.S index 77d05eb..a573eff 100644 --- a/mpi/armv6/mpih-sub1.S +++ b/mpi/arm/mpih-sub1.S @@ -1,5 +1,5 @@ -/* ARMv6 sub_n -- Subtract two limb vectors of the same length > 0 and store - * sum in a third limb vector. +/* ARM sub_n -- Subtract two limb vectors of the same length > 0 and store + * sum in a third limb vector. * * Copyright ? 2013 Jussi Kivilinna * diff --git a/mpi/config.links b/mpi/config.links index 9fb4f10..90d1077 100644 --- a/mpi/config.links +++ b/mpi/config.links @@ -138,14 +138,9 @@ case "${host}" in ;; arm*-*-*) if test "$gcry_cv_gcc_arm_platform_as_ok" = "yes" ; then - if test "$gcry_cv_cc_arm_arch_is_v6" = "yes" ; then - echo '/* configured for armv6 */' >>./mpi/asm-syntax.h - path="armv6" - mpi_cpu_arch="arm" - else - echo '/* No assembler modules configured */' >>./mpi/asm-syntax.h - path="" - fi + echo '/* configured for arm */' >>./mpi/asm-syntax.h + path="arm" + mpi_cpu_arch="arm" else echo '/* No assembler modules configured */' >>./mpi/asm-syntax.h path="" commit 10bf6a7e16ed193f90d2749970a420f00d1d3320 Author: Dmitry Eremin-Solenikov Date: Wed Oct 23 18:36:10 2013 +0300 Correct ASM assembly test in configure.ac * configure.ac: correct HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS test to require neither ARMv6, nor thumb mode. Our assembly code works perfectly even on ARMv4 now. Signed-off-by: Dmitry Eremin-Solenikov diff --git a/configure.ac b/configure.ac index 69cfbd2..a1ffdc7 100644 --- a/configure.ac +++ b/configure.ac @@ -1109,11 +1109,10 @@ AC_CACHE_CHECK([whether GCC assembler is compatible for ARM assembly implementat [[__asm__( /* Test if assembler supports UAL syntax. */ ".syntax unified\n\t" - ".thumb\n\t" /* thumb-2 in UAL, thumb-1 otherwise. */ - ".code 16\n\t" + ".arm\n\t" /* our assembly code is in ARM mode */ /* Following causes error if assembler ignored '.syntax unified'. */ "asmfunc:\n\t" - "add.w %r0, %r4, %r8, ror #12;\n\t" + "add %r0, %r0, %r4, ror #12;\n\t" /* Test if '.type' and '.size' are supported. */ ".size asmfunc,.-asmfunc;\n\t" ----------------------------------------------------------------------- Summary of changes: cipher/Makefile.am | 8 +- cipher/{blowfish-armv6.S => blowfish-arm.S} | 183 ++++++++++++++------------- cipher/blowfish.c | 44 +++---- cipher/{camellia-armv6.S => camellia-arm.S} | 70 +++++----- cipher/camellia-glue.c | 14 +- cipher/camellia.c | 8 +- cipher/camellia.h | 10 +- cipher/{cast5-armv6.S => cast5-arm.S} | 173 +++++++++++++------------ cipher/cast5.c | 46 +++---- cipher/{rijndael-armv6.S => rijndael-arm.S} | 22 ++-- cipher/rijndael.c | 38 +++--- cipher/{twofish-armv6.S => twofish-arm.S} | 28 ++-- cipher/twofish.c | 32 ++--- configure.ac | 15 +-- mpi/{armv6 => arm}/mpi-asm-defs.h | 0 mpi/{armv6 => arm}/mpih-add1.S | 4 +- mpi/{armv6 => arm}/mpih-mul1.S | 4 +- mpi/{armv6 => arm}/mpih-mul2.S | 4 +- mpi/{armv6 => arm}/mpih-mul3.S | 4 +- mpi/{armv6 => arm}/mpih-sub1.S | 4 +- mpi/config.links | 11 +- 21 files changed, 376 insertions(+), 346 deletions(-) rename cipher/{blowfish-armv6.S => blowfish-arm.S} (78%) rename cipher/{camellia-armv6.S => camellia-arm.S} (93%) rename cipher/{cast5-armv6.S => cast5-arm.S} (81%) rename cipher/{rijndael-armv6.S => rijndael-arm.S} (98%) rename cipher/{twofish-armv6.S => twofish-arm.S} (92%) rename mpi/{armv6 => arm}/mpi-asm-defs.h (100%) rename mpi/{armv6 => arm}/mpih-add1.S (93%) rename mpi/{armv6 => arm}/mpih-mul1.S (94%) rename mpi/{armv6 => arm}/mpih-mul2.S (94%) rename mpi/{armv6 => arm}/mpih-mul3.S (94%) rename mpi/{armv6 => arm}/mpih-sub1.S (93%) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Wed Oct 23 17:56:15 2013 From: cvs at cvs.gnupg.org (by Jussi Kivilinna) Date: Wed, 23 Oct 2013 17:56:15 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-332-g54df6fc Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 54df6fcd806f8c150cffe6cc09925bb8b638bb5b (commit) via 293e93672fdabc829e35cc624c397276342bafe4 (commit) via 2901a10dbf1264707debc8402546c07eeac60932 (commit) from 2fd83faa876d0be91ab7884b1a9eaa7793559eb9 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 54df6fcd806f8c150cffe6cc09925bb8b638bb5b Author: Jussi Kivilinna Date: Wed Oct 23 18:36:18 2013 +0300 Replace architecture specific fast_wipememory2 with generic * src/g10lib.h (fast_wipememory2): Remove architecture specific implementations and add generic implementation. -- Reduce code size, adds support for other architectures and gcc appears to generated better code without assembly parts. Signed-off-by: Jussi Kivilinna diff --git a/src/g10lib.h b/src/g10lib.h index 3b09448..80c73ee 100644 --- a/src/g10lib.h +++ b/src/g10lib.h @@ -275,77 +275,42 @@ void __gcry_burn_stack (unsigned int bytes); } while(0) #define wipememory(_ptr,_len) wipememory2(_ptr,0,_len) +#ifdef HAVE_U64_TYPEDEF + #define FASTWIPE_T u64 + #define FASTWIPE_MULT (U64_C(0x0101010101010101)) +#else + #define FASTWIPE_T u32 + #define FASTWIPE_MULT (0x01010101U) +#endif -/* Optimized fast_wipememory2 for i386, x86-64 and arm architectures. May leave - tail bytes unhandled, in which case tail bytes are handled by wipememory2. - */ -#if defined(__x86_64__) && __GNUC__ >= 4 -#define fast_wipememory2(_vptr,_vset,_vlen) do { \ - unsigned long long int _vset8 = _vset; \ - if (_vlen < 8) \ - break; \ - _vset8 *= 0x0101010101010101ULL; \ - do { \ - asm volatile("movq %[set], %[ptr]\n\t" \ - : /**/ \ - : [set] "Cr" (_vset8), \ - [ptr] "m" (*_vptr) \ - : "memory"); \ - _vlen -= 8; \ - _vptr += 8; \ - } while (_vlen >= 8); \ - } while (0) -#elif defined (__i386__) && SIZEOF_UNSIGNED_LONG == 4 && __GNUC__ >= 4 -#define fast_wipememory2(_ptr,_set,_len) do { \ - unsigned long _vset4 = _vset; \ - if (_vlen < 4) \ - break; \ - _vset4 *= 0x01010101; \ - do { \ - asm volatile("movl %[set], %[ptr]\n\t" \ - : /**/ \ - : [set] "Cr" (_vset4), \ - [ptr] "m" (*_vptr) \ - : "memory"); \ - _vlen -= 4; \ - _vptr += 4; \ - } while (_vlen >= 4); \ - } while (0) -#elif defined (__arm__) && (defined (__thumb2__) || !defined (__thumb__)) && \ - __GNUC__ >= 4 - -#ifdef __ARM_FEATURE_UNALIGNED +/* Following architectures can handle unaligned accesses fast. */ +#if defined(__i386__) || defined(__x86_64__) || \ + defined(__powerpc__) || defined(__powerpc64__) || \ + (defined(__arm__) && defined(__ARM_FEATURE_UNALIGNED)) || \ + defined(__aarch64__) #define fast_wipememory2_unaligned_head(_ptr,_set,_len) /*do nothing*/ #else #define fast_wipememory2_unaligned_head(_vptr,_vset,_vlen) do { \ - while((size_t)(_vptr)&3 && _vlen) \ - { *_vptr=(_vset); _vptr++; _vlen--; } \ + while((size_t)(_vptr)&(sizeof(FASTWIPE_T)-1) && _vlen) \ + { *_vptr=(_vset); _vptr++; _vlen--; } \ } while(0) #endif +/* fast_wipememory2 may leave tail bytes unhandled, in which case tail bytes + are handled by wipememory2. */ #define fast_wipememory2(_vptr,_vset,_vlen) do { \ - unsigned long _vset4 = _vset; \ + FASTWIPE_T _vset_long = _vset; \ fast_wipememory2_unaligned_head(_vptr,_vset,_vlen); \ - if (_vlen < 8) \ + if (_vlen < sizeof(FASTWIPE_T)) \ break; \ - _vset4 *= 0x01010101; \ - asm volatile( \ - "mov %%r4, %[set];\n\t" \ - "mov %%r5, %[set];\n\t" \ - "1:;\n\t" \ - "stm %[ptr]!, {%%r4, %%r5};\n\t" \ - "cmp %[end], %[ptr];\n\t" \ - "bne 1b;\n\t" \ - : [ptr] "=r" (_vptr) \ - : [set] "r" (_vset4), \ - [end] "r" (_vptr+(_vlen&(~0x7))), \ - "0" (_vptr) \ - : "memory", "r4", "r5", "cc"); \ - _vlen &= 0x7; \ + _vset_long *= FASTWIPE_MULT; \ + do { \ + volatile FASTWIPE_T *_vptr_long = (volatile void *)_vptr; \ + *_vptr_long = _vset_long; \ + _vlen -= sizeof(FASTWIPE_T); \ + _vptr += sizeof(FASTWIPE_T); \ + } while (_vlen >= sizeof(FASTWIPE_T)); \ } while (0) -#else -#define fast_wipememory2(_ptr,_set,_len) -#endif /* Digit predicates. */ commit 293e93672fdabc829e35cc624c397276342bafe4 Author: Jussi Kivilinna Date: Wed Oct 23 18:36:18 2013 +0300 Improve the speed of the cipher mode code * cipher/bufhelp.h (buf_cpy): New. (buf_xor, buf_xor_2dst): If buffers unaligned, always jump to per-byte processing. (buf_xor_n_copy_2): New. (buf_xor_n_copy): Use 'buf_xor_n_copy_2'. * cipher/blowfish.c (_gcry_blowfish_cbc_dec): Avoid extra memory copy and use new 'buf_xor_n_copy_2'. * cipher/camellia-glue.c (_gcry_camellia_cbc_dec): Ditto. * cipher/cast5.c (_gcry_cast_cbc_dec): Ditto. * cipher/serpent.c (_gcry_serpent_cbc_dec): Ditto. * cipher/twofish.c (_gcry_twofish_cbc_dec): Ditto. * cipher/rijndael.c (_gcry_aes_cbc_dec): Ditto. (do_encrypt, do_decrypt): Use 'buf_cpy' instead of 'memcpy'. (_gcry_aes_cbc_enc): Avoid copying IV, use 'last_iv' pointer instead. * cipher/cipher-cbc.c (_gcry_cipher_cbc_encrypt): Avoid copying IV, update pointer to IV instead. (_gcry_cipher_cbc_decrypt): Avoid extra memory copy and use new 'buf_xor_n_copy_2'. (_gcry_cipher_cbc_encrypt, _gcry_cipher_cbc_decrypt): Avoid extra accesses to c->spec, use 'buf_cpy' instead of memcpy. * cipher/cipher-ccm.c (do_cbc_mac): Ditto. * cipher/cipher-cfb.c (_gcry_cipher_cfb_encrypt) (_gcry_cipher_cfb_decrypt): Ditto. * cipher/cipher-ctr.c (_gcry_cipher_ctr_encrypt): Ditto. * cipher/cipher-ofb.c (_gcry_cipher_ofb_encrypt) (_gcry_cipher_ofb_decrypt): Ditto. * cipher/cipher.c (do_ecb_encrypt, do_ecb_decrypt): Ditto. -- Patch improves the speed of the generic block cipher mode code. Especially on targets without faster unaligned memory accesses, the generic code was slower than the algorithm specific bulk versions. With this patch, this issue should be solved. Tests on Cortex-A8; compiled for ARMv4, without unaligned-accesses: Before: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 490ms 500ms 560ms 580ms 530ms 540ms 560ms 560ms 550ms 540ms 1080ms 1080ms TWOFISH 230ms 230ms 290ms 300ms 260ms 240ms 290ms 290ms 240ms 240ms 520ms 510ms DES 720ms 720ms 800ms 860ms 770ms 770ms 810ms 820ms 770ms 780ms - - CAST5 340ms 340ms 440ms 250ms 390ms 250ms 440ms 430ms 260ms 250ms - - After: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 500ms 490ms 520ms 520ms 530ms 520ms 530ms 540ms 500ms 520ms 1060ms 1070ms TWOFISH 230ms 220ms 250ms 230ms 260ms 230ms 260ms 260ms 230ms 230ms 500ms 490ms DES 720ms 720ms 750ms 760ms 740ms 750ms 770ms 770ms 760ms 760ms - - CAST5 340ms 340ms 370ms 250ms 370ms 250ms 380ms 390ms 250ms 250ms - - Tests on Cortex-A8; compiled for ARMv7-A, with unaligned-accesses: Before: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 430ms 440ms 480ms 530ms 470ms 460ms 490ms 480ms 470ms 460ms 930ms 940ms TWOFISH 220ms 220ms 250ms 230ms 240ms 230ms 270ms 250ms 230ms 240ms 480ms 470ms DES 550ms 540ms 620ms 690ms 570ms 540ms 630ms 650ms 590ms 580ms - - CAST5 300ms 300ms 380ms 230ms 330ms 230ms 380ms 370ms 230ms 230ms - - After: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 430ms 430ms 460ms 450ms 460ms 450ms 470ms 470ms 460ms 470ms 900ms 930ms TWOFISH 220ms 210ms 240ms 230ms 230ms 230ms 250ms 250ms 230ms 230ms 470ms 470ms DES 540ms 540ms 580ms 570ms 570ms 570ms 560ms 620ms 580ms 570ms - - CAST5 300ms 290ms 310ms 230ms 320ms 230ms 350ms 350ms 230ms 230ms - - Tests on Intel Atom N160 (i386): Before: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 380ms 380ms 410ms 420ms 400ms 400ms 410ms 410ms 390ms 400ms 820ms 800ms TWOFISH 340ms 340ms 370ms 350ms 360ms 340ms 370ms 370ms 330ms 340ms 710ms 700ms DES 660ms 650ms 710ms 740ms 680ms 700ms 700ms 710ms 680ms 680ms - - CAST5 340ms 340ms 380ms 330ms 360ms 330ms 390ms 390ms 320ms 330ms - - After: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 380ms 380ms 390ms 410ms 400ms 390ms 410ms 400ms 400ms 390ms 810ms 800ms TWOFISH 330ms 340ms 350ms 360ms 350ms 340ms 380ms 370ms 340ms 360ms 700ms 710ms DES 630ms 640ms 660ms 690ms 680ms 680ms 700ms 690ms 680ms 680ms - - CAST5 340ms 330ms 350ms 330ms 370ms 340ms 380ms 390ms 330ms 330ms - - Tests in Intel i5-4570 (x86-64): Before: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 560ms 560ms 600ms 590ms 600ms 570ms 570ms 570ms 580ms 590ms 1200ms 1180ms TWOFISH 240ms 240ms 270ms 160ms 260ms 160ms 250ms 250ms 160ms 160ms 430ms 430ms DES 570ms 570ms 640ms 590ms 630ms 580ms 600ms 600ms 610ms 620ms - - CAST5 410ms 410ms 470ms 150ms 470ms 150ms 450ms 450ms 150ms 160ms - - After: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 560ms 560ms 590ms 570ms 580ms 570ms 570ms 570ms 590ms 590ms 1200ms 1200ms TWOFISH 240ms 240ms 260ms 160ms 250ms 170ms 250ms 250ms 160ms 160ms 430ms 430ms DES 570ms 570ms 620ms 580ms 630ms 570ms 600ms 590ms 620ms 620ms - - CAST5 410ms 410ms 460ms 150ms 460ms 160ms 450ms 450ms 150ms 150ms - - Signed-off-by: Jussi Kivilinna diff --git a/cipher/blowfish.c b/cipher/blowfish.c index ed4e901..3b6bf6b 100644 --- a/cipher/blowfish.c +++ b/cipher/blowfish.c @@ -701,14 +701,11 @@ _gcry_blowfish_cbc_dec(void *context, unsigned char *iv, void *outbuf_arg, for ( ;nblocks; nblocks-- ) { - /* We need to save INBUF away because it may be identical to - OUTBUF. */ - memcpy(savebuf, inbuf, BLOWFISH_BLOCKSIZE); + /* INBUF is needed later and it may be identical to OUTBUF, so store + the intermediate result to SAVEBUF. */ + do_decrypt_block (ctx, savebuf, inbuf); - do_decrypt_block (ctx, outbuf, inbuf); - - buf_xor(outbuf, outbuf, iv, BLOWFISH_BLOCKSIZE); - memcpy(iv, savebuf, BLOWFISH_BLOCKSIZE); + buf_xor_n_copy_2(outbuf, savebuf, iv, inbuf, BLOWFISH_BLOCKSIZE); inbuf += BLOWFISH_BLOCKSIZE; outbuf += BLOWFISH_BLOCKSIZE; } diff --git a/cipher/bufhelp.h b/cipher/bufhelp.h index 198d286..dc39b46 100644 --- a/cipher/bufhelp.h +++ b/cipher/bufhelp.h @@ -44,6 +44,45 @@ #endif +/* Optimized function for small buffer copying */ +static inline void +buf_cpy(void *_dst, const void *_src, size_t len) +{ +#if __GNUC__ >= 4 && (defined(__x86_64__) || defined(__i386__)) + /* For AMD64 and i386, memcpy is faster. */ + memcpy(_dst, _src, len); +#else + byte *dst = _dst; + const byte *src = _src; + uintptr_t *ldst; + const uintptr_t *lsrc; +#ifndef BUFHELP_FAST_UNALIGNED_ACCESS + const unsigned int longmask = sizeof(uintptr_t) - 1; + + /* Skip fast processing if buffers are unaligned. */ + if (((uintptr_t)dst | (uintptr_t)src) & longmask) + goto do_bytes; +#endif + + ldst = (uintptr_t *)(void *)dst; + lsrc = (const uintptr_t *)(const void *)src; + + for (; len >= sizeof(uintptr_t); len -= sizeof(uintptr_t)) + *ldst++ = *lsrc++; + + dst = (byte *)ldst; + src = (const byte *)lsrc; + +#ifndef BUFHELP_FAST_UNALIGNED_ACCESS +do_bytes: +#endif + /* Handle tail. */ + for (; len; len--) + *dst++ = *src++; +#endif /*__GNUC__ >= 4 && (__x86_64__ || __i386__)*/ +} + + /* Optimized function for buffer xoring */ static inline void buf_xor(void *_dst, const void *_src1, const void *_src2, size_t len) @@ -56,14 +95,9 @@ buf_xor(void *_dst, const void *_src1, const void *_src2, size_t len) #ifndef BUFHELP_FAST_UNALIGNED_ACCESS const unsigned int longmask = sizeof(uintptr_t) - 1; - /* Skip fast processing if alignment of buffers do not match. */ - if ((((uintptr_t)dst ^ (uintptr_t)src1) | - ((uintptr_t)dst ^ (uintptr_t)src2)) & longmask) + /* Skip fast processing if buffers are unaligned. */ + if (((uintptr_t)dst | (uintptr_t)src1 | (uintptr_t)src2) & longmask) goto do_bytes; - - /* Handle unaligned head. */ - for (; len && ((uintptr_t)dst & longmask); len--) - *dst++ = *src1++ ^ *src2++; #endif ldst = (uintptr_t *)(void *)dst; @@ -99,14 +133,9 @@ buf_xor_2dst(void *_dst1, void *_dst2, const void *_src, size_t len) #ifndef BUFHELP_FAST_UNALIGNED_ACCESS const unsigned int longmask = sizeof(uintptr_t) - 1; - /* Skip fast processing if alignment of buffers do not match. */ - if ((((uintptr_t)src ^ (uintptr_t)dst1) | - ((uintptr_t)src ^ (uintptr_t)dst2)) & longmask) + /* Skip fast processing if buffers are unaligned. */ + if (((uintptr_t)src | (uintptr_t)dst1 | (uintptr_t)dst2) & longmask) goto do_bytes; - - /* Handle unaligned head. */ - for (; len && ((uintptr_t)src & longmask); len--) - *dst1++ = (*dst2++ ^= *src++); #endif ldst1 = (uintptr_t *)(void *)dst1; @@ -130,48 +159,44 @@ do_bytes: /* Optimized function for combined buffer xoring and copying. Used by mainly - CFB mode decryption. */ + CBC mode decryption. */ static inline void -buf_xor_n_copy(void *_dst_xor, void *_srcdst_cpy, const void *_src, size_t len) +buf_xor_n_copy_2(void *_dst_xor, const void *_src_xor, void *_srcdst_cpy, + const void *_src_cpy, size_t len) { byte *dst_xor = _dst_xor; byte *srcdst_cpy = _srcdst_cpy; + const byte *src_xor = _src_xor; + const byte *src_cpy = _src_cpy; byte temp; - const byte *src = _src; uintptr_t *ldst_xor, *lsrcdst_cpy; - const uintptr_t *lsrc; + const uintptr_t *lsrc_cpy, *lsrc_xor; uintptr_t ltemp; #ifndef BUFHELP_FAST_UNALIGNED_ACCESS const unsigned int longmask = sizeof(uintptr_t) - 1; - /* Skip fast processing if alignment of buffers do not match. */ - if ((((uintptr_t)src ^ (uintptr_t)dst_xor) | - ((uintptr_t)src ^ (uintptr_t)srcdst_cpy)) & longmask) + /* Skip fast processing if buffers are unaligned. */ + if (((uintptr_t)src_cpy | (uintptr_t)src_xor | (uintptr_t)dst_xor | + (uintptr_t)srcdst_cpy) & longmask) goto do_bytes; - - /* Handle unaligned head. */ - for (; len && ((uintptr_t)src & longmask); len--) - { - temp = *src++; - *dst_xor++ = *srcdst_cpy ^ temp; - *srcdst_cpy++ = temp; - } #endif ldst_xor = (uintptr_t *)(void *)dst_xor; + lsrc_xor = (const uintptr_t *)(void *)src_xor; lsrcdst_cpy = (uintptr_t *)(void *)srcdst_cpy; - lsrc = (const uintptr_t *)(const void *)src; + lsrc_cpy = (const uintptr_t *)(const void *)src_cpy; for (; len >= sizeof(uintptr_t); len -= sizeof(uintptr_t)) { - ltemp = *lsrc++; - *ldst_xor++ = *lsrcdst_cpy ^ ltemp; + ltemp = *lsrc_cpy++; + *ldst_xor++ = *lsrcdst_cpy ^ *lsrc_xor++; *lsrcdst_cpy++ = ltemp; } dst_xor = (byte *)ldst_xor; + src_xor = (const byte *)lsrc_xor; srcdst_cpy = (byte *)lsrcdst_cpy; - src = (const byte *)lsrc; + src_cpy = (const byte *)lsrc_cpy; #ifndef BUFHELP_FAST_UNALIGNED_ACCESS do_bytes: @@ -179,13 +204,22 @@ do_bytes: /* Handle tail. */ for (; len; len--) { - temp = *src++; - *dst_xor++ = *srcdst_cpy ^ temp; + temp = *src_cpy++; + *dst_xor++ = *srcdst_cpy ^ *src_xor++; *srcdst_cpy++ = temp; } } +/* Optimized function for combined buffer xoring and copying. Used by mainly + CFB mode decryption. */ +static inline void +buf_xor_n_copy(void *_dst_xor, void *_srcdst_cpy, const void *_src, size_t len) +{ + buf_xor_n_copy_2(_dst_xor, _src, _srcdst_cpy, _src, len); +} + + #ifndef BUFHELP_FAST_UNALIGNED_ACCESS /* Functions for loading and storing unaligned u32 values of different diff --git a/cipher/camellia-glue.c b/cipher/camellia-glue.c index e6d4029..8c217a7 100644 --- a/cipher/camellia-glue.c +++ b/cipher/camellia-glue.c @@ -441,14 +441,11 @@ _gcry_camellia_cbc_dec(void *context, unsigned char *iv, for ( ;nblocks; nblocks-- ) { - /* We need to save INBUF away because it may be identical to - OUTBUF. */ - memcpy(savebuf, inbuf, CAMELLIA_BLOCK_SIZE); + /* INBUF is needed later and it may be identical to OUTBUF, so store + the intermediate result to SAVEBUF. */ + Camellia_DecryptBlock(ctx->keybitlength, inbuf, ctx->keytable, savebuf); - Camellia_DecryptBlock(ctx->keybitlength, inbuf, ctx->keytable, outbuf); - - buf_xor(outbuf, outbuf, iv, CAMELLIA_BLOCK_SIZE); - memcpy(iv, savebuf, CAMELLIA_BLOCK_SIZE); + buf_xor_n_copy_2(outbuf, savebuf, iv, inbuf, CAMELLIA_BLOCK_SIZE); inbuf += CAMELLIA_BLOCK_SIZE; outbuf += CAMELLIA_BLOCK_SIZE; } diff --git a/cipher/cast5.c b/cipher/cast5.c index 8c016d7..0df7886 100644 --- a/cipher/cast5.c +++ b/cipher/cast5.c @@ -678,14 +678,11 @@ _gcry_cast5_cbc_dec(void *context, unsigned char *iv, void *outbuf_arg, for ( ;nblocks; nblocks-- ) { - /* We need to save INBUF away because it may be identical to - OUTBUF. */ - memcpy(savebuf, inbuf, CAST5_BLOCKSIZE); + /* INBUF is needed later and it may be identical to OUTBUF, so store + the intermediate result to SAVEBUF. */ + do_decrypt_block (ctx, savebuf, inbuf); - do_decrypt_block (ctx, outbuf, inbuf); - - buf_xor(outbuf, outbuf, iv, CAST5_BLOCKSIZE); - memcpy(iv, savebuf, CAST5_BLOCKSIZE); + buf_xor_n_copy_2(outbuf, savebuf, iv, inbuf, CAST5_BLOCKSIZE); inbuf += CAST5_BLOCKSIZE; outbuf += CAST5_BLOCKSIZE; } diff --git a/cipher/cipher-cbc.c b/cipher/cipher-cbc.c index 523f5a6..4ad2ebd 100644 --- a/cipher/cipher-cbc.c +++ b/cipher/cipher-cbc.c @@ -41,14 +41,15 @@ _gcry_cipher_cbc_encrypt (gcry_cipher_hd_t c, unsigned char *ivp; int i; size_t blocksize = c->spec->blocksize; + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; unsigned nblocks = inbuflen / blocksize; unsigned int burn, nburn; if (outbuflen < ((c->flags & GCRY_CIPHER_CBC_MAC)? blocksize : inbuflen)) return GPG_ERR_BUFFER_TOO_SHORT; - if ((inbuflen % c->spec->blocksize) - && !(inbuflen > c->spec->blocksize + if ((inbuflen % blocksize) + && !(inbuflen > blocksize && (c->flags & GCRY_CIPHER_CBC_CTS))) return GPG_ERR_INV_LENGTH; @@ -70,16 +71,21 @@ _gcry_cipher_cbc_encrypt (gcry_cipher_hd_t c, } else { + ivp = c->u_iv.iv; + for (n=0; n < nblocks; n++ ) { - buf_xor(outbuf, inbuf, c->u_iv.iv, blocksize); - nburn = c->spec->encrypt ( &c->context.c, outbuf, outbuf ); + buf_xor (outbuf, inbuf, ivp, blocksize); + nburn = enc_fn ( &c->context.c, outbuf, outbuf ); burn = nburn > burn ? nburn : burn; - memcpy (c->u_iv.iv, outbuf, blocksize ); + ivp = outbuf; inbuf += blocksize; if (!(c->flags & GCRY_CIPHER_CBC_MAC)) outbuf += blocksize; } + + if (ivp != c->u_iv.iv) + buf_cpy (c->u_iv.iv, ivp, blocksize ); } if ((c->flags & GCRY_CIPHER_CBC_CTS) && inbuflen > blocksize) @@ -104,9 +110,9 @@ _gcry_cipher_cbc_encrypt (gcry_cipher_hd_t c, for (; i < blocksize; i++) outbuf[i] = 0 ^ *ivp++; - nburn = c->spec->encrypt (&c->context.c, outbuf, outbuf); + nburn = enc_fn (&c->context.c, outbuf, outbuf); burn = nburn > burn ? nburn : burn; - memcpy (c->u_iv.iv, outbuf, blocksize); + buf_cpy (c->u_iv.iv, outbuf, blocksize); } if (burn > 0) @@ -124,14 +130,15 @@ _gcry_cipher_cbc_decrypt (gcry_cipher_hd_t c, unsigned int n; int i; size_t blocksize = c->spec->blocksize; + gcry_cipher_decrypt_t dec_fn = c->spec->decrypt; unsigned int nblocks = inbuflen / blocksize; unsigned int burn, nburn; if (outbuflen < inbuflen) return GPG_ERR_BUFFER_TOO_SHORT; - if ((inbuflen % c->spec->blocksize) - && !(inbuflen > c->spec->blocksize + if ((inbuflen % blocksize) + && !(inbuflen > blocksize && (c->flags & GCRY_CIPHER_CBC_CTS))) return GPG_ERR_INV_LENGTH; @@ -142,7 +149,7 @@ _gcry_cipher_cbc_decrypt (gcry_cipher_hd_t c, nblocks--; if ((inbuflen % blocksize) == 0) nblocks--; - memcpy (c->lastiv, c->u_iv.iv, blocksize); + buf_cpy (c->lastiv, c->u_iv.iv, blocksize); } if (c->bulk.cbc_dec) @@ -155,16 +162,14 @@ _gcry_cipher_cbc_decrypt (gcry_cipher_hd_t c, { for (n=0; n < nblocks; n++ ) { - /* Because outbuf and inbuf might be the same, we have to - * save the original ciphertext block. We use LASTIV for - * this here because it is not used otherwise. */ - memcpy (c->lastiv, inbuf, blocksize); - nburn = c->spec->decrypt ( &c->context.c, outbuf, inbuf ); + /* Because outbuf and inbuf might be the same, we must not overwrite + the original ciphertext block. We use LASTIV as intermediate + storage here because it is not used otherwise. */ + nburn = dec_fn ( &c->context.c, c->lastiv, inbuf ); burn = nburn > burn ? nburn : burn; - buf_xor(outbuf, outbuf, c->u_iv.iv, blocksize); - memcpy(c->u_iv.iv, c->lastiv, blocksize ); - inbuf += c->spec->blocksize; - outbuf += c->spec->blocksize; + buf_xor_n_copy_2(outbuf, c->lastiv, c->u_iv.iv, inbuf, blocksize); + inbuf += blocksize; + outbuf += blocksize; } } @@ -177,17 +182,17 @@ _gcry_cipher_cbc_decrypt (gcry_cipher_hd_t c, else restbytes = inbuflen % blocksize; - memcpy (c->lastiv, c->u_iv.iv, blocksize ); /* Save Cn-2. */ - memcpy (c->u_iv.iv, inbuf + blocksize, restbytes ); /* Save Cn. */ + buf_cpy (c->lastiv, c->u_iv.iv, blocksize ); /* Save Cn-2. */ + buf_cpy (c->u_iv.iv, inbuf + blocksize, restbytes ); /* Save Cn. */ - nburn = c->spec->decrypt ( &c->context.c, outbuf, inbuf ); + nburn = dec_fn ( &c->context.c, outbuf, inbuf ); burn = nburn > burn ? nburn : burn; buf_xor(outbuf, outbuf, c->u_iv.iv, restbytes); - memcpy(outbuf + blocksize, outbuf, restbytes); + buf_cpy (outbuf + blocksize, outbuf, restbytes); for(i=restbytes; i < blocksize; i++) c->u_iv.iv[i] = outbuf[i]; - nburn = c->spec->decrypt (&c->context.c, outbuf, c->u_iv.iv); + nburn = dec_fn (&c->context.c, outbuf, c->u_iv.iv); burn = nburn > burn ? nburn : burn; buf_xor(outbuf, outbuf, c->lastiv, blocksize); /* c->lastiv is now really lastlastiv, does this matter? */ diff --git a/cipher/cipher-ccm.c b/cipher/cipher-ccm.c index 38752d5..ebcbf1e 100644 --- a/cipher/cipher-ccm.c +++ b/cipher/cipher-ccm.c @@ -40,6 +40,7 @@ do_cbc_mac (gcry_cipher_hd_t c, const unsigned char *inbuf, size_t inlen, int do_padding) { const unsigned int blocksize = 16; + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; unsigned char tmp[blocksize]; unsigned int burn = 0; unsigned int unused = c->u_mode.ccm.mac_unused; @@ -68,8 +69,7 @@ do_cbc_mac (gcry_cipher_hd_t c, const unsigned char *inbuf, size_t inlen, { /* Process one block from macbuf. */ buf_xor(c->u_iv.iv, c->u_iv.iv, c->u_mode.ccm.macbuf, blocksize); - set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_iv.iv, - c->u_iv.iv )); + set_burn (burn, enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv )); unused = 0; } @@ -89,8 +89,7 @@ do_cbc_mac (gcry_cipher_hd_t c, const unsigned char *inbuf, size_t inlen, { buf_xor(c->u_iv.iv, c->u_iv.iv, inbuf, blocksize); - set_burn (burn, c->spec->encrypt ( &c->context.c, c->u_iv.iv, - c->u_iv.iv )); + set_burn (burn, enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv )); inlen -= blocksize; inbuf += blocksize; diff --git a/cipher/cipher-cfb.c b/cipher/cipher-cfb.c index 244f5fd..610d006 100644 --- a/cipher/cipher-cfb.c +++ b/cipher/cipher-cfb.c @@ -37,6 +37,7 @@ _gcry_cipher_cfb_encrypt (gcry_cipher_hd_t c, const unsigned char *inbuf, unsigned int inbuflen) { unsigned char *ivp; + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; size_t blocksize = c->spec->blocksize; size_t blocksize_x_2 = blocksize + blocksize; unsigned int burn, nburn; @@ -48,7 +49,7 @@ _gcry_cipher_cfb_encrypt (gcry_cipher_hd_t c, { /* Short enough to be encoded by the remaining XOR mask. */ /* XOR the input with the IV and store input into IV. */ - ivp = c->u_iv.iv + c->spec->blocksize - c->unused; + ivp = c->u_iv.iv + blocksize - c->unused; buf_xor_2dst(outbuf, ivp, inbuf, inbuflen); c->unused -= inbuflen; return 0; @@ -83,7 +84,7 @@ _gcry_cipher_cfb_encrypt (gcry_cipher_hd_t c, while ( inbuflen >= blocksize_x_2 ) { /* Encrypt the IV. */ - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; /* XOR the input with the IV and store input into IV. */ buf_xor_2dst(outbuf, c->u_iv.iv, inbuf, blocksize); @@ -96,8 +97,8 @@ _gcry_cipher_cfb_encrypt (gcry_cipher_hd_t c, if ( inbuflen >= blocksize ) { /* Save the current IV and then encrypt the IV. */ - memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; /* XOR the input with the IV and store input into IV */ buf_xor_2dst(outbuf, c->u_iv.iv, inbuf, blocksize); @@ -108,8 +109,8 @@ _gcry_cipher_cfb_encrypt (gcry_cipher_hd_t c, if ( inbuflen ) { /* Save the current IV and then encrypt the IV. */ - memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; c->unused = blocksize; /* Apply the XOR. */ @@ -133,6 +134,7 @@ _gcry_cipher_cfb_decrypt (gcry_cipher_hd_t c, const unsigned char *inbuf, unsigned int inbuflen) { unsigned char *ivp; + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; size_t blocksize = c->spec->blocksize; size_t blocksize_x_2 = blocksize + blocksize; unsigned int burn, nburn; @@ -179,7 +181,7 @@ _gcry_cipher_cfb_decrypt (gcry_cipher_hd_t c, while (inbuflen >= blocksize_x_2 ) { /* Encrypt the IV. */ - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; /* XOR the input with the IV and store input into IV. */ buf_xor_n_copy(outbuf, c->u_iv.iv, inbuf, blocksize); @@ -192,8 +194,8 @@ _gcry_cipher_cfb_decrypt (gcry_cipher_hd_t c, if (inbuflen >= blocksize ) { /* Save the current IV and then encrypt the IV. */ - memcpy ( c->lastiv, c->u_iv.iv, blocksize); - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + buf_cpy ( c->lastiv, c->u_iv.iv, blocksize); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; /* XOR the input with the IV and store input into IV */ buf_xor_n_copy(outbuf, c->u_iv.iv, inbuf, blocksize); @@ -205,8 +207,8 @@ _gcry_cipher_cfb_decrypt (gcry_cipher_hd_t c, if (inbuflen) { /* Save the current IV and then encrypt the IV. */ - memcpy ( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + buf_cpy ( c->lastiv, c->u_iv.iv, blocksize ); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; c->unused = blocksize; /* Apply the XOR. */ diff --git a/cipher/cipher-ctr.c b/cipher/cipher-ctr.c index fbc898f..37a6a79 100644 --- a/cipher/cipher-ctr.c +++ b/cipher/cipher-ctr.c @@ -38,6 +38,7 @@ _gcry_cipher_ctr_encrypt (gcry_cipher_hd_t c, { unsigned int n; int i; + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; unsigned int blocksize = c->spec->blocksize; unsigned int nblocks; unsigned int burn, nburn; @@ -77,7 +78,7 @@ _gcry_cipher_ctr_encrypt (gcry_cipher_hd_t c, unsigned char tmp[MAX_BLOCKSIZE]; do { - nburn = c->spec->encrypt (&c->context.c, tmp, c->u_ctr.ctr); + nburn = enc_fn (&c->context.c, tmp, c->u_ctr.ctr); burn = nburn > burn ? nburn : burn; for (i = blocksize; i > 0; i--) @@ -98,7 +99,7 @@ _gcry_cipher_ctr_encrypt (gcry_cipher_hd_t c, /* Save the unused bytes of the counter. */ c->unused = blocksize - n; if (c->unused) - memcpy (c->lastiv+n, tmp+n, c->unused); + buf_cpy (c->lastiv+n, tmp+n, c->unused); wipememory (tmp, sizeof tmp); } diff --git a/cipher/cipher-ofb.c b/cipher/cipher-ofb.c index 3d9d54c..333a748 100644 --- a/cipher/cipher-ofb.c +++ b/cipher/cipher-ofb.c @@ -37,6 +37,7 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, const unsigned char *inbuf, unsigned int inbuflen) { unsigned char *ivp; + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; size_t blocksize = c->spec->blocksize; unsigned int burn, nburn; @@ -47,7 +48,7 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, { /* Short enough to be encoded by the remaining XOR mask. */ /* XOR the input with the IV */ - ivp = c->u_iv.iv + c->spec->blocksize - c->unused; + ivp = c->u_iv.iv + blocksize - c->unused; buf_xor(outbuf, ivp, inbuf, inbuflen); c->unused -= inbuflen; return 0; @@ -69,8 +70,8 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, while ( inbuflen >= blocksize ) { /* Encrypt the IV (and save the current one). */ - memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; buf_xor(outbuf, c->u_iv.iv, inbuf, blocksize); outbuf += blocksize; @@ -79,8 +80,8 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, } if ( inbuflen ) { /* process the remaining bytes */ - memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; c->unused = blocksize; c->unused -= inbuflen; @@ -103,6 +104,7 @@ _gcry_cipher_ofb_decrypt (gcry_cipher_hd_t c, const unsigned char *inbuf, unsigned int inbuflen) { unsigned char *ivp; + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; size_t blocksize = c->spec->blocksize; unsigned int burn, nburn; @@ -134,8 +136,8 @@ _gcry_cipher_ofb_decrypt (gcry_cipher_hd_t c, while ( inbuflen >= blocksize ) { /* Encrypt the IV (and save the current one). */ - memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; buf_xor(outbuf, c->u_iv.iv, inbuf, blocksize); outbuf += blocksize; @@ -145,8 +147,8 @@ _gcry_cipher_ofb_decrypt (gcry_cipher_hd_t c, if ( inbuflen ) { /* Process the remaining bytes. */ /* Encrypt the IV (and save the current one). */ - memcpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = c->spec->encrypt ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); + buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); + nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; c->unused = blocksize; c->unused -= inbuflen; diff --git a/cipher/cipher.c b/cipher/cipher.c index 5214d26..c0d1d0b 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -631,6 +631,7 @@ do_ecb_encrypt (gcry_cipher_hd_t c, unsigned char *outbuf, unsigned int outbuflen, const unsigned char *inbuf, unsigned int inbuflen) { + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; unsigned int blocksize = c->spec->blocksize; unsigned int n, nblocks; unsigned int burn, nburn; @@ -640,12 +641,12 @@ do_ecb_encrypt (gcry_cipher_hd_t c, if ((inbuflen % blocksize)) return GPG_ERR_INV_LENGTH; - nblocks = inbuflen / c->spec->blocksize; + nblocks = inbuflen / blocksize; burn = 0; for (n=0; n < nblocks; n++ ) { - nburn = c->spec->encrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf); + nburn = enc_fn (&c->context.c, outbuf, (byte*)/*arggg*/inbuf); burn = nburn > burn ? nburn : burn; inbuf += blocksize; outbuf += blocksize; @@ -662,6 +663,7 @@ do_ecb_decrypt (gcry_cipher_hd_t c, unsigned char *outbuf, unsigned int outbuflen, const unsigned char *inbuf, unsigned int inbuflen) { + gcry_cipher_decrypt_t dec_fn = c->spec->decrypt; unsigned int blocksize = c->spec->blocksize; unsigned int n, nblocks; unsigned int burn, nburn; @@ -671,12 +673,12 @@ do_ecb_decrypt (gcry_cipher_hd_t c, if ((inbuflen % blocksize)) return GPG_ERR_INV_LENGTH; - nblocks = inbuflen / c->spec->blocksize; + nblocks = inbuflen / blocksize; burn = 0; for (n=0; n < nblocks; n++ ) { - nburn = c->spec->decrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf); + nburn = dec_fn (&c->context.c, outbuf, (byte*)/*arggg*/inbuf); burn = nburn > burn ? nburn : burn; inbuf += blocksize; outbuf += blocksize; diff --git a/cipher/rijndael.c b/cipher/rijndael.c index e9bb4f6..e8733c9 100644 --- a/cipher/rijndael.c +++ b/cipher/rijndael.c @@ -675,9 +675,9 @@ do_encrypt (const RIJNDAEL_context *ctx, byte b[16] ATTR_ALIGNED_16; } b; - memcpy (a.a, ax, 16); + buf_cpy (a.a, ax, 16); do_encrypt_aligned (ctx, b.b, a.a); - memcpy (bx, b.b, 16); + buf_cpy (bx, b.b, 16); } else #endif /*!USE_AMD64_ASM && !USE_ARM_ASM*/ @@ -1556,12 +1556,15 @@ _gcry_aes_cbc_enc (void *context, unsigned char *iv, RIJNDAEL_context *ctx = context; unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; + unsigned char *last_iv; #ifdef USE_AESNI if (ctx->use_aesni) aesni_prepare (); #endif /*USE_AESNI*/ + last_iv = iv; + for ( ;nblocks; nblocks-- ) { if (0) @@ -1576,24 +1579,17 @@ _gcry_aes_cbc_enc (void *context, unsigned char *iv, "pxor %%xmm0, %%xmm1\n\t" "movdqu %%xmm1, %[outbuf]\n\t" : /* No output */ - : [iv] "m" (*iv), + : [iv] "m" (*last_iv), [inbuf] "m" (*inbuf), [outbuf] "m" (*outbuf) : "memory" ); do_aesni (ctx, 0, outbuf, outbuf); - - asm volatile ("movdqu %[outbuf], %%xmm0\n\t" - "movdqu %%xmm0, %[iv]\n\t" - : /* No output */ - : [outbuf] "m" (*outbuf), - [iv] "m" (*iv) - : "memory" ); } #endif /*USE_AESNI*/ else { - buf_xor(outbuf, inbuf, iv, BLOCKSIZE); + buf_xor(outbuf, inbuf, last_iv, BLOCKSIZE); if (0) ; @@ -1603,18 +1599,34 @@ _gcry_aes_cbc_enc (void *context, unsigned char *iv, #endif /*USE_PADLOCK*/ else do_encrypt (ctx, outbuf, outbuf ); - - memcpy (iv, outbuf, BLOCKSIZE); } + last_iv = outbuf; inbuf += BLOCKSIZE; if (!cbc_mac) outbuf += BLOCKSIZE; } + if (last_iv != iv) + { + if (0) + ; +#ifdef USE_AESNI + else if (ctx->use_aesni) + asm volatile ("movdqu %[last], %%xmm0\n\t" + "movdqu %%xmm0, %[iv]\n\t" + : /* No output */ + : [last] "m" (*last_iv), + [iv] "m" (*iv) + : "memory" ); +#endif /*USE_AESNI*/ + else + buf_cpy (iv, last_iv, BLOCKSIZE); + } + #ifdef USE_AESNI - if (ctx->use_aesni) - aesni_cleanup (); + if (ctx->use_aesni) + aesni_cleanup (); #endif /*USE_AESNI*/ _gcry_burn_stack (48 + 2*sizeof(int)); @@ -1810,9 +1822,9 @@ do_decrypt (RIJNDAEL_context *ctx, byte *bx, const byte *ax) byte b[16] ATTR_ALIGNED_16; } b; - memcpy (a.a, ax, 16); + buf_cpy (a.a, ax, 16); do_decrypt_aligned (ctx, b.b, a.a); - memcpy (bx, b.b, 16); + buf_cpy (bx, b.b, 16); } else #endif /*!USE_AMD64_ASM && !USE_ARM_ASM*/ @@ -2068,21 +2080,19 @@ _gcry_aes_cbc_dec (void *context, unsigned char *iv, else for ( ;nblocks; nblocks-- ) { - /* We need to save INBUF away because it may be identical to - OUTBUF. */ - memcpy (savebuf, inbuf, BLOCKSIZE); + /* INBUF is needed later and it may be identical to OUTBUF, so store + the intermediate result to SAVEBUF. */ if (0) ; #ifdef USE_PADLOCK else if (ctx->use_padlock) - do_padlock (ctx, 1, outbuf, inbuf); + do_padlock (ctx, 1, savebuf, inbuf); #endif /*USE_PADLOCK*/ else - do_decrypt (ctx, outbuf, inbuf); + do_decrypt (ctx, savebuf, inbuf); - buf_xor(outbuf, outbuf, iv, BLOCKSIZE); - memcpy (iv, savebuf, BLOCKSIZE); + buf_xor_n_copy_2(outbuf, savebuf, iv, inbuf, BLOCKSIZE); inbuf += BLOCKSIZE; outbuf += BLOCKSIZE; } diff --git a/cipher/serpent.c b/cipher/serpent.c index c0898dc..a8ee15f 100644 --- a/cipher/serpent.c +++ b/cipher/serpent.c @@ -950,14 +950,11 @@ _gcry_serpent_cbc_dec(void *context, unsigned char *iv, for ( ;nblocks; nblocks-- ) { - /* We need to save INBUF away because it may be identical to - OUTBUF. */ - memcpy(savebuf, inbuf, sizeof(serpent_block_t)); + /* INBUF is needed later and it may be identical to OUTBUF, so store + the intermediate result to SAVEBUF. */ + serpent_decrypt_internal (ctx, inbuf, savebuf); - serpent_decrypt_internal (ctx, inbuf, outbuf); - - buf_xor(outbuf, outbuf, iv, sizeof(serpent_block_t)); - memcpy(iv, savebuf, sizeof(serpent_block_t)); + buf_xor_n_copy_2(outbuf, savebuf, iv, inbuf, sizeof(serpent_block_t)); inbuf += sizeof(serpent_block_t); outbuf += sizeof(serpent_block_t); } diff --git a/cipher/twofish.c b/cipher/twofish.c index 086df76..04081db 100644 --- a/cipher/twofish.c +++ b/cipher/twofish.c @@ -1055,16 +1055,13 @@ _gcry_twofish_cbc_dec(void *context, unsigned char *iv, void *outbuf_arg, for ( ;nblocks; nblocks-- ) { - /* We need to save INBUF away because it may be identical to - OUTBUF. */ - memcpy(savebuf, inbuf, TWOFISH_BLOCKSIZE); - - burn = twofish_decrypt (ctx, outbuf, inbuf); + /* INBUF is needed later and it may be identical to OUTBUF, so store + the intermediate result to SAVEBUF. */ + burn = twofish_decrypt (ctx, savebuf, inbuf); if (burn > burn_stack_depth) burn_stack_depth = burn; - buf_xor(outbuf, outbuf, iv, TWOFISH_BLOCKSIZE); - memcpy(iv, savebuf, TWOFISH_BLOCKSIZE); + buf_xor_n_copy_2(outbuf, savebuf, iv, inbuf, TWOFISH_BLOCKSIZE); inbuf += TWOFISH_BLOCKSIZE; outbuf += TWOFISH_BLOCKSIZE; } commit 2901a10dbf1264707debc8402546c07eeac60932 Author: Jussi Kivilinna Date: Wed Oct 23 18:36:18 2013 +0300 bufhelp: enable unaligned memory accesses for AArch64 (64-bit ARM) * cipher/bufhelp.h [__aarch64__] (BUFHELP_FAST_UNALIGNED_ACCESS): Set macro on AArch64. -- Signed-off-by: Jussi Kivilinna diff --git a/cipher/bufhelp.h b/cipher/bufhelp.h index 21fac07..198d286 100644 --- a/cipher/bufhelp.h +++ b/cipher/bufhelp.h @@ -35,7 +35,8 @@ #if defined(__i386__) || defined(__x86_64__) || \ defined(__powerpc__) || defined(__powerpc64__) || \ - (defined(__arm__) && defined(__ARM_FEATURE_UNALIGNED)) + (defined(__arm__) && defined(__ARM_FEATURE_UNALIGNED)) || \ + defined(__aarch64__) /* These architectures are able of unaligned memory accesses and can handle those fast. */ ----------------------------------------------------------------------- Summary of changes: cipher/blowfish.c | 11 ++--- cipher/bufhelp.h | 107 ++++++++++++++++++++++++++++++++---------------- cipher/camellia-glue.c | 11 ++--- cipher/cast5.c | 11 ++--- cipher/cipher-cbc.c | 53 +++++++++++++----------- cipher/cipher-ccm.c | 7 ++-- cipher/cipher-cfb.c | 24 ++++++----- cipher/cipher-ctr.c | 5 ++- cipher/cipher-ofb.c | 20 +++++---- cipher/cipher.c | 10 +++-- cipher/rijndael.c | 58 +++++++++++++++----------- cipher/serpent.c | 11 ++--- cipher/twofish.c | 11 ++--- src/g10lib.h | 85 +++++++++++--------------------------- 14 files changed, 215 insertions(+), 209 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From dbaryshkov at gmail.com Wed Oct 23 22:17:53 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Thu, 24 Oct 2013 00:17:53 +0400 Subject: [PATCH 3/3] Enable assembler optimizations on earlier ARM cores In-Reply-To: <5267AEF1.5010504@iki.fi> References: <1382470167-11975-1-git-send-email-dbaryshkov@gmail.com> <1382470167-11975-3-git-send-email-dbaryshkov@gmail.com> <5267AEF1.5010504@iki.fi> Message-ID: Hello, On Wed, Oct 23, 2013 at 3:11 PM, Jussi Kivilinna wrote: > Thanks. I've added few fixes and if none objects, I'll push these later today > or tomorrow. Thanks! -- With best wishes Dmitry From cvs at cvs.gnupg.org Thu Oct 24 11:23:41 2013 From: cvs at cvs.gnupg.org (by Jussi Kivilinna) Date: Thu, 24 Oct 2013 11:23:41 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-333-g9ce54e5 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 9ce54e5b512418ddf45ce18f2cbd48cdced779f5 (commit) from 54df6fcd806f8c150cffe6cc09925bb8b638bb5b (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 9ce54e5b512418ddf45ce18f2cbd48cdced779f5 Author: Jussi Kivilinna Date: Thu Oct 24 12:20:55 2013 +0300 ecc-gost: Add missing include * ecc-gost.c: Include "pubkey-internal.h". -- Patch fixes compiler warning: ecc-gost.c: In function '_gcry_ecc_gost_sign': ecc-gost.c:95:11: warning: implicit declaration of function '_gcry_dsa_gen_k' [-Wimplicit-function-declaration] k = _gcry_dsa_gen_k (skey->E.n, GCRY_STRONG_RANDOM); ^ Signed-off-by: Jussi Kivilinna diff --git a/cipher/ecc-gost.c b/cipher/ecc-gost.c index 1ebfd39..a40459c 100644 --- a/cipher/ecc-gost.c +++ b/cipher/ecc-gost.c @@ -30,6 +30,7 @@ #include "context.h" #include "ec-context.h" #include "ecc-common.h" +#include "pubkey-internal.h" /* Compute an GOST R 34.10-01/-12 signature. ----------------------------------------------------------------------- Summary of changes: cipher/ecc-gost.c | 1 + 1 file changed, 1 insertion(+) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From jussi.kivilinna at iki.fi Thu Oct 24 11:25:47 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Thu, 24 Oct 2013 12:25:47 +0300 Subject: [PATCH 1/2] Drop _gcry_cipher_ofb_decrypt as it duplicates _gcry_cipher_ofb_encrypt Message-ID: <20131024092547.32328.25893.stgit@localhost6.localdomain6> From: Dmitry Eremin-Solenikov * cipher/cipher.c (cipher_decrypt): Use _gcry_cipher_ofb_encrypt for OFB decryption. * cipher/cipher-internal.h: Remove _gcry_cipher_ofb_decrypt declaration. * cipher/cipher-ofb.c (_gcry_cipher_ofb_decrypt): Remove. (_gcry_cipher_ofb_encrypt): remove copying of IV to lastiv, it's unused there. Signed-off-by: Dmitry Eremin-Solenikov --- cipher/cipher-internal.h | 4 --- cipher/cipher-ofb.c | 69 ---------------------------------------------- cipher/cipher.c | 2 + 3 files changed, 1 insertion(+), 74 deletions(-) diff --git a/cipher/cipher-internal.h b/cipher/cipher-internal.h index 981caa8..f528c84 100644 --- a/cipher/cipher-internal.h +++ b/cipher/cipher-internal.h @@ -173,10 +173,6 @@ gcry_err_code_t _gcry_cipher_ofb_encrypt /* */ (gcry_cipher_hd_t c, unsigned char *outbuf, unsigned int outbuflen, const unsigned char *inbuf, unsigned int inbuflen); -gcry_err_code_t _gcry_cipher_ofb_decrypt -/* */ (gcry_cipher_hd_t c, - unsigned char *outbuf, unsigned int outbuflen, - const unsigned char *inbuf, unsigned int inbuflen); /*-- cipher-ctr.c --*/ gcry_err_code_t _gcry_cipher_ctr_encrypt diff --git a/cipher/cipher-ofb.c b/cipher/cipher-ofb.c index 333a748..c6d84dd 100644 --- a/cipher/cipher-ofb.c +++ b/cipher/cipher-ofb.c @@ -70,7 +70,6 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, while ( inbuflen >= blocksize ) { /* Encrypt the IV (and save the current one). */ - buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; buf_xor(outbuf, c->u_iv.iv, inbuf, blocksize); @@ -80,74 +79,6 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, } if ( inbuflen ) { /* process the remaining bytes */ - buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); - burn = nburn > burn ? nburn : burn; - c->unused = blocksize; - c->unused -= inbuflen; - buf_xor(outbuf, c->u_iv.iv, inbuf, inbuflen); - outbuf += inbuflen; - inbuf += inbuflen; - inbuflen = 0; - } - - if (burn > 0) - _gcry_burn_stack (burn + 4 * sizeof(void *)); - - return 0; -} - - -gcry_err_code_t -_gcry_cipher_ofb_decrypt (gcry_cipher_hd_t c, - unsigned char *outbuf, unsigned int outbuflen, - const unsigned char *inbuf, unsigned int inbuflen) -{ - unsigned char *ivp; - gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; - size_t blocksize = c->spec->blocksize; - unsigned int burn, nburn; - - if (outbuflen < inbuflen) - return GPG_ERR_BUFFER_TOO_SHORT; - - if( inbuflen <= c->unused ) - { - /* Short enough to be encoded by the remaining XOR mask. */ - ivp = c->u_iv.iv + blocksize - c->unused; - buf_xor(outbuf, ivp, inbuf, inbuflen); - c->unused -= inbuflen; - return 0; - } - - burn = 0; - - if ( c->unused ) - { - inbuflen -= c->unused; - ivp = c->u_iv.iv + blocksize - c->unused; - buf_xor(outbuf, ivp, inbuf, c->unused); - outbuf += c->unused; - inbuf += c->unused; - c->unused = 0; - } - - /* Now we can process complete blocks. */ - while ( inbuflen >= blocksize ) - { - /* Encrypt the IV (and save the current one). */ - buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); - burn = nburn > burn ? nburn : burn; - buf_xor(outbuf, c->u_iv.iv, inbuf, blocksize); - outbuf += blocksize; - inbuf += blocksize; - inbuflen -= blocksize; - } - if ( inbuflen ) - { /* Process the remaining bytes. */ - /* Encrypt the IV (and save the current one). */ - buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; c->unused = blocksize; diff --git a/cipher/cipher.c b/cipher/cipher.c index c0d1d0b..df6d202 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -814,7 +814,7 @@ cipher_decrypt (gcry_cipher_hd_t c, byte *outbuf, unsigned int outbuflen, break; case GCRY_CIPHER_MODE_OFB: - rc = _gcry_cipher_ofb_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); + rc = _gcry_cipher_ofb_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); break; case GCRY_CIPHER_MODE_CTR: From jussi.kivilinna at iki.fi Thu Oct 24 11:25:52 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Thu, 24 Oct 2013 12:25:52 +0300 Subject: [PATCH 2/2] Deduplicate code for ECB encryption and decryption In-Reply-To: <20131024092547.32328.25893.stgit@localhost6.localdomain6> References: <20131024092547.32328.25893.stgit@localhost6.localdomain6> Message-ID: <20131024092552.32328.94070.stgit@localhost6.localdomain6> * cipher/cipher.c (do_ecb_crypt): New, based on old 'do_ecb_encrypt'. (do_ecb_encrypt): Use 'do_ecb_crypt', pass encryption function. (do_ecb_decrypt): Use 'do_ecb_crypt', pass decryption function. -- Signed-off-by: Jussi Kivilinna --- cipher/cipher.c | 44 ++++++++++++++------------------------------ 1 file changed, 14 insertions(+), 30 deletions(-) diff --git a/cipher/cipher.c b/cipher/cipher.c index df6d202..73a97b1 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -627,11 +627,11 @@ cipher_reset (gcry_cipher_hd_t c) static gcry_err_code_t -do_ecb_encrypt (gcry_cipher_hd_t c, - unsigned char *outbuf, unsigned int outbuflen, - const unsigned char *inbuf, unsigned int inbuflen) +do_ecb_crypt (gcry_cipher_hd_t c, + unsigned char *outbuf, unsigned int outbuflen, + const unsigned char *inbuf, unsigned int inbuflen, + gcry_cipher_encrypt_t crypt_fn) { - gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; unsigned int blocksize = c->spec->blocksize; unsigned int n, nblocks; unsigned int burn, nburn; @@ -646,7 +646,7 @@ do_ecb_encrypt (gcry_cipher_hd_t c, for (n=0; n < nblocks; n++ ) { - nburn = enc_fn (&c->context.c, outbuf, (byte*)/*arggg*/inbuf); + nburn = crypt_fn (&c->context.c, outbuf, inbuf); burn = nburn > burn ? nburn : burn; inbuf += blocksize; outbuf += blocksize; @@ -659,35 +659,19 @@ do_ecb_encrypt (gcry_cipher_hd_t c, } static gcry_err_code_t -do_ecb_decrypt (gcry_cipher_hd_t c, +do_ecb_encrypt (gcry_cipher_hd_t c, unsigned char *outbuf, unsigned int outbuflen, const unsigned char *inbuf, unsigned int inbuflen) { - gcry_cipher_decrypt_t dec_fn = c->spec->decrypt; - unsigned int blocksize = c->spec->blocksize; - unsigned int n, nblocks; - unsigned int burn, nburn; - - if (outbuflen < inbuflen) - return GPG_ERR_BUFFER_TOO_SHORT; - if ((inbuflen % blocksize)) - return GPG_ERR_INV_LENGTH; - - nblocks = inbuflen / blocksize; - burn = 0; - - for (n=0; n < nblocks; n++ ) - { - nburn = dec_fn (&c->context.c, outbuf, (byte*)/*arggg*/inbuf); - burn = nburn > burn ? nburn : burn; - inbuf += blocksize; - outbuf += blocksize; - } - - if (burn > 0) - _gcry_burn_stack (burn + 4 * sizeof(void *)); + return do_ecb_crypt (c, outbuf, outbuflen, inbuf, inbuflen, c->spec->encrypt); +} - return 0; +static gcry_err_code_t +do_ecb_decrypt (gcry_cipher_hd_t c, + unsigned char *outbuf, unsigned int outbuflen, + const unsigned char *inbuf, unsigned int inbuflen) +{ + return do_ecb_crypt (c, outbuf, outbuflen, inbuf, inbuflen, c->spec->decrypt); } From cvs at cvs.gnupg.org Thu Oct 24 16:00:20 2013 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Thu, 24 Oct 2013 16:00:20 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-335-gc630fd7 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via c630fd71b336eb9209e914d24dc1e26a34521882 (commit) via 1cf5699b6febab1ef9d300531acc2ee33a7df739 (commit) from 9ce54e5b512418ddf45ce18f2cbd48cdced779f5 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit c630fd71b336eb9209e914d24dc1e26a34521882 Author: Werner Koch Date: Thu Oct 24 13:59:29 2013 +0200 ecc: Change algorithm for Ed25519 x recovery. * cipher/ecc-eddsa.c (scanval): Add as temporary hack. (_gcry_ecc_eddsa_recover_x): Use the algorithm from page 15 of the paper. Return an error code. (_gcry_ecc_eddsa_decodepoint): Take care of the error code. * mpi/mpi-mul.c (gcry_mpi_mulm): Use truncated division. Signed-off-by: Werner Koch diff --git a/cipher/ecc-common.h b/cipher/ecc-common.h index e451f8d..93fd449 100644 --- a/cipher/ecc-common.h +++ b/cipher/ecc-common.h @@ -97,8 +97,8 @@ gpg_err_code_t _gcry_ecc_ecdsa_verify (gcry_mpi_t input, ECC_public_key *pkey, gcry_mpi_t r, gcry_mpi_t s); /*-- ecc-eddsa.c --*/ -void _gcry_ecc_eddsa_recover_x (gcry_mpi_t x, gcry_mpi_t y, int sign, - mpi_ec_t ec); +gpg_err_code_t _gcry_ecc_eddsa_recover_x (gcry_mpi_t x, gcry_mpi_t y, int sign, + mpi_ec_t ec); gpg_err_code_t _gcry_ecc_eddsa_encodepoint (mpi_point_t point, mpi_ec_t ctx, gcry_mpi_t x, gcry_mpi_t y, unsigned char **r_buffer, diff --git a/cipher/ecc-eddsa.c b/cipher/ecc-eddsa.c index 4a9fe0a..22f2702 100644 --- a/cipher/ecc-eddsa.c +++ b/cipher/ecc-eddsa.c @@ -46,6 +46,20 @@ reverse_buffer (unsigned char *buffer, unsigned int length) } +/* Helper to scan a hex string. */ +static gcry_mpi_t +scanval (const char *string) +{ + gpg_error_t err; + gcry_mpi_t val; + + err = gcry_mpi_scan (&val, GCRYMPI_FMT_HEX, string, 0, NULL); + if (err) + log_fatal ("scanning ECC parameter failed: %s\n", gpg_strerror (err)); + return val; +} + + /* Encode MPI using the EdDSA scheme. MINLEN specifies the required length of the buffer in bytes. On success 0 is returned an a @@ -122,61 +136,82 @@ _gcry_ecc_eddsa_encodepoint (mpi_point_t point, mpi_ec_t ec, } -/* Recover X from Y and SIGN . */ -void +/* Recover X from Y and SIGN (which actually is a parity bit). */ +gpg_err_code_t _gcry_ecc_eddsa_recover_x (gcry_mpi_t x, gcry_mpi_t y, int sign, mpi_ec_t ec) { - /* FIXME: This algorithm can be improved - see the paper. - sqrt(-1) mod ed255519_p: - 2B8324804FC1DF0B2B4D00993DFBD7A72F431806AD2FE478C4EE1B274A0EA0B0 */ - gcry_mpi_t yy, t, p1, p2, p3; - - /* t = (y^2-1) ? ((b*y^2+1)^{p-2} mod p) */ - yy = mpi_new (0); - mpi_mul (yy, y, y); - t = mpi_copy (yy); - mpi_mul (t, t, ec->b); - mpi_add_ui (t, t, 1); - p2 = mpi_copy (ec->p); - mpi_sub_ui (p2, p2, 2); - mpi_powm (t, t, p2, ec->p); - - mpi_sub_ui (yy, yy, 1); - mpi_mul (t, yy, t); - - /* x = t^{(p+3)/8} mod p */ - p3 = mpi_copy (ec->p); - mpi_add_ui (p3, p3, 3); - mpi_fdiv_q (p3, p3, mpi_const (MPI_C_EIGHT)); - mpi_powm (x, t, p3, ec->p); - - /* (x^2 - t) % p != 0 ? x = (x*(2^{(p-1)/4} mod p)) % p */ - mpi_mul (yy, x, x); - mpi_subm (yy, yy, t, ec->p); - if (mpi_cmp_ui (yy, 0)) + gpg_err_code_t rc = 0; + gcry_mpi_t u, v, v3, t; + static gcry_mpi_t p58, seven; + + if (ec->dialect != ECC_DIALECT_ED25519) + return GPG_ERR_NOT_IMPLEMENTED; + + if (!p58) + p58 = scanval ("0FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF" + "FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFD"); + if (!seven) + seven = mpi_set_ui (NULL, 7); + + u = mpi_new (0); + v = mpi_new (0); + v3 = mpi_new (0); + t = mpi_new (0); + + /* Compute u and v */ + /* u = y^2 */ + mpi_mulm (u, y, y, ec->p); + /* v = b*y^2 */ + mpi_mulm (v, ec->b, u, ec->p); + /* u = y^2-1 */ + mpi_sub_ui (u, u, 1); + /* v = b*y^2+1 */ + mpi_add_ui (v, v, 1); + + /* Compute sqrt(u/v) */ + /* v3 = v^3 */ + mpi_powm (v3, v, mpi_const (MPI_C_THREE), ec->p); + /* t = v3 * v3 * u * v = u * v^7 */ + mpi_powm (t, v, seven, ec->p); + mpi_mulm (t, t, u, ec->p); + /* t = t^((p-5)/8) = (u * v^7)^((p-5)/8) */ + mpi_powm (t, t, p58, ec->p); + /* x = t * u * v^3 = (u * v^3) * (u * v^7)^((p-5)/8) */ + mpi_mulm (t, t, u, ec->p); + mpi_mulm (x, t, v3, ec->p); + + /* Adjust if needed. */ + /* t = v * x^2 */ + mpi_mulm (t, x, x, ec->p); + mpi_mulm (t, t, v, ec->p); + /* -t == u ? x = x * sqrt(-1) */ + gcry_mpi_neg (t, t); + if (!mpi_cmp (t, u)) { - p1 = mpi_copy (ec->p); - mpi_sub_ui (p1, p1, 1); - mpi_fdiv_q (p1, p1, mpi_const (MPI_C_FOUR)); - mpi_powm (yy, mpi_const (MPI_C_TWO), p1, ec->p); - mpi_mulm (x, x, yy, ec->p); + static gcry_mpi_t m1; /* Fixme: this is not thread-safe. */ + if (!m1) + m1 = scanval ("2B8324804FC1DF0B2B4D00993DFBD7A7" + "2F431806AD2FE478C4EE1B274A0EA0B0"); + mpi_mulm (x, x, m1, ec->p); + /* t = v * x^2 */ + mpi_mulm (t, x, x, ec->p); + mpi_mulm (t, t, v, ec->p); + /* -t == u ? x = x * sqrt(-1) */ + gcry_mpi_neg (t, t); + if (!mpi_cmp (t, u)) + rc = GPG_ERR_INV_OBJ; } - else - p1 = NULL; - /* is_odd(x) ? x = p-x */ - if (mpi_test_bit (x, 0)) - mpi_sub (x, ec->p, x); + /* Choose the desired square root according to parity */ + if (mpi_test_bit (x, 0) != !!sign) + gcry_mpi_neg (x, x); - /* lowbit(x) != sign ? x = p-x */ - if (mpi_test_bit (x, 0) != sign) - mpi_sub (x, ec->p, x); + mpi_free (t); + mpi_free (v3); + mpi_free (v); + mpi_free (u); - gcry_mpi_release (yy); - gcry_mpi_release (t); - gcry_mpi_release (p3); - gcry_mpi_release (p2); - gcry_mpi_release (p1); + return rc; } @@ -278,10 +313,10 @@ _gcry_ecc_eddsa_decodepoint (gcry_mpi_t pk, mpi_ec_t ctx, mpi_point_t result, else gcry_free (rawmpi); - _gcry_ecc_eddsa_recover_x (result->x, result->y, sign, ctx); + rc = _gcry_ecc_eddsa_recover_x (result->x, result->y, sign, ctx); mpi_set_ui (result->z, 1); - return 0; + return rc; } diff --git a/mpi/mpi-mul.c b/mpi/mpi-mul.c index ec6aea0..0a68711 100644 --- a/mpi/mpi-mul.c +++ b/mpi/mpi-mul.c @@ -208,5 +208,5 @@ void gcry_mpi_mulm (gcry_mpi_t w, gcry_mpi_t u, gcry_mpi_t v, gcry_mpi_t m) { gcry_mpi_mul (w, u, v); - _gcry_mpi_mod (w, w, m); + _gcry_mpi_tdiv_r (w, w, m); } diff --git a/tests/t-ed25519.c b/tests/t-ed25519.c index 0a6ae14..be200fa 100644 --- a/tests/t-ed25519.c +++ b/tests/t-ed25519.c @@ -53,6 +53,7 @@ static int debug; static int error_count; static int sign_with_pk; static int no_verify; +static int custom_data_file; static void die (const char *format, ...) @@ -405,9 +406,8 @@ one_test (int testno, const char *sk, const char *pk, static void -check_ed25519 (void) +check_ed25519 (const char *fname) { - char *fname; FILE *fp; int lineno, ntests; char *line; @@ -416,7 +416,6 @@ check_ed25519 (void) show ("Checking Ed25519.\n"); - fname = prepend_srcdir ("t-ed25519.inp"); fp = fopen (fname, "r"); if (!fp) die ("error opening '%s': %s\n", fname, strerror (errno)); @@ -459,13 +458,12 @@ check_ed25519 (void) xfree (msg); xfree (sig); - if (ntests != N_TESTS) + if (ntests != N_TESTS && !custom_data_file) fail ("did %d tests but expected %d", ntests, N_TESTS); else if ((ntests % 256)) show_note ("%d tests done\n", ntests); fclose (fp); - xfree (fname); } @@ -473,6 +471,7 @@ int main (int argc, char **argv) { int last_argc = -1; + char *fname = NULL; if (argc) { argc--; argv++; } @@ -492,7 +491,8 @@ main (int argc, char **argv) " --verbose print timings etc.\n" " --debug flyswatter\n" " --sign-with-pk also use the public key for signing\n" - " --no-verify skip the verify test\n", + " --no-verify skip the verify test\n" + " --data FNAME take test data from file FNAME\n", stdout); exit (0); } @@ -517,11 +517,25 @@ main (int argc, char **argv) no_verify = 1; argc--; argv++; } + else if (!strcmp (*argv, "--data")) + { + argc--; argv++; + if (argc) + { + xfree (fname); + fname = xstrdup (*argv); + argc--; argv++; + } + } else if (!strncmp (*argv, "--", 2)) die ("unknown option '%s'", *argv); } + if (!fname) + fname = prepend_srcdir ("t-ed25519.inp"); + else + custom_data_file = 1; gcry_control (GCRYCTL_DISABLE_SECMEM, 0); if (!gcry_check_version (GCRYPT_VERSION)) @@ -532,9 +546,11 @@ main (int argc, char **argv) gcry_control (GCRYCTL_INITIALIZATION_FINISHED, 0); start_timer (); - check_ed25519 (); + check_ed25519 (fname); stop_timer (); + xfree (fname); + show ("All tests completed in %s. Errors: %d\n", elapsed_time (), error_count); return !!error_count; commit 1cf5699b6febab1ef9d300531acc2ee33a7df739 Author: Werner Koch Date: Wed Oct 23 14:20:10 2013 +0200 ecc: Refactor _gcry_ecc_eddsa_decodepoint. * cipher/ecc-eddsa.c (_gcry_ecc_eddsa_decodepoint): Factor some code out to .. (_gcry_ecc_eddsa_recover_x): new. Signed-off-by: Werner Koch diff --git a/cipher/ecc-common.h b/cipher/ecc-common.h index 0a95b95..e451f8d 100644 --- a/cipher/ecc-common.h +++ b/cipher/ecc-common.h @@ -97,6 +97,8 @@ gpg_err_code_t _gcry_ecc_ecdsa_verify (gcry_mpi_t input, ECC_public_key *pkey, gcry_mpi_t r, gcry_mpi_t s); /*-- ecc-eddsa.c --*/ +void _gcry_ecc_eddsa_recover_x (gcry_mpi_t x, gcry_mpi_t y, int sign, + mpi_ec_t ec); gpg_err_code_t _gcry_ecc_eddsa_encodepoint (mpi_point_t point, mpi_ec_t ctx, gcry_mpi_t x, gcry_mpi_t y, unsigned char **r_buffer, diff --git a/cipher/ecc-eddsa.c b/cipher/ecc-eddsa.c index 72103e9..4a9fe0a 100644 --- a/cipher/ecc-eddsa.c +++ b/cipher/ecc-eddsa.c @@ -122,6 +122,64 @@ _gcry_ecc_eddsa_encodepoint (mpi_point_t point, mpi_ec_t ec, } +/* Recover X from Y and SIGN . */ +void +_gcry_ecc_eddsa_recover_x (gcry_mpi_t x, gcry_mpi_t y, int sign, mpi_ec_t ec) +{ + /* FIXME: This algorithm can be improved - see the paper. + sqrt(-1) mod ed255519_p: + 2B8324804FC1DF0B2B4D00993DFBD7A72F431806AD2FE478C4EE1B274A0EA0B0 */ + gcry_mpi_t yy, t, p1, p2, p3; + + /* t = (y^2-1) ? ((b*y^2+1)^{p-2} mod p) */ + yy = mpi_new (0); + mpi_mul (yy, y, y); + t = mpi_copy (yy); + mpi_mul (t, t, ec->b); + mpi_add_ui (t, t, 1); + p2 = mpi_copy (ec->p); + mpi_sub_ui (p2, p2, 2); + mpi_powm (t, t, p2, ec->p); + + mpi_sub_ui (yy, yy, 1); + mpi_mul (t, yy, t); + + /* x = t^{(p+3)/8} mod p */ + p3 = mpi_copy (ec->p); + mpi_add_ui (p3, p3, 3); + mpi_fdiv_q (p3, p3, mpi_const (MPI_C_EIGHT)); + mpi_powm (x, t, p3, ec->p); + + /* (x^2 - t) % p != 0 ? x = (x*(2^{(p-1)/4} mod p)) % p */ + mpi_mul (yy, x, x); + mpi_subm (yy, yy, t, ec->p); + if (mpi_cmp_ui (yy, 0)) + { + p1 = mpi_copy (ec->p); + mpi_sub_ui (p1, p1, 1); + mpi_fdiv_q (p1, p1, mpi_const (MPI_C_FOUR)); + mpi_powm (yy, mpi_const (MPI_C_TWO), p1, ec->p); + mpi_mulm (x, x, yy, ec->p); + } + else + p1 = NULL; + + /* is_odd(x) ? x = p-x */ + if (mpi_test_bit (x, 0)) + mpi_sub (x, ec->p, x); + + /* lowbit(x) != sign ? x = p-x */ + if (mpi_test_bit (x, 0) != sign) + mpi_sub (x, ec->p, x); + + gcry_mpi_release (yy); + gcry_mpi_release (t); + gcry_mpi_release (p3); + gcry_mpi_release (p2); + gcry_mpi_release (p1); +} + + /* Decode the EdDSA style encoded PK and set it into RESULT. CTX is the usual curve context. If R_ENCPK is not NULL, the encoded PK is stored at that address; this is a new copy to be released by the @@ -135,7 +193,6 @@ _gcry_ecc_eddsa_decodepoint (gcry_mpi_t pk, mpi_ec_t ctx, mpi_point_t result, gpg_err_code_t rc; unsigned char *rawmpi; unsigned int rawmpilen; - gcry_mpi_t yy, t, x, p1, p2, p3; int sign; if (mpi_is_opaque (pk)) @@ -153,7 +210,7 @@ _gcry_ecc_eddsa_decodepoint (gcry_mpi_t pk, mpi_ec_t ctx, mpi_point_t result, first byte be 0x04. */ if (rawmpilen > 1 && buf[0] == 0x04 && (rawmpilen%2)) { - gcry_mpi_t y; + gcry_mpi_t x, y; rc = gcry_mpi_scan (&x, GCRYMPI_FMT_STD, buf+1, (rawmpilen-1)/2, NULL); @@ -221,59 +278,9 @@ _gcry_ecc_eddsa_decodepoint (gcry_mpi_t pk, mpi_ec_t ctx, mpi_point_t result, else gcry_free (rawmpi); - /* Now recover X. */ - /* t = (y^2-1) ? ((b*y^2+1)^{p-2} mod p) */ - x = mpi_new (0); - yy = mpi_new (0); - mpi_mul (yy, result->y, result->y); - t = mpi_copy (yy); - mpi_mul (t, t, ctx->b); - mpi_add_ui (t, t, 1); - p2 = mpi_copy (ctx->p); - mpi_sub_ui (p2, p2, 2); - mpi_powm (t, t, p2, ctx->p); - - mpi_sub_ui (yy, yy, 1); - mpi_mul (t, yy, t); - - /* x = t^{(p+3)/8} mod p */ - p3 = mpi_copy (ctx->p); - mpi_add_ui (p3, p3, 3); - mpi_fdiv_q (p3, p3, mpi_const (MPI_C_EIGHT)); - mpi_powm (x, t, p3, ctx->p); - - /* (x^2 - t) % p != 0 ? x = (x*(2^{(p-1)/4} mod p)) % p */ - mpi_mul (yy, x, x); - mpi_subm (yy, yy, t, ctx->p); - if (mpi_cmp_ui (yy, 0)) - { - p1 = mpi_copy (ctx->p); - mpi_sub_ui (p1, p1, 1); - mpi_fdiv_q (p1, p1, mpi_const (MPI_C_FOUR)); - mpi_powm (yy, mpi_const (MPI_C_TWO), p1, ctx->p); - mpi_mulm (x, x, yy, ctx->p); - } - else - p1 = NULL; - - /* is_odd(x) ? x = p-x */ - if (mpi_test_bit (x, 0)) - mpi_sub (x, ctx->p, x); - - /* lowbit(x) != highbit(input) ? x = p-x */ - if (mpi_test_bit (x, 0) != sign) - mpi_sub (x, ctx->p, x); - - mpi_set (result->x, x); + _gcry_ecc_eddsa_recover_x (result->x, result->y, sign, ctx); mpi_set_ui (result->z, 1); - gcry_mpi_release (x); - gcry_mpi_release (yy); - gcry_mpi_release (t); - gcry_mpi_release (p3); - gcry_mpi_release (p2); - gcry_mpi_release (p1); - return 0; } ----------------------------------------------------------------------- Summary of changes: cipher/ecc-common.h | 2 + cipher/ecc-eddsa.c | 150 ++++++++++++++++++++++++++++++++------------------- mpi/mpi-mul.c | 2 +- tests/t-ed25519.c | 30 ++++++++--- 4 files changed, 122 insertions(+), 62 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Fri Oct 25 11:45:12 2013 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Fri, 25 Oct 2013 11:45:12 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-336-g6c6d481 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 6c6d4810927de7310ae7bac61b4ff5467d7cb485 (commit) from c630fd71b336eb9209e914d24dc1e26a34521882 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 6c6d4810927de7310ae7bac61b4ff5467d7cb485 Author: Werner Koch Date: Fri Oct 25 11:43:31 2013 +0200 tests: Add tests for mpi_cmp. * tests/mpitests.c (die): Modernize. (fail): New. (test_opaque, test_add, test_sub, test_mul): Use gcry_log_xx (main): Return error count. (test_cmp): New. Signed-off-by: Werner Koch diff --git a/tests/mpitests.c b/tests/mpitests.c index e1c51d1..c5c60d7 100644 --- a/tests/mpitests.c +++ b/tests/mpitests.c @@ -1,5 +1,6 @@ /* mpitests.c - basic mpi tests * Copyright (C) 2001, 2002, 2003, 2006 Free Software Foundation, Inc. + * Copyright (C) 2013 g10 Code GmbH * * This file is part of Libgcrypt. * @@ -14,9 +15,7 @@ * GNU Lesser General Public License for more details. * * You should have received a copy of the GNU Lesser General Public - * License along with this program; if not, write to the Free Software - * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, - * USA. + * License along with this program; if not, see . */ #ifdef HAVE_CONFIG_H @@ -33,23 +32,46 @@ # include #endif +#define PGM "mpitests" + static int verbose; static int debug; +static int error_count; static void die (const char *format, ...) { + va_list arg_ptr ; + + fflush (stdout); + fprintf (stderr, "%s: ", PGM); + va_start (arg_ptr, format) ; + vfprintf (stderr, format, arg_ptr ); + va_end(arg_ptr); + if (*format && format[strlen(format)-1] != '\n') + putc ('\n', stderr); + exit (1); +} + +static void +fail (const char *format, ...) +{ va_list arg_ptr; + fflush (stdout); + fprintf (stderr, "%s: ", PGM); va_start (arg_ptr, format); vfprintf (stderr, format, arg_ptr); va_end (arg_ptr); - exit (1); + if (*format && format[strlen(format)-1] != '\n') + putc ('\n', stderr); + error_count++; + if (error_count >= 50) + die ("stopped after 50 errors."); } - /* Set up some test patterns */ /* 48 bytes with value 1: this results in 8 limbs for 64bit limbs, 16limb for 32 bit limbs */ @@ -188,17 +210,118 @@ test_opaque (void) if (strcmp (p, "This is a test buffer")) die ("gcry_mpi_get_opaque returned a changed buffer\n"); - if (verbose) - { - fprintf (stderr, "mpi: "); - gcry_mpi_dump (a); - putc ('\n', stderr); - } + if (debug) + gcry_log_debugmpi ("mpi", a); gcry_mpi_release (a); } +static void +test_cmp (void) +{ + gpg_error_t rc; + gcry_mpi_t zero, zero2; + gcry_mpi_t one; + gcry_mpi_t two; + gcry_mpi_t all_ones; + gcry_mpi_t opa1, opa2; + gcry_mpi_t opa1s, opa2s; + gcry_mpi_t opa0, opa02; + + zero = gcry_mpi_new (0); + zero2= gcry_mpi_set_ui (NULL, 0); + one = gcry_mpi_set_ui (NULL, 1); + two = gcry_mpi_set_ui (NULL, 2); + rc = gcry_mpi_scan (&all_ones, GCRYMPI_FMT_USG, ones, sizeof(ones), NULL); + if (rc) + die ("scanning number failed at line %d", __LINE__); + opa0 = gcry_mpi_set_opaque (NULL, gcry_xstrdup ("a"), 0); + opa02 = gcry_mpi_set_opaque (NULL, gcry_xstrdup ("b"), 0); + opa1 = gcry_mpi_set_opaque (NULL, gcry_xstrdup ("aaaaaaaaaaaaaaaa"), 16*8); + opa1s = gcry_mpi_set_opaque (NULL, gcry_xstrdup ("a"), 1*8); + opa2 = gcry_mpi_set_opaque (NULL, gcry_xstrdup ("bbbbbbbbbbbbbbbb"), 16*8); + opa2s = gcry_mpi_set_opaque (NULL, gcry_xstrdup ("b"), 1*8); + + + /* Single limb test with cmp_ui */ + if (gcry_mpi_cmp_ui (zero, 0)) + fail ("mpi_cmp_ui failed at line %d", __LINE__); + if (!(gcry_mpi_cmp_ui (zero, 1) < 0)) + fail ("mpi_cmp_ui failed at line %d", __LINE__); + if (!(gcry_mpi_cmp_ui (zero, (-1)) < 0)) + fail ("mpi_cmp_ui failed at line %d", __LINE__); + + if (gcry_mpi_cmp_ui (two, 2)) + fail ("mpi_cmp_ui failed at line %d", __LINE__); + if (!(gcry_mpi_cmp_ui (two, 3) < 0)) + fail ("mpi_cmp_ui failed at line %d", __LINE__); + if (!(gcry_mpi_cmp_ui (two, 1) > 0)) + fail ("mpi_cmp_ui failed at line %d", __LINE__); + + /* Multi limb tests with cmp_ui. */ + if (!(gcry_mpi_cmp_ui (all_ones, 0) > 0)) + fail ("mpi_cmp_ui failed at line %d", __LINE__); + if (!(gcry_mpi_cmp_ui (all_ones, (-1)) > 0)) + fail ("mpi_cmp_ui failed at line %d", __LINE__); + + /* Single limb test with cmp */ + if (gcry_mpi_cmp (zero, zero2)) + fail ("mpi_cmp failed at line %d", __LINE__); + if (!(gcry_mpi_cmp (zero, one) < 0)) + fail ("mpi_cmp failed at line %d", __LINE__); + if (!(gcry_mpi_cmp (one, zero) > 0)) + fail ("mpi_cmp failed at line %d", __LINE__); + + gcry_mpi_neg (one, one); + if (!(gcry_mpi_cmp (zero, one) > 0)) + fail ("mpi_cmp failed at line %d", __LINE__); + if (!(gcry_mpi_cmp (one, zero) < 0)) + fail ("mpi_cmp failed at line %d", __LINE__); + if (!(gcry_mpi_cmp (one, two) < 0)) + fail ("mpi_cmp failed at line %d", __LINE__); + gcry_mpi_neg (one, one); + + if (!(gcry_mpi_cmp (one, two) < 0)) + fail ("mpi_cmp failed at line %d", __LINE__); + if (!(gcry_mpi_cmp (two, one) > 0)) + fail ("mpi_cmp failed at line %d", __LINE__); + if (!(gcry_mpi_cmp (one, all_ones) < 0)) + fail ("mpi_cmp failed at line %d", __LINE__); + + /* Tests with opaque values. */ + if (!(gcry_mpi_cmp (opa1, one) < 0)) + fail ("mpi_cmp failed at line %d", __LINE__); + if (!(gcry_mpi_cmp (one, opa1) > 0)) + fail ("mpi_cmp failed at line %d", __LINE__); + if (!(gcry_mpi_cmp (opa0, opa02) == 0)) + fail ("mpi_cmp failed at line %d", __LINE__); + if (!(gcry_mpi_cmp (opa1s, opa1) < 0)) + fail ("mpi_cmp failed at line %d", __LINE__); + if (!(gcry_mpi_cmp (opa2, opa1s) > 0)) + fail ("mpi_cmp failed at line %d", __LINE__); + if (!(gcry_mpi_cmp (opa1, opa2) < 0)) + fail ("mpi_cmp failed at line %d", __LINE__); + if (!(gcry_mpi_cmp (opa2, opa1) > 0)) + fail ("mpi_cmp failed at line %d", __LINE__); + if (!(gcry_mpi_cmp (opa1, opa1) == 0)) + fail ("mpi_cmp failed at line %d", __LINE__); + + + gcry_mpi_release(opa2s); + gcry_mpi_release(opa2); + gcry_mpi_release(opa1s); + gcry_mpi_release(opa1); + gcry_mpi_release(opa02); + gcry_mpi_release(opa0); + gcry_mpi_release(all_ones); + gcry_mpi_release(two); + gcry_mpi_release(one); + gcry_mpi_release(zero2); + gcry_mpi_release(zero); +} + + static int test_add (void) { @@ -215,14 +338,14 @@ test_add (void) gcry_mpi_add(result, one, two); gcry_mpi_aprint(GCRYMPI_FMT_HEX, &pc, NULL, result); - if (verbose) - printf("Result of one plus two:\n%s\n", pc); + if (debug) + gcry_log_debug ("Result of one plus two:\n%s\n", pc); gcry_free(pc); gcry_mpi_add(result, ff, one); gcry_mpi_aprint(GCRYMPI_FMT_HEX, &pc, NULL, result); - if (verbose) - printf("Result of ff plus one:\n%s\n", pc); + if (debug) + gcry_log_debug ("Result of ff plus one:\n%s\n", pc); gcry_free(pc); gcry_mpi_release(one); @@ -247,8 +370,8 @@ test_sub (void) gcry_mpi_sub(result, two, one); gcry_mpi_aprint(GCRYMPI_FMT_HEX, &pc, NULL, result); - if (verbose) - printf("Result of two minus one:\n%s\n", pc); + if (debug) + gcry_log_debug ("Result of two minus one:\n%s\n", pc); gcry_free(pc); gcry_mpi_release(one); @@ -272,8 +395,8 @@ test_mul (void) gcry_mpi_mul(result, two, three); gcry_mpi_aprint(GCRYMPI_FMT_HEX, &pc, NULL, result); - if (verbose) - printf("Result of two mul three:\n%s\n", pc); + if (debug) + gcry_log_debug ("Result of two mul three:\n%s\n", pc); gcry_free(pc); gcry_mpi_release(two); @@ -408,10 +531,11 @@ main (int argc, char* argv[]) test_const_and_immutable (); test_opaque (); + test_cmp (); test_add (); test_sub (); test_mul (); test_powm (); - return 0; + return !!error_count; } ----------------------------------------------------------------------- Summary of changes: tests/mpitests.c | 164 +++++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 144 insertions(+), 20 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Sat Oct 26 13:56:41 2013 From: cvs at cvs.gnupg.org (by Jussi Kivilinna) Date: Sat, 26 Oct 2013 13:56:41 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-338-g51f1bea Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 51f1beab3d1e879942a95f58b08de7dbcce75dce (commit) via d9431725952e40f201c7eda000d3c8511ebd5b33 (commit) from 6c6d4810927de7310ae7bac61b4ff5467d7cb485 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 51f1beab3d1e879942a95f58b08de7dbcce75dce Author: Jussi Kivilinna Date: Sat Oct 26 14:51:44 2013 +0300 Deduplicate code for ECB encryption and decryption * cipher/cipher.c (do_ecb_crypt): New, based on old 'do_ecb_encrypt'. (do_ecb_encrypt): Use 'do_ecb_crypt', pass encryption function. (do_ecb_decrypt): Use 'do_ecb_crypt', pass decryption function. -- Signed-off-by: Jussi Kivilinna diff --git a/cipher/cipher.c b/cipher/cipher.c index df6d202..73a97b1 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -627,11 +627,11 @@ cipher_reset (gcry_cipher_hd_t c) static gcry_err_code_t -do_ecb_encrypt (gcry_cipher_hd_t c, - unsigned char *outbuf, unsigned int outbuflen, - const unsigned char *inbuf, unsigned int inbuflen) +do_ecb_crypt (gcry_cipher_hd_t c, + unsigned char *outbuf, unsigned int outbuflen, + const unsigned char *inbuf, unsigned int inbuflen, + gcry_cipher_encrypt_t crypt_fn) { - gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; unsigned int blocksize = c->spec->blocksize; unsigned int n, nblocks; unsigned int burn, nburn; @@ -646,7 +646,7 @@ do_ecb_encrypt (gcry_cipher_hd_t c, for (n=0; n < nblocks; n++ ) { - nburn = enc_fn (&c->context.c, outbuf, (byte*)/*arggg*/inbuf); + nburn = crypt_fn (&c->context.c, outbuf, inbuf); burn = nburn > burn ? nburn : burn; inbuf += blocksize; outbuf += blocksize; @@ -659,35 +659,19 @@ do_ecb_encrypt (gcry_cipher_hd_t c, } static gcry_err_code_t -do_ecb_decrypt (gcry_cipher_hd_t c, +do_ecb_encrypt (gcry_cipher_hd_t c, unsigned char *outbuf, unsigned int outbuflen, const unsigned char *inbuf, unsigned int inbuflen) { - gcry_cipher_decrypt_t dec_fn = c->spec->decrypt; - unsigned int blocksize = c->spec->blocksize; - unsigned int n, nblocks; - unsigned int burn, nburn; - - if (outbuflen < inbuflen) - return GPG_ERR_BUFFER_TOO_SHORT; - if ((inbuflen % blocksize)) - return GPG_ERR_INV_LENGTH; - - nblocks = inbuflen / blocksize; - burn = 0; - - for (n=0; n < nblocks; n++ ) - { - nburn = dec_fn (&c->context.c, outbuf, (byte*)/*arggg*/inbuf); - burn = nburn > burn ? nburn : burn; - inbuf += blocksize; - outbuf += blocksize; - } - - if (burn > 0) - _gcry_burn_stack (burn + 4 * sizeof(void *)); + return do_ecb_crypt (c, outbuf, outbuflen, inbuf, inbuflen, c->spec->encrypt); +} - return 0; +static gcry_err_code_t +do_ecb_decrypt (gcry_cipher_hd_t c, + unsigned char *outbuf, unsigned int outbuflen, + const unsigned char *inbuf, unsigned int inbuflen) +{ + return do_ecb_crypt (c, outbuf, outbuflen, inbuf, inbuflen, c->spec->decrypt); } commit d9431725952e40f201c7eda000d3c8511ebd5b33 Author: Dmitry Eremin-Solenikov Date: Sat Oct 26 14:51:44 2013 +0300 Drop _gcry_cipher_ofb_decrypt as it duplicates _gcry_cipher_ofb_encrypt * cipher/cipher.c (cipher_decrypt): Use _gcry_cipher_ofb_encrypt for OFB decryption. * cipher/cipher-internal.h: Remove _gcry_cipher_ofb_decrypt declaration. * cipher/cipher-ofb.c (_gcry_cipher_ofb_decrypt): Remove. (_gcry_cipher_ofb_encrypt): remove copying of IV to lastiv, it's unused there. Signed-off-by: Dmitry Eremin-Solenikov diff --git a/cipher/cipher-internal.h b/cipher/cipher-internal.h index 981caa8..f528c84 100644 --- a/cipher/cipher-internal.h +++ b/cipher/cipher-internal.h @@ -173,10 +173,6 @@ gcry_err_code_t _gcry_cipher_ofb_encrypt /* */ (gcry_cipher_hd_t c, unsigned char *outbuf, unsigned int outbuflen, const unsigned char *inbuf, unsigned int inbuflen); -gcry_err_code_t _gcry_cipher_ofb_decrypt -/* */ (gcry_cipher_hd_t c, - unsigned char *outbuf, unsigned int outbuflen, - const unsigned char *inbuf, unsigned int inbuflen); /*-- cipher-ctr.c --*/ gcry_err_code_t _gcry_cipher_ctr_encrypt diff --git a/cipher/cipher-ofb.c b/cipher/cipher-ofb.c index 333a748..c6d84dd 100644 --- a/cipher/cipher-ofb.c +++ b/cipher/cipher-ofb.c @@ -70,7 +70,6 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, while ( inbuflen >= blocksize ) { /* Encrypt the IV (and save the current one). */ - buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; buf_xor(outbuf, c->u_iv.iv, inbuf, blocksize); @@ -80,74 +79,6 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, } if ( inbuflen ) { /* process the remaining bytes */ - buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); - burn = nburn > burn ? nburn : burn; - c->unused = blocksize; - c->unused -= inbuflen; - buf_xor(outbuf, c->u_iv.iv, inbuf, inbuflen); - outbuf += inbuflen; - inbuf += inbuflen; - inbuflen = 0; - } - - if (burn > 0) - _gcry_burn_stack (burn + 4 * sizeof(void *)); - - return 0; -} - - -gcry_err_code_t -_gcry_cipher_ofb_decrypt (gcry_cipher_hd_t c, - unsigned char *outbuf, unsigned int outbuflen, - const unsigned char *inbuf, unsigned int inbuflen) -{ - unsigned char *ivp; - gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; - size_t blocksize = c->spec->blocksize; - unsigned int burn, nburn; - - if (outbuflen < inbuflen) - return GPG_ERR_BUFFER_TOO_SHORT; - - if( inbuflen <= c->unused ) - { - /* Short enough to be encoded by the remaining XOR mask. */ - ivp = c->u_iv.iv + blocksize - c->unused; - buf_xor(outbuf, ivp, inbuf, inbuflen); - c->unused -= inbuflen; - return 0; - } - - burn = 0; - - if ( c->unused ) - { - inbuflen -= c->unused; - ivp = c->u_iv.iv + blocksize - c->unused; - buf_xor(outbuf, ivp, inbuf, c->unused); - outbuf += c->unused; - inbuf += c->unused; - c->unused = 0; - } - - /* Now we can process complete blocks. */ - while ( inbuflen >= blocksize ) - { - /* Encrypt the IV (and save the current one). */ - buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); - nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); - burn = nburn > burn ? nburn : burn; - buf_xor(outbuf, c->u_iv.iv, inbuf, blocksize); - outbuf += blocksize; - inbuf += blocksize; - inbuflen -= blocksize; - } - if ( inbuflen ) - { /* Process the remaining bytes. */ - /* Encrypt the IV (and save the current one). */ - buf_cpy( c->lastiv, c->u_iv.iv, blocksize ); nburn = enc_fn ( &c->context.c, c->u_iv.iv, c->u_iv.iv ); burn = nburn > burn ? nburn : burn; c->unused = blocksize; diff --git a/cipher/cipher.c b/cipher/cipher.c index c0d1d0b..df6d202 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -814,7 +814,7 @@ cipher_decrypt (gcry_cipher_hd_t c, byte *outbuf, unsigned int outbuflen, break; case GCRY_CIPHER_MODE_OFB: - rc = _gcry_cipher_ofb_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); + rc = _gcry_cipher_ofb_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); break; case GCRY_CIPHER_MODE_CTR: ----------------------------------------------------------------------- Summary of changes: cipher/cipher-internal.h | 4 --- cipher/cipher-ofb.c | 69 ---------------------------------------------- cipher/cipher.c | 46 ++++++++++--------------------- 3 files changed, 15 insertions(+), 104 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From jussi.kivilinna at iki.fi Sat Oct 26 15:03:53 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Sat, 26 Oct 2013 16:03:53 +0300 Subject: [PATCH 1/2] Add AMD64 assembly implementation of Salsa20 Message-ID: <20131026130353.31770.70585.stgit@localhost6.localdomain6> * cipher/Makefile.am: Add 'salsa20-amd64.S'. * cipher/salsa20-amd64.S: New. * cipher/salsa20.c (USE_AMD64): New macro. [USE_AMD64] (_gcry_salsa20_amd64_keysetup, _gcry_salsa20_amd64_ivsetup) (_gcry_salsa20_amd64_encrypt_blocks): New prototypes. [USE_AMD64] (salsa20_keysetup, salsa20_ivsetup, salsa20_core): New. [!USE_AMD64] (salsa20_core): Change 'src' to non-constant, update block counter in 'salsa20_core' and return burn stack depth. [!USE_AMD64] (salsa20_keysetup, salsa20_ivsetup): New. (salsa20_do_setkey): Move generic key setup to 'salsa20_keysetup'. (salsa20_setkey): Fix burn stack depth. (salsa20_setiv): Move generic IV setup to 'salsa20_ivsetup'. (salsa20_do_encrypt_stream) [USE_AMD64]: Process large buffers in AMD64 implementation. (salsa20_do_encrypt_stream): Move stack burning to this function... (salsa20_encrypt_stream, salsa20r12_encrypt_stream): ...from these functions. * configure.ac [x86-64]: Add 'salsa20-amd64.lo'. -- Patch adds fast AMD64 assembly implementation for Salsa20. This implementation is based on public domain code by D. J. Bernstein and it is available at http://cr.yp.to/snuffle.html (amd64-xmm6). Implementation gains extra speed by processing four blocks in parallel with help SSE2 instructions. Benchmark results on Intel Core i5-4570 (3.2 Ghz): Before: SALSA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 3.88 ns/B 246.0 MiB/s 12.41 c/B STREAM dec | 3.88 ns/B 246.0 MiB/s 12.41 c/B = SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 2.46 ns/B 387.9 MiB/s 7.87 c/B STREAM dec | 2.46 ns/B 387.7 MiB/s 7.87 c/B After: SALSA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 0.985 ns/B 967.8 MiB/s 3.15 c/B STREAM dec | 0.987 ns/B 966.5 MiB/s 3.16 c/B = SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 0.636 ns/B 1500.5 MiB/s 2.03 c/B STREAM dec | 0.636 ns/B 1499.2 MiB/s 2.04 c/B Signed-off-by: Jussi Kivilinna --- cipher/Makefile.am | 2 cipher/salsa20-amd64.S | 924 ++++++++++++++++++++++++++++++++++++++++++++++++ cipher/salsa20.c | 197 ++++++---- configure.ac | 7 4 files changed, 1056 insertions(+), 74 deletions(-) create mode 100644 cipher/salsa20-amd64.S diff --git a/cipher/Makefile.am b/cipher/Makefile.am index d7db933..e786713 100644 --- a/cipher/Makefile.am +++ b/cipher/Makefile.am @@ -71,7 +71,7 @@ md5.c \ rijndael.c rijndael-tables.h rijndael-amd64.S rijndael-arm.S \ rmd160.c \ rsa.c \ -salsa20.c \ +salsa20.c salsa20-amd64.S \ scrypt.c \ seed.c \ serpent.c serpent-sse2-amd64.S serpent-avx2-amd64.S \ diff --git a/cipher/salsa20-amd64.S b/cipher/salsa20-amd64.S new file mode 100644 index 0000000..691df58 --- /dev/null +++ b/cipher/salsa20-amd64.S @@ -0,0 +1,924 @@ +/* salsa20-amd64.S - AMD64 implementation of Salsa20 + * + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +/* + * Based on public domain implementation by D. J. Bernstein at + * http://cr.yp.to/snuffle.html + */ + +#ifdef __x86_64 +#include +#if defined(HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS) && defined(USE_SALSA20) + +.text + +.align 8 +.globl _gcry_salsa20_amd64_keysetup +.type _gcry_salsa20_amd64_keysetup, at function; +_gcry_salsa20_amd64_keysetup: + movl 0(%rsi),%r8d + movl 4(%rsi),%r9d + movl 8(%rsi),%eax + movl 12(%rsi),%r10d + movl %r8d,20(%rdi) + movl %r9d,40(%rdi) + movl %eax,60(%rdi) + movl %r10d,48(%rdi) + cmp $256,%rdx + jb ._kbits128 +._kbits256: + movl 16(%rsi),%edx + movl 20(%rsi),%ecx + movl 24(%rsi),%r8d + movl 28(%rsi),%esi + movl %edx,28(%rdi) + movl %ecx,16(%rdi) + movl %r8d,36(%rdi) + movl %esi,56(%rdi) + mov $1634760805,%rsi + mov $857760878,%rdx + mov $2036477234,%rcx + mov $1797285236,%r8 + movl %esi,0(%rdi) + movl %edx,4(%rdi) + movl %ecx,8(%rdi) + movl %r8d,12(%rdi) + jmp ._keysetupdone +._kbits128: + movl 0(%rsi),%edx + movl 4(%rsi),%ecx + movl 8(%rsi),%r8d + movl 12(%rsi),%esi + movl %edx,28(%rdi) + movl %ecx,16(%rdi) + movl %r8d,36(%rdi) + movl %esi,56(%rdi) + mov $1634760805,%rsi + mov $824206446,%rdx + mov $2036477238,%rcx + mov $1797285236,%r8 + movl %esi,0(%rdi) + movl %edx,4(%rdi) + movl %ecx,8(%rdi) + movl %r8d,12(%rdi) +._keysetupdone: + ret + +.align 8 +.globl _gcry_salsa20_amd64_ivsetup +.type _gcry_salsa20_amd64_ivsetup, at function; +_gcry_salsa20_amd64_ivsetup: + movl 0(%rsi),%r8d + movl 4(%rsi),%esi + mov $0,%r9 + mov $0,%rax + movl %r8d,24(%rdi) + movl %esi,44(%rdi) + movl %r9d,32(%rdi) + movl %eax,52(%rdi) + ret + +.align 8 +.globl _gcry_salsa20_amd64_encrypt_blocks +.type _gcry_salsa20_amd64_encrypt_blocks, at function; +_gcry_salsa20_amd64_encrypt_blocks: + /* + * Modifications to original implementation: + * - Number of rounds passing in register %r8 (for Salsa20/12). + * - Length is input as number of blocks, so don't handle tail bytes + * (this is done in salsa20.c). + */ + push %rbx + shlq $6, %rcx /* blocks to bytes */ + mov %r8, %rbx + mov %rsp,%r11 + and $31,%r11 + add $384,%r11 + sub %r11,%rsp + mov %rdi,%r8 + mov %rsi,%rsi + mov %rdx,%rdi + mov %rcx,%rdx + cmp $0,%rdx + jbe ._done +._start: + cmp $256,%rdx + jb ._bytes_are_64_128_or_192 + movdqa 0(%r8),%xmm0 + pshufd $0x55,%xmm0,%xmm1 + pshufd $0xaa,%xmm0,%xmm2 + pshufd $0xff,%xmm0,%xmm3 + pshufd $0x00,%xmm0,%xmm0 + movdqa %xmm1,0(%rsp) + movdqa %xmm2,16(%rsp) + movdqa %xmm3,32(%rsp) + movdqa %xmm0,48(%rsp) + movdqa 16(%r8),%xmm0 + pshufd $0xaa,%xmm0,%xmm1 + pshufd $0xff,%xmm0,%xmm2 + pshufd $0x00,%xmm0,%xmm3 + pshufd $0x55,%xmm0,%xmm0 + movdqa %xmm1,64(%rsp) + movdqa %xmm2,80(%rsp) + movdqa %xmm3,96(%rsp) + movdqa %xmm0,112(%rsp) + movdqa 32(%r8),%xmm0 + pshufd $0xff,%xmm0,%xmm1 + pshufd $0x55,%xmm0,%xmm2 + pshufd $0xaa,%xmm0,%xmm0 + movdqa %xmm1,128(%rsp) + movdqa %xmm2,144(%rsp) + movdqa %xmm0,160(%rsp) + movdqa 48(%r8),%xmm0 + pshufd $0x00,%xmm0,%xmm1 + pshufd $0xaa,%xmm0,%xmm2 + pshufd $0xff,%xmm0,%xmm0 + movdqa %xmm1,176(%rsp) + movdqa %xmm2,192(%rsp) + movdqa %xmm0,208(%rsp) +._bytesatleast256: + movl 32(%r8),%ecx + movl 52(%r8),%r9d + movl %ecx,224(%rsp) + movl %r9d,240(%rsp) + add $1,%ecx + adc $0,%r9d + movl %ecx,4+224(%rsp) + movl %r9d,4+240(%rsp) + add $1,%ecx + adc $0,%r9d + movl %ecx,8+224(%rsp) + movl %r9d,8+240(%rsp) + add $1,%ecx + adc $0,%r9d + movl %ecx,12+224(%rsp) + movl %r9d,12+240(%rsp) + add $1,%ecx + adc $0,%r9d + movl %ecx,32(%r8) + movl %r9d,52(%r8) + movq %rdx,288(%rsp) + mov %rbx,%rdx + movdqa 0(%rsp),%xmm0 + movdqa 16(%rsp),%xmm1 + movdqa 32(%rsp),%xmm2 + movdqa 192(%rsp),%xmm3 + movdqa 208(%rsp),%xmm4 + movdqa 64(%rsp),%xmm5 + movdqa 80(%rsp),%xmm6 + movdqa 112(%rsp),%xmm7 + movdqa 128(%rsp),%xmm8 + movdqa 144(%rsp),%xmm9 + movdqa 160(%rsp),%xmm10 + movdqa 240(%rsp),%xmm11 + movdqa 48(%rsp),%xmm12 + movdqa 96(%rsp),%xmm13 + movdqa 176(%rsp),%xmm14 + movdqa 224(%rsp),%xmm15 +._mainloop1: + movdqa %xmm1,256(%rsp) + movdqa %xmm2,272(%rsp) + movdqa %xmm13,%xmm1 + paddd %xmm12,%xmm1 + movdqa %xmm1,%xmm2 + pslld $7,%xmm1 + pxor %xmm1,%xmm14 + psrld $25,%xmm2 + pxor %xmm2,%xmm14 + movdqa %xmm7,%xmm1 + paddd %xmm0,%xmm1 + movdqa %xmm1,%xmm2 + pslld $7,%xmm1 + pxor %xmm1,%xmm11 + psrld $25,%xmm2 + pxor %xmm2,%xmm11 + movdqa %xmm12,%xmm1 + paddd %xmm14,%xmm1 + movdqa %xmm1,%xmm2 + pslld $9,%xmm1 + pxor %xmm1,%xmm15 + psrld $23,%xmm2 + pxor %xmm2,%xmm15 + movdqa %xmm0,%xmm1 + paddd %xmm11,%xmm1 + movdqa %xmm1,%xmm2 + pslld $9,%xmm1 + pxor %xmm1,%xmm9 + psrld $23,%xmm2 + pxor %xmm2,%xmm9 + movdqa %xmm14,%xmm1 + paddd %xmm15,%xmm1 + movdqa %xmm1,%xmm2 + pslld $13,%xmm1 + pxor %xmm1,%xmm13 + psrld $19,%xmm2 + pxor %xmm2,%xmm13 + movdqa %xmm11,%xmm1 + paddd %xmm9,%xmm1 + movdqa %xmm1,%xmm2 + pslld $13,%xmm1 + pxor %xmm1,%xmm7 + psrld $19,%xmm2 + pxor %xmm2,%xmm7 + movdqa %xmm15,%xmm1 + paddd %xmm13,%xmm1 + movdqa %xmm1,%xmm2 + pslld $18,%xmm1 + pxor %xmm1,%xmm12 + psrld $14,%xmm2 + pxor %xmm2,%xmm12 + movdqa 256(%rsp),%xmm1 + movdqa %xmm12,256(%rsp) + movdqa %xmm9,%xmm2 + paddd %xmm7,%xmm2 + movdqa %xmm2,%xmm12 + pslld $18,%xmm2 + pxor %xmm2,%xmm0 + psrld $14,%xmm12 + pxor %xmm12,%xmm0 + movdqa %xmm5,%xmm2 + paddd %xmm1,%xmm2 + movdqa %xmm2,%xmm12 + pslld $7,%xmm2 + pxor %xmm2,%xmm3 + psrld $25,%xmm12 + pxor %xmm12,%xmm3 + movdqa 272(%rsp),%xmm2 + movdqa %xmm0,272(%rsp) + movdqa %xmm6,%xmm0 + paddd %xmm2,%xmm0 + movdqa %xmm0,%xmm12 + pslld $7,%xmm0 + pxor %xmm0,%xmm4 + psrld $25,%xmm12 + pxor %xmm12,%xmm4 + movdqa %xmm1,%xmm0 + paddd %xmm3,%xmm0 + movdqa %xmm0,%xmm12 + pslld $9,%xmm0 + pxor %xmm0,%xmm10 + psrld $23,%xmm12 + pxor %xmm12,%xmm10 + movdqa %xmm2,%xmm0 + paddd %xmm4,%xmm0 + movdqa %xmm0,%xmm12 + pslld $9,%xmm0 + pxor %xmm0,%xmm8 + psrld $23,%xmm12 + pxor %xmm12,%xmm8 + movdqa %xmm3,%xmm0 + paddd %xmm10,%xmm0 + movdqa %xmm0,%xmm12 + pslld $13,%xmm0 + pxor %xmm0,%xmm5 + psrld $19,%xmm12 + pxor %xmm12,%xmm5 + movdqa %xmm4,%xmm0 + paddd %xmm8,%xmm0 + movdqa %xmm0,%xmm12 + pslld $13,%xmm0 + pxor %xmm0,%xmm6 + psrld $19,%xmm12 + pxor %xmm12,%xmm6 + movdqa %xmm10,%xmm0 + paddd %xmm5,%xmm0 + movdqa %xmm0,%xmm12 + pslld $18,%xmm0 + pxor %xmm0,%xmm1 + psrld $14,%xmm12 + pxor %xmm12,%xmm1 + movdqa 256(%rsp),%xmm0 + movdqa %xmm1,256(%rsp) + movdqa %xmm4,%xmm1 + paddd %xmm0,%xmm1 + movdqa %xmm1,%xmm12 + pslld $7,%xmm1 + pxor %xmm1,%xmm7 + psrld $25,%xmm12 + pxor %xmm12,%xmm7 + movdqa %xmm8,%xmm1 + paddd %xmm6,%xmm1 + movdqa %xmm1,%xmm12 + pslld $18,%xmm1 + pxor %xmm1,%xmm2 + psrld $14,%xmm12 + pxor %xmm12,%xmm2 + movdqa 272(%rsp),%xmm12 + movdqa %xmm2,272(%rsp) + movdqa %xmm14,%xmm1 + paddd %xmm12,%xmm1 + movdqa %xmm1,%xmm2 + pslld $7,%xmm1 + pxor %xmm1,%xmm5 + psrld $25,%xmm2 + pxor %xmm2,%xmm5 + movdqa %xmm0,%xmm1 + paddd %xmm7,%xmm1 + movdqa %xmm1,%xmm2 + pslld $9,%xmm1 + pxor %xmm1,%xmm10 + psrld $23,%xmm2 + pxor %xmm2,%xmm10 + movdqa %xmm12,%xmm1 + paddd %xmm5,%xmm1 + movdqa %xmm1,%xmm2 + pslld $9,%xmm1 + pxor %xmm1,%xmm8 + psrld $23,%xmm2 + pxor %xmm2,%xmm8 + movdqa %xmm7,%xmm1 + paddd %xmm10,%xmm1 + movdqa %xmm1,%xmm2 + pslld $13,%xmm1 + pxor %xmm1,%xmm4 + psrld $19,%xmm2 + pxor %xmm2,%xmm4 + movdqa %xmm5,%xmm1 + paddd %xmm8,%xmm1 + movdqa %xmm1,%xmm2 + pslld $13,%xmm1 + pxor %xmm1,%xmm14 + psrld $19,%xmm2 + pxor %xmm2,%xmm14 + movdqa %xmm10,%xmm1 + paddd %xmm4,%xmm1 + movdqa %xmm1,%xmm2 + pslld $18,%xmm1 + pxor %xmm1,%xmm0 + psrld $14,%xmm2 + pxor %xmm2,%xmm0 + movdqa 256(%rsp),%xmm1 + movdqa %xmm0,256(%rsp) + movdqa %xmm8,%xmm0 + paddd %xmm14,%xmm0 + movdqa %xmm0,%xmm2 + pslld $18,%xmm0 + pxor %xmm0,%xmm12 + psrld $14,%xmm2 + pxor %xmm2,%xmm12 + movdqa %xmm11,%xmm0 + paddd %xmm1,%xmm0 + movdqa %xmm0,%xmm2 + pslld $7,%xmm0 + pxor %xmm0,%xmm6 + psrld $25,%xmm2 + pxor %xmm2,%xmm6 + movdqa 272(%rsp),%xmm2 + movdqa %xmm12,272(%rsp) + movdqa %xmm3,%xmm0 + paddd %xmm2,%xmm0 + movdqa %xmm0,%xmm12 + pslld $7,%xmm0 + pxor %xmm0,%xmm13 + psrld $25,%xmm12 + pxor %xmm12,%xmm13 + movdqa %xmm1,%xmm0 + paddd %xmm6,%xmm0 + movdqa %xmm0,%xmm12 + pslld $9,%xmm0 + pxor %xmm0,%xmm15 + psrld $23,%xmm12 + pxor %xmm12,%xmm15 + movdqa %xmm2,%xmm0 + paddd %xmm13,%xmm0 + movdqa %xmm0,%xmm12 + pslld $9,%xmm0 + pxor %xmm0,%xmm9 + psrld $23,%xmm12 + pxor %xmm12,%xmm9 + movdqa %xmm6,%xmm0 + paddd %xmm15,%xmm0 + movdqa %xmm0,%xmm12 + pslld $13,%xmm0 + pxor %xmm0,%xmm11 + psrld $19,%xmm12 + pxor %xmm12,%xmm11 + movdqa %xmm13,%xmm0 + paddd %xmm9,%xmm0 + movdqa %xmm0,%xmm12 + pslld $13,%xmm0 + pxor %xmm0,%xmm3 + psrld $19,%xmm12 + pxor %xmm12,%xmm3 + movdqa %xmm15,%xmm0 + paddd %xmm11,%xmm0 + movdqa %xmm0,%xmm12 + pslld $18,%xmm0 + pxor %xmm0,%xmm1 + psrld $14,%xmm12 + pxor %xmm12,%xmm1 + movdqa %xmm9,%xmm0 + paddd %xmm3,%xmm0 + movdqa %xmm0,%xmm12 + pslld $18,%xmm0 + pxor %xmm0,%xmm2 + psrld $14,%xmm12 + pxor %xmm12,%xmm2 + movdqa 256(%rsp),%xmm12 + movdqa 272(%rsp),%xmm0 + sub $2,%rdx + ja ._mainloop1 + paddd 48(%rsp),%xmm12 + paddd 112(%rsp),%xmm7 + paddd 160(%rsp),%xmm10 + paddd 208(%rsp),%xmm4 + movd %xmm12,%rdx + movd %xmm7,%rcx + movd %xmm10,%r9 + movd %xmm4,%rax + pshufd $0x39,%xmm12,%xmm12 + pshufd $0x39,%xmm7,%xmm7 + pshufd $0x39,%xmm10,%xmm10 + pshufd $0x39,%xmm4,%xmm4 + xorl 0(%rsi),%edx + xorl 4(%rsi),%ecx + xorl 8(%rsi),%r9d + xorl 12(%rsi),%eax + movl %edx,0(%rdi) + movl %ecx,4(%rdi) + movl %r9d,8(%rdi) + movl %eax,12(%rdi) + movd %xmm12,%rdx + movd %xmm7,%rcx + movd %xmm10,%r9 + movd %xmm4,%rax + pshufd $0x39,%xmm12,%xmm12 + pshufd $0x39,%xmm7,%xmm7 + pshufd $0x39,%xmm10,%xmm10 + pshufd $0x39,%xmm4,%xmm4 + xorl 64(%rsi),%edx + xorl 68(%rsi),%ecx + xorl 72(%rsi),%r9d + xorl 76(%rsi),%eax + movl %edx,64(%rdi) + movl %ecx,68(%rdi) + movl %r9d,72(%rdi) + movl %eax,76(%rdi) + movd %xmm12,%rdx + movd %xmm7,%rcx + movd %xmm10,%r9 + movd %xmm4,%rax + pshufd $0x39,%xmm12,%xmm12 + pshufd $0x39,%xmm7,%xmm7 + pshufd $0x39,%xmm10,%xmm10 + pshufd $0x39,%xmm4,%xmm4 + xorl 128(%rsi),%edx + xorl 132(%rsi),%ecx + xorl 136(%rsi),%r9d + xorl 140(%rsi),%eax + movl %edx,128(%rdi) + movl %ecx,132(%rdi) + movl %r9d,136(%rdi) + movl %eax,140(%rdi) + movd %xmm12,%rdx + movd %xmm7,%rcx + movd %xmm10,%r9 + movd %xmm4,%rax + xorl 192(%rsi),%edx + xorl 196(%rsi),%ecx + xorl 200(%rsi),%r9d + xorl 204(%rsi),%eax + movl %edx,192(%rdi) + movl %ecx,196(%rdi) + movl %r9d,200(%rdi) + movl %eax,204(%rdi) + paddd 176(%rsp),%xmm14 + paddd 0(%rsp),%xmm0 + paddd 64(%rsp),%xmm5 + paddd 128(%rsp),%xmm8 + movd %xmm14,%rdx + movd %xmm0,%rcx + movd %xmm5,%r9 + movd %xmm8,%rax + pshufd $0x39,%xmm14,%xmm14 + pshufd $0x39,%xmm0,%xmm0 + pshufd $0x39,%xmm5,%xmm5 + pshufd $0x39,%xmm8,%xmm8 + xorl 16(%rsi),%edx + xorl 20(%rsi),%ecx + xorl 24(%rsi),%r9d + xorl 28(%rsi),%eax + movl %edx,16(%rdi) + movl %ecx,20(%rdi) + movl %r9d,24(%rdi) + movl %eax,28(%rdi) + movd %xmm14,%rdx + movd %xmm0,%rcx + movd %xmm5,%r9 + movd %xmm8,%rax + pshufd $0x39,%xmm14,%xmm14 + pshufd $0x39,%xmm0,%xmm0 + pshufd $0x39,%xmm5,%xmm5 + pshufd $0x39,%xmm8,%xmm8 + xorl 80(%rsi),%edx + xorl 84(%rsi),%ecx + xorl 88(%rsi),%r9d + xorl 92(%rsi),%eax + movl %edx,80(%rdi) + movl %ecx,84(%rdi) + movl %r9d,88(%rdi) + movl %eax,92(%rdi) + movd %xmm14,%rdx + movd %xmm0,%rcx + movd %xmm5,%r9 + movd %xmm8,%rax + pshufd $0x39,%xmm14,%xmm14 + pshufd $0x39,%xmm0,%xmm0 + pshufd $0x39,%xmm5,%xmm5 + pshufd $0x39,%xmm8,%xmm8 + xorl 144(%rsi),%edx + xorl 148(%rsi),%ecx + xorl 152(%rsi),%r9d + xorl 156(%rsi),%eax + movl %edx,144(%rdi) + movl %ecx,148(%rdi) + movl %r9d,152(%rdi) + movl %eax,156(%rdi) + movd %xmm14,%rdx + movd %xmm0,%rcx + movd %xmm5,%r9 + movd %xmm8,%rax + xorl 208(%rsi),%edx + xorl 212(%rsi),%ecx + xorl 216(%rsi),%r9d + xorl 220(%rsi),%eax + movl %edx,208(%rdi) + movl %ecx,212(%rdi) + movl %r9d,216(%rdi) + movl %eax,220(%rdi) + paddd 224(%rsp),%xmm15 + paddd 240(%rsp),%xmm11 + paddd 16(%rsp),%xmm1 + paddd 80(%rsp),%xmm6 + movd %xmm15,%rdx + movd %xmm11,%rcx + movd %xmm1,%r9 + movd %xmm6,%rax + pshufd $0x39,%xmm15,%xmm15 + pshufd $0x39,%xmm11,%xmm11 + pshufd $0x39,%xmm1,%xmm1 + pshufd $0x39,%xmm6,%xmm6 + xorl 32(%rsi),%edx + xorl 36(%rsi),%ecx + xorl 40(%rsi),%r9d + xorl 44(%rsi),%eax + movl %edx,32(%rdi) + movl %ecx,36(%rdi) + movl %r9d,40(%rdi) + movl %eax,44(%rdi) + movd %xmm15,%rdx + movd %xmm11,%rcx + movd %xmm1,%r9 + movd %xmm6,%rax + pshufd $0x39,%xmm15,%xmm15 + pshufd $0x39,%xmm11,%xmm11 + pshufd $0x39,%xmm1,%xmm1 + pshufd $0x39,%xmm6,%xmm6 + xorl 96(%rsi),%edx + xorl 100(%rsi),%ecx + xorl 104(%rsi),%r9d + xorl 108(%rsi),%eax + movl %edx,96(%rdi) + movl %ecx,100(%rdi) + movl %r9d,104(%rdi) + movl %eax,108(%rdi) + movd %xmm15,%rdx + movd %xmm11,%rcx + movd %xmm1,%r9 + movd %xmm6,%rax + pshufd $0x39,%xmm15,%xmm15 + pshufd $0x39,%xmm11,%xmm11 + pshufd $0x39,%xmm1,%xmm1 + pshufd $0x39,%xmm6,%xmm6 + xorl 160(%rsi),%edx + xorl 164(%rsi),%ecx + xorl 168(%rsi),%r9d + xorl 172(%rsi),%eax + movl %edx,160(%rdi) + movl %ecx,164(%rdi) + movl %r9d,168(%rdi) + movl %eax,172(%rdi) + movd %xmm15,%rdx + movd %xmm11,%rcx + movd %xmm1,%r9 + movd %xmm6,%rax + xorl 224(%rsi),%edx + xorl 228(%rsi),%ecx + xorl 232(%rsi),%r9d + xorl 236(%rsi),%eax + movl %edx,224(%rdi) + movl %ecx,228(%rdi) + movl %r9d,232(%rdi) + movl %eax,236(%rdi) + paddd 96(%rsp),%xmm13 + paddd 144(%rsp),%xmm9 + paddd 192(%rsp),%xmm3 + paddd 32(%rsp),%xmm2 + movd %xmm13,%rdx + movd %xmm9,%rcx + movd %xmm3,%r9 + movd %xmm2,%rax + pshufd $0x39,%xmm13,%xmm13 + pshufd $0x39,%xmm9,%xmm9 + pshufd $0x39,%xmm3,%xmm3 + pshufd $0x39,%xmm2,%xmm2 + xorl 48(%rsi),%edx + xorl 52(%rsi),%ecx + xorl 56(%rsi),%r9d + xorl 60(%rsi),%eax + movl %edx,48(%rdi) + movl %ecx,52(%rdi) + movl %r9d,56(%rdi) + movl %eax,60(%rdi) + movd %xmm13,%rdx + movd %xmm9,%rcx + movd %xmm3,%r9 + movd %xmm2,%rax + pshufd $0x39,%xmm13,%xmm13 + pshufd $0x39,%xmm9,%xmm9 + pshufd $0x39,%xmm3,%xmm3 + pshufd $0x39,%xmm2,%xmm2 + xorl 112(%rsi),%edx + xorl 116(%rsi),%ecx + xorl 120(%rsi),%r9d + xorl 124(%rsi),%eax + movl %edx,112(%rdi) + movl %ecx,116(%rdi) + movl %r9d,120(%rdi) + movl %eax,124(%rdi) + movd %xmm13,%rdx + movd %xmm9,%rcx + movd %xmm3,%r9 + movd %xmm2,%rax + pshufd $0x39,%xmm13,%xmm13 + pshufd $0x39,%xmm9,%xmm9 + pshufd $0x39,%xmm3,%xmm3 + pshufd $0x39,%xmm2,%xmm2 + xorl 176(%rsi),%edx + xorl 180(%rsi),%ecx + xorl 184(%rsi),%r9d + xorl 188(%rsi),%eax + movl %edx,176(%rdi) + movl %ecx,180(%rdi) + movl %r9d,184(%rdi) + movl %eax,188(%rdi) + movd %xmm13,%rdx + movd %xmm9,%rcx + movd %xmm3,%r9 + movd %xmm2,%rax + xorl 240(%rsi),%edx + xorl 244(%rsi),%ecx + xorl 248(%rsi),%r9d + xorl 252(%rsi),%eax + movl %edx,240(%rdi) + movl %ecx,244(%rdi) + movl %r9d,248(%rdi) + movl %eax,252(%rdi) + movq 288(%rsp),%rdx + sub $256,%rdx + add $256,%rsi + add $256,%rdi + cmp $256,%rdx + jae ._bytesatleast256 + cmp $0,%rdx + jbe ._done +._bytes_are_64_128_or_192: + movq %rdx,288(%rsp) + movdqa 0(%r8),%xmm0 + movdqa 16(%r8),%xmm1 + movdqa 32(%r8),%xmm2 + movdqa 48(%r8),%xmm3 + movdqa %xmm1,%xmm4 + mov %rbx,%rdx +._mainloop2: + paddd %xmm0,%xmm4 + movdqa %xmm0,%xmm5 + movdqa %xmm4,%xmm6 + pslld $7,%xmm4 + psrld $25,%xmm6 + pxor %xmm4,%xmm3 + pxor %xmm6,%xmm3 + paddd %xmm3,%xmm5 + movdqa %xmm3,%xmm4 + movdqa %xmm5,%xmm6 + pslld $9,%xmm5 + psrld $23,%xmm6 + pxor %xmm5,%xmm2 + pshufd $0x93,%xmm3,%xmm3 + pxor %xmm6,%xmm2 + paddd %xmm2,%xmm4 + movdqa %xmm2,%xmm5 + movdqa %xmm4,%xmm6 + pslld $13,%xmm4 + psrld $19,%xmm6 + pxor %xmm4,%xmm1 + pshufd $0x4e,%xmm2,%xmm2 + pxor %xmm6,%xmm1 + paddd %xmm1,%xmm5 + movdqa %xmm3,%xmm4 + movdqa %xmm5,%xmm6 + pslld $18,%xmm5 + psrld $14,%xmm6 + pxor %xmm5,%xmm0 + pshufd $0x39,%xmm1,%xmm1 + pxor %xmm6,%xmm0 + paddd %xmm0,%xmm4 + movdqa %xmm0,%xmm5 + movdqa %xmm4,%xmm6 + pslld $7,%xmm4 + psrld $25,%xmm6 + pxor %xmm4,%xmm1 + pxor %xmm6,%xmm1 + paddd %xmm1,%xmm5 + movdqa %xmm1,%xmm4 + movdqa %xmm5,%xmm6 + pslld $9,%xmm5 + psrld $23,%xmm6 + pxor %xmm5,%xmm2 + pshufd $0x93,%xmm1,%xmm1 + pxor %xmm6,%xmm2 + paddd %xmm2,%xmm4 + movdqa %xmm2,%xmm5 + movdqa %xmm4,%xmm6 + pslld $13,%xmm4 + psrld $19,%xmm6 + pxor %xmm4,%xmm3 + pshufd $0x4e,%xmm2,%xmm2 + pxor %xmm6,%xmm3 + paddd %xmm3,%xmm5 + movdqa %xmm1,%xmm4 + movdqa %xmm5,%xmm6 + pslld $18,%xmm5 + psrld $14,%xmm6 + pxor %xmm5,%xmm0 + pshufd $0x39,%xmm3,%xmm3 + pxor %xmm6,%xmm0 + paddd %xmm0,%xmm4 + movdqa %xmm0,%xmm5 + movdqa %xmm4,%xmm6 + pslld $7,%xmm4 + psrld $25,%xmm6 + pxor %xmm4,%xmm3 + pxor %xmm6,%xmm3 + paddd %xmm3,%xmm5 + movdqa %xmm3,%xmm4 + movdqa %xmm5,%xmm6 + pslld $9,%xmm5 + psrld $23,%xmm6 + pxor %xmm5,%xmm2 + pshufd $0x93,%xmm3,%xmm3 + pxor %xmm6,%xmm2 + paddd %xmm2,%xmm4 + movdqa %xmm2,%xmm5 + movdqa %xmm4,%xmm6 + pslld $13,%xmm4 + psrld $19,%xmm6 + pxor %xmm4,%xmm1 + pshufd $0x4e,%xmm2,%xmm2 + pxor %xmm6,%xmm1 + paddd %xmm1,%xmm5 + movdqa %xmm3,%xmm4 + movdqa %xmm5,%xmm6 + pslld $18,%xmm5 + psrld $14,%xmm6 + pxor %xmm5,%xmm0 + pshufd $0x39,%xmm1,%xmm1 + pxor %xmm6,%xmm0 + paddd %xmm0,%xmm4 + movdqa %xmm0,%xmm5 + movdqa %xmm4,%xmm6 + pslld $7,%xmm4 + psrld $25,%xmm6 + pxor %xmm4,%xmm1 + pxor %xmm6,%xmm1 + paddd %xmm1,%xmm5 + movdqa %xmm1,%xmm4 + movdqa %xmm5,%xmm6 + pslld $9,%xmm5 + psrld $23,%xmm6 + pxor %xmm5,%xmm2 + pshufd $0x93,%xmm1,%xmm1 + pxor %xmm6,%xmm2 + paddd %xmm2,%xmm4 + movdqa %xmm2,%xmm5 + movdqa %xmm4,%xmm6 + pslld $13,%xmm4 + psrld $19,%xmm6 + pxor %xmm4,%xmm3 + pshufd $0x4e,%xmm2,%xmm2 + pxor %xmm6,%xmm3 + sub $4,%rdx + paddd %xmm3,%xmm5 + movdqa %xmm1,%xmm4 + movdqa %xmm5,%xmm6 + pslld $18,%xmm5 + pxor %xmm7,%xmm7 + psrld $14,%xmm6 + pxor %xmm5,%xmm0 + pshufd $0x39,%xmm3,%xmm3 + pxor %xmm6,%xmm0 + ja ._mainloop2 + paddd 0(%r8),%xmm0 + paddd 16(%r8),%xmm1 + paddd 32(%r8),%xmm2 + paddd 48(%r8),%xmm3 + movd %xmm0,%rdx + movd %xmm1,%rcx + movd %xmm2,%rax + movd %xmm3,%r10 + pshufd $0x39,%xmm0,%xmm0 + pshufd $0x39,%xmm1,%xmm1 + pshufd $0x39,%xmm2,%xmm2 + pshufd $0x39,%xmm3,%xmm3 + xorl 0(%rsi),%edx + xorl 48(%rsi),%ecx + xorl 32(%rsi),%eax + xorl 16(%rsi),%r10d + movl %edx,0(%rdi) + movl %ecx,48(%rdi) + movl %eax,32(%rdi) + movl %r10d,16(%rdi) + movd %xmm0,%rdx + movd %xmm1,%rcx + movd %xmm2,%rax + movd %xmm3,%r10 + pshufd $0x39,%xmm0,%xmm0 + pshufd $0x39,%xmm1,%xmm1 + pshufd $0x39,%xmm2,%xmm2 + pshufd $0x39,%xmm3,%xmm3 + xorl 20(%rsi),%edx + xorl 4(%rsi),%ecx + xorl 52(%rsi),%eax + xorl 36(%rsi),%r10d + movl %edx,20(%rdi) + movl %ecx,4(%rdi) + movl %eax,52(%rdi) + movl %r10d,36(%rdi) + movd %xmm0,%rdx + movd %xmm1,%rcx + movd %xmm2,%rax + movd %xmm3,%r10 + pshufd $0x39,%xmm0,%xmm0 + pshufd $0x39,%xmm1,%xmm1 + pshufd $0x39,%xmm2,%xmm2 + pshufd $0x39,%xmm3,%xmm3 + xorl 40(%rsi),%edx + xorl 24(%rsi),%ecx + xorl 8(%rsi),%eax + xorl 56(%rsi),%r10d + movl %edx,40(%rdi) + movl %ecx,24(%rdi) + movl %eax,8(%rdi) + movl %r10d,56(%rdi) + movd %xmm0,%rdx + movd %xmm1,%rcx + movd %xmm2,%rax + movd %xmm3,%r10 + xorl 60(%rsi),%edx + xorl 44(%rsi),%ecx + xorl 28(%rsi),%eax + xorl 12(%rsi),%r10d + movl %edx,60(%rdi) + movl %ecx,44(%rdi) + movl %eax,28(%rdi) + movl %r10d,12(%rdi) + movq 288(%rsp),%rdx + movl 32(%r8),%ecx + movl 52(%r8),%eax + add $1,%ecx + adc $0,%eax + movl %ecx,32(%r8) + movl %eax,52(%r8) + cmp $64,%rdx + ja ._bytes_are_128_or_192 +._done: + add %r11,%rsp + mov %r11,%rax + pop %rbx + ret +._bytes_are_128_or_192: + sub $64,%rdx + add $64,%rdi + add $64,%rsi + jmp ._bytes_are_64_128_or_192 +.size _gcry_salsa20_amd64_encrypt_blocks,.-_gcry_salsa20_amd64_encrypt_blocks; + +#endif /*defined(USE_SALSA20)*/ +#endif /*__x86_64*/ diff --git a/cipher/salsa20.c b/cipher/salsa20.c index 6189bca..892b9fc 100644 --- a/cipher/salsa20.c +++ b/cipher/salsa20.c @@ -40,6 +40,14 @@ #include "cipher.h" #include "bufhelp.h" + +/* USE_AMD64 indicates whether to compile with AMD64 code. */ +#undef USE_AMD64 +#if defined(__x86_64__) && defined(HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS) +# define USE_AMD64 1 +#endif + + #define SALSA20_MIN_KEY_SIZE 16 /* Bytes. */ #define SALSA20_MAX_KEY_SIZE 32 /* Bytes. */ #define SALSA20_BLOCK_SIZE 64 /* Bytes. */ @@ -83,6 +91,36 @@ typedef struct static void salsa20_setiv (void *context, const byte *iv, unsigned int ivlen); static const char *selftest (void); + +#ifdef USE_AMD64 +/* AMD64 assembly implementations of Salsa20. */ +void _gcry_salsa20_amd64_keysetup(u32 *ctxinput, const void *key, int keybits); +void _gcry_salsa20_amd64_ivsetup(u32 *ctxinput, const void *iv); +unsigned int +_gcry_salsa20_amd64_encrypt_blocks(u32 *ctxinput, const void *src, void *dst, + size_t len, int rounds); + +static void +salsa20_keysetup(SALSA20_context_t *ctx, const byte *key, int keylen) +{ + _gcry_salsa20_amd64_keysetup(ctx->input, key, keylen * 8); +} + +static void +salsa20_ivsetup(SALSA20_context_t *ctx, const byte *iv) +{ + _gcry_salsa20_amd64_ivsetup(ctx->input, iv); +} + +static unsigned int +salsa20_core (u32 *dst, u32 *src, unsigned int rounds) +{ + memset(dst, 0, SALSA20_BLOCK_SIZE); + return _gcry_salsa20_amd64_encrypt_blocks(src, dst, dst, 1, rounds); +} + +#else /* USE_AMD64 */ + #if 0 @@ -110,8 +148,8 @@ static const char *selftest (void); x0 ^= ROTL32 (18, x3 + x2); \ } while(0) -static void -salsa20_core (u32 *dst, const u32 *src, unsigned rounds) +static unsigned int +salsa20_core (u32 *dst, u32 *src, unsigned int rounds) { u32 pad[SALSA20_INPUT_LENGTH]; unsigned int i; @@ -138,31 +176,24 @@ salsa20_core (u32 *dst, const u32 *src, unsigned rounds) u32 t = pad[i] + src[i]; dst[i] = LE_SWAP32 (t); } + + /* Update counter. */ + if (!++src[8]) + src[9]++; + + /* burn_stack */ + return ( 3*sizeof (void*) \ + + 2*sizeof (void*) \ + + 64 \ + + sizeof (unsigned int) \ + + sizeof (u32) ); } #undef QROUND #undef SALSA20_CORE_DEBUG -static gcry_err_code_t -salsa20_do_setkey (SALSA20_context_t *ctx, - const byte *key, unsigned int keylen) +static void +salsa20_keysetup(SALSA20_context_t *ctx, const byte *key, int keylen) { - static int initialized; - static const char *selftest_failed; - - if (!initialized ) - { - initialized = 1; - selftest_failed = selftest (); - if (selftest_failed) - log_error ("SALSA20 selftest failed (%s)\n", selftest_failed ); - } - if (selftest_failed) - return GPG_ERR_SELFTEST_FAILED; - - if (keylen != SALSA20_MIN_KEY_SIZE - && keylen != SALSA20_MAX_KEY_SIZE) - return GPG_ERR_INV_KEYLEN; - /* These constants are the little endian encoding of the string "expand 32-byte k". For the 128 bit variant, the "32" in that string will be fixed up to "16". */ @@ -192,6 +223,41 @@ salsa20_do_setkey (SALSA20_context_t *ctx, ctx->input[5] -= 0x02000000; /* Change to "1 dn". */ ctx->input[10] += 0x00000004; /* Change to "yb-6". */ } +} + +static void salsa20_ivsetup(SALSA20_context_t *ctx, const byte *iv) +{ + ctx->input[6] = LE_READ_UINT32(iv + 0); + ctx->input[7] = LE_READ_UINT32(iv + 4); + /* Reset the block counter. */ + ctx->input[8] = 0; + ctx->input[9] = 0; +} + +#endif /*!USE_AMD64*/ + +static gcry_err_code_t +salsa20_do_setkey (SALSA20_context_t *ctx, + const byte *key, unsigned int keylen) +{ + static int initialized; + static const char *selftest_failed; + + if (!initialized ) + { + initialized = 1; + selftest_failed = selftest (); + if (selftest_failed) + log_error ("SALSA20 selftest failed (%s)\n", selftest_failed ); + } + if (selftest_failed) + return GPG_ERR_SELFTEST_FAILED; + + if (keylen != SALSA20_MIN_KEY_SIZE + && keylen != SALSA20_MAX_KEY_SIZE) + return GPG_ERR_INV_KEYLEN; + + salsa20_keysetup (ctx, key, keylen); /* We default to a zero nonce. */ salsa20_setiv (ctx, NULL, 0); @@ -205,7 +271,7 @@ salsa20_setkey (void *context, const byte *key, unsigned int keylen) { SALSA20_context_t *ctx = (SALSA20_context_t *)context; gcry_err_code_t rc = salsa20_do_setkey (ctx, key, keylen); - _gcry_burn_stack (300/* FIXME*/); + _gcry_burn_stack (4 + sizeof (void *) + 4 * sizeof (void *)); return rc; } @@ -214,28 +280,22 @@ static void salsa20_setiv (void *context, const byte *iv, unsigned int ivlen) { SALSA20_context_t *ctx = (SALSA20_context_t *)context; + byte tmp[SALSA20_IV_SIZE]; - if (!iv) - { - ctx->input[6] = 0; - ctx->input[7] = 0; - } - else if (ivlen == SALSA20_IV_SIZE) - { - ctx->input[6] = LE_READ_UINT32(iv + 0); - ctx->input[7] = LE_READ_UINT32(iv + 4); - } + if (iv && ivlen != SALSA20_IV_SIZE) + log_info ("WARNING: salsa20_setiv: bad ivlen=%u\n", ivlen); + + if (!iv || ivlen != SALSA20_IV_SIZE) + memset (tmp, 0, sizeof(tmp)); else - { - log_info ("WARNING: salsa20_setiv: bad ivlen=%u\n", ivlen); - ctx->input[6] = 0; - ctx->input[7] = 0; - } - /* Reset the block counter. */ - ctx->input[8] = 0; - ctx->input[9] = 0; + memcpy (tmp, iv, SALSA20_IV_SIZE); + + salsa20_ivsetup (ctx, tmp); + /* Reset the unused pad bytes counter. */ ctx->unused = 0; + + wipememory (tmp, sizeof(tmp)); } @@ -246,6 +306,8 @@ salsa20_do_encrypt_stream (SALSA20_context_t *ctx, byte *outbuf, const byte *inbuf, unsigned int length, unsigned rounds) { + unsigned int nburn, burn = 0; + if (ctx->unused) { unsigned char *p = (void*)ctx->pad; @@ -266,26 +328,39 @@ salsa20_do_encrypt_stream (SALSA20_context_t *ctx, gcry_assert (!ctx->unused); } - for (;;) +#ifdef USE_AMD64 + if (length >= SALSA20_BLOCK_SIZE) + { + unsigned int nblocks = length / SALSA20_BLOCK_SIZE; + burn = _gcry_salsa20_amd64_encrypt_blocks(ctx->input, inbuf, outbuf, + nblocks, rounds); + length -= SALSA20_BLOCK_SIZE * nblocks; + outbuf += SALSA20_BLOCK_SIZE * nblocks; + inbuf += SALSA20_BLOCK_SIZE * nblocks; + } +#endif + + while (length > 0) { /* Create the next pad and bump the block counter. Note that it is the user's duty to change to another nonce not later than after 2^70 processed bytes. */ - salsa20_core (ctx->pad, ctx->input, rounds); - if (!++ctx->input[8]) - ctx->input[9]++; + nburn = salsa20_core (ctx->pad, ctx->input, rounds); + burn = nburn > burn ? nburn : burn; if (length <= SALSA20_BLOCK_SIZE) { buf_xor (outbuf, inbuf, ctx->pad, length); ctx->unused = SALSA20_BLOCK_SIZE - length; - return; + break; } buf_xor (outbuf, inbuf, ctx->pad, SALSA20_BLOCK_SIZE); length -= SALSA20_BLOCK_SIZE; outbuf += SALSA20_BLOCK_SIZE; inbuf += SALSA20_BLOCK_SIZE; - } + } + + _gcry_burn_stack (burn); } @@ -296,19 +371,7 @@ salsa20_encrypt_stream (void *context, SALSA20_context_t *ctx = (SALSA20_context_t *)context; if (length) - { - salsa20_do_encrypt_stream (ctx, outbuf, inbuf, length, SALSA20_ROUNDS); - _gcry_burn_stack (/* salsa20_do_encrypt_stream: */ - 2*sizeof (void*) - + 3*sizeof (void*) + sizeof (unsigned int) - /* salsa20_core: */ - + 2*sizeof (void*) - + 2*sizeof (void*) - + 64 - + sizeof (unsigned int) - + sizeof (u32) - ); - } + salsa20_do_encrypt_stream (ctx, outbuf, inbuf, length, SALSA20_ROUNDS); } @@ -319,19 +382,7 @@ salsa20r12_encrypt_stream (void *context, SALSA20_context_t *ctx = (SALSA20_context_t *)context; if (length) - { - salsa20_do_encrypt_stream (ctx, outbuf, inbuf, length, SALSA20R12_ROUNDS); - _gcry_burn_stack (/* salsa20_do_encrypt_stream: */ - 2*sizeof (void*) - + 3*sizeof (void*) + sizeof (unsigned int) - /* salsa20_core: */ - + 2*sizeof (void*) - + 2*sizeof (void*) - + 64 - + sizeof (unsigned int) - + sizeof (u32) - ); - } + salsa20_do_encrypt_stream (ctx, outbuf, inbuf, length, SALSA20R12_ROUNDS); } diff --git a/configure.ac b/configure.ac index 5b7ba0d..114460c 100644 --- a/configure.ac +++ b/configure.ac @@ -1553,6 +1553,13 @@ LIST_MEMBER(salsa20, $enabled_ciphers) if test "$found" = "1" ; then GCRYPT_CIPHERS="$GCRYPT_CIPHERS salsa20.lo" AC_DEFINE(USE_SALSA20, 1, [Defined if this module should be included]) + + case "${host}" in + x86_64-*-*) + # Build with the assembly implementation + GCRYPT_CIPHERS="$GCRYPT_CIPHERS salsa20-amd64.lo" + ;; + esac fi LIST_MEMBER(gost28147, $enabled_ciphers) From jussi.kivilinna at iki.fi Sat Oct 26 15:03:58 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Sat, 26 Oct 2013 16:03:58 +0300 Subject: [PATCH 2/2] Add ARM NEON assembly implementation of Salsa20 In-Reply-To: <20131026130353.31770.70585.stgit@localhost6.localdomain6> References: <20131026130353.31770.70585.stgit@localhost6.localdomain6> Message-ID: <20131026130358.31770.80028.stgit@localhost6.localdomain6> * cipher/Makefile.am: Add 'salsa20-armv7-neon.S'. * cipher/salsa20-armv7-neon.S: New. * cipher/salsa20.c [USE_ARM_NEON_ASM]: New macro. (struct SALSA20_context_s, salsa20_core_t, salsa20_keysetup_t) (salsa20_ivsetup_t): New. (SALSA20_context_t) [USE_ARM_NEON_ASM]: Add 'use_neon'. (SALSA20_context_t): Add 'keysetup', 'ivsetup' and 'core'. (salsa20_core): Change 'src' argument to 'ctx'. [USE_ARM_NEON_ASM] (_gcry_arm_neon_salsa20_encrypt): New prototype. [USE_ARM_NEON_ASM] (salsa20_core_neon, salsa20_keysetup_neon) (salsa20_ivsetup_neon): New. (salsa20_do_setkey): Setup keysetup, ivsetup and core with default functions. (salsa20_do_setkey) [USE_ARM_NEON_ASM]: When NEON support detect, set keysetup, ivsetup and core with ARM NEON functions. (salsa20_do_setkey): Call 'ctx->keysetup'. (salsa20_setiv): Call 'ctx->ivsetup'. (salsa20_do_encrypt_stream) [USE_ARM_NEON_ASM]: Process large buffers in ARM NEON implementation. (salsa20_do_encrypt_stream): Call 'ctx->core' instead of directly calling 'salsa20_core'. (selftest): Add test to check large buffer processing and block counter updating. * configure.ac [neonsupport]: 'Add salsa20-armv7-neon.lo'. -- Patch adds fast ARM NEON assembly implementation for Salsa20. Implementation gains extra speed by processing three blocks in parallel with help of ARM NEON vector processing unit. This implementation is based on public domain code by Peter Schawabe and D. J. Bernstein and it is available in SUPERCOP benchmarking framework. For more details on this work, check paper "NEON crypto" by Daniel J. Bernstein and Peter Schwabe: http://cryptojedi.org/papers/#neoncrypto Benchmark results on Cortex-A8 (1008 Mhz): Before: SALSA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 18.88 ns/B 50.51 MiB/s 19.03 c/B STREAM dec | 18.89 ns/B 50.49 MiB/s 19.04 c/B = SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 13.60 ns/B 70.14 MiB/s 13.71 c/B STREAM dec | 13.60 ns/B 70.13 MiB/s 13.71 c/B After: SALSA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 5.48 ns/B 174.1 MiB/s 5.52 c/B STREAM dec | 5.47 ns/B 174.2 MiB/s 5.52 c/B = SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 3.65 ns/B 260.9 MiB/s 3.68 c/B STREAM dec | 3.65 ns/B 261.6 MiB/s 3.67 c/B Signed-off-by: Jussi Kivilinna --- cipher/Makefile.am | 2 cipher/salsa20-armv7-neon.S | 898 +++++++++++++++++++++++++++++++++++++++++++ cipher/salsa20.c | 132 ++++++ configure.ac | 5 4 files changed, 1027 insertions(+), 10 deletions(-) create mode 100644 cipher/salsa20-armv7-neon.S diff --git a/cipher/Makefile.am b/cipher/Makefile.am index e786713..95d484e 100644 --- a/cipher/Makefile.am +++ b/cipher/Makefile.am @@ -71,7 +71,7 @@ md5.c \ rijndael.c rijndael-tables.h rijndael-amd64.S rijndael-arm.S \ rmd160.c \ rsa.c \ -salsa20.c salsa20-amd64.S \ +salsa20.c salsa20-amd64.S salsa20-armv7-neon.S \ scrypt.c \ seed.c \ serpent.c serpent-sse2-amd64.S serpent-avx2-amd64.S \ diff --git a/cipher/salsa20-armv7-neon.S b/cipher/salsa20-armv7-neon.S new file mode 100644 index 0000000..0adb2c7 --- /dev/null +++ b/cipher/salsa20-armv7-neon.S @@ -0,0 +1,898 @@ +/* salsa-armv7-neon.S - ARM NEON implementation of Salsa20 cipher + * + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include + +#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) && \ + defined(HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS) && \ + defined(HAVE_GCC_INLINE_ASM_NEON) && defined(USE_SALSA20) + +/* + * Based on public domain implementation from SUPERCOP benchmarking framework + * by Peter Schwabe and D. J. Bernstein. Paper about the implementation at: + * http://cryptojedi.org/papers/#neoncrypto + */ + +.syntax unified +.arm +.fpu neon +.text + +.align 2 +.global _gcry_arm_neon_salsa20_encrypt +.type _gcry_arm_neon_salsa20_encrypt,%function; +_gcry_arm_neon_salsa20_encrypt: + /* Modifications: + * - arguments changed to (void *c, const void *m, unsigned int nblks, + * void *ctx, unsigned int rounds) from (void *c, const void *m, + * unsigned long long mlen, const void *n, const void *k) + * - nonce and key read from 'ctx' as well as sigma and counter. + * - read in counter from 'ctx' at the start. + * - update counter in 'ctx' at the end. + * - length is input as number of blocks, so don't handle tail bytes + * (this is done in salsa20.c). + */ + lsl r2,r2,#6 + vpush {q4,q5,q6,q7} + mov r12,sp + sub sp,sp,#352 + and sp,sp,#0xffffffe0 + strd r4,[sp,#0] + strd r6,[sp,#8] + strd r8,[sp,#16] + strd r10,[sp,#24] + str r14,[sp,#224] + str r12,[sp,#228] + str r0,[sp,#232] + str r1,[sp,#236] + str r2,[sp,#240] + ldr r4,[r12,#64] + str r4,[sp,#244] + mov r2,r3 + add r3,r2,#48 + vld1.8 {q3},[r2] + add r0,r2,#32 + add r14,r2,#40 + vmov.i64 q3,#0xff + str r14,[sp,#160] + ldrd r8,[r2,#4] + vld1.8 {d0},[r0] + ldrd r4,[r2,#20] + vld1.8 {d8-d9},[r2]! + ldrd r6,[r0,#0] + vmov d4,d9 + ldr r0,[r14] + vrev64.i32 d0,d0 + ldr r1,[r14,#4] + vld1.8 {d10-d11},[r2] + strd r6,[sp,#32] + sub r2,r2,#16 + strd r0,[sp,#40] + vmov d5,d11 + strd r8,[sp,#48] + vext.32 d1,d0,d10,#1 + strd r4,[sp,#56] + ldr r1,[r2,#0] + vshr.u32 q3,q3,#7 + ldr r4,[r2,#12] + vext.32 d3,d11,d9,#1 + ldr r11,[r2,#16] + vext.32 d2,d8,d0,#1 + ldr r8,[r2,#28] + vext.32 d0,d10,d8,#1 + ldr r0,[r3,#0] + add r2,r2,#44 + vmov q4,q3 + vld1.8 {d6-d7},[r14] + vadd.i64 q3,q3,q4 + ldr r5,[r3,#4] + add r12,sp,#256 + vst1.8 {d4-d5},[r12,: 128] + ldr r10,[r3,#8] + add r14,sp,#272 + vst1.8 {d2-d3},[r14,: 128] + ldr r9,[r3,#12] + vld1.8 {d2-d3},[r3] + strd r0,[sp,#64] + ldr r0,[sp,#240] + strd r4,[sp,#72] + strd r10,[sp,#80] + strd r8,[sp,#88] + nop + cmp r0,#192 + blo ._mlenlowbelow192 +._mlenatleast192: + ldrd r2,[sp,#48] + vext.32 d7,d6,d6,#1 + vmov q8,q1 + ldrd r6,[sp,#32] + vld1.8 {d18-d19},[r12,: 128] + vmov q10,q0 + str r0,[sp,#240] + vext.32 d4,d7,d19,#1 + vmov q11,q8 + vext.32 d10,d18,d7,#1 + vadd.i64 q3,q3,q4 + ldrd r0,[sp,#64] + vld1.8 {d24-d25},[r14,: 128] + vmov d5,d24 + add r8,sp,#288 + ldrd r4,[sp,#72] + vmov d11,d25 + add r9,sp,#304 + ldrd r10,[sp,#80] + vst1.8 {d4-d5},[r8,: 128] + strd r2,[sp,#96] + vext.32 d7,d6,d6,#1 + vmov q13,q10 + strd r6,[sp,#104] + vmov d13,d24 + vst1.8 {d10-d11},[r9,: 128] + add r2,sp,#320 + vext.32 d12,d7,d19,#1 + vmov d15,d25 + add r6,sp,#336 + ldr r12,[sp,#244] + vext.32 d14,d18,d7,#1 + vadd.i64 q3,q3,q4 + ldrd r8,[sp,#88] + vst1.8 {d12-d13},[r2,: 128] + ldrd r2,[sp,#56] + vst1.8 {d14-d15},[r6,: 128] + ldrd r6,[sp,#40] +._mainloop2: + str r12,[sp,#248] + vadd.i32 q4,q10,q8 + vadd.i32 q9,q13,q11 + add r12,r0,r2 + add r14,r5,r1 + vshl.i32 q12,q4,#7 + vshl.i32 q14,q9,#7 + vshr.u32 q4,q4,#25 + vshr.u32 q9,q9,#25 + eor r4,r4,r12,ROR #25 + eor r7,r7,r14,ROR #25 + add r12,r4,r0 + add r14,r7,r5 + veor q5,q5,q12 + veor q7,q7,q14 + veor q4,q5,q4 + veor q5,q7,q9 + eor r6,r6,r12,ROR #23 + eor r3,r3,r14,ROR #23 + add r12,r6,r4 + str r7,[sp,#116] + add r7,r3,r7 + ldr r14,[sp,#108] + vadd.i32 q7,q8,q4 + vadd.i32 q9,q11,q5 + vshl.i32 q12,q7,#9 + vshl.i32 q14,q9,#9 + vshr.u32 q7,q7,#23 + vshr.u32 q9,q9,#23 + veor q2,q2,q12 + veor q6,q6,q14 + veor q2,q2,q7 + veor q6,q6,q9 + eor r2,r2,r12,ROR #19 + str r2,[sp,#120] + eor r1,r1,r7,ROR #19 + ldr r7,[sp,#96] + add r2,r2,r6 + str r6,[sp,#112] + add r6,r1,r3 + ldr r12,[sp,#104] + vadd.i32 q7,q4,q2 + vext.32 q4,q4,q4,#3 + vadd.i32 q9,q5,q6 + vshl.i32 q12,q7,#13 + vext.32 q5,q5,q5,#3 + vshl.i32 q14,q9,#13 + eor r0,r0,r2,ROR #14 + eor r2,r5,r6,ROR #14 + str r3,[sp,#124] + add r3,r10,r12 + ldr r5,[sp,#100] + add r6,r9,r11 + vshr.u32 q7,q7,#19 + vshr.u32 q9,q9,#19 + veor q10,q10,q12 + veor q12,q13,q14 + eor r8,r8,r3,ROR #25 + eor r3,r5,r6,ROR #25 + add r5,r8,r10 + add r6,r3,r9 + veor q7,q10,q7 + veor q9,q12,q9 + eor r5,r7,r5,ROR #23 + eor r6,r14,r6,ROR #23 + add r7,r5,r8 + add r14,r6,r3 + vadd.i32 q10,q2,q7 + vswp d4,d5 + vadd.i32 q12,q6,q9 + vshl.i32 q13,q10,#18 + vswp d12,d13 + vshl.i32 q14,q12,#18 + eor r7,r12,r7,ROR #19 + eor r11,r11,r14,ROR #19 + add r12,r7,r5 + add r14,r11,r6 + vshr.u32 q10,q10,#14 + vext.32 q7,q7,q7,#1 + vshr.u32 q12,q12,#14 + veor q8,q8,q13 + vext.32 q9,q9,q9,#1 + veor q11,q11,q14 + eor r10,r10,r12,ROR #14 + eor r9,r9,r14,ROR #14 + add r12,r0,r3 + add r14,r2,r4 + veor q8,q8,q10 + veor q10,q11,q12 + eor r1,r1,r12,ROR #25 + eor r7,r7,r14,ROR #25 + add r12,r1,r0 + add r14,r7,r2 + vadd.i32 q11,q4,q8 + vadd.i32 q12,q5,q10 + vshl.i32 q13,q11,#7 + vshl.i32 q14,q12,#7 + eor r5,r5,r12,ROR #23 + eor r6,r6,r14,ROR #23 + vshr.u32 q11,q11,#25 + vshr.u32 q12,q12,#25 + add r12,r5,r1 + add r14,r6,r7 + veor q7,q7,q13 + veor q9,q9,q14 + veor q7,q7,q11 + veor q9,q9,q12 + vadd.i32 q11,q8,q7 + vadd.i32 q12,q10,q9 + vshl.i32 q13,q11,#9 + vshl.i32 q14,q12,#9 + eor r3,r3,r12,ROR #19 + str r7,[sp,#104] + eor r4,r4,r14,ROR #19 + ldr r7,[sp,#112] + add r12,r3,r5 + str r6,[sp,#108] + add r6,r4,r6 + ldr r14,[sp,#116] + eor r0,r0,r12,ROR #14 + str r5,[sp,#96] + eor r5,r2,r6,ROR #14 + ldr r2,[sp,#120] + vshr.u32 q11,q11,#23 + vshr.u32 q12,q12,#23 + veor q2,q2,q13 + veor q6,q6,q14 + veor q2,q2,q11 + veor q6,q6,q12 + add r6,r10,r14 + add r12,r9,r8 + vadd.i32 q11,q7,q2 + vext.32 q7,q7,q7,#3 + vadd.i32 q12,q9,q6 + vshl.i32 q13,q11,#13 + vext.32 q9,q9,q9,#3 + vshl.i32 q14,q12,#13 + vshr.u32 q11,q11,#19 + vshr.u32 q12,q12,#19 + eor r11,r11,r6,ROR #25 + eor r2,r2,r12,ROR #25 + add r6,r11,r10 + str r3,[sp,#100] + add r3,r2,r9 + ldr r12,[sp,#124] + veor q4,q4,q13 + veor q5,q5,q14 + veor q4,q4,q11 + veor q5,q5,q12 + eor r6,r7,r6,ROR #23 + eor r3,r12,r3,ROR #23 + add r7,r6,r11 + add r12,r3,r2 + vadd.i32 q11,q2,q4 + vswp d4,d5 + vadd.i32 q12,q6,q5 + vshl.i32 q13,q11,#18 + vswp d12,d13 + vshl.i32 q14,q12,#18 + eor r7,r14,r7,ROR #19 + eor r8,r8,r12,ROR #19 + add r12,r7,r6 + add r14,r8,r3 + vshr.u32 q11,q11,#14 + vext.32 q4,q4,q4,#1 + vshr.u32 q12,q12,#14 + veor q8,q8,q13 + vext.32 q5,q5,q5,#1 + veor q10,q10,q14 + eor r10,r10,r12,ROR #14 + veor q8,q8,q11 + eor r9,r9,r14,ROR #14 + veor q10,q10,q12 + vadd.i32 q11,q7,q8 + vadd.i32 q12,q9,q10 + add r12,r0,r2 + add r14,r5,r1 + vshl.i32 q13,q11,#7 + vshl.i32 q14,q12,#7 + vshr.u32 q11,q11,#25 + vshr.u32 q12,q12,#25 + eor r4,r4,r12,ROR #25 + eor r7,r7,r14,ROR #25 + add r12,r4,r0 + add r14,r7,r5 + veor q4,q4,q13 + veor q5,q5,q14 + veor q4,q4,q11 + veor q5,q5,q12 + eor r6,r6,r12,ROR #23 + eor r3,r3,r14,ROR #23 + add r12,r6,r4 + str r7,[sp,#116] + add r7,r3,r7 + ldr r14,[sp,#108] + vadd.i32 q11,q8,q4 + vadd.i32 q12,q10,q5 + vshl.i32 q13,q11,#9 + vshl.i32 q14,q12,#9 + vshr.u32 q11,q11,#23 + vshr.u32 q12,q12,#23 + veor q2,q2,q13 + veor q6,q6,q14 + veor q2,q2,q11 + veor q6,q6,q12 + eor r2,r2,r12,ROR #19 + str r2,[sp,#120] + eor r1,r1,r7,ROR #19 + ldr r7,[sp,#96] + add r2,r2,r6 + str r6,[sp,#112] + add r6,r1,r3 + ldr r12,[sp,#104] + vadd.i32 q11,q4,q2 + vext.32 q4,q4,q4,#3 + vadd.i32 q12,q5,q6 + vshl.i32 q13,q11,#13 + vext.32 q5,q5,q5,#3 + vshl.i32 q14,q12,#13 + eor r0,r0,r2,ROR #14 + eor r2,r5,r6,ROR #14 + str r3,[sp,#124] + add r3,r10,r12 + ldr r5,[sp,#100] + add r6,r9,r11 + vshr.u32 q11,q11,#19 + vshr.u32 q12,q12,#19 + veor q7,q7,q13 + veor q9,q9,q14 + eor r8,r8,r3,ROR #25 + eor r3,r5,r6,ROR #25 + add r5,r8,r10 + add r6,r3,r9 + veor q7,q7,q11 + veor q9,q9,q12 + eor r5,r7,r5,ROR #23 + eor r6,r14,r6,ROR #23 + add r7,r5,r8 + add r14,r6,r3 + vadd.i32 q11,q2,q7 + vswp d4,d5 + vadd.i32 q12,q6,q9 + vshl.i32 q13,q11,#18 + vswp d12,d13 + vshl.i32 q14,q12,#18 + eor r7,r12,r7,ROR #19 + eor r11,r11,r14,ROR #19 + add r12,r7,r5 + add r14,r11,r6 + vshr.u32 q11,q11,#14 + vext.32 q7,q7,q7,#1 + vshr.u32 q12,q12,#14 + veor q8,q8,q13 + vext.32 q9,q9,q9,#1 + veor q10,q10,q14 + eor r10,r10,r12,ROR #14 + eor r9,r9,r14,ROR #14 + add r12,r0,r3 + add r14,r2,r4 + veor q8,q8,q11 + veor q11,q10,q12 + eor r1,r1,r12,ROR #25 + eor r7,r7,r14,ROR #25 + add r12,r1,r0 + add r14,r7,r2 + vadd.i32 q10,q4,q8 + vadd.i32 q12,q5,q11 + vshl.i32 q13,q10,#7 + vshl.i32 q14,q12,#7 + eor r5,r5,r12,ROR #23 + eor r6,r6,r14,ROR #23 + vshr.u32 q10,q10,#25 + vshr.u32 q12,q12,#25 + add r12,r5,r1 + add r14,r6,r7 + veor q7,q7,q13 + veor q9,q9,q14 + veor q7,q7,q10 + veor q9,q9,q12 + vadd.i32 q10,q8,q7 + vadd.i32 q12,q11,q9 + vshl.i32 q13,q10,#9 + vshl.i32 q14,q12,#9 + eor r3,r3,r12,ROR #19 + str r7,[sp,#104] + eor r4,r4,r14,ROR #19 + ldr r7,[sp,#112] + add r12,r3,r5 + str r6,[sp,#108] + add r6,r4,r6 + ldr r14,[sp,#116] + eor r0,r0,r12,ROR #14 + str r5,[sp,#96] + eor r5,r2,r6,ROR #14 + ldr r2,[sp,#120] + vshr.u32 q10,q10,#23 + vshr.u32 q12,q12,#23 + veor q2,q2,q13 + veor q6,q6,q14 + veor q2,q2,q10 + veor q6,q6,q12 + add r6,r10,r14 + add r12,r9,r8 + vadd.i32 q12,q7,q2 + vext.32 q10,q7,q7,#3 + vadd.i32 q7,q9,q6 + vshl.i32 q14,q12,#13 + vext.32 q13,q9,q9,#3 + vshl.i32 q9,q7,#13 + vshr.u32 q12,q12,#19 + vshr.u32 q7,q7,#19 + eor r11,r11,r6,ROR #25 + eor r2,r2,r12,ROR #25 + add r6,r11,r10 + str r3,[sp,#100] + add r3,r2,r9 + ldr r12,[sp,#124] + veor q4,q4,q14 + veor q5,q5,q9 + veor q4,q4,q12 + veor q7,q5,q7 + eor r6,r7,r6,ROR #23 + eor r3,r12,r3,ROR #23 + add r7,r6,r11 + add r12,r3,r2 + vadd.i32 q5,q2,q4 + vswp d4,d5 + vadd.i32 q9,q6,q7 + vshl.i32 q12,q5,#18 + vswp d12,d13 + vshl.i32 q14,q9,#18 + eor r7,r14,r7,ROR #19 + eor r8,r8,r12,ROR #19 + add r12,r7,r6 + add r14,r8,r3 + vshr.u32 q15,q5,#14 + vext.32 q5,q4,q4,#1 + vshr.u32 q4,q9,#14 + veor q8,q8,q12 + vext.32 q7,q7,q7,#1 + veor q9,q11,q14 + eor r10,r10,r12,ROR #14 + ldr r12,[sp,#248] + veor q8,q8,q15 + eor r9,r9,r14,ROR #14 + veor q11,q9,q4 + subs r12,r12,#4 + bhi ._mainloop2 + strd r8,[sp,#112] + ldrd r8,[sp,#64] + strd r2,[sp,#120] + ldrd r2,[sp,#96] + add r0,r0,r8 + strd r10,[sp,#96] + add r1,r1,r9 + ldrd r10,[sp,#48] + ldrd r8,[sp,#72] + add r2,r2,r10 + strd r6,[sp,#128] + add r3,r3,r11 + ldrd r6,[sp,#104] + ldrd r10,[sp,#32] + ldr r12,[sp,#236] + add r4,r4,r8 + add r5,r5,r9 + add r6,r6,r10 + add r7,r7,r11 + cmp r12,#0 + beq ._nomessage1 + ldr r8,[r12,#0] + ldr r9,[r12,#4] + ldr r10,[r12,#8] + ldr r11,[r12,#12] + eor r0,r0,r8 + ldr r8,[r12,#16] + eor r1,r1,r9 + ldr r9,[r12,#20] + eor r2,r2,r10 + ldr r10,[r12,#24] + eor r3,r3,r11 + ldr r11,[r12,#28] + eor r4,r4,r8 + eor r5,r5,r9 + eor r6,r6,r10 + eor r7,r7,r11 +._nomessage1: + ldr r14,[sp,#232] + vadd.i32 q4,q8,q1 + str r0,[r14,#0] + add r0,sp,#304 + str r1,[r14,#4] + vld1.8 {d16-d17},[r0,: 128] + str r2,[r14,#8] + vadd.i32 q5,q8,q5 + str r3,[r14,#12] + add r0,sp,#288 + str r4,[r14,#16] + vld1.8 {d16-d17},[r0,: 128] + str r5,[r14,#20] + vadd.i32 q9,q10,q0 + str r6,[r14,#24] + vadd.i32 q2,q8,q2 + str r7,[r14,#28] + vmov.i64 q8,#0xffffffff + ldrd r6,[sp,#128] + vext.32 d20,d8,d10,#1 + ldrd r0,[sp,#40] + vext.32 d25,d9,d11,#1 + ldrd r2,[sp,#120] + vbif q4,q9,q8 + ldrd r4,[sp,#56] + vext.32 d21,d5,d19,#1 + add r6,r6,r0 + vext.32 d24,d4,d18,#1 + add r7,r7,r1 + vbif q2,q5,q8 + add r2,r2,r4 + vrev64.i32 q5,q10 + add r3,r3,r5 + vrev64.i32 q9,q12 + adds r0,r0,#3 + vswp d5,d9 + adc r1,r1,#0 + strd r0,[sp,#40] + ldrd r8,[sp,#112] + ldrd r0,[sp,#88] + ldrd r10,[sp,#96] + ldrd r4,[sp,#80] + add r0,r8,r0 + add r1,r9,r1 + add r4,r10,r4 + add r5,r11,r5 + add r8,r14,#64 + cmp r12,#0 + beq ._nomessage2 + ldr r9,[r12,#32] + ldr r10,[r12,#36] + ldr r11,[r12,#40] + ldr r14,[r12,#44] + eor r6,r6,r9 + ldr r9,[r12,#48] + eor r7,r7,r10 + ldr r10,[r12,#52] + eor r4,r4,r11 + ldr r11,[r12,#56] + eor r5,r5,r14 + ldr r14,[r12,#60] + add r12,r12,#64 + eor r2,r2,r9 + vld1.8 {d20-d21},[r12]! + veor q4,q4,q10 + eor r3,r3,r10 + vld1.8 {d20-d21},[r12]! + veor q5,q5,q10 + eor r0,r0,r11 + vld1.8 {d20-d21},[r12]! + veor q2,q2,q10 + eor r1,r1,r14 + vld1.8 {d20-d21},[r12]! + veor q9,q9,q10 +._nomessage2: + vst1.8 {d8-d9},[r8]! + vst1.8 {d10-d11},[r8]! + vmov.i64 q4,#0xff + vst1.8 {d4-d5},[r8]! + vst1.8 {d18-d19},[r8]! + str r6,[r8,#-96] + add r6,sp,#336 + str r7,[r8,#-92] + add r7,sp,#320 + str r4,[r8,#-88] + vadd.i32 q2,q11,q1 + vld1.8 {d10-d11},[r6,: 128] + vadd.i32 q5,q5,q7 + vld1.8 {d14-d15},[r7,: 128] + vadd.i32 q9,q13,q0 + vadd.i32 q6,q7,q6 + str r5,[r8,#-84] + vext.32 d14,d4,d10,#1 + str r2,[r8,#-80] + vext.32 d21,d5,d11,#1 + str r3,[r8,#-76] + vbif q2,q9,q8 + str r0,[r8,#-72] + vext.32 d15,d13,d19,#1 + vshr.u32 q4,q4,#7 + str r1,[r8,#-68] + vext.32 d20,d12,d18,#1 + vbif q6,q5,q8 + ldr r0,[sp,#240] + vrev64.i32 q5,q7 + vrev64.i32 q7,q10 + vswp d13,d5 + vadd.i64 q3,q3,q4 + sub r0,r0,#192 + cmp r12,#0 + beq ._nomessage21 + vld1.8 {d16-d17},[r12]! + veor q2,q2,q8 + vld1.8 {d16-d17},[r12]! + veor q5,q5,q8 + vld1.8 {d16-d17},[r12]! + veor q6,q6,q8 + vld1.8 {d16-d17},[r12]! + veor q7,q7,q8 +._nomessage21: + vst1.8 {d4-d5},[r8]! + vst1.8 {d10-d11},[r8]! + vst1.8 {d12-d13},[r8]! + vst1.8 {d14-d15},[r8]! + str r12,[sp,#236] + add r14,sp,#272 + add r12,sp,#256 + str r8,[sp,#232] + cmp r0,#192 + bhs ._mlenatleast192 +._mlenlowbelow192: + cmp r0,#0 + beq ._done + b ._mlenatleast1 +._nextblock: + sub r0,r0,#64 +._mlenatleast1: +._handleblock: + str r0,[sp,#248] + ldrd r2,[sp,#48] + ldrd r6,[sp,#32] + ldrd r0,[sp,#64] + ldrd r4,[sp,#72] + ldrd r10,[sp,#80] + ldrd r8,[sp,#88] + strd r2,[sp,#96] + strd r6,[sp,#104] + ldrd r2,[sp,#56] + ldrd r6,[sp,#40] + ldr r12,[sp,#244] +._mainloop1: + str r12,[sp,#252] + add r12,r0,r2 + add r14,r5,r1 + eor r4,r4,r12,ROR #25 + eor r7,r7,r14,ROR #25 + add r12,r4,r0 + add r14,r7,r5 + eor r6,r6,r12,ROR #23 + eor r3,r3,r14,ROR #23 + add r12,r6,r4 + str r7,[sp,#132] + add r7,r3,r7 + ldr r14,[sp,#104] + eor r2,r2,r12,ROR #19 + str r6,[sp,#128] + eor r1,r1,r7,ROR #19 + ldr r7,[sp,#100] + add r6,r2,r6 + str r2,[sp,#120] + add r2,r1,r3 + ldr r12,[sp,#96] + eor r0,r0,r6,ROR #14 + str r3,[sp,#124] + eor r2,r5,r2,ROR #14 + ldr r3,[sp,#108] + add r5,r10,r14 + add r6,r9,r11 + eor r8,r8,r5,ROR #25 + eor r5,r7,r6,ROR #25 + add r6,r8,r10 + add r7,r5,r9 + eor r6,r12,r6,ROR #23 + eor r3,r3,r7,ROR #23 + add r7,r6,r8 + add r12,r3,r5 + eor r7,r14,r7,ROR #19 + eor r11,r11,r12,ROR #19 + add r12,r7,r6 + add r14,r11,r3 + eor r10,r10,r12,ROR #14 + eor r9,r9,r14,ROR #14 + add r12,r0,r5 + add r14,r2,r4 + eor r1,r1,r12,ROR #25 + eor r7,r7,r14,ROR #25 + add r12,r1,r0 + add r14,r7,r2 + eor r6,r6,r12,ROR #23 + eor r3,r3,r14,ROR #23 + add r12,r6,r1 + str r7,[sp,#104] + add r7,r3,r7 + ldr r14,[sp,#128] + eor r5,r5,r12,ROR #19 + str r3,[sp,#108] + eor r4,r4,r7,ROR #19 + ldr r7,[sp,#132] + add r12,r5,r6 + str r6,[sp,#96] + add r3,r4,r3 + ldr r6,[sp,#120] + eor r0,r0,r12,ROR #14 + str r5,[sp,#100] + eor r5,r2,r3,ROR #14 + ldr r3,[sp,#124] + add r2,r10,r7 + add r12,r9,r8 + eor r11,r11,r2,ROR #25 + eor r2,r6,r12,ROR #25 + add r6,r11,r10 + add r12,r2,r9 + eor r6,r14,r6,ROR #23 + eor r3,r3,r12,ROR #23 + add r12,r6,r11 + add r14,r3,r2 + eor r7,r7,r12,ROR #19 + eor r8,r8,r14,ROR #19 + add r12,r7,r6 + add r14,r8,r3 + eor r10,r10,r12,ROR #14 + eor r9,r9,r14,ROR #14 + ldr r12,[sp,#252] + subs r12,r12,#2 + bhi ._mainloop1 + strd r6,[sp,#128] + strd r2,[sp,#120] + strd r10,[sp,#112] + strd r8,[sp,#136] + ldrd r2,[sp,#96] + ldrd r6,[sp,#104] + ldrd r8,[sp,#64] + ldrd r10,[sp,#48] + add r0,r0,r8 + add r1,r1,r9 + add r2,r2,r10 + add r3,r3,r11 + ldrd r8,[sp,#72] + ldrd r10,[sp,#32] + add r4,r4,r8 + add r5,r5,r9 + add r6,r6,r10 + add r7,r7,r11 + ldr r12,[sp,#236] + cmp r12,#0 + beq ._nomessage10 + ldr r8,[r12,#0] + ldr r9,[r12,#4] + ldr r10,[r12,#8] + ldr r11,[r12,#12] + eor r0,r0,r8 + ldr r8,[r12,#16] + eor r1,r1,r9 + ldr r9,[r12,#20] + eor r2,r2,r10 + ldr r10,[r12,#24] + eor r3,r3,r11 + ldr r11,[r12,#28] + eor r4,r4,r8 + eor r5,r5,r9 + eor r6,r6,r10 + eor r7,r7,r11 +._nomessage10: + ldr r14,[sp,#232] + str r0,[r14,#0] + str r1,[r14,#4] + str r2,[r14,#8] + str r3,[r14,#12] + str r4,[r14,#16] + str r5,[r14,#20] + str r6,[r14,#24] + str r7,[r14,#28] + ldrd r6,[sp,#128] + ldrd r10,[sp,#112] + ldrd r0,[sp,#40] + ldrd r4,[sp,#80] + add r6,r6,r0 + add r7,r7,r1 + add r10,r10,r4 + add r11,r11,r5 + adds r0,r0,#1 + adc r1,r1,#0 + strd r0,[sp,#40] + ldrd r2,[sp,#120] + ldrd r8,[sp,#136] + ldrd r4,[sp,#56] + ldrd r0,[sp,#88] + add r2,r2,r4 + add r3,r3,r5 + add r0,r8,r0 + add r1,r9,r1 + cmp r12,#0 + beq ._nomessage11 + ldr r4,[r12,#32] + ldr r5,[r12,#36] + ldr r8,[r12,#40] + ldr r9,[r12,#44] + eor r6,r6,r4 + ldr r4,[r12,#48] + eor r7,r7,r5 + ldr r5,[r12,#52] + eor r10,r10,r8 + ldr r8,[r12,#56] + eor r11,r11,r9 + ldr r9,[r12,#60] + eor r2,r2,r4 + eor r3,r3,r5 + eor r0,r0,r8 + eor r1,r1,r9 + add r4,r12,#64 + str r4,[sp,#236] +._nomessage11: + str r6,[r14,#32] + str r7,[r14,#36] + str r10,[r14,#40] + str r11,[r14,#44] + str r2,[r14,#48] + str r3,[r14,#52] + str r0,[r14,#56] + str r1,[r14,#60] + add r0,r14,#64 + str r0,[sp,#232] + ldr r0,[sp,#248] + cmp r0,#64 + bhi ._nextblock +._done: + ldr r2,[sp,#160] + ldrd r4,[sp,#0] + ldrd r6,[sp,#8] + ldrd r8,[sp,#16] + ldrd r10,[sp,#24] + ldr r12,[sp,#228] + ldr r14,[sp,#224] + ldrd r0,[sp,#40] + strd r0,[r2] + sub r0,r12,sp + mov sp,r12 + vpop {q4,q5,q6,q7} + add r0,r0,#64 + bx lr + +#endif diff --git a/cipher/salsa20.c b/cipher/salsa20.c index 892b9fc..983c7c2 100644 --- a/cipher/salsa20.c +++ b/cipher/salsa20.c @@ -41,12 +41,22 @@ #include "bufhelp.h" + /* USE_AMD64 indicates whether to compile with AMD64 code. */ #undef USE_AMD64 #if defined(__x86_64__) && defined(HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS) # define USE_AMD64 1 #endif +/* USE_ARM_NEON_ASM indicates whether to enable ARM NEON assembly code. */ +#undef USE_ARM_NEON_ASM +#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +# if defined(HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS) && \ + defined(HAVE_GCC_INLINE_ASM_NEON) +# define USE_ARM_NEON_ASM 1 +# endif +#endif + #define SALSA20_MIN_KEY_SIZE 16 /* Bytes. */ #define SALSA20_MAX_KEY_SIZE 32 /* Bytes. */ @@ -60,7 +70,16 @@ #define SALSA20R12_ROUNDS 12 -typedef struct +struct SALSA20_context_s; + +typedef unsigned int (*salsa20_core_t) (u32 *dst, struct SALSA20_context_s *ctx, + unsigned int rounds); +typedef void (* salsa20_keysetup_t)(struct SALSA20_context_s *ctx, + const byte *key, int keylen); +typedef void (* salsa20_ivsetup_t)(struct SALSA20_context_s *ctx, + const byte *iv); + +typedef struct SALSA20_context_s { /* Indices 1-4 and 11-14 holds the key (two identical copies for the shorter key size), indices 0, 5, 10, 15 are constant, indices 6, 7 @@ -74,6 +93,12 @@ typedef struct u32 input[SALSA20_INPUT_LENGTH]; u32 pad[SALSA20_INPUT_LENGTH]; unsigned int unused; /* bytes in the pad. */ +#ifdef USE_ARM_NEON_ASM + int use_neon; +#endif + salsa20_keysetup_t keysetup; + salsa20_ivsetup_t ivsetup; + salsa20_core_t core; } SALSA20_context_t; @@ -113,10 +138,10 @@ salsa20_ivsetup(SALSA20_context_t *ctx, const byte *iv) } static unsigned int -salsa20_core (u32 *dst, u32 *src, unsigned int rounds) +salsa20_core (u32 *dst, SALSA20_context_t *ctx, unsigned int rounds) { memset(dst, 0, SALSA20_BLOCK_SIZE); - return _gcry_salsa20_amd64_encrypt_blocks(src, dst, dst, 1, rounds); + return _gcry_salsa20_amd64_encrypt_blocks(ctx->input, dst, dst, 1, rounds); } #else /* USE_AMD64 */ @@ -149,9 +174,9 @@ salsa20_core (u32 *dst, u32 *src, unsigned int rounds) } while(0) static unsigned int -salsa20_core (u32 *dst, u32 *src, unsigned int rounds) +salsa20_core (u32 *dst, SALSA20_context_t *ctx, unsigned rounds) { - u32 pad[SALSA20_INPUT_LENGTH]; + u32 pad[SALSA20_INPUT_LENGTH], *src = ctx->input; unsigned int i; memcpy (pad, src, sizeof(pad)); @@ -236,6 +261,49 @@ static void salsa20_ivsetup(SALSA20_context_t *ctx, const byte *iv) #endif /*!USE_AMD64*/ +#ifdef USE_ARM_NEON_ASM + +/* ARM NEON implementation of Salsa20. */ +unsigned int +_gcry_arm_neon_salsa20_encrypt(void *c, const void *m, unsigned int nblks, + void *k, unsigned int rounds); + +static unsigned int +salsa20_core_neon (u32 *dst, SALSA20_context_t *ctx, unsigned int rounds) +{ + return _gcry_arm_neon_salsa20_encrypt(dst, NULL, 1, ctx->input, rounds); +} + +static void salsa20_ivsetup_neon(SALSA20_context_t *ctx, const byte *iv) +{ + memcpy(ctx->input + 8, iv, 8); + /* Reset the block counter. */ + memset(ctx->input + 10, 0, 8); +} + +static void +salsa20_keysetup_neon(SALSA20_context_t *ctx, const byte *key, int klen) +{ + static const unsigned char sigma32[16] = "expand 32-byte k"; + static const unsigned char sigma16[16] = "expand 16-byte k"; + + if (klen == 16) + { + memcpy (ctx->input, key, 16); + memcpy (ctx->input + 4, key, 16); /* Duplicate 128-bit key. */ + memcpy (ctx->input + 12, sigma16, 16); + } + else + { + /* 32-byte key */ + memcpy (ctx->input, key, 32); + memcpy (ctx->input + 12, sigma32, 16); + } +} + +#endif /*USE_ARM_NEON_ASM*/ + + static gcry_err_code_t salsa20_do_setkey (SALSA20_context_t *ctx, const byte *key, unsigned int keylen) @@ -257,7 +325,23 @@ salsa20_do_setkey (SALSA20_context_t *ctx, && keylen != SALSA20_MAX_KEY_SIZE) return GPG_ERR_INV_KEYLEN; - salsa20_keysetup (ctx, key, keylen); + /* Default ops. */ + ctx->keysetup = salsa20_keysetup; + ctx->ivsetup = salsa20_ivsetup; + ctx->core = salsa20_core; + +#ifdef USE_ARM_NEON_ASM + ctx->use_neon = (_gcry_get_hw_features () & HWF_ARM_NEON) != 0; + if (ctx->use_neon) + { + /* Use ARM NEON ops instead. */ + ctx->keysetup = salsa20_keysetup_neon; + ctx->ivsetup = salsa20_ivsetup_neon; + ctx->core = salsa20_core_neon; + } +#endif + + ctx->keysetup (ctx, key, keylen); /* We default to a zero nonce. */ salsa20_setiv (ctx, NULL, 0); @@ -290,7 +374,7 @@ salsa20_setiv (void *context, const byte *iv, unsigned int ivlen) else memcpy (tmp, iv, SALSA20_IV_SIZE); - salsa20_ivsetup (ctx, tmp); + ctx->ivsetup (ctx, tmp); /* Reset the unused pad bytes counter. */ ctx->unused = 0; @@ -340,12 +424,24 @@ salsa20_do_encrypt_stream (SALSA20_context_t *ctx, } #endif +#ifdef USE_ARM_NEON_ASM + if (ctx->use_neon && length >= SALSA20_BLOCK_SIZE) + { + unsigned int nblocks = length / SALSA20_BLOCK_SIZE; + _gcry_arm_neon_salsa20_encrypt (outbuf, inbuf, nblocks, ctx->input, + rounds); + length -= SALSA20_BLOCK_SIZE * nblocks; + outbuf += SALSA20_BLOCK_SIZE * nblocks; + inbuf += SALSA20_BLOCK_SIZE * nblocks; + } +#endif + while (length > 0) { /* Create the next pad and bump the block counter. Note that it is the user's duty to change to another nonce not later than after 2^70 processed bytes. */ - nburn = salsa20_core (ctx->pad, ctx->input, rounds); + nburn = ctx->core (ctx->pad, ctx, rounds); burn = nburn > burn ? nburn : burn; if (length <= SALSA20_BLOCK_SIZE) @@ -386,12 +482,13 @@ salsa20r12_encrypt_stream (void *context, } - static const char* selftest (void) { SALSA20_context_t ctx; byte scratch[8+1]; + byte buf[256+64+4]; + int i; static byte key_1[] = { 0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, @@ -418,6 +515,23 @@ selftest (void) salsa20_encrypt_stream (&ctx, scratch, scratch, sizeof plaintext_1); if (memcmp (scratch, plaintext_1, sizeof plaintext_1)) return "Salsa20 decryption test 1 failed."; + + for (i = 0; i < sizeof buf; i++) + buf[i] = i; + salsa20_setkey (&ctx, key_1, sizeof key_1); + salsa20_setiv (&ctx, nonce_1, sizeof nonce_1); + /*encrypt*/ + salsa20_encrypt_stream (&ctx, buf, buf, sizeof buf); + /*decrypt*/ + salsa20_setkey (&ctx, key_1, sizeof key_1); + salsa20_setiv (&ctx, nonce_1, sizeof nonce_1); + salsa20_encrypt_stream (&ctx, buf, buf, 1); + salsa20_encrypt_stream (&ctx, buf+1, buf+1, (sizeof buf)-1-1); + salsa20_encrypt_stream (&ctx, buf+(sizeof buf)-1, buf+(sizeof buf)-1, 1); + for (i = 0; i < sizeof buf; i++) + if (buf[i] != (byte)i) + return "Salsa20 encryption test 2 failed."; + return NULL; } diff --git a/configure.ac b/configure.ac index 114460c..19c97bd 100644 --- a/configure.ac +++ b/configure.ac @@ -1560,6 +1560,11 @@ if test "$found" = "1" ; then GCRYPT_CIPHERS="$GCRYPT_CIPHERS salsa20-amd64.lo" ;; esac + + if test x"$neonsupport" = xyes ; then + # Build with the NEON implementation + GCRYPT_CIPHERS="$GCRYPT_CIPHERS salsa20-armv7-neon.lo" + fi fi LIST_MEMBER(gost28147, $enabled_ciphers) From jussi.kivilinna at iki.fi Mon Oct 28 08:10:06 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 28 Oct 2013 09:10:06 +0200 Subject: [PATCH] Change .global to .globl in assembly files Message-ID: <20131028071006.20575.62431.stgit@localhost6.localdomain6> * cipher/blowfish-arm.S: Change '.global' to '.globl'. * cipher/camellia-aesni-avx-amd64.S: Ditto. * cipher/camellia-aesni-avx2-amd64.S: Ditto. * cipher/camellia-arm.S: Ditto. * cipher/cast5-amd64.S: Ditto. * cipher/rijndael-amd64.S: Ditto. * cipher/rijndael-arm.S: Ditto. * cipher/serpent-avx2-amd64.S: Ditto. * cipher/serpent-sse2-amd64.S: Ditto. * cipher/twofish-amd64.S: Ditto. * cipher/twofish-arm.S: Ditto. -- The .global keyword is used only in newer versions of GAS, so change these to older .globl for better portability. Signed-off-by: Jussi Kivilinna --- cipher/blowfish-arm.S | 4 ++-- cipher/camellia-aesni-avx-amd64.S | 6 +++--- cipher/camellia-aesni-avx2-amd64.S | 6 +++--- cipher/camellia-arm.S | 4 ++-- cipher/cast5-amd64.S | 10 +++++----- cipher/rijndael-amd64.S | 4 ++-- cipher/rijndael-arm.S | 4 ++-- cipher/serpent-avx2-amd64.S | 6 +++--- cipher/serpent-sse2-amd64.S | 6 +++--- cipher/twofish-amd64.S | 10 +++++----- cipher/twofish-arm.S | 4 ++-- 11 files changed, 32 insertions(+), 32 deletions(-) diff --git a/cipher/blowfish-arm.S b/cipher/blowfish-arm.S index 43090d7..901d0c3 100644 --- a/cipher/blowfish-arm.S +++ b/cipher/blowfish-arm.S @@ -296,7 +296,7 @@ _gcry_blowfish_arm_do_encrypt: .size _gcry_blowfish_arm_do_encrypt,.-_gcry_blowfish_arm_do_encrypt; .align 3 -.global _gcry_blowfish_arm_encrypt_block +.globl _gcry_blowfish_arm_encrypt_block .type _gcry_blowfish_arm_encrypt_block,%function; _gcry_blowfish_arm_encrypt_block: @@ -317,7 +317,7 @@ _gcry_blowfish_arm_encrypt_block: .size _gcry_blowfish_arm_encrypt_block,.-_gcry_blowfish_arm_encrypt_block; .align 3 -.global _gcry_blowfish_arm_decrypt_block +.globl _gcry_blowfish_arm_decrypt_block .type _gcry_blowfish_arm_decrypt_block,%function; _gcry_blowfish_arm_decrypt_block: diff --git a/cipher/camellia-aesni-avx-amd64.S b/cipher/camellia-aesni-avx-amd64.S index 9873d98..9be5d14 100644 --- a/cipher/camellia-aesni-avx-amd64.S +++ b/cipher/camellia-aesni-avx-amd64.S @@ -947,7 +947,7 @@ __camellia_dec_blk16: vpsubq tmp, x, x; .align 8 -.global _gcry_camellia_aesni_avx_ctr_enc +.globl _gcry_camellia_aesni_avx_ctr_enc .type _gcry_camellia_aesni_avx_ctr_enc, at function; _gcry_camellia_aesni_avx_ctr_enc: @@ -1062,7 +1062,7 @@ _gcry_camellia_aesni_avx_ctr_enc: .size _gcry_camellia_aesni_avx_ctr_enc,.-_gcry_camellia_aesni_avx_ctr_enc; .align 8 -.global _gcry_camellia_aesni_avx_cbc_dec +.globl _gcry_camellia_aesni_avx_cbc_dec .type _gcry_camellia_aesni_avx_cbc_dec, at function; _gcry_camellia_aesni_avx_cbc_dec: @@ -1126,7 +1126,7 @@ _gcry_camellia_aesni_avx_cbc_dec: .size _gcry_camellia_aesni_avx_cbc_dec,.-_gcry_camellia_aesni_avx_cbc_dec; .align 8 -.global _gcry_camellia_aesni_avx_cfb_dec +.globl _gcry_camellia_aesni_avx_cfb_dec .type _gcry_camellia_aesni_avx_cfb_dec, at function; _gcry_camellia_aesni_avx_cfb_dec: diff --git a/cipher/camellia-aesni-avx2-amd64.S b/cipher/camellia-aesni-avx2-amd64.S index 7e31323..f0a4fd8 100644 --- a/cipher/camellia-aesni-avx2-amd64.S +++ b/cipher/camellia-aesni-avx2-amd64.S @@ -926,7 +926,7 @@ __camellia_dec_blk32: vpsubq tmp, x, x; .align 8 -.global _gcry_camellia_aesni_avx2_ctr_enc +.globl _gcry_camellia_aesni_avx2_ctr_enc .type _gcry_camellia_aesni_avx2_ctr_enc, at function; _gcry_camellia_aesni_avx2_ctr_enc: @@ -1110,7 +1110,7 @@ _gcry_camellia_aesni_avx2_ctr_enc: .size _gcry_camellia_aesni_avx2_ctr_enc,.-_gcry_camellia_aesni_avx2_ctr_enc; .align 8 -.global _gcry_camellia_aesni_avx2_cbc_dec +.globl _gcry_camellia_aesni_avx2_cbc_dec .type _gcry_camellia_aesni_avx2_cbc_dec, at function; _gcry_camellia_aesni_avx2_cbc_dec: @@ -1181,7 +1181,7 @@ _gcry_camellia_aesni_avx2_cbc_dec: .size _gcry_camellia_aesni_avx2_cbc_dec,.-_gcry_camellia_aesni_avx2_cbc_dec; .align 8 -.global _gcry_camellia_aesni_avx2_cfb_dec +.globl _gcry_camellia_aesni_avx2_cfb_dec .type _gcry_camellia_aesni_avx2_cfb_dec, at function; _gcry_camellia_aesni_avx2_cfb_dec: diff --git a/cipher/camellia-arm.S b/cipher/camellia-arm.S index 820c46e..302f436 100644 --- a/cipher/camellia-arm.S +++ b/cipher/camellia-arm.S @@ -252,7 +252,7 @@ str_output_be(%r1, YL, YR, XL, XR, RT0, RT1); .align 3 -.global _gcry_camellia_arm_encrypt_block +.globl _gcry_camellia_arm_encrypt_block .type _gcry_camellia_arm_encrypt_block,%function; _gcry_camellia_arm_encrypt_block: @@ -300,7 +300,7 @@ _gcry_camellia_arm_encrypt_block: .size _gcry_camellia_arm_encrypt_block,.-_gcry_camellia_arm_encrypt_block; .align 3 -.global _gcry_camellia_arm_decrypt_block +.globl _gcry_camellia_arm_decrypt_block .type _gcry_camellia_arm_decrypt_block,%function; _gcry_camellia_arm_decrypt_block: diff --git a/cipher/cast5-amd64.S b/cipher/cast5-amd64.S index 1bca249..c3b819d 100644 --- a/cipher/cast5-amd64.S +++ b/cipher/cast5-amd64.S @@ -179,7 +179,7 @@ movq RLR0, (RIO); .align 8 -.global _gcry_cast5_amd64_encrypt_block +.globl _gcry_cast5_amd64_encrypt_block .type _gcry_cast5_amd64_encrypt_block, at function; _gcry_cast5_amd64_encrypt_block: @@ -219,7 +219,7 @@ _gcry_cast5_amd64_encrypt_block: .size _gcry_cast5_amd64_encrypt_block,.-_gcry_cast5_amd64_encrypt_block; .align 8 -.global _gcry_cast5_amd64_decrypt_block +.globl _gcry_cast5_amd64_decrypt_block .type _gcry_cast5_amd64_decrypt_block, at function; _gcry_cast5_amd64_decrypt_block: @@ -417,7 +417,7 @@ __cast5_dec_blk4: .size __cast5_dec_blk4,.-__cast5_dec_blk4; .align 8 -.global _gcry_cast5_amd64_ctr_enc +.globl _gcry_cast5_amd64_ctr_enc .type _gcry_cast5_amd64_ctr_enc, at function; _gcry_cast5_amd64_ctr_enc: /* input: @@ -475,7 +475,7 @@ _gcry_cast5_amd64_ctr_enc: .size _gcry_cast5_amd64_ctr_enc,.-_gcry_cast5_amd64_ctr_enc; .align 8 -.global _gcry_cast5_amd64_cbc_dec +.globl _gcry_cast5_amd64_cbc_dec .type _gcry_cast5_amd64_cbc_dec, at function; _gcry_cast5_amd64_cbc_dec: /* input: @@ -529,7 +529,7 @@ _gcry_cast5_amd64_cbc_dec: .size _gcry_cast5_amd64_cbc_dec,.-_gcry_cast5_amd64_cbc_dec; .align 8 -.global _gcry_cast5_amd64_cfb_dec +.globl _gcry_cast5_amd64_cfb_dec .type _gcry_cast5_amd64_cfb_dec, at function; _gcry_cast5_amd64_cfb_dec: /* input: diff --git a/cipher/rijndael-amd64.S b/cipher/rijndael-amd64.S index 2a7dd90..d360ea8 100644 --- a/cipher/rijndael-amd64.S +++ b/cipher/rijndael-amd64.S @@ -160,7 +160,7 @@ addroundkey((round) + 1, RNA, RNB, RNC, RND); .align 8 -.global _gcry_aes_amd64_encrypt_block +.globl _gcry_aes_amd64_encrypt_block .type _gcry_aes_amd64_encrypt_block, at function; _gcry_aes_amd64_encrypt_block: @@ -281,7 +281,7 @@ _gcry_aes_amd64_encrypt_block: addroundkey(round, RNA, RNB, RNC, RND); .align 8 -.global _gcry_aes_amd64_decrypt_block +.globl _gcry_aes_amd64_decrypt_block .type _gcry_aes_amd64_decrypt_block, at function; _gcry_aes_amd64_decrypt_block: diff --git a/cipher/rijndael-arm.S b/cipher/rijndael-arm.S index 2a747bf..22c350c 100644 --- a/cipher/rijndael-arm.S +++ b/cipher/rijndael-arm.S @@ -211,7 +211,7 @@ addroundkey(rna, rnb, rnc, rnd, ra, rb, rc, rd, dummy); .align 3 -.global _gcry_aes_arm_encrypt_block +.globl _gcry_aes_arm_encrypt_block .type _gcry_aes_arm_encrypt_block,%function; _gcry_aes_arm_encrypt_block: @@ -465,7 +465,7 @@ _gcry_aes_arm_encrypt_block: addroundkey(rna, rnb, rnc, rnd, ra, rb, rc, rd, dummy); .align 3 -.global _gcry_aes_arm_decrypt_block +.globl _gcry_aes_arm_decrypt_block .type _gcry_aes_arm_decrypt_block,%function; _gcry_aes_arm_decrypt_block: diff --git a/cipher/serpent-avx2-amd64.S b/cipher/serpent-avx2-amd64.S index 8a76ab1..532361d 100644 --- a/cipher/serpent-avx2-amd64.S +++ b/cipher/serpent-avx2-amd64.S @@ -588,7 +588,7 @@ __serpent_dec_blk16: vpsubq tmp, x, x; .align 8 -.global _gcry_serpent_avx2_ctr_enc +.globl _gcry_serpent_avx2_ctr_enc .type _gcry_serpent_avx2_ctr_enc, at function; _gcry_serpent_avx2_ctr_enc: /* input: @@ -695,7 +695,7 @@ _gcry_serpent_avx2_ctr_enc: .size _gcry_serpent_avx2_ctr_enc,.-_gcry_serpent_avx2_ctr_enc; .align 8 -.global _gcry_serpent_avx2_cbc_dec +.globl _gcry_serpent_avx2_cbc_dec .type _gcry_serpent_avx2_cbc_dec, at function; _gcry_serpent_avx2_cbc_dec: /* input: @@ -746,7 +746,7 @@ _gcry_serpent_avx2_cbc_dec: .size _gcry_serpent_avx2_cbc_dec,.-_gcry_serpent_avx2_cbc_dec; .align 8 -.global _gcry_serpent_avx2_cfb_dec +.globl _gcry_serpent_avx2_cfb_dec .type _gcry_serpent_avx2_cfb_dec, at function; _gcry_serpent_avx2_cfb_dec: /* input: diff --git a/cipher/serpent-sse2-amd64.S b/cipher/serpent-sse2-amd64.S index 516126b..f2be236 100644 --- a/cipher/serpent-sse2-amd64.S +++ b/cipher/serpent-sse2-amd64.S @@ -605,7 +605,7 @@ __serpent_dec_blk8: .size __serpent_dec_blk8,.-__serpent_dec_blk8; .align 8 -.global _gcry_serpent_sse2_ctr_enc +.globl _gcry_serpent_sse2_ctr_enc .type _gcry_serpent_sse2_ctr_enc, at function; _gcry_serpent_sse2_ctr_enc: /* input: @@ -735,7 +735,7 @@ _gcry_serpent_sse2_ctr_enc: .size _gcry_serpent_sse2_ctr_enc,.-_gcry_serpent_sse2_ctr_enc; .align 8 -.global _gcry_serpent_sse2_cbc_dec +.globl _gcry_serpent_sse2_cbc_dec .type _gcry_serpent_sse2_cbc_dec, at function; _gcry_serpent_sse2_cbc_dec: /* input: @@ -796,7 +796,7 @@ _gcry_serpent_sse2_cbc_dec: .size _gcry_serpent_sse2_cbc_dec,.-_gcry_serpent_sse2_cbc_dec; .align 8 -.global _gcry_serpent_sse2_cfb_dec +.globl _gcry_serpent_sse2_cfb_dec .type _gcry_serpent_sse2_cfb_dec, at function; _gcry_serpent_sse2_cfb_dec: /* input: diff --git a/cipher/twofish-amd64.S b/cipher/twofish-amd64.S index 45548d2..c923d22 100644 --- a/cipher/twofish-amd64.S +++ b/cipher/twofish-amd64.S @@ -165,7 +165,7 @@ movl x, (4 * (n))(out); .align 8 -.global _gcry_twofish_amd64_encrypt_block +.globl _gcry_twofish_amd64_encrypt_block .type _gcry_twofish_amd64_encrypt_block, at function; _gcry_twofish_amd64_encrypt_block: @@ -208,7 +208,7 @@ _gcry_twofish_amd64_encrypt_block: .size _gcry_twofish_amd64_encrypt_block,.-_gcry_twofish_amd64_encrypt_block; .align 8 -.global _gcry_twofish_amd64_decrypt_block +.globl _gcry_twofish_amd64_decrypt_block .type _gcry_twofish_amd64_decrypt_block, at function; _gcry_twofish_amd64_decrypt_block: @@ -514,7 +514,7 @@ __twofish_dec_blk3: .size __twofish_dec_blk3,.-__twofish_dec_blk3; .align 8 -.global _gcry_twofish_amd64_ctr_enc +.globl _gcry_twofish_amd64_ctr_enc .type _gcry_twofish_amd64_ctr_enc, at function; _gcry_twofish_amd64_ctr_enc: /* input: @@ -596,7 +596,7 @@ _gcry_twofish_amd64_ctr_enc: .size _gcry_twofish_amd64_ctr_enc,.-_gcry_twofish_amd64_ctr_enc; .align 8 -.global _gcry_twofish_amd64_cbc_dec +.globl _gcry_twofish_amd64_cbc_dec .type _gcry_twofish_amd64_cbc_dec, at function; _gcry_twofish_amd64_cbc_dec: /* input: @@ -662,7 +662,7 @@ _gcry_twofish_amd64_cbc_dec: .size _gcry_twofish_amd64_cbc_dec,.-_gcry_twofish_amd64_cbc_dec; .align 8 -.global _gcry_twofish_amd64_cfb_dec +.globl _gcry_twofish_amd64_cfb_dec .type _gcry_twofish_amd64_cfb_dec, at function; _gcry_twofish_amd64_cfb_dec: /* input: diff --git a/cipher/twofish-arm.S b/cipher/twofish-arm.S index ee22f56..ead2240 100644 --- a/cipher/twofish-arm.S +++ b/cipher/twofish-arm.S @@ -257,7 +257,7 @@ ror1(RD); .align 3 -.global _gcry_twofish_arm_encrypt_block +.globl _gcry_twofish_arm_encrypt_block .type _gcry_twofish_arm_encrypt_block,%function; _gcry_twofish_arm_encrypt_block: @@ -308,7 +308,7 @@ _gcry_twofish_arm_encrypt_block: .size _gcry_twofish_arm_encrypt_block,.-_gcry_twofish_arm_encrypt_block; .align 3 -.global _gcry_twofish_arm_decrypt_block +.globl _gcry_twofish_arm_decrypt_block .type _gcry_twofish_arm_decrypt_block,%function; _gcry_twofish_arm_decrypt_block: From cvs at cvs.gnupg.org Mon Oct 28 10:29:14 2013 From: cvs at cvs.gnupg.org (by Jussi Kivilinna) Date: Mon, 28 Oct 2013 10:29:14 +0100 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-339-gebc8abf Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via ebc8abfcb09d6106fcfce40f240a513e276f46e9 (commit) from 51f1beab3d1e879942a95f58b08de7dbcce75dce (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit ebc8abfcb09d6106fcfce40f240a513e276f46e9 Author: Jussi Kivilinna Date: Mon Oct 28 08:54:59 2013 +0200 Change .global to .globl in assembly files * cipher/blowfish-arm.S: Change '.global' to '.globl'. * cipher/camellia-aesni-avx-amd64.S: Ditto. * cipher/camellia-aesni-avx2-amd64.S: Ditto. * cipher/camellia-arm.S: Ditto. * cipher/cast5-amd64.S: Ditto. * cipher/rijndael-amd64.S: Ditto. * cipher/rijndael-arm.S: Ditto. * cipher/serpent-avx2-amd64.S: Ditto. * cipher/serpent-sse2-amd64.S: Ditto. * cipher/twofish-amd64.S: Ditto. * cipher/twofish-arm.S: Ditto. -- The .global keyword is used only in newer versions of GAS, so change these to older .globl for better portability. Signed-off-by: Jussi Kivilinna diff --git a/cipher/blowfish-arm.S b/cipher/blowfish-arm.S index 43090d7..901d0c3 100644 --- a/cipher/blowfish-arm.S +++ b/cipher/blowfish-arm.S @@ -296,7 +296,7 @@ _gcry_blowfish_arm_do_encrypt: .size _gcry_blowfish_arm_do_encrypt,.-_gcry_blowfish_arm_do_encrypt; .align 3 -.global _gcry_blowfish_arm_encrypt_block +.globl _gcry_blowfish_arm_encrypt_block .type _gcry_blowfish_arm_encrypt_block,%function; _gcry_blowfish_arm_encrypt_block: @@ -317,7 +317,7 @@ _gcry_blowfish_arm_encrypt_block: .size _gcry_blowfish_arm_encrypt_block,.-_gcry_blowfish_arm_encrypt_block; .align 3 -.global _gcry_blowfish_arm_decrypt_block +.globl _gcry_blowfish_arm_decrypt_block .type _gcry_blowfish_arm_decrypt_block,%function; _gcry_blowfish_arm_decrypt_block: diff --git a/cipher/camellia-aesni-avx-amd64.S b/cipher/camellia-aesni-avx-amd64.S index 9873d98..9be5d14 100644 --- a/cipher/camellia-aesni-avx-amd64.S +++ b/cipher/camellia-aesni-avx-amd64.S @@ -947,7 +947,7 @@ __camellia_dec_blk16: vpsubq tmp, x, x; .align 8 -.global _gcry_camellia_aesni_avx_ctr_enc +.globl _gcry_camellia_aesni_avx_ctr_enc .type _gcry_camellia_aesni_avx_ctr_enc, at function; _gcry_camellia_aesni_avx_ctr_enc: @@ -1062,7 +1062,7 @@ _gcry_camellia_aesni_avx_ctr_enc: .size _gcry_camellia_aesni_avx_ctr_enc,.-_gcry_camellia_aesni_avx_ctr_enc; .align 8 -.global _gcry_camellia_aesni_avx_cbc_dec +.globl _gcry_camellia_aesni_avx_cbc_dec .type _gcry_camellia_aesni_avx_cbc_dec, at function; _gcry_camellia_aesni_avx_cbc_dec: @@ -1126,7 +1126,7 @@ _gcry_camellia_aesni_avx_cbc_dec: .size _gcry_camellia_aesni_avx_cbc_dec,.-_gcry_camellia_aesni_avx_cbc_dec; .align 8 -.global _gcry_camellia_aesni_avx_cfb_dec +.globl _gcry_camellia_aesni_avx_cfb_dec .type _gcry_camellia_aesni_avx_cfb_dec, at function; _gcry_camellia_aesni_avx_cfb_dec: diff --git a/cipher/camellia-aesni-avx2-amd64.S b/cipher/camellia-aesni-avx2-amd64.S index 7e31323..f0a4fd8 100644 --- a/cipher/camellia-aesni-avx2-amd64.S +++ b/cipher/camellia-aesni-avx2-amd64.S @@ -926,7 +926,7 @@ __camellia_dec_blk32: vpsubq tmp, x, x; .align 8 -.global _gcry_camellia_aesni_avx2_ctr_enc +.globl _gcry_camellia_aesni_avx2_ctr_enc .type _gcry_camellia_aesni_avx2_ctr_enc, at function; _gcry_camellia_aesni_avx2_ctr_enc: @@ -1110,7 +1110,7 @@ _gcry_camellia_aesni_avx2_ctr_enc: .size _gcry_camellia_aesni_avx2_ctr_enc,.-_gcry_camellia_aesni_avx2_ctr_enc; .align 8 -.global _gcry_camellia_aesni_avx2_cbc_dec +.globl _gcry_camellia_aesni_avx2_cbc_dec .type _gcry_camellia_aesni_avx2_cbc_dec, at function; _gcry_camellia_aesni_avx2_cbc_dec: @@ -1181,7 +1181,7 @@ _gcry_camellia_aesni_avx2_cbc_dec: .size _gcry_camellia_aesni_avx2_cbc_dec,.-_gcry_camellia_aesni_avx2_cbc_dec; .align 8 -.global _gcry_camellia_aesni_avx2_cfb_dec +.globl _gcry_camellia_aesni_avx2_cfb_dec .type _gcry_camellia_aesni_avx2_cfb_dec, at function; _gcry_camellia_aesni_avx2_cfb_dec: diff --git a/cipher/camellia-arm.S b/cipher/camellia-arm.S index 820c46e..302f436 100644 --- a/cipher/camellia-arm.S +++ b/cipher/camellia-arm.S @@ -252,7 +252,7 @@ str_output_be(%r1, YL, YR, XL, XR, RT0, RT1); .align 3 -.global _gcry_camellia_arm_encrypt_block +.globl _gcry_camellia_arm_encrypt_block .type _gcry_camellia_arm_encrypt_block,%function; _gcry_camellia_arm_encrypt_block: @@ -300,7 +300,7 @@ _gcry_camellia_arm_encrypt_block: .size _gcry_camellia_arm_encrypt_block,.-_gcry_camellia_arm_encrypt_block; .align 3 -.global _gcry_camellia_arm_decrypt_block +.globl _gcry_camellia_arm_decrypt_block .type _gcry_camellia_arm_decrypt_block,%function; _gcry_camellia_arm_decrypt_block: diff --git a/cipher/cast5-amd64.S b/cipher/cast5-amd64.S index 1bca249..c3b819d 100644 --- a/cipher/cast5-amd64.S +++ b/cipher/cast5-amd64.S @@ -179,7 +179,7 @@ movq RLR0, (RIO); .align 8 -.global _gcry_cast5_amd64_encrypt_block +.globl _gcry_cast5_amd64_encrypt_block .type _gcry_cast5_amd64_encrypt_block, at function; _gcry_cast5_amd64_encrypt_block: @@ -219,7 +219,7 @@ _gcry_cast5_amd64_encrypt_block: .size _gcry_cast5_amd64_encrypt_block,.-_gcry_cast5_amd64_encrypt_block; .align 8 -.global _gcry_cast5_amd64_decrypt_block +.globl _gcry_cast5_amd64_decrypt_block .type _gcry_cast5_amd64_decrypt_block, at function; _gcry_cast5_amd64_decrypt_block: @@ -417,7 +417,7 @@ __cast5_dec_blk4: .size __cast5_dec_blk4,.-__cast5_dec_blk4; .align 8 -.global _gcry_cast5_amd64_ctr_enc +.globl _gcry_cast5_amd64_ctr_enc .type _gcry_cast5_amd64_ctr_enc, at function; _gcry_cast5_amd64_ctr_enc: /* input: @@ -475,7 +475,7 @@ _gcry_cast5_amd64_ctr_enc: .size _gcry_cast5_amd64_ctr_enc,.-_gcry_cast5_amd64_ctr_enc; .align 8 -.global _gcry_cast5_amd64_cbc_dec +.globl _gcry_cast5_amd64_cbc_dec .type _gcry_cast5_amd64_cbc_dec, at function; _gcry_cast5_amd64_cbc_dec: /* input: @@ -529,7 +529,7 @@ _gcry_cast5_amd64_cbc_dec: .size _gcry_cast5_amd64_cbc_dec,.-_gcry_cast5_amd64_cbc_dec; .align 8 -.global _gcry_cast5_amd64_cfb_dec +.globl _gcry_cast5_amd64_cfb_dec .type _gcry_cast5_amd64_cfb_dec, at function; _gcry_cast5_amd64_cfb_dec: /* input: diff --git a/cipher/rijndael-amd64.S b/cipher/rijndael-amd64.S index 2a7dd90..d360ea8 100644 --- a/cipher/rijndael-amd64.S +++ b/cipher/rijndael-amd64.S @@ -160,7 +160,7 @@ addroundkey((round) + 1, RNA, RNB, RNC, RND); .align 8 -.global _gcry_aes_amd64_encrypt_block +.globl _gcry_aes_amd64_encrypt_block .type _gcry_aes_amd64_encrypt_block, at function; _gcry_aes_amd64_encrypt_block: @@ -281,7 +281,7 @@ _gcry_aes_amd64_encrypt_block: addroundkey(round, RNA, RNB, RNC, RND); .align 8 -.global _gcry_aes_amd64_decrypt_block +.globl _gcry_aes_amd64_decrypt_block .type _gcry_aes_amd64_decrypt_block, at function; _gcry_aes_amd64_decrypt_block: diff --git a/cipher/rijndael-arm.S b/cipher/rijndael-arm.S index 2a747bf..22c350c 100644 --- a/cipher/rijndael-arm.S +++ b/cipher/rijndael-arm.S @@ -211,7 +211,7 @@ addroundkey(rna, rnb, rnc, rnd, ra, rb, rc, rd, dummy); .align 3 -.global _gcry_aes_arm_encrypt_block +.globl _gcry_aes_arm_encrypt_block .type _gcry_aes_arm_encrypt_block,%function; _gcry_aes_arm_encrypt_block: @@ -465,7 +465,7 @@ _gcry_aes_arm_encrypt_block: addroundkey(rna, rnb, rnc, rnd, ra, rb, rc, rd, dummy); .align 3 -.global _gcry_aes_arm_decrypt_block +.globl _gcry_aes_arm_decrypt_block .type _gcry_aes_arm_decrypt_block,%function; _gcry_aes_arm_decrypt_block: diff --git a/cipher/serpent-avx2-amd64.S b/cipher/serpent-avx2-amd64.S index 8a76ab1..532361d 100644 --- a/cipher/serpent-avx2-amd64.S +++ b/cipher/serpent-avx2-amd64.S @@ -588,7 +588,7 @@ __serpent_dec_blk16: vpsubq tmp, x, x; .align 8 -.global _gcry_serpent_avx2_ctr_enc +.globl _gcry_serpent_avx2_ctr_enc .type _gcry_serpent_avx2_ctr_enc, at function; _gcry_serpent_avx2_ctr_enc: /* input: @@ -695,7 +695,7 @@ _gcry_serpent_avx2_ctr_enc: .size _gcry_serpent_avx2_ctr_enc,.-_gcry_serpent_avx2_ctr_enc; .align 8 -.global _gcry_serpent_avx2_cbc_dec +.globl _gcry_serpent_avx2_cbc_dec .type _gcry_serpent_avx2_cbc_dec, at function; _gcry_serpent_avx2_cbc_dec: /* input: @@ -746,7 +746,7 @@ _gcry_serpent_avx2_cbc_dec: .size _gcry_serpent_avx2_cbc_dec,.-_gcry_serpent_avx2_cbc_dec; .align 8 -.global _gcry_serpent_avx2_cfb_dec +.globl _gcry_serpent_avx2_cfb_dec .type _gcry_serpent_avx2_cfb_dec, at function; _gcry_serpent_avx2_cfb_dec: /* input: diff --git a/cipher/serpent-sse2-amd64.S b/cipher/serpent-sse2-amd64.S index 516126b..f2be236 100644 --- a/cipher/serpent-sse2-amd64.S +++ b/cipher/serpent-sse2-amd64.S @@ -605,7 +605,7 @@ __serpent_dec_blk8: .size __serpent_dec_blk8,.-__serpent_dec_blk8; .align 8 -.global _gcry_serpent_sse2_ctr_enc +.globl _gcry_serpent_sse2_ctr_enc .type _gcry_serpent_sse2_ctr_enc, at function; _gcry_serpent_sse2_ctr_enc: /* input: @@ -735,7 +735,7 @@ _gcry_serpent_sse2_ctr_enc: .size _gcry_serpent_sse2_ctr_enc,.-_gcry_serpent_sse2_ctr_enc; .align 8 -.global _gcry_serpent_sse2_cbc_dec +.globl _gcry_serpent_sse2_cbc_dec .type _gcry_serpent_sse2_cbc_dec, at function; _gcry_serpent_sse2_cbc_dec: /* input: @@ -796,7 +796,7 @@ _gcry_serpent_sse2_cbc_dec: .size _gcry_serpent_sse2_cbc_dec,.-_gcry_serpent_sse2_cbc_dec; .align 8 -.global _gcry_serpent_sse2_cfb_dec +.globl _gcry_serpent_sse2_cfb_dec .type _gcry_serpent_sse2_cfb_dec, at function; _gcry_serpent_sse2_cfb_dec: /* input: diff --git a/cipher/twofish-amd64.S b/cipher/twofish-amd64.S index 45548d2..c923d22 100644 --- a/cipher/twofish-amd64.S +++ b/cipher/twofish-amd64.S @@ -165,7 +165,7 @@ movl x, (4 * (n))(out); .align 8 -.global _gcry_twofish_amd64_encrypt_block +.globl _gcry_twofish_amd64_encrypt_block .type _gcry_twofish_amd64_encrypt_block, at function; _gcry_twofish_amd64_encrypt_block: @@ -208,7 +208,7 @@ _gcry_twofish_amd64_encrypt_block: .size _gcry_twofish_amd64_encrypt_block,.-_gcry_twofish_amd64_encrypt_block; .align 8 -.global _gcry_twofish_amd64_decrypt_block +.globl _gcry_twofish_amd64_decrypt_block .type _gcry_twofish_amd64_decrypt_block, at function; _gcry_twofish_amd64_decrypt_block: @@ -514,7 +514,7 @@ __twofish_dec_blk3: .size __twofish_dec_blk3,.-__twofish_dec_blk3; .align 8 -.global _gcry_twofish_amd64_ctr_enc +.globl _gcry_twofish_amd64_ctr_enc .type _gcry_twofish_amd64_ctr_enc, at function; _gcry_twofish_amd64_ctr_enc: /* input: @@ -596,7 +596,7 @@ _gcry_twofish_amd64_ctr_enc: .size _gcry_twofish_amd64_ctr_enc,.-_gcry_twofish_amd64_ctr_enc; .align 8 -.global _gcry_twofish_amd64_cbc_dec +.globl _gcry_twofish_amd64_cbc_dec .type _gcry_twofish_amd64_cbc_dec, at function; _gcry_twofish_amd64_cbc_dec: /* input: @@ -662,7 +662,7 @@ _gcry_twofish_amd64_cbc_dec: .size _gcry_twofish_amd64_cbc_dec,.-_gcry_twofish_amd64_cbc_dec; .align 8 -.global _gcry_twofish_amd64_cfb_dec +.globl _gcry_twofish_amd64_cfb_dec .type _gcry_twofish_amd64_cfb_dec, at function; _gcry_twofish_amd64_cfb_dec: /* input: diff --git a/cipher/twofish-arm.S b/cipher/twofish-arm.S index ee22f56..ead2240 100644 --- a/cipher/twofish-arm.S +++ b/cipher/twofish-arm.S @@ -257,7 +257,7 @@ ror1(RD); .align 3 -.global _gcry_twofish_arm_encrypt_block +.globl _gcry_twofish_arm_encrypt_block .type _gcry_twofish_arm_encrypt_block,%function; _gcry_twofish_arm_encrypt_block: @@ -308,7 +308,7 @@ _gcry_twofish_arm_encrypt_block: .size _gcry_twofish_arm_encrypt_block,.-_gcry_twofish_arm_encrypt_block; .align 3 -.global _gcry_twofish_arm_decrypt_block +.globl _gcry_twofish_arm_decrypt_block .type _gcry_twofish_arm_decrypt_block,%function; _gcry_twofish_arm_decrypt_block: ----------------------------------------------------------------------- Summary of changes: cipher/blowfish-arm.S | 4 ++-- cipher/camellia-aesni-avx-amd64.S | 6 +++--- cipher/camellia-aesni-avx2-amd64.S | 6 +++--- cipher/camellia-arm.S | 4 ++-- cipher/cast5-amd64.S | 10 +++++----- cipher/rijndael-amd64.S | 4 ++-- cipher/rijndael-arm.S | 4 ++-- cipher/serpent-avx2-amd64.S | 6 +++--- cipher/serpent-sse2-amd64.S | 6 +++--- cipher/twofish-amd64.S | 10 +++++----- cipher/twofish-arm.S | 4 ++-- 11 files changed, 32 insertions(+), 32 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From jussi.kivilinna at iki.fi Mon Oct 28 10:58:43 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 28 Oct 2013 11:58:43 +0200 Subject: [PATCH] Add new benchmarking utility, bench-cycles Message-ID: <20131028095843.14876.6831.stgit@localhost6.localdomain6> * tests/Makefile.am (TESTS): Add 'bench-slope'. * tests/bench-slope.c: New. -- Bench-slope is new benchmarking tool for libgcrypt for obtaining overheadless cycles/byte speed of cipher and hash algorithms. Tool measures the time each operation (hash/encrypt/decrypt/authentication) takes for different buffer sizes of from ~0kB to ~4kB and calculates the slope for these data points. The default output is then given as nanosecs/byte and mebibytes/sec. If user provides the speed of used CPU, tool also outputs cycles/byte result (CPU-Ghz * ns/B = c/B). Output without CPU speed (with ARM Cortex-A8): $ tests/bench-slope hash Hash: | nanosecs/byte mebibytes/sec cycles/byte MD5 | 7.35 ns/B 129.7 MiB/s - c/B SHA1 | 12.30 ns/B 77.53 MiB/s - c/B RIPEMD160 | 15.96 ns/B 59.77 MiB/s - c/B TIGER192 | 55.55 ns/B 17.17 MiB/s - c/B SHA256 | 24.38 ns/B 39.12 MiB/s - c/B SHA384 | 34.24 ns/B 27.86 MiB/s - c/B SHA512 | 34.19 ns/B 27.90 MiB/s - c/B SHA224 | 24.38 ns/B 39.12 MiB/s - c/B MD4 | 5.68 ns/B 168.0 MiB/s - c/B CRC32 | 9.26 ns/B 103.0 MiB/s - c/B CRC32RFC1510 | 9.20 ns/B 103.6 MiB/s - c/B CRC24RFC2440 | 87.31 ns/B 10.92 MiB/s - c/B WHIRLPOOL | 253.3 ns/B 3.77 MiB/s - c/B TIGER | 55.55 ns/B 17.17 MiB/s - c/B TIGER2 | 55.55 ns/B 17.17 MiB/s - c/B GOSTR3411_94 | 212.0 ns/B 4.50 MiB/s - c/B STRIBOG256 | 630.1 ns/B 1.51 MiB/s - c/B STRIBOG512 | 630.1 ns/B 1.51 MiB/s - c/B = With CPU speed (with Intel i5-4570, 3.2Ghz when turbo-boost disabled): $ tests/bench-slope --cpu-mhz 3201 cipher arcfour blowfish aes Cipher: ARCFOUR | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 2.43 ns/B 392.1 MiB/s 7.79 c/B STREAM dec | 2.44 ns/B 390.2 MiB/s 7.82 c/B = BLOWFISH | nanosecs/byte mebibytes/sec cycles/byte ECB enc | 7.62 ns/B 125.2 MiB/s 24.38 c/B ECB dec | 7.63 ns/B 125.0 MiB/s 24.43 c/B CBC enc | 9.18 ns/B 103.9 MiB/s 29.38 c/B CBC dec | 2.60 ns/B 366.2 MiB/s 8.34 c/B CFB enc | 9.17 ns/B 104.0 MiB/s 29.35 c/B CFB dec | 2.66 ns/B 358.1 MiB/s 8.53 c/B OFB enc | 8.97 ns/B 106.3 MiB/s 28.72 c/B OFB dec | 8.97 ns/B 106.3 MiB/s 28.71 c/B CTR enc | 2.60 ns/B 366.5 MiB/s 8.33 c/B CTR dec | 2.60 ns/B 367.1 MiB/s 8.32 c/B = AES | nanosecs/byte mebibytes/sec cycles/byte ECB enc | 0.439 ns/B 2173.0 MiB/s 1.40 c/B ECB dec | 0.489 ns/B 1949.5 MiB/s 1.57 c/B CBC enc | 1.64 ns/B 580.8 MiB/s 5.26 c/B CBC dec | 0.219 ns/B 4357.6 MiB/s 0.701 c/B CFB enc | 1.53 ns/B 623.6 MiB/s 4.90 c/B CFB dec | 0.219 ns/B 4350.5 MiB/s 0.702 c/B OFB enc | 1.51 ns/B 629.9 MiB/s 4.85 c/B OFB dec | 1.51 ns/B 629.9 MiB/s 4.85 c/B CTR enc | 0.288 ns/B 3308.5 MiB/s 0.923 c/B CTR dec | 0.288 ns/B 3316.9 MiB/s 0.920 c/B CCM enc | 1.93 ns/B 493.8 MiB/s 6.18 c/B CCM dec | 1.93 ns/B 494.0 MiB/s 6.18 c/B CCM auth | 1.64 ns/B 580.1 MiB/s 5.26 c/B = Note: It's highly recommented to disable turbo-boost and dynamic CPU frequency features when making these kind of measurements to reduce variance. Note: The results are maximum performance for each operation; the actual speed in application depends on various matters, such as: used buffer sizes, cache usage, etc. Signed-off-by: Jussi Kivilinna --- tests/Makefile.am | 4 tests/bench-slope.c | 1154 +++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 1156 insertions(+), 2 deletions(-) create mode 100644 tests/bench-slope.c diff --git a/tests/Makefile.am b/tests/Makefile.am index ac84e75..c9ba5f4 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -24,8 +24,8 @@ TESTS = version mpitests tsexp t-convert \ fips186-dsa aeswrap pkcs1v2 random dsa-rfc6979 t-ed25519 -# The last test to run. -TESTS += benchmark +# The last tests to run. +TESTS += benchmark bench-slope # Need to include ../src in addition to top_srcdir because gcrypt.h is diff --git a/tests/bench-slope.c b/tests/bench-slope.c new file mode 100644 index 0000000..b706d37 --- /dev/null +++ b/tests/bench-slope.c @@ -0,0 +1,1154 @@ +/* bench-slope.c - for libgcrypt + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser general Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#ifdef HAVE_CONFIG_H +#include +#endif +#include +#include +#include +#include +#include + +#ifdef _GCRYPT_IN_LIBGCRYPT +#include "../src/gcrypt-int.h" +#include "../compat/libcompat.h" +#else +#include +#endif + +#define PGM "bench-slope" + +static int verbose; + + +/* CPU Ghz value provided by user, allows constructing cycles/byte and other + results. */ +static double cpu_ghz = -1; + + + +/*************************************** Default parameters for measurements. */ + +/* Start at small buffer size, to get reasonably timer calibration for fast + * implementations (AES-NI etc). Sixteen selected to support the largest block + * size of current cipher block. */ +#define BUF_START_SIZE 16 + +/* From ~0 to ~4kbytes give comparable results with results from academia + * (SUPERCOP). */ +#define BUF_END_SIZE (BUF_START_SIZE + 4096) + +/* With 128 byte steps, we get (4096)/128 = 32 data points. */ +#define BUF_STEP_SIZE 128 + +/* Number of repeated measurements at each data point. The median of these + * measurements is selected as data point further analysis. */ +#define NUM_MEASUREMENT_REPETITIONS 32 + +/**************************************************** High-resolution timers. */ + +/* This benchmarking module needs needs high resolution timer. */ +#undef NO_GET_NSEC_TIME +#if defined(_WIN32) +struct nsec_time +{ + LARGE_INTEGER perf_count; +}; + +static void +get_nsec_time (struct nsec_time *t) +{ + BOOL ok; + + ok = QueryPerformanceCounter (&t->perf_count); + assert (ok); +} + +static double +get_time_nsec_diff (struct nsec_time *start, struct nsec_time *end) +{ + static double nsecs_per_count = 0.0; + double nsecs; + + if (nsecs_per_count == 0.0) + { + LARGE_INTEGER perf_freq; + BOOL ok; + + /* Get counts per second. */ + ok = QueryPerformanceFrequency (&perf_freq); + assert (ok); + + nsecs_per_count = 1.0 / perf_freq.QuadPart; + nsecs_per_count *= 1000000.0 * 1000.0; /* sec => nsec */ + + assert (nsecs_per_count > 0.0); + } + + nsecs = end->perf_count.QuadPart - start->perf_count.QuadPart; /* counts */ + nsecs *= nsecs_per_count; /* counts * (nsecs / count) => nsecs */ + + return nsecs; +} +#elif defined(HAVE_CLOCK_GETTIME) +struct nsec_time +{ + struct timespec ts; +}; + +static void +get_nsec_time (struct nsec_time *t) +{ + int err; + + err = clock_gettime (CLOCK_REALTIME, &t->ts); + assert (err == 0); +} + +static double +get_time_nsec_diff (struct nsec_time *start, struct nsec_time *end) +{ + double nsecs; + + nsecs = end->ts.tv_sec - start->ts.tv_sec; + nsecs *= 1000000.0 * 1000.0; /* sec => nsec */ + + /* This way we don't have to care if tv_nsec unsigned or signed. */ + if (end->ts.tv_nsec >= start->ts.tv_nsec) + nsecs += end->ts.tv_nsec - start->ts.tv_nsec; + else + nsecs -= start->ts.tv_nsec - end->ts.tv_nsec; + + return nsecs; +} +#elif defined(HAVE_GETTIMEOFDAY) +struct nsec_time +{ + struct timeval tv; +}; + +static void +get_nsec_time (struct nsec_time *t) +{ + int err; + + err = gettimeofday (&t->tv, NULL); + assert (err == 0); +} + +static double +get_time_nsec_diff (struct nsec_time *start, struct nsec_time *end) +{ + double nsecs; + + nsecs = end->tv.tv_sec - start->tv.tv_sec; + nsecs *= 1000000; /* sec => ?sec */ + + /* This way we don't have to care if tv_usec unsigned or signed. */ + if (end->tv.tv_usec >= start->tv.tv_usec) + nsecs += end->tv.tv_usec - start->tv.tv_usec; + else + nsecs -= start->tv.tv_usec - end->tv.tv_usec; + + nsecs *= 1000; /* ?sec => nsec */ + + return nsecs; +} +#else +#define NO_GET_NSEC_TIME 1 +#endif + + +/* If no high resolution timer found, provide dummy bench-slope. */ +#ifdef NO_GET_NSEC_TIME + + +int +main (void) +{ + /* do nothing */ + return 0; +} + + +#else /* !NO_GET_NSEC_TIME */ + + +/********************************************** Slope benchmarking framework. */ + +struct bench_obj +{ + const struct bench_ops *ops; + + unsigned int num_measure_repetitions; + unsigned int min_bufsize; + unsigned int max_bufsize; + unsigned int step_size; + + void *priv; +}; + +typedef int (*const bench_initialize_t) (struct bench_obj * obj); +typedef void (*const bench_finalize_t) (struct bench_obj * obj); +typedef void (*const bench_do_run_t) (struct bench_obj * obj, void *buffer, + size_t buflen); + +struct bench_ops +{ + bench_initialize_t initialize; + bench_finalize_t finalize; + bench_do_run_t do_run; +}; + + +double +get_slope (double (*const get_x) (unsigned int idx, void *priv), + void *get_x_priv, double y_points[], unsigned int npoints, + double *overhead) +{ + double sumx, sumy, sumx2, sumy2, sumxy; + unsigned int i; + double b, a; + + sumx = sumy = sumx2 = sumy2 = sumxy = 0; + + for (i = 0; i < npoints; i++) + { + double x, y; + + x = get_x (i, get_x_priv); /* bytes */ + y = y_points[i]; /* nsecs */ + + sumx += x; + sumy += y; + sumx2 += x * x; + //sumy2 += y * y; + sumxy += x * y; + } + + b = (npoints * sumxy - sumx * sumy) / (npoints * sumx2 - sumx * sumx); + a = (sumy - b * sumx) / npoints; + + if (overhead) + *overhead = a; /* nsecs */ + + return b; /* nsecs per byte */ +} + + +double +get_bench_obj_point_x (unsigned int idx, void *priv) +{ + struct bench_obj *obj = priv; + return (double) (obj->min_bufsize + (idx * obj->step_size)); +} + + +unsigned int +get_num_measurements (struct bench_obj *obj) +{ + unsigned int buf_range = obj->max_bufsize - obj->min_bufsize; + unsigned int num = buf_range / obj->step_size + 1; + + while (obj->min_bufsize + (num * obj->step_size) > obj->max_bufsize) + num--; + + return num + 1; +} + + +static int +double_cmp (const void *__a, const void *__b) +{ + const double *a, *b; + + a = __a; + b = __b; + + if (*a > *b) + return 1; + if (*a < *b) + return -1; + return 0; +} + + +double +do_bench_obj_measurement (struct bench_obj *obj, void *buffer, size_t buflen, + unsigned int loop_iterations) +{ + const unsigned int num_repetitions = obj->num_measure_repetitions; + const bench_do_run_t do_run = obj->ops->do_run; + double measurement_raw[num_repetitions]; + struct nsec_time start, end; + unsigned int rep, loop; + double res; + + if (num_repetitions < 1 || loop_iterations < 1) + return 0.0; + + for (rep = 0; rep < num_repetitions; rep++) + { + get_nsec_time (&start); + + for (loop = 0; loop < loop_iterations; loop++) + do_run (obj, buffer, buflen); + + get_nsec_time (&end); + + measurement_raw[rep] = get_time_nsec_diff (&start, &end); + } + + /* Return median of repeated measurements. */ + qsort (measurement_raw, num_repetitions, sizeof (measurement_raw[0]), + double_cmp); + + if (num_repetitions % 2 == 1) + return measurement_raw[num_repetitions / 2]; + + res = measurement_raw[num_repetitions / 2] + + measurement_raw[num_repetitions / 2 - 1]; + return res / 2; +} + + +unsigned int +adjust_loop_iterations_to_timer_accuracy (struct bench_obj *obj, void *buffer) +{ + const double increase_thres = 3.0; + double tmp, nsecs; + unsigned int loop_iterations; + unsigned int test_bufsize; + + test_bufsize = obj->min_bufsize; + if (test_bufsize == 0) + test_bufsize += obj->step_size; + + loop_iterations = 0; + do + { + /* Increase loop iterations until we get other results than zero. */ + nsecs = + do_bench_obj_measurement (obj, buffer, test_bufsize, + ++loop_iterations); + } + while (nsecs < 1.0 - 0.1); + do + { + /* Increase loop iterations until we get reasonable increase for elapsed time. */ + tmp = + do_bench_obj_measurement (obj, buffer, test_bufsize, + ++loop_iterations); + } + while (tmp < nsecs * (increase_thres - 0.1)); + + return loop_iterations; +} + + +/* Benchmark and return linear regression slope in nanoseconds per byte. */ +double +do_slope_benchmark (struct bench_obj *obj) +{ + unsigned int num_measurements; + double *measurements = NULL; + double slope, overhead; + unsigned int loop_iterations, midx, i; + unsigned char *real_buffer = NULL; + unsigned char *buffer; + size_t cur_bufsize; + int err; + + err = obj->ops->initialize (obj); + if (err < 0) + return -1; + + num_measurements = get_num_measurements (obj); + measurements = calloc (num_measurements, sizeof (*measurements)); + if (!measurements) + goto err_free; + + if (num_measurements < 1 || obj->num_measure_repetitions < 1 || + obj->max_bufsize < 1 || obj->min_bufsize > obj->max_bufsize) + goto err_free; + + real_buffer = malloc (obj->max_bufsize + 128); + if (!real_buffer) + goto err_free; + /* Get aligned buffer */ + buffer = real_buffer; + buffer += 128 - ((real_buffer - (unsigned char *) 0) & (128 - 1)); + + for (i = 0; i < obj->max_bufsize; i++) + buffer[i] = 0x55 ^ (-i); + + /* Adjust number of loop iterations up to timer accuracy. */ + loop_iterations = adjust_loop_iterations_to_timer_accuracy (obj, buffer); + + /* Perform measurements */ + for (midx = 0, cur_bufsize = obj->min_bufsize; + cur_bufsize <= obj->max_bufsize; cur_bufsize += obj->step_size, midx++) + { + measurements[midx] = + do_bench_obj_measurement (obj, buffer, cur_bufsize, loop_iterations); + measurements[midx] /= loop_iterations; + } + + assert (midx == num_measurements); + + slope = + get_slope (&get_bench_obj_point_x, obj, measurements, num_measurements, + &overhead); + + free (real_buffer); + obj->ops->finalize (obj); + + return slope; + +err_free: + if (measurements) + free (measurements); + if (real_buffer) + free (real_buffer); + obj->ops->finalize (obj); + + return -1; +} + + +/********************************************************** Printing results. */ + +static void +double_to_str (char *out, size_t outlen, double value) +{ + const char *fmt; + + if (value < 1.0) + fmt = "%.3f"; + else if (value < 100.0) + fmt = "%.2f"; + else + fmt = "%.1f"; + + snprintf (out, outlen, fmt, value); +} + +static void +bench_print_result (double nsecs_per_byte) +{ + double cycles_per_byte, mbytes_per_sec; + char nsecpbyte_buf[16]; + char mbpsec_buf[16]; + char cpbyte_buf[16]; + + strcpy (cpbyte_buf, "-"); + + double_to_str (nsecpbyte_buf, sizeof (nsecpbyte_buf), nsecs_per_byte); + + /* If user didn't provide CPU speed, we cannot show cycles/byte results. */ + if (cpu_ghz > 0.0) + { + cycles_per_byte = nsecs_per_byte * cpu_ghz; + double_to_str (cpbyte_buf, sizeof (cpbyte_buf), cycles_per_byte); + } + + mbytes_per_sec = + (1000.0 * 1000.0 * 1000.0) / (nsecs_per_byte * 1024 * 1024); + double_to_str (mbpsec_buf, sizeof (mbpsec_buf), mbytes_per_sec); + + strncat (nsecpbyte_buf, " ns/B", sizeof(nsecpbyte_buf) - 1); + strncat (mbpsec_buf, " MiB/s", sizeof(mbpsec_buf) - 1); + strncat (cpbyte_buf, " c/B", sizeof(cpbyte_buf) - 1); + + printf ("%14s %15s %13s\n", nsecpbyte_buf, mbpsec_buf, cpbyte_buf); +} + +static void +bench_print_header (const char *algo_name) +{ + printf (" %-14s | ", algo_name); + printf ("%14s %15s %13s\n", "nanosecs/byte", "mebibytes/sec", + "cycles/byte"); +} + +static void +bench_print_footer (void) +{ + printf (" %-14s =\n", ""); +} + + +/********************************************************* Cipher benchmarks. */ + +struct bench_cipher_mode +{ + int mode; + const char *name; + struct bench_ops *ops; + + int algo; +}; + + +static int +bench_encrypt_init (struct bench_obj *obj) +{ + struct bench_cipher_mode *mode = obj->priv; + gcry_cipher_hd_t hd; + int err, keylen; + + obj->min_bufsize = BUF_START_SIZE; + obj->max_bufsize = BUF_END_SIZE; + obj->step_size = BUF_STEP_SIZE; + obj->num_measure_repetitions = NUM_MEASUREMENT_REPETITIONS; + + err = gcry_cipher_open (&hd, mode->algo, mode->mode, 0); + if (err) + { + fprintf (stderr, PGM ": error opening cipher `%s'\n", + gcry_cipher_algo_name (mode->algo)); + exit (1); + } + + keylen = gcry_cipher_get_algo_keylen (mode->algo); + if (keylen) + { + char key[keylen]; + int i; + + for (i = 0; i < keylen; i++) + key[i] = 0x33 ^ (11 - i); + + err = gcry_cipher_setkey (hd, key, keylen); + if (err) + { + fprintf (stderr, "gcry_cipher_setkey failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + } + else + { + fprintf (stderr, PGM ": failed to get key length for algorithm `%s'\n", + gcry_cipher_algo_name (mode->algo)); + gcry_cipher_close (hd); + exit (1); + } + + obj->priv = hd; + + return 0; +} + +static void +bench_encrypt_free (struct bench_obj *obj) +{ + gcry_cipher_hd_t hd = obj->priv; + + gcry_cipher_close (hd); +} + +static void +bench_encrypt_do_bench (struct bench_obj *obj, void *buf, size_t buflen) +{ + gcry_cipher_hd_t hd = obj->priv; + int err; + + err = gcry_cipher_encrypt (hd, buf, buflen, buf, buflen); + if (err) + { + fprintf (stderr, "gcry_cipher_encrypt failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } +} + +static void +bench_decrypt_do_bench (struct bench_obj *obj, void *buf, size_t buflen) +{ + gcry_cipher_hd_t hd = obj->priv; + int err; + + err = gcry_cipher_decrypt (hd, buf, buflen, buf, buflen); + if (err) + { + fprintf (stderr, "gcry_cipher_encrypt failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } +} + +static struct bench_ops encrypt_ops = { + &bench_encrypt_init, + &bench_encrypt_free, + &bench_encrypt_do_bench +}; + +static struct bench_ops decrypt_ops = { + &bench_encrypt_init, + &bench_encrypt_free, + &bench_decrypt_do_bench +}; + + + +static void +bench_ccm_encrypt_do_bench (struct bench_obj *obj, void *buf, size_t buflen) +{ + gcry_cipher_hd_t hd = obj->priv; + int err; + char tag[8]; + char nonce[11] = { 0x80, 0x01, }; + size_t params[3]; + + gcry_cipher_setiv (hd, nonce, sizeof (nonce)); + + /* Set CCM lengths */ + params[0] = buflen; + params[1] = 0; /*aadlen */ + params[2] = sizeof (tag); + err = + gcry_cipher_ctl (hd, GCRYCTL_SET_CCM_LENGTHS, params, sizeof (params)); + if (err) + { + fprintf (stderr, "gcry_cipher_ctl failed: %s\n", gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + err = gcry_cipher_encrypt (hd, buf, buflen, buf, buflen); + if (err) + { + fprintf (stderr, "gcry_cipher_encrypt failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + err = gcry_cipher_gettag (hd, tag, sizeof (tag)); + if (err) + { + fprintf (stderr, "gcry_cipher_gettag failed: %s\n", gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } +} + +static void +bench_ccm_decrypt_do_bench (struct bench_obj *obj, void *buf, size_t buflen) +{ + gcry_cipher_hd_t hd = obj->priv; + int err; + char tag[8] = { 0, }; + char nonce[11] = { 0x80, 0x01, }; + size_t params[3]; + + gcry_cipher_setiv (hd, nonce, sizeof (nonce)); + + /* Set CCM lengths */ + params[0] = buflen; + params[1] = 0; /*aadlen */ + params[2] = sizeof (tag); + err = + gcry_cipher_ctl (hd, GCRYCTL_SET_CCM_LENGTHS, params, sizeof (params)); + if (err) + { + fprintf (stderr, "gcry_cipher_ctl failed: %s\n", gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + err = gcry_cipher_decrypt (hd, buf, buflen, buf, buflen); + if (err) + { + fprintf (stderr, "gcry_cipher_encrypt failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + err = gcry_cipher_checktag (hd, tag, sizeof (tag)); + if (gpg_err_code (err) == GPG_ERR_CHECKSUM) + err = gpg_error (GPG_ERR_NO_ERROR); + if (err) + { + fprintf (stderr, "gcry_cipher_gettag failed: %s\n", gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } +} + +static void +bench_ccm_authenticate_do_bench (struct bench_obj *obj, void *buf, + size_t buflen) +{ + gcry_cipher_hd_t hd = obj->priv; + int err; + char tag[8] = { 0, }; + char nonce[11] = { 0x80, 0x01, }; + size_t params[3]; + char data = 0xff; + + gcry_cipher_setiv (hd, nonce, sizeof (nonce)); + + /* Set CCM lengths */ + params[0] = sizeof (data); /*datalen */ + params[1] = buflen; /*aadlen */ + params[2] = sizeof (tag); + err = + gcry_cipher_ctl (hd, GCRYCTL_SET_CCM_LENGTHS, params, sizeof (params)); + if (err) + { + fprintf (stderr, "gcry_cipher_ctl failed: %s\n", gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + err = gcry_cipher_authenticate (hd, buf, buflen); + if (err) + { + fprintf (stderr, "gcry_cipher_authenticate failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + err = gcry_cipher_encrypt (hd, &data, sizeof (data), &data, sizeof (data)); + if (err) + { + fprintf (stderr, "gcry_cipher_encrypt failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + err = gcry_cipher_gettag (hd, tag, sizeof (tag)); + if (err) + { + fprintf (stderr, "gcry_cipher_gettag failed: %s\n", gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } +} + +static struct bench_ops ccm_encrypt_ops = { + &bench_encrypt_init, + &bench_encrypt_free, + &bench_ccm_encrypt_do_bench +}; + +static struct bench_ops ccm_decrypt_ops = { + &bench_encrypt_init, + &bench_encrypt_free, + &bench_ccm_decrypt_do_bench +}; + +static struct bench_ops ccm_authenticate_ops = { + &bench_encrypt_init, + &bench_encrypt_free, + &bench_ccm_authenticate_do_bench +}; + + +static struct bench_cipher_mode cipher_modes[] = { + {GCRY_CIPHER_MODE_ECB, "ECB enc", &encrypt_ops}, + {GCRY_CIPHER_MODE_ECB, "ECB dec", &decrypt_ops}, + {GCRY_CIPHER_MODE_CBC, "CBC enc", &encrypt_ops}, + {GCRY_CIPHER_MODE_CBC, "CBC dec", &decrypt_ops}, + {GCRY_CIPHER_MODE_CFB, "CFB enc", &encrypt_ops}, + {GCRY_CIPHER_MODE_CFB, "CFB dec", &decrypt_ops}, + {GCRY_CIPHER_MODE_OFB, "OFB enc", &encrypt_ops}, + {GCRY_CIPHER_MODE_OFB, "OFB dec", &decrypt_ops}, + {GCRY_CIPHER_MODE_CTR, "CTR enc", &encrypt_ops}, + {GCRY_CIPHER_MODE_CTR, "CTR dec", &decrypt_ops}, + {GCRY_CIPHER_MODE_CCM, "CCM enc", &ccm_encrypt_ops}, + {GCRY_CIPHER_MODE_CCM, "CCM dec", &ccm_decrypt_ops}, + {GCRY_CIPHER_MODE_CCM, "CCM auth", &ccm_authenticate_ops}, + {0}, +}; + + +static void +cipher_bench_one (int algo, struct bench_cipher_mode *pmode) +{ + struct bench_cipher_mode mode = *pmode; + struct bench_obj obj = { 0 }; + double result; + unsigned int blklen; + + mode.algo = algo; + + /* Check if this mode is ok */ + blklen = gcry_cipher_get_algo_blklen (algo); + if (!blklen) + return; + + /* Stream cipher? Only test with ECB. */ + if (blklen == 1 && mode.mode != GCRY_CIPHER_MODE_ECB) + return; + if (blklen == 1 && mode.mode == GCRY_CIPHER_MODE_ECB) + { + mode.mode = GCRY_CIPHER_MODE_STREAM; + mode.name = mode.ops == &encrypt_ops ? "STREAM enc" : "STREAM dec"; + } + + /* CCM has restrictions for block-size */ + if (mode.mode == GCRY_CIPHER_MODE_CCM && blklen != GCRY_CCM_BLOCK_LEN) + return; + + printf (" %14s | ", mode.name); + fflush (stdout); + + obj.ops = mode.ops; + obj.priv = &mode; + + result = do_slope_benchmark (&obj); + + bench_print_result (result); +} + + +static void +__cipher_bench (int algo) +{ + const char *algoname; + int i; + + algoname = gcry_cipher_algo_name (algo); + + bench_print_header (algoname); + + for (i = 0; cipher_modes[i].mode; i++) + cipher_bench_one (algo, &cipher_modes[i]); + + bench_print_footer (); +} + + +void +cipher_bench (char **argv, int argc) +{ + int i, algo; + + printf ("Cipher:\n"); + + if (argv && argc) + { + for (i = 0; i < argc; i++) + { + algo = gcry_cipher_map_name (argv[i]); + if (algo) + __cipher_bench (algo); + } + } + else + { + for (i = 1; i < 400; i++) + if (!gcry_cipher_test_algo (i)) + __cipher_bench (i); + } +} + + +/*********************************************************** Hash benchmarks. */ + +struct bench_hash_mode +{ + const char *name; + struct bench_ops *ops; + + int algo; +}; + + +static int +bench_hash_init (struct bench_obj *obj) +{ + struct bench_hash_mode *mode = obj->priv; + gcry_md_hd_t hd; + int err; + + obj->min_bufsize = BUF_START_SIZE; + obj->max_bufsize = BUF_END_SIZE; + obj->step_size = BUF_STEP_SIZE; + obj->num_measure_repetitions = NUM_MEASUREMENT_REPETITIONS; + + err = gcry_md_open (&hd, mode->algo, 0); + if (err) + { + fprintf (stderr, PGM ": error opening hash `%s'\n", + gcry_md_algo_name (mode->algo)); + exit (1); + } + + obj->priv = hd; + + return 0; +} + +static void +bench_hash_free (struct bench_obj *obj) +{ + gcry_md_hd_t hd = obj->priv; + + gcry_md_close (hd); +} + +static void +bench_hash_do_bench (struct bench_obj *obj, void *buf, size_t buflen) +{ + gcry_md_hd_t hd = obj->priv; + + gcry_md_write (hd, buf, buflen); + gcry_md_final (hd); +} + +static struct bench_ops hash_ops = { + &bench_hash_init, + &bench_hash_free, + &bench_hash_do_bench +}; + + +static struct bench_hash_mode hash_modes[] = { + {"", &hash_ops}, + {0}, +}; + + +static void +hash_bench_one (int algo, struct bench_hash_mode *pmode) +{ + struct bench_hash_mode mode = *pmode; + struct bench_obj obj = { 0 }; + double result; + + mode.algo = algo; + + if (mode.name[0] == '\0') + printf (" %-14s | ", gcry_md_algo_name (algo)); + else + printf (" %14s | ", mode.name); + fflush (stdout); + + obj.ops = mode.ops; + obj.priv = &mode; + + result = do_slope_benchmark (&obj); + + bench_print_result (result); +} + +static void +__hash_bench (int algo) +{ + int i; + + for (i = 0; hash_modes[i].name; i++) + hash_bench_one (algo, &hash_modes[i]); +} + +void +hash_bench (char **argv, int argc) +{ + int i, algo; + + printf ("Hash:\n"); + + bench_print_header (""); + + if (argv && argc) + { + for (i = 0; i < argc; i++) + { + algo = gcry_md_map_name (argv[i]); + if (algo) + __hash_bench (algo); + } + } + else + { + for (i = 1; i < 400; i++) + if (!gcry_md_test_algo (i)) + __hash_bench (i); + } + + bench_print_footer (); +} + + +/************************************************************** Main program. */ + +void +print_help (void) +{ + static const char *help_lines[] = { + "usage: bench-slope [options] [hash|cipher [algonames]]", + "", + " options:", + " --cpu-mhz Set CPU speed for calculating cycles per bytes", + " results.", + " --disable-hwf Disable hardware acceleration feature(s) for", + " benchmarking.", + NULL + }; + const char **line; + + for (line = help_lines; *line; line++) + fprintf (stdout, "%s\n", *line); +} + + +/* Warm up CPU. */ +static void +warm_up_cpu (void) +{ + struct nsec_time start, end; + + get_nsec_time (&start); + do + { + get_nsec_time (&end); + } + while (get_time_nsec_diff (&start, &end) < 1000.0 * 1000.0 * 1000.0); +} + + +int +main (int argc, char **argv) +{ + int last_argc = -1; + int debug = 0; + + if (argc) + { + argc--; + argv++; + } + + while (argc && last_argc != argc) + { + last_argc = argc; + + if (!strcmp (*argv, "--")) + { + argc--; + argv++; + break; + } + else if (!strcmp (*argv, "--help")) + { + print_help (); + exit (0); + } + else if (!strcmp (*argv, "--verbose")) + { + verbose++; + argc--; + argv++; + } + else if (!strcmp (*argv, "--debug")) + { + verbose += 2; + debug++; + argc--; + argv++; + } + else if (!strcmp (*argv, "--disable-hwf")) + { + argc--; + argv++; + if (argc) + { + if (gcry_control (GCRYCTL_DISABLE_HWF, *argv, NULL)) + fprintf (stderr, + PGM + ": unknown hardware feature `%s' - option ignored\n", + *argv); + argc--; + argv++; + } + } + else if (!strcmp (*argv, "--cpu-mhz")) + { + argc--; + argv++; + if (argc) + { + cpu_ghz = atof (*argv); + cpu_ghz /= 1000; /* Mhz => Ghz */ + + argc--; + argv++; + } + } + } + + gcry_control (GCRYCTL_SET_VERBOSITY, (int) verbose); + + if (!gcry_check_version (GCRYPT_VERSION)) + { + fprintf (stderr, PGM ": version mismatch; pgm=%s, library=%s\n", + GCRYPT_VERSION, gcry_check_version (NULL)); + exit (1); + } + + if (debug) + gcry_control (GCRYCTL_SET_DEBUG_FLAGS, 1u, 0); + + gcry_control (GCRYCTL_DISABLE_SECMEM, 0); + gcry_control (GCRYCTL_INITIALIZATION_FINISHED, 0); + gcry_control (GCRYCTL_ENABLE_QUICK_RANDOM, 0); + + if (!argc) + { + warm_up_cpu (); + hash_bench (NULL, 0); + cipher_bench (NULL, 0); + } + else if (!strcmp (*argv, "hash")) + { + argc--; + argv++; + + warm_up_cpu (); + hash_bench ((argc == 0) ? NULL : argv, argc); + } + else if (!strcmp (*argv, "cipher")) + { + argc--; + argv++; + + warm_up_cpu (); + cipher_bench ((argc == 0) ? NULL : argv, argc); + } + else + { + fprintf (stderr, "unknown argument: %s\n", *argv); + print_help (); + } + + return 0; +} + +#endif /* !NO_GET_NSEC_TIME */ From jussi.kivilinna at iki.fi Mon Oct 28 07:49:21 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 28 Oct 2013 08:49:21 +0200 Subject: [PATCH] Add ARM NEON assembly implementation of Serpent Message-ID: <20131028064921.4185.96066.stgit@localhost6.localdomain6> * cipher/Makefile.am: Add 'serpent-armv7-neon.S'. * cipher/serpent-armv7-neon.S: New. * cipher/serpent.c (USE_NEON): New macro. (serpent_context_t) [USE_NEON]: Add 'use_neon'. [USE_NEON] (_gcry_serpent_neon_ctr_enc, _gcry_serpent_neon_cfb_dec) (_gcry_serpent_neon_cbc_dec): New prototypes. (serpent_setkey_internal) [USE_NEON]: Detect NEON support. (_gcry_serpent_neon_ctr_enc, _gcry_serpent_neon_cfb_dec) (_gcry_serpent_neon_cbc_dec) [USE_NEON]: Use NEON implementations to process eight blocks in parallel. * configure.ac [neonsupport]: Add 'serpent-armv7-neon.lo'. -- Patch adds ARM NEON optimized implementation of Serpent cipher to speed up parallelizable bulk operations. Benchmarks on ARM Cortex-A8 (armhf, 1008 Mhz): Old: SERPENT128 | nanosecs/byte mebibytes/sec cycles/byte CBC dec | 43.53 ns/B 21.91 MiB/s 43.88 c/B CFB dec | 44.77 ns/B 21.30 MiB/s 45.13 c/B CTR enc | 45.21 ns/B 21.10 MiB/s 45.57 c/B CTR dec | 45.21 ns/B 21.09 MiB/s 45.57 c/B New: SERPENT128 | nanosecs/byte mebibytes/sec cycles/byte CBC dec | 26.26 ns/B 36.32 MiB/s 26.47 c/B CFB dec | 26.21 ns/B 36.38 MiB/s 26.42 c/B CTR enc | 26.20 ns/B 36.40 MiB/s 26.41 c/B CTR dec | 26.20 ns/B 36.40 MiB/s 26.41 c/B Signed-off-by: Jussi Kivilinna --- cipher/serpent-armv7-neon.S | 869 +++++++++++++++++++++++++++++++++++++++++++ cipher/serpent.c | 122 ++++++ configure.ac | 5 3 files changed, 996 insertions(+) create mode 100644 cipher/serpent-armv7-neon.S diff --git a/cipher/serpent-armv7-neon.S b/cipher/serpent-armv7-neon.S new file mode 100644 index 0000000..21473e9 --- /dev/null +++ b/cipher/serpent-armv7-neon.S @@ -0,0 +1,869 @@ +/* serpent-armv7-neon.S - ARM/NEON assembly implementation of Serpent cipher + * + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include + +#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) && \ + defined(HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS) && \ + defined(HAVE_GCC_INLINE_ASM_NEON) + +.text + +.syntax unified +.fpu neon +.arm + +/* ARM registers */ +#define RROUND r0 + +/* NEON vector registers */ +#define RA0 q0 +#define RA1 q1 +#define RA2 q2 +#define RA3 q3 +#define RA4 q4 +#define RB0 q5 +#define RB1 q6 +#define RB2 q7 +#define RB3 q8 +#define RB4 q9 + +#define RT0 q10 +#define RT1 q11 +#define RT2 q12 +#define RT3 q13 + +#define RA0d0 d0 +#define RA0d1 d1 +#define RA1d0 d2 +#define RA1d1 d3 +#define RA2d0 d4 +#define RA2d1 d5 +#define RA3d0 d6 +#define RA3d1 d7 +#define RA4d0 d8 +#define RA4d1 d9 +#define RB0d0 d10 +#define RB0d1 d11 +#define RB1d0 d12 +#define RB1d1 d13 +#define RB2d0 d14 +#define RB2d1 d15 +#define RB3d0 d16 +#define RB3d1 d17 +#define RB4d0 d18 +#define RB4d1 d19 +#define RT0d0 d20 +#define RT0d1 d21 +#define RT1d0 d22 +#define RT1d1 d23 +#define RT2d0 d24 +#define RT2d1 d25 + +/********************************************************************** + helper macros + **********************************************************************/ + +#define transpose_4x4(_q0, _q1, _q2, _q3) \ + vtrn.32 _q0, _q1; \ + vtrn.32 _q2, _q3; \ + vswp _q0##d1, _q2##d0; \ + vswp _q1##d1, _q3##d0; + +/********************************************************************** + 8-way serpent + **********************************************************************/ + +/* + * These are the S-Boxes of Serpent from following research paper. + * + * D. A. Osvik, ?Speeding up Serpent,? in Third AES Candidate Conference, + * (New York, New York, USA), p. 317?329, National Institute of Standards and + * Technology, 2000. + * + * Paper is also available at: http://www.ii.uib.no/~osvik/pub/aes3.pdf + * + */ +#define SBOX0(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + veor a3, a3, a0; veor b3, b3, b0; vmov a4, a1; vmov b4, b1; \ + vand a1, a1, a3; vand b1, b1, b3; veor a4, a4, a2; veor b4, b4, b2; \ + veor a1, a1, a0; veor b1, b1, b0; vorr a0, a0, a3; vorr b0, b0, b3; \ + veor a0, a0, a4; veor b0, b0, b4; veor a4, a4, a3; veor b4, b4, b3; \ + veor a3, a3, a2; veor b3, b3, b2; vorr a2, a2, a1; vorr b2, b2, b1; \ + veor a2, a2, a4; veor b2, b2, b4; vmvn a4, a4; vmvn b4, b4; \ + vorr a4, a4, a1; vorr b4, b4, b1; veor a1, a1, a3; veor b1, b1, b3; \ + veor a1, a1, a4; veor b1, b1, b4; vorr a3, a3, a0; vorr b3, b3, b0; \ + veor a1, a1, a3; veor b1, b1, b3; veor a4, a3; veor b4, b3; + +#define SBOX0_INVERSE(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmvn a2, a2; vmvn b2, b2; vmov a4, a1; vmov b4, b1; \ + vorr a1, a1, a0; vorr b1, b1, b0; vmvn a4, a4; vmvn b4, b4; \ + veor a1, a1, a2; veor b1, b1, b2; vorr a2, a2, a4; vorr b2, b2, b4; \ + veor a1, a1, a3; veor b1, b1, b3; veor a0, a0, a4; veor b0, b0, b4; \ + veor a2, a2, a0; veor b2, b2, b0; vand a0, a0, a3; vand b0, b0, b3; \ + veor a4, a4, a0; veor b4, b4, b0; vorr a0, a0, a1; vorr b0, b0, b1; \ + veor a0, a0, a2; veor b0, b0, b2; veor a3, a3, a4; veor b3, b3, b4; \ + veor a2, a2, a1; veor b2, b2, b1; veor a3, a3, a0; veor b3, b3, b0; \ + veor a3, a3, a1; veor b3, b3, b1;\ + vand a2, a2, a3; vand b2, b2, b3;\ + veor a4, a2; veor b4, b2; + +#define SBOX1(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmvn a0, a0; vmvn b0, b0; vmvn a2, a2; vmvn b2, b2; \ + vmov a4, a0; vmov b4, b0; vand a0, a0, a1; vand b0, b0, b1; \ + veor a2, a2, a0; veor b2, b2, b0; vorr a0, a0, a3; vorr b0, b0, b3; \ + veor a3, a3, a2; veor b3, b3, b2; veor a1, a1, a0; veor b1, b1, b0; \ + veor a0, a0, a4; veor b0, b0, b4; vorr a4, a4, a1; vorr b4, b4, b1; \ + veor a1, a1, a3; veor b1, b1, b3; vorr a2, a2, a0; vorr b2, b2, b0; \ + vand a2, a2, a4; vand b2, b2, b4; veor a0, a0, a1; veor b0, b0, b1; \ + vand a1, a1, a2; vand b1, b1, b2;\ + veor a1, a1, a0; veor b1, b1, b0; vand a0, a0, a2; vand b0, b0, b2; \ + veor a0, a4; veor b0, b4; + +#define SBOX1_INVERSE(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmov a4, a1; vmov b4, b1; veor a1, a1, a3; veor b1, b1, b3; \ + vand a3, a3, a1; vand b3, b3, b1; veor a4, a4, a2; veor b4, b4, b2; \ + veor a3, a3, a0; veor b3, b3, b0; vorr a0, a0, a1; vorr b0, b0, b1; \ + veor a2, a2, a3; veor b2, b2, b3; veor a0, a0, a4; veor b0, b0, b4; \ + vorr a0, a0, a2; vorr b0, b0, b2; veor a1, a1, a3; veor b1, b1, b3; \ + veor a0, a0, a1; veor b0, b0, b1; vorr a1, a1, a3; vorr b1, b1, b3; \ + veor a1, a1, a0; veor b1, b1, b0; vmvn a4, a4; vmvn b4, b4; \ + veor a4, a4, a1; veor b4, b4, b1; vorr a1, a1, a0; vorr b1, b1, b0; \ + veor a1, a1, a0; veor b1, b1, b0;\ + vorr a1, a1, a4; vorr b1, b1, b4;\ + veor a3, a1; veor b3, b1; + +#define SBOX2(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmov a4, a0; vmov b4, b0; vand a0, a0, a2; vand b0, b0, b2; \ + veor a0, a0, a3; veor b0, b0, b3; veor a2, a2, a1; veor b2, b2, b1; \ + veor a2, a2, a0; veor b2, b2, b0; vorr a3, a3, a4; vorr b3, b3, b4; \ + veor a3, a3, a1; veor b3, b3, b1; veor a4, a4, a2; veor b4, b4, b2; \ + vmov a1, a3; vmov b1, b3; vorr a3, a3, a4; vorr b3, b3, b4; \ + veor a3, a3, a0; veor b3, b3, b0; vand a0, a0, a1; vand b0, b0, b1; \ + veor a4, a4, a0; veor b4, b4, b0; veor a1, a1, a3; veor b1, b1, b3; \ + veor a1, a1, a4; veor b1, b1, b4; vmvn a4, a4; vmvn b4, b4; + +#define SBOX2_INVERSE(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + veor a2, a2, a3; veor b2, b2, b3; veor a3, a3, a0; veor b3, b3, b0; \ + vmov a4, a3; vmov b4, b3; vand a3, a3, a2; vand b3, b3, b2; \ + veor a3, a3, a1; veor b3, b3, b1; vorr a1, a1, a2; vorr b1, b1, b2; \ + veor a1, a1, a4; veor b1, b1, b4; vand a4, a4, a3; vand b4, b4, b3; \ + veor a2, a2, a3; veor b2, b2, b3; vand a4, a4, a0; vand b4, b4, b0; \ + veor a4, a4, a2; veor b4, b4, b2; vand a2, a2, a1; vand b2, b2, b1; \ + vorr a2, a2, a0; vorr b2, b2, b0; vmvn a3, a3; vmvn b3, b3; \ + veor a2, a2, a3; veor b2, b2, b3; veor a0, a0, a3; veor b0, b0, b3; \ + vand a0, a0, a1; vand b0, b0, b1; veor a3, a3, a4; veor b3, b3, b4; \ + veor a3, a0; veor b3, b0; + +#define SBOX3(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmov a4, a0; vmov b4, b0; vorr a0, a0, a3; vorr b0, b0, b3; \ + veor a3, a3, a1; veor b3, b3, b1; vand a1, a1, a4; vand b1, b1, b4; \ + veor a4, a4, a2; veor b4, b4, b2; veor a2, a2, a3; veor b2, b2, b3; \ + vand a3, a3, a0; vand b3, b3, b0; vorr a4, a4, a1; vorr b4, b4, b1; \ + veor a3, a3, a4; veor b3, b3, b4; veor a0, a0, a1; veor b0, b0, b1; \ + vand a4, a4, a0; vand b4, b4, b0; veor a1, a1, a3; veor b1, b1, b3; \ + veor a4, a4, a2; veor b4, b4, b2; vorr a1, a1, a0; vorr b1, b1, b0; \ + veor a1, a1, a2; veor b1, b1, b2; veor a0, a0, a3; veor b0, b0, b3; \ + vmov a2, a1; vmov b2, b1; vorr a1, a1, a3; vorr b1, b1, b3; \ + veor a1, a0; veor b1, b0; + +#define SBOX3_INVERSE(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmov a4, a2; vmov b4, b2; veor a2, a2, a1; veor b2, b2, b1; \ + veor a0, a0, a2; veor b0, b0, b2; vand a4, a4, a2; vand b4, b4, b2; \ + veor a4, a4, a0; veor b4, b4, b0; vand a0, a0, a1; vand b0, b0, b1; \ + veor a1, a1, a3; veor b1, b1, b3; vorr a3, a3, a4; vorr b3, b3, b4; \ + veor a2, a2, a3; veor b2, b2, b3; veor a0, a0, a3; veor b0, b0, b3; \ + veor a1, a1, a4; veor b1, b1, b4; vand a3, a3, a2; vand b3, b3, b2; \ + veor a3, a3, a1; veor b3, b3, b1; veor a1, a1, a0; veor b1, b1, b0; \ + vorr a1, a1, a2; vorr b1, b1, b2; veor a0, a0, a3; veor b0, b0, b3; \ + veor a1, a1, a4; veor b1, b1, b4;\ + veor a0, a1; veor b0, b1; + +#define SBOX4(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + veor a1, a1, a3; veor b1, b1, b3; vmvn a3, a3; vmvn b3, b3; \ + veor a2, a2, a3; veor b2, b2, b3; veor a3, a3, a0; veor b3, b3, b0; \ + vmov a4, a1; vmov b4, b1; vand a1, a1, a3; vand b1, b1, b3; \ + veor a1, a1, a2; veor b1, b1, b2; veor a4, a4, a3; veor b4, b4, b3; \ + veor a0, a0, a4; veor b0, b0, b4; vand a2, a2, a4; vand b2, b2, b4; \ + veor a2, a2, a0; veor b2, b2, b0; vand a0, a0, a1; vand b0, b0, b1; \ + veor a3, a3, a0; veor b3, b3, b0; vorr a4, a4, a1; vorr b4, b4, b1; \ + veor a4, a4, a0; veor b4, b4, b0; vorr a0, a0, a3; vorr b0, b0, b3; \ + veor a0, a0, a2; veor b0, b0, b2; vand a2, a2, a3; vand b2, b2, b3; \ + vmvn a0, a0; vmvn b0, b0; veor a4, a2; veor b4, b2; + +#define SBOX4_INVERSE(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmov a4, a2; vmov b4, b2; vand a2, a2, a3; vand b2, b2, b3; \ + veor a2, a2, a1; veor b2, b2, b1; vorr a1, a1, a3; vorr b1, b1, b3; \ + vand a1, a1, a0; vand b1, b1, b0; veor a4, a4, a2; veor b4, b4, b2; \ + veor a4, a4, a1; veor b4, b4, b1; vand a1, a1, a2; vand b1, b1, b2; \ + vmvn a0, a0; vmvn b0, b0; veor a3, a3, a4; veor b3, b3, b4; \ + veor a1, a1, a3; veor b1, b1, b3; vand a3, a3, a0; vand b3, b3, b0; \ + veor a3, a3, a2; veor b3, b3, b2; veor a0, a0, a1; veor b0, b0, b1; \ + vand a2, a2, a0; vand b2, b2, b0; veor a3, a3, a0; veor b3, b3, b0; \ + veor a2, a2, a4; veor b2, b2, b4;\ + vorr a2, a2, a3; vorr b2, b2, b3; veor a3, a3, a0; veor b3, b3, b0; \ + veor a2, a1; veor b2, b1; + +#define SBOX5(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + veor a0, a0, a1; veor b0, b0, b1; veor a1, a1, a3; veor b1, b1, b3; \ + vmvn a3, a3; vmvn b3, b3; vmov a4, a1; vmov b4, b1; \ + vand a1, a1, a0; vand b1, b1, b0; veor a2, a2, a3; veor b2, b2, b3; \ + veor a1, a1, a2; veor b1, b1, b2; vorr a2, a2, a4; vorr b2, b2, b4; \ + veor a4, a4, a3; veor b4, b4, b3; vand a3, a3, a1; vand b3, b3, b1; \ + veor a3, a3, a0; veor b3, b3, b0; veor a4, a4, a1; veor b4, b4, b1; \ + veor a4, a4, a2; veor b4, b4, b2; veor a2, a2, a0; veor b2, b2, b0; \ + vand a0, a0, a3; vand b0, b0, b3; vmvn a2, a2; vmvn b2, b2; \ + veor a0, a0, a4; veor b0, b0, b4; vorr a4, a4, a3; vorr b4, b4, b3; \ + veor a2, a4; veor b2, b4; + +#define SBOX5_INVERSE(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmvn a1, a1; vmvn b1, b1; vmov a4, a3; vmov b4, b3; \ + veor a2, a2, a1; veor b2, b2, b1; vorr a3, a3, a0; vorr b3, b3, b0; \ + veor a3, a3, a2; veor b3, b3, b2; vorr a2, a2, a1; vorr b2, b2, b1; \ + vand a2, a2, a0; vand b2, b2, b0; veor a4, a4, a3; veor b4, b4, b3; \ + veor a2, a2, a4; veor b2, b2, b4; vorr a4, a4, a0; vorr b4, b4, b0; \ + veor a4, a4, a1; veor b4, b4, b1; vand a1, a1, a2; vand b1, b1, b2; \ + veor a1, a1, a3; veor b1, b1, b3; veor a4, a4, a2; veor b4, b4, b2; \ + vand a3, a3, a4; vand b3, b3, b4; veor a4, a4, a1; veor b4, b4, b1; \ + veor a3, a3, a4; veor b3, b3, b4; vmvn a4, a4; vmvn b4, b4; \ + veor a3, a0; veor b3, b0; + +#define SBOX6(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmvn a2, a2; vmvn b2, b2; vmov a4, a3; vmov b4, b3; \ + vand a3, a3, a0; vand b3, b3, b0; veor a0, a0, a4; veor b0, b0, b4; \ + veor a3, a3, a2; veor b3, b3, b2; vorr a2, a2, a4; vorr b2, b2, b4; \ + veor a1, a1, a3; veor b1, b1, b3; veor a2, a2, a0; veor b2, b2, b0; \ + vorr a0, a0, a1; vorr b0, b0, b1; veor a2, a2, a1; veor b2, b2, b1; \ + veor a4, a4, a0; veor b4, b4, b0; vorr a0, a0, a3; vorr b0, b0, b3; \ + veor a0, a0, a2; veor b0, b0, b2; veor a4, a4, a3; veor b4, b4, b3; \ + veor a4, a4, a0; veor b4, b4, b0; vmvn a3, a3; vmvn b3, b3; \ + vand a2, a2, a4; vand b2, b2, b4;\ + veor a2, a3; veor b2, b3; + +#define SBOX6_INVERSE(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + veor a0, a0, a2; veor b0, b0, b2; vmov a4, a2; vmov b4, b2; \ + vand a2, a2, a0; vand b2, b2, b0; veor a4, a4, a3; veor b4, b4, b3; \ + vmvn a2, a2; vmvn b2, b2; veor a3, a3, a1; veor b3, b3, b1; \ + veor a2, a2, a3; veor b2, b2, b3; vorr a4, a4, a0; vorr b4, b4, b0; \ + veor a0, a0, a2; veor b0, b0, b2; veor a3, a3, a4; veor b3, b3, b4; \ + veor a4, a4, a1; veor b4, b4, b1; vand a1, a1, a3; vand b1, b1, b3; \ + veor a1, a1, a0; veor b1, b1, b0; veor a0, a0, a3; veor b0, b0, b3; \ + vorr a0, a0, a2; vorr b0, b0, b2; veor a3, a3, a1; veor b3, b3, b1; \ + veor a4, a0; veor b4, b0; + +#define SBOX7(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmov a4, a1; vmov b4, b1; vorr a1, a1, a2; vorr b1, b1, b2; \ + veor a1, a1, a3; veor b1, b1, b3; veor a4, a4, a2; veor b4, b4, b2; \ + veor a2, a2, a1; veor b2, b2, b1; vorr a3, a3, a4; vorr b3, b3, b4; \ + vand a3, a3, a0; vand b3, b3, b0; veor a4, a4, a2; veor b4, b4, b2; \ + veor a3, a3, a1; veor b3, b3, b1; vorr a1, a1, a4; vorr b1, b1, b4; \ + veor a1, a1, a0; veor b1, b1, b0; vorr a0, a0, a4; vorr b0, b0, b4; \ + veor a0, a0, a2; veor b0, b0, b2; veor a1, a1, a4; veor b1, b1, b4; \ + veor a2, a2, a1; veor b2, b2, b1; vand a1, a1, a0; vand b1, b1, b0; \ + veor a1, a1, a4; veor b1, b1, b4; vmvn a2, a2; vmvn b2, b2; \ + vorr a2, a2, a0; vorr b2, b2, b0;\ + veor a4, a2; veor b4, b2; + +#define SBOX7_INVERSE(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmov a4, a2; vmov b4, b2; veor a2, a2, a0; veor b2, b2, b0; \ + vand a0, a0, a3; vand b0, b0, b3; vorr a4, a4, a3; vorr b4, b4, b3; \ + vmvn a2, a2; vmvn b2, b2; veor a3, a3, a1; veor b3, b3, b1; \ + vorr a1, a1, a0; vorr b1, b1, b0; veor a0, a0, a2; veor b0, b0, b2; \ + vand a2, a2, a4; vand b2, b2, b4; vand a3, a3, a4; vand b3, b3, b4; \ + veor a1, a1, a2; veor b1, b1, b2; veor a2, a2, a0; veor b2, b2, b0; \ + vorr a0, a0, a2; vorr b0, b0, b2; veor a4, a4, a1; veor b4, b4, b1; \ + veor a0, a0, a3; veor b0, b0, b3; veor a3, a3, a4; veor b3, b3, b4; \ + vorr a4, a4, a0; vorr b4, b4, b0; veor a3, a3, a2; veor b3, b3, b2; \ + veor a4, a2; veor b4, b2; + +/* Apply SBOX number WHICH to to the block. */ +#define SBOX(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + SBOX##which (a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) + +/* Apply inverse SBOX number WHICH to to the block. */ +#define SBOX_INVERSE(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + SBOX##which##_INVERSE (a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) + +/* XOR round key into block state in a0,a1,a2,a3. a4 used as temporary. */ +#define BLOCK_XOR_KEY(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vdup.32 RT3, RT0d0[0]; \ + vdup.32 RT1, RT0d0[1]; \ + vdup.32 RT2, RT0d1[0]; \ + vdup.32 RT0, RT0d1[1]; \ + veor a0, a0, RT3; veor b0, b0, RT3; \ + veor a1, a1, RT1; veor b1, b1, RT1; \ + veor a2, a2, RT2; veor b2, b2, RT2; \ + veor a3, a3, RT0; veor b3, b3, RT0; + +#define BLOCK_LOAD_KEY_ENC() \ + vld1.8 {RT0d0, RT0d1}, [RROUND]!; + +#define BLOCK_LOAD_KEY_DEC() \ + vld1.8 {RT0d0, RT0d1}, [RROUND]; \ + sub RROUND, RROUND, #16 + +/* Apply the linear transformation to BLOCK. */ +#define LINEAR_TRANSFORMATION(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vshl.u32 a4, a0, #13; vshl.u32 b4, b0, #13; \ + vshr.u32 a0, a0, #(32-13); vshr.u32 b0, b0, #(32-13); \ + veor a0, a0, a4; veor b0, b0, b4; \ + vshl.u32 a4, a2, #3; vshl.u32 b4, b2, #3; \ + vshr.u32 a2, a2, #(32-3); vshr.u32 b2, b2, #(32-3); \ + veor a2, a2, a4; veor b2, b2, b4; \ + veor a1, a0, a1; veor b1, b0, b1; \ + veor a1, a2, a1; veor b1, b2, b1; \ + vshl.u32 a4, a0, #3; vshl.u32 b4, b0, #3; \ + veor a3, a2, a3; veor b3, b2, b3; \ + veor a3, a4, a3; veor b3, b4, b3; \ + vshl.u32 a4, a1, #1; vshl.u32 b4, b1, #1; \ + vshr.u32 a1, a1, #(32-1); vshr.u32 b1, b1, #(32-1); \ + veor a1, a1, a4; veor b1, b1, b4; \ + vshl.u32 a4, a3, #7; vshl.u32 b4, b3, #7; \ + vshr.u32 a3, a3, #(32-7); vshr.u32 b3, b3, #(32-7); \ + veor a3, a3, a4; veor b3, b3, b4; \ + veor a0, a1, a0; veor b0, b1, b0; \ + veor a0, a3, a0; veor b0, b3, b0; \ + vshl.u32 a4, a1, #7; vshl.u32 b4, b1, #7; \ + veor a2, a3, a2; veor b2, b3, b2; \ + veor a2, a4, a2; veor b2, b4, b2; \ + vshl.u32 a4, a0, #5; vshl.u32 b4, b0, #5; \ + vshr.u32 a0, a0, #(32-5); vshr.u32 b0, b0, #(32-5); \ + veor a0, a0, a4; veor b0, b0, b4; \ + vshl.u32 a4, a2, #22; vshl.u32 b4, b2, #22; \ + vshr.u32 a2, a2, #(32-22); vshr.u32 b2, b2, #(32-22); \ + veor a2, a2, a4; veor b2, b2, b4; + +/* Apply the inverse linear transformation to BLOCK. */ +#define LINEAR_TRANSFORMATION_INVERSE(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vshr.u32 a4, a2, #22; vshr.u32 b4, b2, #22; \ + vshl.u32 a2, a2, #(32-22); vshl.u32 b2, b2, #(32-22); \ + veor a2, a2, a4; veor b2, b2, b4; \ + vshr.u32 a4, a0, #5; vshr.u32 b4, b0, #5; \ + vshl.u32 a0, a0, #(32-5); vshl.u32 b0, b0, #(32-5); \ + veor a0, a0, a4; veor b0, b0, b4; \ + vshl.u32 a4, a1, #7; vshl.u32 b4, b1, #7; \ + veor a2, a3, a2; veor b2, b3, b2; \ + veor a2, a4, a2; veor b2, b4, b2; \ + veor a0, a1, a0; veor b0, b1, b0; \ + veor a0, a3, a0; veor b0, b3, b0; \ + vshr.u32 a4, a3, #7; vshr.u32 b4, b3, #7; \ + vshl.u32 a3, a3, #(32-7); vshl.u32 b3, b3, #(32-7); \ + veor a3, a3, a4; veor b3, b3, b4; \ + vshr.u32 a4, a1, #1; vshr.u32 b4, b1, #1; \ + vshl.u32 a1, a1, #(32-1); vshl.u32 b1, b1, #(32-1); \ + veor a1, a1, a4; veor b1, b1, b4; \ + vshl.u32 a4, a0, #3; vshl.u32 b4, b0, #3; \ + veor a3, a2, a3; veor b3, b2, b3; \ + veor a3, a4, a3; veor b3, b4, b3; \ + veor a1, a0, a1; veor b1, b0, b1; \ + veor a1, a2, a1; veor b1, b2, b1; \ + vshr.u32 a4, a2, #3; vshr.u32 b4, b2, #3; \ + vshl.u32 a2, a2, #(32-3); vshl.u32 b2, b2, #(32-3); \ + veor a2, a2, a4; veor b2, b2, b4; \ + vshr.u32 a4, a0, #13; vshr.u32 b4, b0, #13; \ + vshl.u32 a0, a0, #(32-13); vshl.u32 b0, b0, #(32-13); \ + veor a0, a0, a4; veor b0, b0, b4; + +/* Apply a Serpent round to eight parallel blocks. This macro increments + `round'. */ +#define ROUND(round, which, a0, a1, a2, a3, a4, na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, nb0, nb1, nb2, nb3, nb4) \ + BLOCK_XOR_KEY (a0, a1, a2, a3, a4, b0, b1, b2, b3, b4); \ + BLOCK_LOAD_KEY_ENC (); \ + SBOX (which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4); \ + LINEAR_TRANSFORMATION (na0, na1, na2, na3, na4, nb0, nb1, nb2, nb3, nb4); + +/* Apply the last Serpent round to eight parallel blocks. This macro increments + `round'. */ +#define ROUND_LAST(round, which, a0, a1, a2, a3, a4, na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, nb0, nb1, nb2, nb3, nb4) \ + BLOCK_XOR_KEY (a0, a1, a2, a3, a4, b0, b1, b2, b3, b4); \ + BLOCK_LOAD_KEY_ENC (); \ + SBOX (which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4); \ + BLOCK_XOR_KEY (na0, na1, na2, na3, na4, nb0, nb1, nb2, nb3, nb4); + +/* Apply an inverse Serpent round to eight parallel blocks. This macro + increments `round'. */ +#define ROUND_INVERSE(round, which, a0, a1, a2, a3, a4, \ + na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, \ + nb0, nb1, nb2, nb3, nb4) \ + LINEAR_TRANSFORMATION_INVERSE (a0, a1, a2, a3, a4, b0, b1, b2, b3, b4); \ + SBOX_INVERSE (which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4); \ + BLOCK_XOR_KEY (na0, na1, na2, na3, na4, nb0, nb1, nb2, nb3, nb4); \ + BLOCK_LOAD_KEY_DEC (); + +/* Apply the first inverse Serpent round to eight parallel blocks. This macro + increments `round'. */ +#define ROUND_FIRST_INVERSE(round, which, a0, a1, a2, a3, a4, \ + na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, \ + nb0, nb1, nb2, nb3, nb4) \ + BLOCK_XOR_KEY (a0, a1, a2, a3, a4, b0, b1, b2, b3, b4); \ + BLOCK_LOAD_KEY_DEC (); \ + SBOX_INVERSE (which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4); \ + BLOCK_XOR_KEY (na0, na1, na2, na3, na4, nb0, nb1, nb2, nb3, nb4); \ + BLOCK_LOAD_KEY_DEC (); + +.align 3 +.type __serpent_enc_blk8,%function; +__serpent_enc_blk8: + /* input: + * r0: round key pointer + * RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3: eight parallel plaintext + * blocks + * output: + * RA4, RA1, RA2, RA0, RB4, RB1, RB2, RB0: eight parallel + * ciphertext blocks + */ + + transpose_4x4(RA0, RA1, RA2, RA3); + BLOCK_LOAD_KEY_ENC (); + transpose_4x4(RB0, RB1, RB2, RB3); + + ROUND (0, 0, RA0, RA1, RA2, RA3, RA4, RA1, RA4, RA2, RA0, RA3, + RB0, RB1, RB2, RB3, RB4, RB1, RB4, RB2, RB0, RB3); + ROUND (1, 1, RA1, RA4, RA2, RA0, RA3, RA2, RA1, RA0, RA4, RA3, + RB1, RB4, RB2, RB0, RB3, RB2, RB1, RB0, RB4, RB3); + ROUND (2, 2, RA2, RA1, RA0, RA4, RA3, RA0, RA4, RA1, RA3, RA2, + RB2, RB1, RB0, RB4, RB3, RB0, RB4, RB1, RB3, RB2); + ROUND (3, 3, RA0, RA4, RA1, RA3, RA2, RA4, RA1, RA3, RA2, RA0, + RB0, RB4, RB1, RB3, RB2, RB4, RB1, RB3, RB2, RB0); + ROUND (4, 4, RA4, RA1, RA3, RA2, RA0, RA1, RA0, RA4, RA2, RA3, + RB4, RB1, RB3, RB2, RB0, RB1, RB0, RB4, RB2, RB3); + ROUND (5, 5, RA1, RA0, RA4, RA2, RA3, RA0, RA2, RA1, RA4, RA3, + RB1, RB0, RB4, RB2, RB3, RB0, RB2, RB1, RB4, RB3); + ROUND (6, 6, RA0, RA2, RA1, RA4, RA3, RA0, RA2, RA3, RA1, RA4, + RB0, RB2, RB1, RB4, RB3, RB0, RB2, RB3, RB1, RB4); + ROUND (7, 7, RA0, RA2, RA3, RA1, RA4, RA4, RA1, RA2, RA0, RA3, + RB0, RB2, RB3, RB1, RB4, RB4, RB1, RB2, RB0, RB3); + ROUND (8, 0, RA4, RA1, RA2, RA0, RA3, RA1, RA3, RA2, RA4, RA0, + RB4, RB1, RB2, RB0, RB3, RB1, RB3, RB2, RB4, RB0); + ROUND (9, 1, RA1, RA3, RA2, RA4, RA0, RA2, RA1, RA4, RA3, RA0, + RB1, RB3, RB2, RB4, RB0, RB2, RB1, RB4, RB3, RB0); + ROUND (10, 2, RA2, RA1, RA4, RA3, RA0, RA4, RA3, RA1, RA0, RA2, + RB2, RB1, RB4, RB3, RB0, RB4, RB3, RB1, RB0, RB2); + ROUND (11, 3, RA4, RA3, RA1, RA0, RA2, RA3, RA1, RA0, RA2, RA4, + RB4, RB3, RB1, RB0, RB2, RB3, RB1, RB0, RB2, RB4); + ROUND (12, 4, RA3, RA1, RA0, RA2, RA4, RA1, RA4, RA3, RA2, RA0, + RB3, RB1, RB0, RB2, RB4, RB1, RB4, RB3, RB2, RB0); + ROUND (13, 5, RA1, RA4, RA3, RA2, RA0, RA4, RA2, RA1, RA3, RA0, + RB1, RB4, RB3, RB2, RB0, RB4, RB2, RB1, RB3, RB0); + ROUND (14, 6, RA4, RA2, RA1, RA3, RA0, RA4, RA2, RA0, RA1, RA3, + RB4, RB2, RB1, RB3, RB0, RB4, RB2, RB0, RB1, RB3); + ROUND (15, 7, RA4, RA2, RA0, RA1, RA3, RA3, RA1, RA2, RA4, RA0, + RB4, RB2, RB0, RB1, RB3, RB3, RB1, RB2, RB4, RB0); + ROUND (16, 0, RA3, RA1, RA2, RA4, RA0, RA1, RA0, RA2, RA3, RA4, + RB3, RB1, RB2, RB4, RB0, RB1, RB0, RB2, RB3, RB4); + ROUND (17, 1, RA1, RA0, RA2, RA3, RA4, RA2, RA1, RA3, RA0, RA4, + RB1, RB0, RB2, RB3, RB4, RB2, RB1, RB3, RB0, RB4); + ROUND (18, 2, RA2, RA1, RA3, RA0, RA4, RA3, RA0, RA1, RA4, RA2, + RB2, RB1, RB3, RB0, RB4, RB3, RB0, RB1, RB4, RB2); + ROUND (19, 3, RA3, RA0, RA1, RA4, RA2, RA0, RA1, RA4, RA2, RA3, + RB3, RB0, RB1, RB4, RB2, RB0, RB1, RB4, RB2, RB3); + ROUND (20, 4, RA0, RA1, RA4, RA2, RA3, RA1, RA3, RA0, RA2, RA4, + RB0, RB1, RB4, RB2, RB3, RB1, RB3, RB0, RB2, RB4); + ROUND (21, 5, RA1, RA3, RA0, RA2, RA4, RA3, RA2, RA1, RA0, RA4, + RB1, RB3, RB0, RB2, RB4, RB3, RB2, RB1, RB0, RB4); + ROUND (22, 6, RA3, RA2, RA1, RA0, RA4, RA3, RA2, RA4, RA1, RA0, + RB3, RB2, RB1, RB0, RB4, RB3, RB2, RB4, RB1, RB0); + ROUND (23, 7, RA3, RA2, RA4, RA1, RA0, RA0, RA1, RA2, RA3, RA4, + RB3, RB2, RB4, RB1, RB0, RB0, RB1, RB2, RB3, RB4); + ROUND (24, 0, RA0, RA1, RA2, RA3, RA4, RA1, RA4, RA2, RA0, RA3, + RB0, RB1, RB2, RB3, RB4, RB1, RB4, RB2, RB0, RB3); + ROUND (25, 1, RA1, RA4, RA2, RA0, RA3, RA2, RA1, RA0, RA4, RA3, + RB1, RB4, RB2, RB0, RB3, RB2, RB1, RB0, RB4, RB3); + ROUND (26, 2, RA2, RA1, RA0, RA4, RA3, RA0, RA4, RA1, RA3, RA2, + RB2, RB1, RB0, RB4, RB3, RB0, RB4, RB1, RB3, RB2); + ROUND (27, 3, RA0, RA4, RA1, RA3, RA2, RA4, RA1, RA3, RA2, RA0, + RB0, RB4, RB1, RB3, RB2, RB4, RB1, RB3, RB2, RB0); + ROUND (28, 4, RA4, RA1, RA3, RA2, RA0, RA1, RA0, RA4, RA2, RA3, + RB4, RB1, RB3, RB2, RB0, RB1, RB0, RB4, RB2, RB3); + ROUND (29, 5, RA1, RA0, RA4, RA2, RA3, RA0, RA2, RA1, RA4, RA3, + RB1, RB0, RB4, RB2, RB3, RB0, RB2, RB1, RB4, RB3); + ROUND (30, 6, RA0, RA2, RA1, RA4, RA3, RA0, RA2, RA3, RA1, RA4, + RB0, RB2, RB1, RB4, RB3, RB0, RB2, RB3, RB1, RB4); + ROUND_LAST (31, 7, RA0, RA2, RA3, RA1, RA4, RA4, RA1, RA2, RA0, RA3, + RB0, RB2, RB3, RB1, RB4, RB4, RB1, RB2, RB0, RB3); + + transpose_4x4(RA4, RA1, RA2, RA0); + transpose_4x4(RB4, RB1, RB2, RB0); + + bx lr; +.size __serpent_enc_blk8,.-__serpent_enc_blk8; + +.align 3 +.type __serpent_dec_blk8,%function; +__serpent_dec_blk8: + /* input: + * r0: round key pointer + * RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3: eight parallel + * ciphertext blocks + * output: + * RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3: eight parallel plaintext + * blocks + */ + + add RROUND, RROUND, #(32*16); + + transpose_4x4(RA0, RA1, RA2, RA3); + BLOCK_LOAD_KEY_DEC (); + transpose_4x4(RB0, RB1, RB2, RB3); + + ROUND_FIRST_INVERSE (31, 7, RA0, RA1, RA2, RA3, RA4, + RA3, RA0, RA1, RA4, RA2, + RB0, RB1, RB2, RB3, RB4, + RB3, RB0, RB1, RB4, RB2); + ROUND_INVERSE (30, 6, RA3, RA0, RA1, RA4, RA2, RA0, RA1, RA2, RA4, RA3, + RB3, RB0, RB1, RB4, RB2, RB0, RB1, RB2, RB4, RB3); + ROUND_INVERSE (29, 5, RA0, RA1, RA2, RA4, RA3, RA1, RA3, RA4, RA2, RA0, + RB0, RB1, RB2, RB4, RB3, RB1, RB3, RB4, RB2, RB0); + ROUND_INVERSE (28, 4, RA1, RA3, RA4, RA2, RA0, RA1, RA2, RA4, RA0, RA3, + RB1, RB3, RB4, RB2, RB0, RB1, RB2, RB4, RB0, RB3); + ROUND_INVERSE (27, 3, RA1, RA2, RA4, RA0, RA3, RA4, RA2, RA0, RA1, RA3, + RB1, RB2, RB4, RB0, RB3, RB4, RB2, RB0, RB1, RB3); + ROUND_INVERSE (26, 2, RA4, RA2, RA0, RA1, RA3, RA2, RA3, RA0, RA1, RA4, + RB4, RB2, RB0, RB1, RB3, RB2, RB3, RB0, RB1, RB4); + ROUND_INVERSE (25, 1, RA2, RA3, RA0, RA1, RA4, RA4, RA2, RA1, RA0, RA3, + RB2, RB3, RB0, RB1, RB4, RB4, RB2, RB1, RB0, RB3); + ROUND_INVERSE (24, 0, RA4, RA2, RA1, RA0, RA3, RA4, RA3, RA2, RA0, RA1, + RB4, RB2, RB1, RB0, RB3, RB4, RB3, RB2, RB0, RB1); + ROUND_INVERSE (23, 7, RA4, RA3, RA2, RA0, RA1, RA0, RA4, RA3, RA1, RA2, + RB4, RB3, RB2, RB0, RB1, RB0, RB4, RB3, RB1, RB2); + ROUND_INVERSE (22, 6, RA0, RA4, RA3, RA1, RA2, RA4, RA3, RA2, RA1, RA0, + RB0, RB4, RB3, RB1, RB2, RB4, RB3, RB2, RB1, RB0); + ROUND_INVERSE (21, 5, RA4, RA3, RA2, RA1, RA0, RA3, RA0, RA1, RA2, RA4, + RB4, RB3, RB2, RB1, RB0, RB3, RB0, RB1, RB2, RB4); + ROUND_INVERSE (20, 4, RA3, RA0, RA1, RA2, RA4, RA3, RA2, RA1, RA4, RA0, + RB3, RB0, RB1, RB2, RB4, RB3, RB2, RB1, RB4, RB0); + ROUND_INVERSE (19, 3, RA3, RA2, RA1, RA4, RA0, RA1, RA2, RA4, RA3, RA0, + RB3, RB2, RB1, RB4, RB0, RB1, RB2, RB4, RB3, RB0); + ROUND_INVERSE (18, 2, RA1, RA2, RA4, RA3, RA0, RA2, RA0, RA4, RA3, RA1, + RB1, RB2, RB4, RB3, RB0, RB2, RB0, RB4, RB3, RB1); + ROUND_INVERSE (17, 1, RA2, RA0, RA4, RA3, RA1, RA1, RA2, RA3, RA4, RA0, + RB2, RB0, RB4, RB3, RB1, RB1, RB2, RB3, RB4, RB0); + ROUND_INVERSE (16, 0, RA1, RA2, RA3, RA4, RA0, RA1, RA0, RA2, RA4, RA3, + RB1, RB2, RB3, RB4, RB0, RB1, RB0, RB2, RB4, RB3); + ROUND_INVERSE (15, 7, RA1, RA0, RA2, RA4, RA3, RA4, RA1, RA0, RA3, RA2, + RB1, RB0, RB2, RB4, RB3, RB4, RB1, RB0, RB3, RB2); + ROUND_INVERSE (14, 6, RA4, RA1, RA0, RA3, RA2, RA1, RA0, RA2, RA3, RA4, + RB4, RB1, RB0, RB3, RB2, RB1, RB0, RB2, RB3, RB4); + ROUND_INVERSE (13, 5, RA1, RA0, RA2, RA3, RA4, RA0, RA4, RA3, RA2, RA1, + RB1, RB0, RB2, RB3, RB4, RB0, RB4, RB3, RB2, RB1); + ROUND_INVERSE (12, 4, RA0, RA4, RA3, RA2, RA1, RA0, RA2, RA3, RA1, RA4, + RB0, RB4, RB3, RB2, RB1, RB0, RB2, RB3, RB1, RB4); + ROUND_INVERSE (11, 3, RA0, RA2, RA3, RA1, RA4, RA3, RA2, RA1, RA0, RA4, + RB0, RB2, RB3, RB1, RB4, RB3, RB2, RB1, RB0, RB4); + ROUND_INVERSE (10, 2, RA3, RA2, RA1, RA0, RA4, RA2, RA4, RA1, RA0, RA3, + RB3, RB2, RB1, RB0, RB4, RB2, RB4, RB1, RB0, RB3); + ROUND_INVERSE (9, 1, RA2, RA4, RA1, RA0, RA3, RA3, RA2, RA0, RA1, RA4, + RB2, RB4, RB1, RB0, RB3, RB3, RB2, RB0, RB1, RB4); + ROUND_INVERSE (8, 0, RA3, RA2, RA0, RA1, RA4, RA3, RA4, RA2, RA1, RA0, + RB3, RB2, RB0, RB1, RB4, RB3, RB4, RB2, RB1, RB0); + ROUND_INVERSE (7, 7, RA3, RA4, RA2, RA1, RA0, RA1, RA3, RA4, RA0, RA2, + RB3, RB4, RB2, RB1, RB0, RB1, RB3, RB4, RB0, RB2); + ROUND_INVERSE (6, 6, RA1, RA3, RA4, RA0, RA2, RA3, RA4, RA2, RA0, RA1, + RB1, RB3, RB4, RB0, RB2, RB3, RB4, RB2, RB0, RB1); + ROUND_INVERSE (5, 5, RA3, RA4, RA2, RA0, RA1, RA4, RA1, RA0, RA2, RA3, + RB3, RB4, RB2, RB0, RB1, RB4, RB1, RB0, RB2, RB3); + ROUND_INVERSE (4, 4, RA4, RA1, RA0, RA2, RA3, RA4, RA2, RA0, RA3, RA1, + RB4, RB1, RB0, RB2, RB3, RB4, RB2, RB0, RB3, RB1); + ROUND_INVERSE (3, 3, RA4, RA2, RA0, RA3, RA1, RA0, RA2, RA3, RA4, RA1, + RB4, RB2, RB0, RB3, RB1, RB0, RB2, RB3, RB4, RB1); + ROUND_INVERSE (2, 2, RA0, RA2, RA3, RA4, RA1, RA2, RA1, RA3, RA4, RA0, + RB0, RB2, RB3, RB4, RB1, RB2, RB1, RB3, RB4, RB0); + ROUND_INVERSE (1, 1, RA2, RA1, RA3, RA4, RA0, RA0, RA2, RA4, RA3, RA1, + RB2, RB1, RB3, RB4, RB0, RB0, RB2, RB4, RB3, RB1); + ROUND_INVERSE (0, 0, RA0, RA2, RA4, RA3, RA1, RA0, RA1, RA2, RA3, RA4, + RB0, RB2, RB4, RB3, RB1, RB0, RB1, RB2, RB3, RB4); + + transpose_4x4(RA0, RA1, RA2, RA3); + transpose_4x4(RB0, RB1, RB2, RB3); + + bx lr; +.size __serpent_dec_blk8,.-__serpent_dec_blk8; + +.align 3 +.globl _gcry_serpent_neon_ctr_enc +.type _gcry_serpent_neon_ctr_enc,%function; +_gcry_serpent_neon_ctr_enc: + /* input: + * r0: ctx, CTX + * r1: dst (8 blocks) + * r2: src (8 blocks) + * r3: iv + */ + + vmov.u8 RT1d0, #0xff; /* u64: -1 */ + push {r4,lr}; + vadd.u64 RT2d0, RT1d0, RT1d0; /* u64: -2 */ + vpush {RA4-RB2}; + + /* load IV and byteswap */ + vld1.8 {RA0}, [r3]; + vrev64.u8 RT0, RA0; /* be => le */ + ldr r4, [r3, #8]; + + /* construct IVs */ + vsub.u64 RA2d1, RT0d1, RT2d0; /* +2 */ + vsub.u64 RA1d1, RT0d1, RT1d0; /* +1 */ + cmp r4, #-1; + + vsub.u64 RB0d1, RA2d1, RT2d0; /* +4 */ + vsub.u64 RA3d1, RA2d1, RT1d0; /* +3 */ + ldr r4, [r3, #12]; + + vsub.u64 RB2d1, RB0d1, RT2d0; /* +6 */ + vsub.u64 RB1d1, RB0d1, RT1d0; /* +5 */ + + vsub.u64 RT2d1, RB2d1, RT2d0; /* +8 */ + vsub.u64 RB3d1, RB2d1, RT1d0; /* +7 */ + + vmov RA1d0, RT0d0; + vmov RA2d0, RT0d0; + vmov RA3d0, RT0d0; + vmov RB0d0, RT0d0; + rev r4, r4; + vmov RB1d0, RT0d0; + vmov RB2d0, RT0d0; + vmov RB3d0, RT0d0; + vmov RT2d0, RT0d0; + + /* check need for handling 64-bit overflow and carry */ + beq .Ldo_ctr_carry; + +.Lctr_carry_done: + /* le => be */ + vrev64.u8 RA1, RA1; + vrev64.u8 RA2, RA2; + vrev64.u8 RA3, RA3; + vrev64.u8 RB0, RB0; + vrev64.u8 RT2, RT2; + vrev64.u8 RB1, RB1; + vrev64.u8 RB2, RB2; + vrev64.u8 RB3, RB3; + /* store new IV */ + vst1.8 {RT2}, [r3]; + + bl __serpent_enc_blk8; + + vld1.8 {RT0, RT1}, [r2]!; + vld1.8 {RT2, RT3}, [r2]!; + veor RA4, RA4, RT0; + veor RA1, RA1, RT1; + vld1.8 {RT0, RT1}, [r2]!; + veor RA2, RA2, RT2; + veor RA0, RA0, RT3; + vld1.8 {RT2, RT3}, [r2]!; + veor RB4, RB4, RT0; + veor RT0, RT0; + veor RB1, RB1, RT1; + veor RT1, RT1; + veor RB2, RB2, RT2; + veor RT2, RT2; + veor RB0, RB0, RT3; + veor RT3, RT3; + + vst1.8 {RA4}, [r1]!; + vst1.8 {RA1}, [r1]!; + veor RA1, RA1; + vst1.8 {RA2}, [r1]!; + veor RA2, RA2; + vst1.8 {RA0}, [r1]!; + veor RA0, RA0; + vst1.8 {RB4}, [r1]!; + veor RB4, RB4; + vst1.8 {RB1}, [r1]!; + vst1.8 {RB2}, [r1]!; + vst1.8 {RB0}, [r1]!; + + vpop {RA4-RB2}; + + /* clear the used registers */ + veor RA3, RA3; + veor RB3, RB3; + + pop {r4,pc}; + +.Ldo_ctr_carry: + cmp r4, #-8; + blo .Lctr_carry_done; + beq .Lcarry_RT2; + + cmp r4, #-6; + blo .Lcarry_RB3; + beq .Lcarry_RB2; + + cmp r4, #-4; + blo .Lcarry_RB1; + beq .Lcarry_RB0; + + cmp r4, #-2; + blo .Lcarry_RA3; + beq .Lcarry_RA2; + + vsub.u64 RA1d0, RT1d0; +.Lcarry_RA2: + vsub.u64 RA2d0, RT1d0; +.Lcarry_RA3: + vsub.u64 RA3d0, RT1d0; +.Lcarry_RB0: + vsub.u64 RB0d0, RT1d0; +.Lcarry_RB1: + vsub.u64 RB1d0, RT1d0; +.Lcarry_RB2: + vsub.u64 RB2d0, RT1d0; +.Lcarry_RB3: + vsub.u64 RB3d0, RT1d0; +.Lcarry_RT2: + vsub.u64 RT2d0, RT1d0; + + b .Lctr_carry_done; +.size _gcry_serpent_neon_ctr_enc,.-_gcry_serpent_neon_ctr_enc; + +.align 3 +.globl _gcry_serpent_neon_cfb_dec +.type _gcry_serpent_neon_cfb_dec,%function; +_gcry_serpent_neon_cfb_dec: + /* input: + * r0: ctx, CTX + * r1: dst (8 blocks) + * r2: src (8 blocks) + * r3: iv + */ + + push {lr}; + vpush {RA4-RB2}; + + /* Load input */ + vld1.8 {RA0}, [r3]; + vld1.8 {RA1, RA2}, [r2]!; + vld1.8 {RA3}, [r2]!; + vld1.8 {RB0}, [r2]!; + vld1.8 {RB1, RB2}, [r2]!; + vld1.8 {RB3}, [r2]!; + + /* Update IV */ + vld1.8 {RT0}, [r2]!; + vst1.8 {RT0}, [r3]; + mov r3, lr; + sub r2, r2, #(8*16); + + bl __serpent_enc_blk8; + + vld1.8 {RT0, RT1}, [r2]!; + vld1.8 {RT2, RT3}, [r2]!; + veor RA4, RA4, RT0; + veor RA1, RA1, RT1; + vld1.8 {RT0, RT1}, [r2]!; + veor RA2, RA2, RT2; + veor RA0, RA0, RT3; + vld1.8 {RT2, RT3}, [r2]!; + veor RB4, RB4, RT0; + veor RT0, RT0; + veor RB1, RB1, RT1; + veor RT1, RT1; + veor RB2, RB2, RT2; + veor RT2, RT2; + veor RB0, RB0, RT3; + veor RT3, RT3; + + vst1.8 {RA4}, [r1]!; + vst1.8 {RA1}, [r1]!; + veor RA1, RA1; + vst1.8 {RA2}, [r1]!; + veor RA2, RA2; + vst1.8 {RA0}, [r1]!; + veor RA0, RA0; + vst1.8 {RB4}, [r1]!; + veor RB4, RB4; + vst1.8 {RB1}, [r1]!; + vst1.8 {RB2}, [r1]!; + vst1.8 {RB0}, [r1]!; + + vpop {RA4-RB2}; + + /* clear the used registers */ + veor RA3, RA3; + veor RB3, RB3; + + pop {pc}; +.size _gcry_serpent_neon_cfb_dec,.-_gcry_serpent_neon_cfb_dec; + +.align 3 +.globl _gcry_serpent_neon_cbc_dec +.type _gcry_serpent_neon_cbc_dec,%function; +_gcry_serpent_neon_cbc_dec: + /* input: + * r0: ctx, CTX + * r1: dst (8 blocks) + * r2: src (8 blocks) + * r3: iv + */ + + push {lr}; + vpush {RA4-RB2}; + + vld1.8 {RA0, RA1}, [r2]!; + vld1.8 {RA2, RA3}, [r2]!; + vld1.8 {RB0, RB1}, [r2]!; + vld1.8 {RB2, RB3}, [r2]!; + sub r2, r2, #(8*16); + + bl __serpent_dec_blk8; + + vld1.8 {RB4}, [r3]; + vld1.8 {RT0, RT1}, [r2]!; + vld1.8 {RT2, RT3}, [r2]!; + veor RA0, RA0, RB4; + veor RA1, RA1, RT0; + veor RA2, RA2, RT1; + vld1.8 {RT0, RT1}, [r2]!; + veor RA3, RA3, RT2; + veor RB0, RB0, RT3; + vld1.8 {RT2, RT3}, [r2]!; + veor RB1, RB1, RT0; + veor RT0, RT0; + veor RB2, RB2, RT1; + veor RT1, RT1; + veor RB3, RB3, RT2; + veor RT2, RT2; + vst1.8 {RT3}, [r3]; /* store new IV */ + veor RT3, RT3; + + vst1.8 {RA0, RA1}, [r1]!; + veor RA0, RA0; + veor RA1, RA1; + vst1.8 {RA2, RA3}, [r1]!; + veor RA2, RA2; + vst1.8 {RB0, RB1}, [r1]!; + veor RA3, RA3; + vst1.8 {RB2, RB3}, [r1]!; + veor RB3, RB3; + + vpop {RA4-RB2}; + + /* clear the used registers */ + veor RB4, RB4; + + pop {pc}; +.size _gcry_serpent_neon_cbc_dec,.-_gcry_serpent_neon_cbc_dec; + +#endif diff --git a/cipher/serpent.c b/cipher/serpent.c index a8ee15f..cfda742 100644 --- a/cipher/serpent.c +++ b/cipher/serpent.c @@ -46,6 +46,15 @@ # endif #endif +/* USE_NEON indicates whether to enable ARM NEON assembly code. */ +#undef USE_NEON +#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +# if defined(HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS) && \ + defined(HAVE_GCC_INLINE_ASM_NEON) +# define USE_NEON 1 +# endif +#endif + /* Number of rounds per Serpent encrypt/decrypt operation. */ #define ROUNDS 32 @@ -71,6 +80,9 @@ typedef struct serpent_context #ifdef USE_AVX2 int use_avx2; #endif +#ifdef USE_NEON + int use_neon; +#endif } serpent_context_t; @@ -114,6 +126,26 @@ extern void _gcry_serpent_avx2_cfb_dec(serpent_context_t *ctx, unsigned char *iv); #endif +#ifdef USE_NEON +/* Assembler implementations of Serpent using ARM NEON. Process 8 block in + parallel. + */ +extern void _gcry_serpent_neon_ctr_enc(serpent_context_t *ctx, + unsigned char *out, + const unsigned char *in, + unsigned char *ctr); + +extern void _gcry_serpent_neon_cbc_dec(serpent_context_t *ctx, + unsigned char *out, + const unsigned char *in, + unsigned char *iv); + +extern void _gcry_serpent_neon_cfb_dec(serpent_context_t *ctx, + unsigned char *out, + const unsigned char *in, + unsigned char *iv); +#endif + /* A prototype. */ static const char *serpent_test (void); @@ -634,6 +666,14 @@ serpent_setkey_internal (serpent_context_t *context, } #endif +#ifdef USE_NEON + context->use_neon = 0; + if ((_gcry_get_hw_features () & HWF_ARM_NEON)) + { + context->use_neon = 1; + } +#endif + _gcry_burn_stack (272 * sizeof (u32)); } @@ -861,6 +901,34 @@ _gcry_serpent_ctr_enc(void *context, unsigned char *ctr, } #endif +#ifdef USE_NEON + if (ctx->use_neon) + { + int did_use_neon = 0; + + /* Process data in 8 block chunks. */ + while (nblocks >= 8) + { + _gcry_serpent_neon_ctr_enc(ctx, outbuf, inbuf, ctr); + + nblocks -= 8; + outbuf += 8 * sizeof(serpent_block_t); + inbuf += 8 * sizeof(serpent_block_t); + did_use_neon = 1; + } + + if (did_use_neon) + { + /* serpent-neon assembly code does not use stack */ + if (nblocks == 0) + burn_stack_depth = 0; + } + + /* Use generic code to handle smaller chunks... */ + /* TODO: use caching instead? */ + } +#endif + for ( ;nblocks; nblocks-- ) { /* Encrypt the counter. */ @@ -948,6 +1016,33 @@ _gcry_serpent_cbc_dec(void *context, unsigned char *iv, } #endif +#ifdef USE_NEON + if (ctx->use_neon) + { + int did_use_neon = 0; + + /* Process data in 8 block chunks. */ + while (nblocks >= 8) + { + _gcry_serpent_neon_cbc_dec(ctx, outbuf, inbuf, iv); + + nblocks -= 8; + outbuf += 8 * sizeof(serpent_block_t); + inbuf += 8 * sizeof(serpent_block_t); + did_use_neon = 1; + } + + if (did_use_neon) + { + /* serpent-neon assembly code does not use stack */ + if (nblocks == 0) + burn_stack_depth = 0; + } + + /* Use generic code to handle smaller chunks... */ + } +#endif + for ( ;nblocks; nblocks-- ) { /* INBUF is needed later and it may be identical to OUTBUF, so store @@ -1028,6 +1123,33 @@ _gcry_serpent_cfb_dec(void *context, unsigned char *iv, } #endif +#ifdef USE_NEON + if (ctx->use_neon) + { + int did_use_neon = 0; + + /* Process data in 8 block chunks. */ + while (nblocks >= 8) + { + _gcry_serpent_neon_cfb_dec(ctx, outbuf, inbuf, iv); + + nblocks -= 8; + outbuf += 8 * sizeof(serpent_block_t); + inbuf += 8 * sizeof(serpent_block_t); + did_use_neon = 1; + } + + if (did_use_neon) + { + /* serpent-neon assembly code does not use stack */ + if (nblocks == 0) + burn_stack_depth = 0; + } + + /* Use generic code to handle smaller chunks... */ + } +#endif + for ( ;nblocks; nblocks-- ) { serpent_encrypt_internal(ctx, iv, iv); diff --git a/configure.ac b/configure.ac index 19c97bd..e3471d0 100644 --- a/configure.ac +++ b/configure.ac @@ -1502,6 +1502,11 @@ if test "$found" = "1" ; then # Build with the AVX2 implementation GCRYPT_CIPHERS="$GCRYPT_CIPHERS serpent-avx2-amd64.lo" fi + + if test x"$neonsupport" = xyes ; then + # Build with the NEON implementation + GCRYPT_CIPHERS="$GCRYPT_CIPHERS serpent-armv7-neon.lo" + fi fi LIST_MEMBER(rfc2268, $enabled_ciphers) From wk at gnupg.org Mon Oct 28 12:36:28 2013 From: wk at gnupg.org (Werner Koch) Date: Mon, 28 Oct 2013 12:36:28 +0100 Subject: [PATCH] Add new benchmarking utility, bench-cycles In-Reply-To: <20131028095843.14876.6831.stgit@localhost6.localdomain6> (Jussi Kivilinna's message of "Mon, 28 Oct 2013 11:58:43 +0200") References: <20131028095843.14876.6831.stgit@localhost6.localdomain6> Message-ID: <878uxdvd8j.fsf@vigenere.g10code.de> On Mon, 28 Oct 2013 10:58, jussi.kivilinna at iki.fi said: > Bench-slope is new benchmarking tool for libgcrypt for obtaining overheadless > cycles/byte speed of cipher and hash algorithms. Tool measures the time each > operation (hash/encrypt/decrypt/authentication) takes for different buffer > sizes of from ~0kB to ~4kB and calculates the slope for these data points. Cool. You may want to exit (77) if a nsec timer does not exist. The regression test driver then prints "SKIP". Shalom-Salam, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From dbaryshkov at gmail.com Mon Oct 28 13:04:12 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Mon, 28 Oct 2013 16:04:12 +0400 Subject: [PATCH] Add new benchmarking utility, bench-cycles In-Reply-To: <20131028095843.14876.6831.stgit@localhost6.localdomain6> References: <20131028095843.14876.6831.stgit@localhost6.localdomain6> Message-ID: Hello, On Mon, Oct 28, 2013 at 1:58 PM, Jussi Kivilinna wrote: > * tests/Makefile.am (TESTS): Add 'bench-slope'. > * tests/bench-slope.c: New. Subject talks about bench-cycles, however commit message uses bench-slope. -- With best wishes Dmitry From jussi.kivilinna at iki.fi Mon Oct 28 14:06:03 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 28 Oct 2013 15:06:03 +0200 Subject: [PATCH] Add new benchmarking utility, bench-cycles In-Reply-To: References: <20131028095843.14876.6831.stgit@localhost6.localdomain6> Message-ID: <526E613B.9020601@iki.fi> On 28.10.2013 14:04, Dmitry Eremin-Solenikov wrote: > Hello, > > On Mon, Oct 28, 2013 at 1:58 PM, Jussi Kivilinna wrote: >> * tests/Makefile.am (TESTS): Add 'bench-slope'. >> * tests/bench-slope.c: New. > > Subject talks about bench-cycles, however commit message uses bench-slope. > Ah yes, naming things can sometimes be hard and subject of change over time. I have to check if there is other places where I kept old work name. -Jussi From jussi.kivilinna at iki.fi Mon Oct 28 14:56:02 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 28 Oct 2013 15:56:02 +0200 Subject: [PATCH] Add new benchmarking utility, bench-cycles In-Reply-To: <878uxdvd8j.fsf@vigenere.g10code.de> References: <20131028095843.14876.6831.stgit@localhost6.localdomain6> <878uxdvd8j.fsf@vigenere.g10code.de> Message-ID: <526E6CF2.1030805@iki.fi> On 28.10.2013 13:36, Werner Koch wrote: > On Mon, 28 Oct 2013 10:58, jussi.kivilinna at iki.fi said: > >> Bench-slope is new benchmarking tool for libgcrypt for obtaining overheadless >> cycles/byte speed of cipher and hash algorithms. Tool measures the time each >> operation (hash/encrypt/decrypt/authentication) takes for different buffer >> sizes of from ~0kB to ~4kB and calculates the slope for these data points. > > Cool. > > You may want to exit (77) if a nsec timer does not exist. The > regression test driver then prints "SKIP". Ok, I'll change this. -Jussi > > > Shalom-Salam, > > Werner > > From cvs at cvs.gnupg.org Mon Oct 28 15:11:49 2013 From: cvs at cvs.gnupg.org (by Jussi Kivilinna) Date: Mon, 28 Oct 2013 15:11:49 +0100 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-340-ge214e83 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via e214e8392671dd30e9c33260717b5e756debf3bf (commit) from ebc8abfcb09d6106fcfce40f240a513e276f46e9 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit e214e8392671dd30e9c33260717b5e756debf3bf Author: Jussi Kivilinna Date: Sat Oct 26 15:00:48 2013 +0300 Add new benchmarking utility, bench-slope * tests/Makefile.am (TESTS): Add 'bench-slope'. * tests/bench-slope.c: New. -- Bench-slope is new benchmarking tool for libgcrypt for obtaining overheadless cycles/byte speed of cipher and hash algorithms. Tool measures the time each operation (hash/encrypt/decrypt/authentication) takes for different buffer sizes of from ~0kB to ~4kB and calculates the slope for these data points. The default output is then given as nanosecs/byte and mebibytes/sec. If user provides the speed of used CPU, tool also outputs cycles/byte result (CPU-Ghz * ns/B = c/B). Output without CPU speed (with ARM Cortex-A8): $ tests/bench-slope hash Hash: | nanosecs/byte mebibytes/sec cycles/byte MD5 | 7.35 ns/B 129.7 MiB/s - c/B SHA1 | 12.30 ns/B 77.53 MiB/s - c/B RIPEMD160 | 15.96 ns/B 59.77 MiB/s - c/B TIGER192 | 55.55 ns/B 17.17 MiB/s - c/B SHA256 | 24.38 ns/B 39.12 MiB/s - c/B SHA384 | 34.24 ns/B 27.86 MiB/s - c/B SHA512 | 34.19 ns/B 27.90 MiB/s - c/B SHA224 | 24.38 ns/B 39.12 MiB/s - c/B MD4 | 5.68 ns/B 168.0 MiB/s - c/B CRC32 | 9.26 ns/B 103.0 MiB/s - c/B CRC32RFC1510 | 9.20 ns/B 103.6 MiB/s - c/B CRC24RFC2440 | 87.31 ns/B 10.92 MiB/s - c/B WHIRLPOOL | 253.3 ns/B 3.77 MiB/s - c/B TIGER | 55.55 ns/B 17.17 MiB/s - c/B TIGER2 | 55.55 ns/B 17.17 MiB/s - c/B GOSTR3411_94 | 212.0 ns/B 4.50 MiB/s - c/B STRIBOG256 | 630.1 ns/B 1.51 MiB/s - c/B STRIBOG512 | 630.1 ns/B 1.51 MiB/s - c/B = With CPU speed (with Intel i5-4570, 3.2Ghz when turbo-boost disabled): $ tests/bench-slope --cpu-mhz 3201 cipher arcfour blowfish aes Cipher: ARCFOUR | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 2.43 ns/B 392.1 MiB/s 7.79 c/B STREAM dec | 2.44 ns/B 390.2 MiB/s 7.82 c/B = BLOWFISH | nanosecs/byte mebibytes/sec cycles/byte ECB enc | 7.62 ns/B 125.2 MiB/s 24.38 c/B ECB dec | 7.63 ns/B 125.0 MiB/s 24.43 c/B CBC enc | 9.18 ns/B 103.9 MiB/s 29.38 c/B CBC dec | 2.60 ns/B 366.2 MiB/s 8.34 c/B CFB enc | 9.17 ns/B 104.0 MiB/s 29.35 c/B CFB dec | 2.66 ns/B 358.1 MiB/s 8.53 c/B OFB enc | 8.97 ns/B 106.3 MiB/s 28.72 c/B OFB dec | 8.97 ns/B 106.3 MiB/s 28.71 c/B CTR enc | 2.60 ns/B 366.5 MiB/s 8.33 c/B CTR dec | 2.60 ns/B 367.1 MiB/s 8.32 c/B = AES | nanosecs/byte mebibytes/sec cycles/byte ECB enc | 0.439 ns/B 2173.0 MiB/s 1.40 c/B ECB dec | 0.489 ns/B 1949.5 MiB/s 1.57 c/B CBC enc | 1.64 ns/B 580.8 MiB/s 5.26 c/B CBC dec | 0.219 ns/B 4357.6 MiB/s 0.701 c/B CFB enc | 1.53 ns/B 623.6 MiB/s 4.90 c/B CFB dec | 0.219 ns/B 4350.5 MiB/s 0.702 c/B OFB enc | 1.51 ns/B 629.9 MiB/s 4.85 c/B OFB dec | 1.51 ns/B 629.9 MiB/s 4.85 c/B CTR enc | 0.288 ns/B 3308.5 MiB/s 0.923 c/B CTR dec | 0.288 ns/B 3316.9 MiB/s 0.920 c/B CCM enc | 1.93 ns/B 493.8 MiB/s 6.18 c/B CCM dec | 1.93 ns/B 494.0 MiB/s 6.18 c/B CCM auth | 1.64 ns/B 580.1 MiB/s 5.26 c/B = Note: It's highly recommented to disable turbo-boost and dynamic CPU frequency features when making these kind of measurements to reduce variance. Note: The results are maximum performance for each operation; the actual speed in application depends on various matters, such as: used buffer sizes, cache usage, etc. Signed-off-by: Jussi Kivilinna diff --git a/tests/Makefile.am b/tests/Makefile.am index ac84e75..c9ba5f4 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -24,8 +24,8 @@ TESTS = version mpitests tsexp t-convert \ fips186-dsa aeswrap pkcs1v2 random dsa-rfc6979 t-ed25519 -# The last test to run. -TESTS += benchmark +# The last tests to run. +TESTS += benchmark bench-slope # Need to include ../src in addition to top_srcdir because gcrypt.h is diff --git a/tests/bench-slope.c b/tests/bench-slope.c new file mode 100644 index 0000000..62543bc --- /dev/null +++ b/tests/bench-slope.c @@ -0,0 +1,1172 @@ +/* bench-slope.c - for libgcrypt + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser general Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#ifdef HAVE_CONFIG_H +#include +#endif +#include +#include +#include +#include +#include + +#ifdef _GCRYPT_IN_LIBGCRYPT +#include "../src/gcrypt-int.h" +#include "../compat/libcompat.h" +#else +#include +#endif + +#define PGM "bench-slope" + +static int verbose; + + +/* CPU Ghz value provided by user, allows constructing cycles/byte and other + results. */ +static double cpu_ghz = -1; + + + +/*************************************** Default parameters for measurements. */ + +/* Start at small buffer size, to get reasonable timer calibration for fast + * implementations (AES-NI etc). Sixteen selected to support the largest block + * size of current set cipher blocks. */ +#define BUF_START_SIZE 16 + +/* From ~0 to ~4kbytes give comparable results with results from academia + * (SUPERCOP). */ +#define BUF_END_SIZE (BUF_START_SIZE + 4096) + +/* With 128 byte steps, we get (4096)/128 = 32 data points. */ +#define BUF_STEP_SIZE 128 + +/* Number of repeated measurements at each data point. The median of these + * measurements is selected as data point further analysis. */ +#define NUM_MEASUREMENT_REPETITIONS 32 + +/**************************************************** High-resolution timers. */ + +/* This benchmarking module needs needs high resolution timer. */ +#undef NO_GET_NSEC_TIME +#if defined(_WIN32) +struct nsec_time +{ + LARGE_INTEGER perf_count; +}; + +static void +get_nsec_time (struct nsec_time *t) +{ + BOOL ok; + + ok = QueryPerformanceCounter (&t->perf_count); + assert (ok); +} + +static double +get_time_nsec_diff (struct nsec_time *start, struct nsec_time *end) +{ + static double nsecs_per_count = 0.0; + double nsecs; + + if (nsecs_per_count == 0.0) + { + LARGE_INTEGER perf_freq; + BOOL ok; + + /* Get counts per second. */ + ok = QueryPerformanceFrequency (&perf_freq); + assert (ok); + + nsecs_per_count = 1.0 / perf_freq.QuadPart; + nsecs_per_count *= 1000000.0 * 1000.0; /* sec => nsec */ + + assert (nsecs_per_count > 0.0); + } + + nsecs = end->perf_count.QuadPart - start->perf_count.QuadPart; /* counts */ + nsecs *= nsecs_per_count; /* counts * (nsecs / count) => nsecs */ + + return nsecs; +} +#elif defined(HAVE_CLOCK_GETTIME) +struct nsec_time +{ + struct timespec ts; +}; + +static void +get_nsec_time (struct nsec_time *t) +{ + int err; + + err = clock_gettime (CLOCK_REALTIME, &t->ts); + assert (err == 0); +} + +static double +get_time_nsec_diff (struct nsec_time *start, struct nsec_time *end) +{ + double nsecs; + + nsecs = end->ts.tv_sec - start->ts.tv_sec; + nsecs *= 1000000.0 * 1000.0; /* sec => nsec */ + + /* This way we don't have to care if tv_nsec unsigned or signed. */ + if (end->ts.tv_nsec >= start->ts.tv_nsec) + nsecs += end->ts.tv_nsec - start->ts.tv_nsec; + else + nsecs -= start->ts.tv_nsec - end->ts.tv_nsec; + + return nsecs; +} +#elif defined(HAVE_GETTIMEOFDAY) +struct nsec_time +{ + struct timeval tv; +}; + +static void +get_nsec_time (struct nsec_time *t) +{ + int err; + + err = gettimeofday (&t->tv, NULL); + assert (err == 0); +} + +static double +get_time_nsec_diff (struct nsec_time *start, struct nsec_time *end) +{ + double nsecs; + + nsecs = end->tv.tv_sec - start->tv.tv_sec; + nsecs *= 1000000; /* sec => ?sec */ + + /* This way we don't have to care if tv_usec unsigned or signed. */ + if (end->tv.tv_usec >= start->tv.tv_usec) + nsecs += end->tv.tv_usec - start->tv.tv_usec; + else + nsecs -= start->tv.tv_usec - end->tv.tv_usec; + + nsecs *= 1000; /* ?sec => nsec */ + + return nsecs; +} +#else +#define NO_GET_NSEC_TIME 1 +#endif + + +/* If no high resolution timer found, provide dummy bench-slope. */ +#ifdef NO_GET_NSEC_TIME + + +int +main (void) +{ + /* No nsec timer => SKIP test. */ + return 77; +} + + +#else /* !NO_GET_NSEC_TIME */ + + +/********************************************** Slope benchmarking framework. */ + +struct bench_obj +{ + const struct bench_ops *ops; + + unsigned int num_measure_repetitions; + unsigned int min_bufsize; + unsigned int max_bufsize; + unsigned int step_size; + + void *priv; +}; + +typedef int (*const bench_initialize_t) (struct bench_obj * obj); +typedef void (*const bench_finalize_t) (struct bench_obj * obj); +typedef void (*const bench_do_run_t) (struct bench_obj * obj, void *buffer, + size_t buflen); + +struct bench_ops +{ + bench_initialize_t initialize; + bench_finalize_t finalize; + bench_do_run_t do_run; +}; + + +double +get_slope (double (*const get_x) (unsigned int idx, void *priv), + void *get_x_priv, double y_points[], unsigned int npoints, + double *overhead) +{ + double sumx, sumy, sumx2, sumy2, sumxy; + unsigned int i; + double b, a; + + sumx = sumy = sumx2 = sumy2 = sumxy = 0; + + for (i = 0; i < npoints; i++) + { + double x, y; + + x = get_x (i, get_x_priv); /* bytes */ + y = y_points[i]; /* nsecs */ + + sumx += x; + sumy += y; + sumx2 += x * x; + //sumy2 += y * y; + sumxy += x * y; + } + + b = (npoints * sumxy - sumx * sumy) / (npoints * sumx2 - sumx * sumx); + a = (sumy - b * sumx) / npoints; + + if (overhead) + *overhead = a; /* nsecs */ + + return b; /* nsecs per byte */ +} + + +double +get_bench_obj_point_x (unsigned int idx, void *priv) +{ + struct bench_obj *obj = priv; + return (double) (obj->min_bufsize + (idx * obj->step_size)); +} + + +unsigned int +get_num_measurements (struct bench_obj *obj) +{ + unsigned int buf_range = obj->max_bufsize - obj->min_bufsize; + unsigned int num = buf_range / obj->step_size + 1; + + while (obj->min_bufsize + (num * obj->step_size) > obj->max_bufsize) + num--; + + return num + 1; +} + + +static int +double_cmp (const void *__a, const void *__b) +{ + const double *a, *b; + + a = __a; + b = __b; + + if (*a > *b) + return 1; + if (*a < *b) + return -1; + return 0; +} + + +double +do_bench_obj_measurement (struct bench_obj *obj, void *buffer, size_t buflen, + double *measurement_raw, + unsigned int loop_iterations) +{ + const unsigned int num_repetitions = obj->num_measure_repetitions; + const bench_do_run_t do_run = obj->ops->do_run; + struct nsec_time start, end; + unsigned int rep, loop; + double res; + + if (num_repetitions < 1 || loop_iterations < 1) + return 0.0; + + for (rep = 0; rep < num_repetitions; rep++) + { + get_nsec_time (&start); + + for (loop = 0; loop < loop_iterations; loop++) + do_run (obj, buffer, buflen); + + get_nsec_time (&end); + + measurement_raw[rep] = get_time_nsec_diff (&start, &end); + } + + /* Return median of repeated measurements. */ + qsort (measurement_raw, num_repetitions, sizeof (measurement_raw[0]), + double_cmp); + + if (num_repetitions % 2 == 1) + return measurement_raw[num_repetitions / 2]; + + res = measurement_raw[num_repetitions / 2] + + measurement_raw[num_repetitions / 2 - 1]; + return res / 2; +} + + +unsigned int +adjust_loop_iterations_to_timer_accuracy (struct bench_obj *obj, void *buffer, + double *measurement_raw) +{ + const double increase_thres = 3.0; + double tmp, nsecs; + unsigned int loop_iterations; + unsigned int test_bufsize; + + test_bufsize = obj->min_bufsize; + if (test_bufsize == 0) + test_bufsize += obj->step_size; + + loop_iterations = 0; + do + { + /* Increase loop iterations until we get other results than zero. */ + nsecs = + do_bench_obj_measurement (obj, buffer, test_bufsize, + measurement_raw, ++loop_iterations); + } + while (nsecs < 1.0 - 0.1); + do + { + /* Increase loop iterations until we get reasonable increase for elapsed time. */ + tmp = + do_bench_obj_measurement (obj, buffer, test_bufsize, + measurement_raw, ++loop_iterations); + } + while (tmp < nsecs * (increase_thres - 0.1)); + + return loop_iterations; +} + + +/* Benchmark and return linear regression slope in nanoseconds per byte. */ +double +do_slope_benchmark (struct bench_obj *obj) +{ + unsigned int num_measurements; + double *measurements = NULL; + double *measurement_raw = NULL; + double slope, overhead; + unsigned int loop_iterations, midx, i; + unsigned char *real_buffer = NULL; + unsigned char *buffer; + size_t cur_bufsize; + int err; + + err = obj->ops->initialize (obj); + if (err < 0) + return -1; + + num_measurements = get_num_measurements (obj); + measurements = calloc (num_measurements, sizeof (*measurements)); + if (!measurements) + goto err_free; + + measurement_raw = + calloc (obj->num_measure_repetitions, sizeof (*measurement_raw)); + if (!measurement_raw) + goto err_free; + + if (num_measurements < 1 || obj->num_measure_repetitions < 1 || + obj->max_bufsize < 1 || obj->min_bufsize > obj->max_bufsize) + goto err_free; + + real_buffer = malloc (obj->max_bufsize + 128); + if (!real_buffer) + goto err_free; + /* Get aligned buffer */ + buffer = real_buffer; + buffer += 128 - ((real_buffer - (unsigned char *) 0) & (128 - 1)); + + for (i = 0; i < obj->max_bufsize; i++) + buffer[i] = 0x55 ^ (-i); + + /* Adjust number of loop iterations up to timer accuracy. */ + loop_iterations = adjust_loop_iterations_to_timer_accuracy (obj, buffer, + measurement_raw); + + /* Perform measurements */ + for (midx = 0, cur_bufsize = obj->min_bufsize; + cur_bufsize <= obj->max_bufsize; cur_bufsize += obj->step_size, midx++) + { + measurements[midx] = + do_bench_obj_measurement (obj, buffer, cur_bufsize, measurement_raw, + loop_iterations); + measurements[midx] /= loop_iterations; + } + + assert (midx == num_measurements); + + slope = + get_slope (&get_bench_obj_point_x, obj, measurements, num_measurements, + &overhead); + + free (measurement_raw); + free (real_buffer); + obj->ops->finalize (obj); + + return slope; + +err_free: + if (measurement_raw) + free (measurement_raw); + if (measurements) + free (measurements); + if (real_buffer) + free (real_buffer); + obj->ops->finalize (obj); + + return -1; +} + + +/********************************************************** Printing results. */ + +static void +double_to_str (char *out, size_t outlen, double value) +{ + const char *fmt; + + if (value < 1.0) + fmt = "%.3f"; + else if (value < 100.0) + fmt = "%.2f"; + else + fmt = "%.1f"; + + snprintf (out, outlen, fmt, value); +} + +static void +bench_print_result (double nsecs_per_byte) +{ + double cycles_per_byte, mbytes_per_sec; + char nsecpbyte_buf[16]; + char mbpsec_buf[16]; + char cpbyte_buf[16]; + + strcpy (cpbyte_buf, "-"); + + double_to_str (nsecpbyte_buf, sizeof (nsecpbyte_buf), nsecs_per_byte); + + /* If user didn't provide CPU speed, we cannot show cycles/byte results. */ + if (cpu_ghz > 0.0) + { + cycles_per_byte = nsecs_per_byte * cpu_ghz; + double_to_str (cpbyte_buf, sizeof (cpbyte_buf), cycles_per_byte); + } + + mbytes_per_sec = + (1000.0 * 1000.0 * 1000.0) / (nsecs_per_byte * 1024 * 1024); + double_to_str (mbpsec_buf, sizeof (mbpsec_buf), mbytes_per_sec); + + strncat (nsecpbyte_buf, " ns/B", sizeof (nsecpbyte_buf) - 1); + strncat (mbpsec_buf, " MiB/s", sizeof (mbpsec_buf) - 1); + strncat (cpbyte_buf, " c/B", sizeof (cpbyte_buf) - 1); + + printf ("%14s %15s %13s\n", nsecpbyte_buf, mbpsec_buf, cpbyte_buf); +} + +static void +bench_print_header (const char *algo_name) +{ + printf (" %-14s | ", algo_name); + printf ("%14s %15s %13s\n", "nanosecs/byte", "mebibytes/sec", + "cycles/byte"); +} + +static void +bench_print_footer (void) +{ + printf (" %-14s =\n", ""); +} + + +/********************************************************* Cipher benchmarks. */ + +struct bench_cipher_mode +{ + int mode; + const char *name; + struct bench_ops *ops; + + int algo; +}; + + +static int +bench_encrypt_init (struct bench_obj *obj) +{ + struct bench_cipher_mode *mode = obj->priv; + gcry_cipher_hd_t hd; + int err, keylen; + + obj->min_bufsize = BUF_START_SIZE; + obj->max_bufsize = BUF_END_SIZE; + obj->step_size = BUF_STEP_SIZE; + obj->num_measure_repetitions = NUM_MEASUREMENT_REPETITIONS; + + err = gcry_cipher_open (&hd, mode->algo, mode->mode, 0); + if (err) + { + fprintf (stderr, PGM ": error opening cipher `%s'\n", + gcry_cipher_algo_name (mode->algo)); + exit (1); + } + + keylen = gcry_cipher_get_algo_keylen (mode->algo); + if (keylen) + { + char key[keylen]; + int i; + + for (i = 0; i < keylen; i++) + key[i] = 0x33 ^ (11 - i); + + err = gcry_cipher_setkey (hd, key, keylen); + if (err) + { + fprintf (stderr, PGM ": gcry_cipher_setkey failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + } + else + { + fprintf (stderr, PGM ": failed to get key length for algorithm `%s'\n", + gcry_cipher_algo_name (mode->algo)); + gcry_cipher_close (hd); + exit (1); + } + + obj->priv = hd; + + return 0; +} + +static void +bench_encrypt_free (struct bench_obj *obj) +{ + gcry_cipher_hd_t hd = obj->priv; + + gcry_cipher_close (hd); +} + +static void +bench_encrypt_do_bench (struct bench_obj *obj, void *buf, size_t buflen) +{ + gcry_cipher_hd_t hd = obj->priv; + int err; + + err = gcry_cipher_encrypt (hd, buf, buflen, buf, buflen); + if (err) + { + fprintf (stderr, PGM ": gcry_cipher_encrypt failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } +} + +static void +bench_decrypt_do_bench (struct bench_obj *obj, void *buf, size_t buflen) +{ + gcry_cipher_hd_t hd = obj->priv; + int err; + + err = gcry_cipher_decrypt (hd, buf, buflen, buf, buflen); + if (err) + { + fprintf (stderr, PGM ": gcry_cipher_encrypt failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } +} + +static struct bench_ops encrypt_ops = { + &bench_encrypt_init, + &bench_encrypt_free, + &bench_encrypt_do_bench +}; + +static struct bench_ops decrypt_ops = { + &bench_encrypt_init, + &bench_encrypt_free, + &bench_decrypt_do_bench +}; + + + +static void +bench_ccm_encrypt_do_bench (struct bench_obj *obj, void *buf, size_t buflen) +{ + gcry_cipher_hd_t hd = obj->priv; + int err; + char tag[8]; + char nonce[11] = { 0x80, 0x01, }; + size_t params[3]; + + gcry_cipher_setiv (hd, nonce, sizeof (nonce)); + + /* Set CCM lengths */ + params[0] = buflen; + params[1] = 0; /*aadlen */ + params[2] = sizeof (tag); + err = + gcry_cipher_ctl (hd, GCRYCTL_SET_CCM_LENGTHS, params, sizeof (params)); + if (err) + { + fprintf (stderr, PGM ": gcry_cipher_ctl failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + err = gcry_cipher_encrypt (hd, buf, buflen, buf, buflen); + if (err) + { + fprintf (stderr, PGM ": gcry_cipher_encrypt failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + err = gcry_cipher_gettag (hd, tag, sizeof (tag)); + if (err) + { + fprintf (stderr, PGM ": gcry_cipher_gettag failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } +} + +static void +bench_ccm_decrypt_do_bench (struct bench_obj *obj, void *buf, size_t buflen) +{ + gcry_cipher_hd_t hd = obj->priv; + int err; + char tag[8] = { 0, }; + char nonce[11] = { 0x80, 0x01, }; + size_t params[3]; + + gcry_cipher_setiv (hd, nonce, sizeof (nonce)); + + /* Set CCM lengths */ + params[0] = buflen; + params[1] = 0; /*aadlen */ + params[2] = sizeof (tag); + err = + gcry_cipher_ctl (hd, GCRYCTL_SET_CCM_LENGTHS, params, sizeof (params)); + if (err) + { + fprintf (stderr, PGM ": gcry_cipher_ctl failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + err = gcry_cipher_decrypt (hd, buf, buflen, buf, buflen); + if (err) + { + fprintf (stderr, PGM ": gcry_cipher_encrypt failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + err = gcry_cipher_checktag (hd, tag, sizeof (tag)); + if (gpg_err_code (err) == GPG_ERR_CHECKSUM) + err = gpg_error (GPG_ERR_NO_ERROR); + if (err) + { + fprintf (stderr, PGM ": gcry_cipher_gettag failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } +} + +static void +bench_ccm_authenticate_do_bench (struct bench_obj *obj, void *buf, + size_t buflen) +{ + gcry_cipher_hd_t hd = obj->priv; + int err; + char tag[8] = { 0, }; + char nonce[11] = { 0x80, 0x01, }; + size_t params[3]; + char data = 0xff; + + gcry_cipher_setiv (hd, nonce, sizeof (nonce)); + + /* Set CCM lengths */ + params[0] = sizeof (data); /*datalen */ + params[1] = buflen; /*aadlen */ + params[2] = sizeof (tag); + err = + gcry_cipher_ctl (hd, GCRYCTL_SET_CCM_LENGTHS, params, sizeof (params)); + if (err) + { + fprintf (stderr, PGM ": gcry_cipher_ctl failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + err = gcry_cipher_authenticate (hd, buf, buflen); + if (err) + { + fprintf (stderr, PGM ": gcry_cipher_authenticate failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + err = gcry_cipher_encrypt (hd, &data, sizeof (data), &data, sizeof (data)); + if (err) + { + fprintf (stderr, PGM ": gcry_cipher_encrypt failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + err = gcry_cipher_gettag (hd, tag, sizeof (tag)); + if (err) + { + fprintf (stderr, PGM ": gcry_cipher_gettag failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } +} + +static struct bench_ops ccm_encrypt_ops = { + &bench_encrypt_init, + &bench_encrypt_free, + &bench_ccm_encrypt_do_bench +}; + +static struct bench_ops ccm_decrypt_ops = { + &bench_encrypt_init, + &bench_encrypt_free, + &bench_ccm_decrypt_do_bench +}; + +static struct bench_ops ccm_authenticate_ops = { + &bench_encrypt_init, + &bench_encrypt_free, + &bench_ccm_authenticate_do_bench +}; + + +static struct bench_cipher_mode cipher_modes[] = { + {GCRY_CIPHER_MODE_ECB, "ECB enc", &encrypt_ops}, + {GCRY_CIPHER_MODE_ECB, "ECB dec", &decrypt_ops}, + {GCRY_CIPHER_MODE_CBC, "CBC enc", &encrypt_ops}, + {GCRY_CIPHER_MODE_CBC, "CBC dec", &decrypt_ops}, + {GCRY_CIPHER_MODE_CFB, "CFB enc", &encrypt_ops}, + {GCRY_CIPHER_MODE_CFB, "CFB dec", &decrypt_ops}, + {GCRY_CIPHER_MODE_OFB, "OFB enc", &encrypt_ops}, + {GCRY_CIPHER_MODE_OFB, "OFB dec", &decrypt_ops}, + {GCRY_CIPHER_MODE_CTR, "CTR enc", &encrypt_ops}, + {GCRY_CIPHER_MODE_CTR, "CTR dec", &decrypt_ops}, + {GCRY_CIPHER_MODE_CCM, "CCM enc", &ccm_encrypt_ops}, + {GCRY_CIPHER_MODE_CCM, "CCM dec", &ccm_decrypt_ops}, + {GCRY_CIPHER_MODE_CCM, "CCM auth", &ccm_authenticate_ops}, + {0}, +}; + + +static void +cipher_bench_one (int algo, struct bench_cipher_mode *pmode) +{ + struct bench_cipher_mode mode = *pmode; + struct bench_obj obj = { 0 }; + double result; + unsigned int blklen; + + mode.algo = algo; + + /* Check if this mode is ok */ + blklen = gcry_cipher_get_algo_blklen (algo); + if (!blklen) + return; + + /* Stream cipher? Only test with ECB. */ + if (blklen == 1 && mode.mode != GCRY_CIPHER_MODE_ECB) + return; + if (blklen == 1 && mode.mode == GCRY_CIPHER_MODE_ECB) + { + mode.mode = GCRY_CIPHER_MODE_STREAM; + mode.name = mode.ops == &encrypt_ops ? "STREAM enc" : "STREAM dec"; + } + + /* CCM has restrictions for block-size */ + if (mode.mode == GCRY_CIPHER_MODE_CCM && blklen != GCRY_CCM_BLOCK_LEN) + return; + + printf (" %14s | ", mode.name); + fflush (stdout); + + obj.ops = mode.ops; + obj.priv = &mode; + + result = do_slope_benchmark (&obj); + + bench_print_result (result); +} + + +static void +__cipher_bench (int algo) +{ + const char *algoname; + int i; + + algoname = gcry_cipher_algo_name (algo); + + bench_print_header (algoname); + + for (i = 0; cipher_modes[i].mode; i++) + cipher_bench_one (algo, &cipher_modes[i]); + + bench_print_footer (); +} + + +void +cipher_bench (char **argv, int argc) +{ + int i, algo; + + printf ("Cipher:\n"); + + if (argv && argc) + { + for (i = 0; i < argc; i++) + { + algo = gcry_cipher_map_name (argv[i]); + if (algo) + __cipher_bench (algo); + } + } + else + { + for (i = 1; i < 400; i++) + if (!gcry_cipher_test_algo (i)) + __cipher_bench (i); + } +} + + +/*********************************************************** Hash benchmarks. */ + +struct bench_hash_mode +{ + const char *name; + struct bench_ops *ops; + + int algo; +}; + + +static int +bench_hash_init (struct bench_obj *obj) +{ + struct bench_hash_mode *mode = obj->priv; + gcry_md_hd_t hd; + int err; + + obj->min_bufsize = BUF_START_SIZE; + obj->max_bufsize = BUF_END_SIZE; + obj->step_size = BUF_STEP_SIZE; + obj->num_measure_repetitions = NUM_MEASUREMENT_REPETITIONS; + + err = gcry_md_open (&hd, mode->algo, 0); + if (err) + { + fprintf (stderr, PGM ": error opening hash `%s'\n", + gcry_md_algo_name (mode->algo)); + exit (1); + } + + obj->priv = hd; + + return 0; +} + +static void +bench_hash_free (struct bench_obj *obj) +{ + gcry_md_hd_t hd = obj->priv; + + gcry_md_close (hd); +} + +static void +bench_hash_do_bench (struct bench_obj *obj, void *buf, size_t buflen) +{ + gcry_md_hd_t hd = obj->priv; + + gcry_md_write (hd, buf, buflen); + gcry_md_final (hd); +} + +static struct bench_ops hash_ops = { + &bench_hash_init, + &bench_hash_free, + &bench_hash_do_bench +}; + + +static struct bench_hash_mode hash_modes[] = { + {"", &hash_ops}, + {0}, +}; + + +static void +hash_bench_one (int algo, struct bench_hash_mode *pmode) +{ + struct bench_hash_mode mode = *pmode; + struct bench_obj obj = { 0 }; + double result; + + mode.algo = algo; + + if (mode.name[0] == '\0') + printf (" %-14s | ", gcry_md_algo_name (algo)); + else + printf (" %14s | ", mode.name); + fflush (stdout); + + obj.ops = mode.ops; + obj.priv = &mode; + + result = do_slope_benchmark (&obj); + + bench_print_result (result); +} + +static void +__hash_bench (int algo) +{ + int i; + + for (i = 0; hash_modes[i].name; i++) + hash_bench_one (algo, &hash_modes[i]); +} + +void +hash_bench (char **argv, int argc) +{ + int i, algo; + + printf ("Hash:\n"); + + bench_print_header (""); + + if (argv && argc) + { + for (i = 0; i < argc; i++) + { + algo = gcry_md_map_name (argv[i]); + if (algo) + __hash_bench (algo); + } + } + else + { + for (i = 1; i < 400; i++) + if (!gcry_md_test_algo (i)) + __hash_bench (i); + } + + bench_print_footer (); +} + + +/************************************************************** Main program. */ + +void +print_help (void) +{ + static const char *help_lines[] = { + "usage: bench-slope [options] [hash|cipher [algonames]]", + "", + " options:", + " --cpu-mhz Set CPU speed for calculating cycles per bytes", + " results.", + " --disable-hwf Disable hardware acceleration feature(s) for", + " benchmarking.", + NULL + }; + const char **line; + + for (line = help_lines; *line; line++) + fprintf (stdout, "%s\n", *line); +} + + +/* Warm up CPU. */ +static void +warm_up_cpu (void) +{ + struct nsec_time start, end; + + get_nsec_time (&start); + do + { + get_nsec_time (&end); + } + while (get_time_nsec_diff (&start, &end) < 1000.0 * 1000.0 * 1000.0); +} + + +int +main (int argc, char **argv) +{ + int last_argc = -1; + int debug = 0; + + if (argc) + { + argc--; + argv++; + } + + while (argc && last_argc != argc) + { + last_argc = argc; + + if (!strcmp (*argv, "--")) + { + argc--; + argv++; + break; + } + else if (!strcmp (*argv, "--help")) + { + print_help (); + exit (0); + } + else if (!strcmp (*argv, "--verbose")) + { + verbose++; + argc--; + argv++; + } + else if (!strcmp (*argv, "--debug")) + { + verbose += 2; + debug++; + argc--; + argv++; + } + else if (!strcmp (*argv, "--disable-hwf")) + { + argc--; + argv++; + if (argc) + { + if (gcry_control (GCRYCTL_DISABLE_HWF, *argv, NULL)) + fprintf (stderr, + PGM + ": unknown hardware feature `%s' - option ignored\n", + *argv); + argc--; + argv++; + } + } + else if (!strcmp (*argv, "--cpu-mhz")) + { + argc--; + argv++; + if (argc) + { + cpu_ghz = atof (*argv); + cpu_ghz /= 1000; /* Mhz => Ghz */ + + argc--; + argv++; + } + } + } + + gcry_control (GCRYCTL_SET_VERBOSITY, (int) verbose); + + if (!gcry_check_version (GCRYPT_VERSION)) + { + fprintf (stderr, PGM ": version mismatch; pgm=%s, library=%s\n", + GCRYPT_VERSION, gcry_check_version (NULL)); + exit (1); + } + + if (debug) + gcry_control (GCRYCTL_SET_DEBUG_FLAGS, 1u, 0); + + gcry_control (GCRYCTL_DISABLE_SECMEM, 0); + gcry_control (GCRYCTL_INITIALIZATION_FINISHED, 0); + gcry_control (GCRYCTL_ENABLE_QUICK_RANDOM, 0); + + if (!argc) + { + warm_up_cpu (); + hash_bench (NULL, 0); + cipher_bench (NULL, 0); + } + else if (!strcmp (*argv, "hash")) + { + argc--; + argv++; + + warm_up_cpu (); + hash_bench ((argc == 0) ? NULL : argv, argc); + } + else if (!strcmp (*argv, "cipher")) + { + argc--; + argv++; + + warm_up_cpu (); + cipher_bench ((argc == 0) ? NULL : argv, argc); + } + else + { + fprintf (stderr, PGM ": unknown argument: %s\n", *argv); + print_help (); + } + + return 0; +} + +#endif /* !NO_GET_NSEC_TIME */ ----------------------------------------------------------------------- Summary of changes: tests/Makefile.am | 4 +- tests/bench-slope.c | 1172 +++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 1174 insertions(+), 2 deletions(-) create mode 100644 tests/bench-slope.c hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Mon Oct 28 15:18:07 2013 From: cvs at cvs.gnupg.org (by Jussi Kivilinna) Date: Mon, 28 Oct 2013 15:18:07 +0100 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-342-g3ff9d25 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 3ff9d2571c18cd7a34359f9c60a10d3b0f932b23 (commit) via 5a3d43485efdc09912be0967ee0a3ce345b3b15a (commit) from e214e8392671dd30e9c33260717b5e756debf3bf (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 3ff9d2571c18cd7a34359f9c60a10d3b0f932b23 Author: Jussi Kivilinna Date: Sat Oct 26 15:00:48 2013 +0300 Add ARM NEON assembly implementation of Salsa20 * cipher/Makefile.am: Add 'salsa20-armv7-neon.S'. * cipher/salsa20-armv7-neon.S: New. * cipher/salsa20.c [USE_ARM_NEON_ASM]: New macro. (struct SALSA20_context_s, salsa20_core_t, salsa20_keysetup_t) (salsa20_ivsetup_t): New. (SALSA20_context_t) [USE_ARM_NEON_ASM]: Add 'use_neon'. (SALSA20_context_t): Add 'keysetup', 'ivsetup' and 'core'. (salsa20_core): Change 'src' argument to 'ctx'. [USE_ARM_NEON_ASM] (_gcry_arm_neon_salsa20_encrypt): New prototype. [USE_ARM_NEON_ASM] (salsa20_core_neon, salsa20_keysetup_neon) (salsa20_ivsetup_neon): New. (salsa20_do_setkey): Setup keysetup, ivsetup and core with default functions. (salsa20_do_setkey) [USE_ARM_NEON_ASM]: When NEON support detect, set keysetup, ivsetup and core with ARM NEON functions. (salsa20_do_setkey): Call 'ctx->keysetup'. (salsa20_setiv): Call 'ctx->ivsetup'. (salsa20_do_encrypt_stream) [USE_ARM_NEON_ASM]: Process large buffers in ARM NEON implementation. (salsa20_do_encrypt_stream): Call 'ctx->core' instead of directly calling 'salsa20_core'. (selftest): Add test to check large buffer processing and block counter updating. * configure.ac [neonsupport]: 'Add salsa20-armv7-neon.lo'. -- Patch adds fast ARM NEON assembly implementation for Salsa20. Implementation gains extra speed by processing three blocks in parallel with help of ARM NEON vector processing unit. This implementation is based on public domain code by Peter Schwabe and D. J. Bernstein and it is available in SUPERCOP benchmarking framework. For more details on this work, check paper "NEON crypto" by Daniel J. Bernstein and Peter Schwabe: http://cryptojedi.org/papers/#neoncrypto Benchmark results on Cortex-A8 (1008 Mhz): Before: SALSA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 18.88 ns/B 50.51 MiB/s 19.03 c/B STREAM dec | 18.89 ns/B 50.49 MiB/s 19.04 c/B = SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 13.60 ns/B 70.14 MiB/s 13.71 c/B STREAM dec | 13.60 ns/B 70.13 MiB/s 13.71 c/B After: SALSA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 5.48 ns/B 174.1 MiB/s 5.52 c/B STREAM dec | 5.47 ns/B 174.2 MiB/s 5.52 c/B = SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 3.65 ns/B 260.9 MiB/s 3.68 c/B STREAM dec | 3.65 ns/B 261.6 MiB/s 3.67 c/B Signed-off-by: Jussi Kivilinna diff --git a/cipher/Makefile.am b/cipher/Makefile.am index e786713..95d484e 100644 --- a/cipher/Makefile.am +++ b/cipher/Makefile.am @@ -71,7 +71,7 @@ md5.c \ rijndael.c rijndael-tables.h rijndael-amd64.S rijndael-arm.S \ rmd160.c \ rsa.c \ -salsa20.c salsa20-amd64.S \ +salsa20.c salsa20-amd64.S salsa20-armv7-neon.S \ scrypt.c \ seed.c \ serpent.c serpent-sse2-amd64.S serpent-avx2-amd64.S \ diff --git a/cipher/salsa20-armv7-neon.S b/cipher/salsa20-armv7-neon.S new file mode 100644 index 0000000..5b51301 --- /dev/null +++ b/cipher/salsa20-armv7-neon.S @@ -0,0 +1,899 @@ +/* salsa-armv7-neon.S - ARM NEON implementation of Salsa20 cipher + * + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include + +#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) && \ + defined(HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS) && \ + defined(HAVE_GCC_INLINE_ASM_NEON) && defined(USE_SALSA20) + +/* + * Based on public domain implementation from SUPERCOP benchmarking framework + * by Peter Schwabe and D. J. Bernstein. Paper about the implementation at: + * http://cryptojedi.org/papers/#neoncrypto + */ + +.syntax unified +.arm +.fpu neon +.text + +.align 2 +.global _gcry_arm_neon_salsa20_encrypt +.type _gcry_arm_neon_salsa20_encrypt,%function; +_gcry_arm_neon_salsa20_encrypt: + /* Modifications: + * - arguments changed to (void *c, const void *m, unsigned int nblks, + * void *ctx, unsigned int rounds) from (void *c, const void *m, + * unsigned long long mlen, const void *n, const void *k) + * - nonce and key read from 'ctx' as well as sigma and counter. + * - read in counter from 'ctx' at the start. + * - update counter in 'ctx' at the end. + * - length is input as number of blocks, so don't handle tail bytes + * (this is done in salsa20.c). + */ + lsl r2,r2,#6 + vpush {q4,q5,q6,q7} + mov r12,sp + sub sp,sp,#352 + and sp,sp,#0xffffffe0 + strd r4,[sp,#0] + strd r6,[sp,#8] + strd r8,[sp,#16] + strd r10,[sp,#24] + str r14,[sp,#224] + str r12,[sp,#228] + str r0,[sp,#232] + str r1,[sp,#236] + str r2,[sp,#240] + ldr r4,[r12,#64] + str r4,[sp,#244] + mov r2,r3 + add r3,r2,#48 + vld1.8 {q3},[r2] + add r0,r2,#32 + add r14,r2,#40 + vmov.i64 q3,#0xff + str r14,[sp,#160] + ldrd r8,[r2,#4] + vld1.8 {d0},[r0] + ldrd r4,[r2,#20] + vld1.8 {d8-d9},[r2]! + ldrd r6,[r0,#0] + vmov d4,d9 + ldr r0,[r14] + vrev64.i32 d0,d0 + ldr r1,[r14,#4] + vld1.8 {d10-d11},[r2] + strd r6,[sp,#32] + sub r2,r2,#16 + strd r0,[sp,#40] + vmov d5,d11 + strd r8,[sp,#48] + vext.32 d1,d0,d10,#1 + strd r4,[sp,#56] + ldr r1,[r2,#0] + vshr.u32 q3,q3,#7 + ldr r4,[r2,#12] + vext.32 d3,d11,d9,#1 + ldr r11,[r2,#16] + vext.32 d2,d8,d0,#1 + ldr r8,[r2,#28] + vext.32 d0,d10,d8,#1 + ldr r0,[r3,#0] + add r2,r2,#44 + vmov q4,q3 + vld1.8 {d6-d7},[r14] + vadd.i64 q3,q3,q4 + ldr r5,[r3,#4] + add r12,sp,#256 + vst1.8 {d4-d5},[r12,: 128] + ldr r10,[r3,#8] + add r14,sp,#272 + vst1.8 {d2-d3},[r14,: 128] + ldr r9,[r3,#12] + vld1.8 {d2-d3},[r3] + strd r0,[sp,#64] + ldr r0,[sp,#240] + strd r4,[sp,#72] + strd r10,[sp,#80] + strd r8,[sp,#88] + nop + cmp r0,#192 + blo ._mlenlowbelow192 +._mlenatleast192: + ldrd r2,[sp,#48] + vext.32 d7,d6,d6,#1 + vmov q8,q1 + ldrd r6,[sp,#32] + vld1.8 {d18-d19},[r12,: 128] + vmov q10,q0 + str r0,[sp,#240] + vext.32 d4,d7,d19,#1 + vmov q11,q8 + vext.32 d10,d18,d7,#1 + vadd.i64 q3,q3,q4 + ldrd r0,[sp,#64] + vld1.8 {d24-d25},[r14,: 128] + vmov d5,d24 + add r8,sp,#288 + ldrd r4,[sp,#72] + vmov d11,d25 + add r9,sp,#304 + ldrd r10,[sp,#80] + vst1.8 {d4-d5},[r8,: 128] + strd r2,[sp,#96] + vext.32 d7,d6,d6,#1 + vmov q13,q10 + strd r6,[sp,#104] + vmov d13,d24 + vst1.8 {d10-d11},[r9,: 128] + add r2,sp,#320 + vext.32 d12,d7,d19,#1 + vmov d15,d25 + add r6,sp,#336 + ldr r12,[sp,#244] + vext.32 d14,d18,d7,#1 + vadd.i64 q3,q3,q4 + ldrd r8,[sp,#88] + vst1.8 {d12-d13},[r2,: 128] + ldrd r2,[sp,#56] + vst1.8 {d14-d15},[r6,: 128] + ldrd r6,[sp,#40] +._mainloop2: + str r12,[sp,#248] + vadd.i32 q4,q10,q8 + vadd.i32 q9,q13,q11 + add r12,r0,r2 + add r14,r5,r1 + vshl.i32 q12,q4,#7 + vshl.i32 q14,q9,#7 + vshr.u32 q4,q4,#25 + vshr.u32 q9,q9,#25 + eor r4,r4,r12,ROR #25 + eor r7,r7,r14,ROR #25 + add r12,r4,r0 + add r14,r7,r5 + veor q5,q5,q12 + veor q7,q7,q14 + veor q4,q5,q4 + veor q5,q7,q9 + eor r6,r6,r12,ROR #23 + eor r3,r3,r14,ROR #23 + add r12,r6,r4 + str r7,[sp,#116] + add r7,r3,r7 + ldr r14,[sp,#108] + vadd.i32 q7,q8,q4 + vadd.i32 q9,q11,q5 + vshl.i32 q12,q7,#9 + vshl.i32 q14,q9,#9 + vshr.u32 q7,q7,#23 + vshr.u32 q9,q9,#23 + veor q2,q2,q12 + veor q6,q6,q14 + veor q2,q2,q7 + veor q6,q6,q9 + eor r2,r2,r12,ROR #19 + str r2,[sp,#120] + eor r1,r1,r7,ROR #19 + ldr r7,[sp,#96] + add r2,r2,r6 + str r6,[sp,#112] + add r6,r1,r3 + ldr r12,[sp,#104] + vadd.i32 q7,q4,q2 + vext.32 q4,q4,q4,#3 + vadd.i32 q9,q5,q6 + vshl.i32 q12,q7,#13 + vext.32 q5,q5,q5,#3 + vshl.i32 q14,q9,#13 + eor r0,r0,r2,ROR #14 + eor r2,r5,r6,ROR #14 + str r3,[sp,#124] + add r3,r10,r12 + ldr r5,[sp,#100] + add r6,r9,r11 + vshr.u32 q7,q7,#19 + vshr.u32 q9,q9,#19 + veor q10,q10,q12 + veor q12,q13,q14 + eor r8,r8,r3,ROR #25 + eor r3,r5,r6,ROR #25 + add r5,r8,r10 + add r6,r3,r9 + veor q7,q10,q7 + veor q9,q12,q9 + eor r5,r7,r5,ROR #23 + eor r6,r14,r6,ROR #23 + add r7,r5,r8 + add r14,r6,r3 + vadd.i32 q10,q2,q7 + vswp d4,d5 + vadd.i32 q12,q6,q9 + vshl.i32 q13,q10,#18 + vswp d12,d13 + vshl.i32 q14,q12,#18 + eor r7,r12,r7,ROR #19 + eor r11,r11,r14,ROR #19 + add r12,r7,r5 + add r14,r11,r6 + vshr.u32 q10,q10,#14 + vext.32 q7,q7,q7,#1 + vshr.u32 q12,q12,#14 + veor q8,q8,q13 + vext.32 q9,q9,q9,#1 + veor q11,q11,q14 + eor r10,r10,r12,ROR #14 + eor r9,r9,r14,ROR #14 + add r12,r0,r3 + add r14,r2,r4 + veor q8,q8,q10 + veor q10,q11,q12 + eor r1,r1,r12,ROR #25 + eor r7,r7,r14,ROR #25 + add r12,r1,r0 + add r14,r7,r2 + vadd.i32 q11,q4,q8 + vadd.i32 q12,q5,q10 + vshl.i32 q13,q11,#7 + vshl.i32 q14,q12,#7 + eor r5,r5,r12,ROR #23 + eor r6,r6,r14,ROR #23 + vshr.u32 q11,q11,#25 + vshr.u32 q12,q12,#25 + add r12,r5,r1 + add r14,r6,r7 + veor q7,q7,q13 + veor q9,q9,q14 + veor q7,q7,q11 + veor q9,q9,q12 + vadd.i32 q11,q8,q7 + vadd.i32 q12,q10,q9 + vshl.i32 q13,q11,#9 + vshl.i32 q14,q12,#9 + eor r3,r3,r12,ROR #19 + str r7,[sp,#104] + eor r4,r4,r14,ROR #19 + ldr r7,[sp,#112] + add r12,r3,r5 + str r6,[sp,#108] + add r6,r4,r6 + ldr r14,[sp,#116] + eor r0,r0,r12,ROR #14 + str r5,[sp,#96] + eor r5,r2,r6,ROR #14 + ldr r2,[sp,#120] + vshr.u32 q11,q11,#23 + vshr.u32 q12,q12,#23 + veor q2,q2,q13 + veor q6,q6,q14 + veor q2,q2,q11 + veor q6,q6,q12 + add r6,r10,r14 + add r12,r9,r8 + vadd.i32 q11,q7,q2 + vext.32 q7,q7,q7,#3 + vadd.i32 q12,q9,q6 + vshl.i32 q13,q11,#13 + vext.32 q9,q9,q9,#3 + vshl.i32 q14,q12,#13 + vshr.u32 q11,q11,#19 + vshr.u32 q12,q12,#19 + eor r11,r11,r6,ROR #25 + eor r2,r2,r12,ROR #25 + add r6,r11,r10 + str r3,[sp,#100] + add r3,r2,r9 + ldr r12,[sp,#124] + veor q4,q4,q13 + veor q5,q5,q14 + veor q4,q4,q11 + veor q5,q5,q12 + eor r6,r7,r6,ROR #23 + eor r3,r12,r3,ROR #23 + add r7,r6,r11 + add r12,r3,r2 + vadd.i32 q11,q2,q4 + vswp d4,d5 + vadd.i32 q12,q6,q5 + vshl.i32 q13,q11,#18 + vswp d12,d13 + vshl.i32 q14,q12,#18 + eor r7,r14,r7,ROR #19 + eor r8,r8,r12,ROR #19 + add r12,r7,r6 + add r14,r8,r3 + vshr.u32 q11,q11,#14 + vext.32 q4,q4,q4,#1 + vshr.u32 q12,q12,#14 + veor q8,q8,q13 + vext.32 q5,q5,q5,#1 + veor q10,q10,q14 + eor r10,r10,r12,ROR #14 + veor q8,q8,q11 + eor r9,r9,r14,ROR #14 + veor q10,q10,q12 + vadd.i32 q11,q7,q8 + vadd.i32 q12,q9,q10 + add r12,r0,r2 + add r14,r5,r1 + vshl.i32 q13,q11,#7 + vshl.i32 q14,q12,#7 + vshr.u32 q11,q11,#25 + vshr.u32 q12,q12,#25 + eor r4,r4,r12,ROR #25 + eor r7,r7,r14,ROR #25 + add r12,r4,r0 + add r14,r7,r5 + veor q4,q4,q13 + veor q5,q5,q14 + veor q4,q4,q11 + veor q5,q5,q12 + eor r6,r6,r12,ROR #23 + eor r3,r3,r14,ROR #23 + add r12,r6,r4 + str r7,[sp,#116] + add r7,r3,r7 + ldr r14,[sp,#108] + vadd.i32 q11,q8,q4 + vadd.i32 q12,q10,q5 + vshl.i32 q13,q11,#9 + vshl.i32 q14,q12,#9 + vshr.u32 q11,q11,#23 + vshr.u32 q12,q12,#23 + veor q2,q2,q13 + veor q6,q6,q14 + veor q2,q2,q11 + veor q6,q6,q12 + eor r2,r2,r12,ROR #19 + str r2,[sp,#120] + eor r1,r1,r7,ROR #19 + ldr r7,[sp,#96] + add r2,r2,r6 + str r6,[sp,#112] + add r6,r1,r3 + ldr r12,[sp,#104] + vadd.i32 q11,q4,q2 + vext.32 q4,q4,q4,#3 + vadd.i32 q12,q5,q6 + vshl.i32 q13,q11,#13 + vext.32 q5,q5,q5,#3 + vshl.i32 q14,q12,#13 + eor r0,r0,r2,ROR #14 + eor r2,r5,r6,ROR #14 + str r3,[sp,#124] + add r3,r10,r12 + ldr r5,[sp,#100] + add r6,r9,r11 + vshr.u32 q11,q11,#19 + vshr.u32 q12,q12,#19 + veor q7,q7,q13 + veor q9,q9,q14 + eor r8,r8,r3,ROR #25 + eor r3,r5,r6,ROR #25 + add r5,r8,r10 + add r6,r3,r9 + veor q7,q7,q11 + veor q9,q9,q12 + eor r5,r7,r5,ROR #23 + eor r6,r14,r6,ROR #23 + add r7,r5,r8 + add r14,r6,r3 + vadd.i32 q11,q2,q7 + vswp d4,d5 + vadd.i32 q12,q6,q9 + vshl.i32 q13,q11,#18 + vswp d12,d13 + vshl.i32 q14,q12,#18 + eor r7,r12,r7,ROR #19 + eor r11,r11,r14,ROR #19 + add r12,r7,r5 + add r14,r11,r6 + vshr.u32 q11,q11,#14 + vext.32 q7,q7,q7,#1 + vshr.u32 q12,q12,#14 + veor q8,q8,q13 + vext.32 q9,q9,q9,#1 + veor q10,q10,q14 + eor r10,r10,r12,ROR #14 + eor r9,r9,r14,ROR #14 + add r12,r0,r3 + add r14,r2,r4 + veor q8,q8,q11 + veor q11,q10,q12 + eor r1,r1,r12,ROR #25 + eor r7,r7,r14,ROR #25 + add r12,r1,r0 + add r14,r7,r2 + vadd.i32 q10,q4,q8 + vadd.i32 q12,q5,q11 + vshl.i32 q13,q10,#7 + vshl.i32 q14,q12,#7 + eor r5,r5,r12,ROR #23 + eor r6,r6,r14,ROR #23 + vshr.u32 q10,q10,#25 + vshr.u32 q12,q12,#25 + add r12,r5,r1 + add r14,r6,r7 + veor q7,q7,q13 + veor q9,q9,q14 + veor q7,q7,q10 + veor q9,q9,q12 + vadd.i32 q10,q8,q7 + vadd.i32 q12,q11,q9 + vshl.i32 q13,q10,#9 + vshl.i32 q14,q12,#9 + eor r3,r3,r12,ROR #19 + str r7,[sp,#104] + eor r4,r4,r14,ROR #19 + ldr r7,[sp,#112] + add r12,r3,r5 + str r6,[sp,#108] + add r6,r4,r6 + ldr r14,[sp,#116] + eor r0,r0,r12,ROR #14 + str r5,[sp,#96] + eor r5,r2,r6,ROR #14 + ldr r2,[sp,#120] + vshr.u32 q10,q10,#23 + vshr.u32 q12,q12,#23 + veor q2,q2,q13 + veor q6,q6,q14 + veor q2,q2,q10 + veor q6,q6,q12 + add r6,r10,r14 + add r12,r9,r8 + vadd.i32 q12,q7,q2 + vext.32 q10,q7,q7,#3 + vadd.i32 q7,q9,q6 + vshl.i32 q14,q12,#13 + vext.32 q13,q9,q9,#3 + vshl.i32 q9,q7,#13 + vshr.u32 q12,q12,#19 + vshr.u32 q7,q7,#19 + eor r11,r11,r6,ROR #25 + eor r2,r2,r12,ROR #25 + add r6,r11,r10 + str r3,[sp,#100] + add r3,r2,r9 + ldr r12,[sp,#124] + veor q4,q4,q14 + veor q5,q5,q9 + veor q4,q4,q12 + veor q7,q5,q7 + eor r6,r7,r6,ROR #23 + eor r3,r12,r3,ROR #23 + add r7,r6,r11 + add r12,r3,r2 + vadd.i32 q5,q2,q4 + vswp d4,d5 + vadd.i32 q9,q6,q7 + vshl.i32 q12,q5,#18 + vswp d12,d13 + vshl.i32 q14,q9,#18 + eor r7,r14,r7,ROR #19 + eor r8,r8,r12,ROR #19 + add r12,r7,r6 + add r14,r8,r3 + vshr.u32 q15,q5,#14 + vext.32 q5,q4,q4,#1 + vshr.u32 q4,q9,#14 + veor q8,q8,q12 + vext.32 q7,q7,q7,#1 + veor q9,q11,q14 + eor r10,r10,r12,ROR #14 + ldr r12,[sp,#248] + veor q8,q8,q15 + eor r9,r9,r14,ROR #14 + veor q11,q9,q4 + subs r12,r12,#4 + bhi ._mainloop2 + strd r8,[sp,#112] + ldrd r8,[sp,#64] + strd r2,[sp,#120] + ldrd r2,[sp,#96] + add r0,r0,r8 + strd r10,[sp,#96] + add r1,r1,r9 + ldrd r10,[sp,#48] + ldrd r8,[sp,#72] + add r2,r2,r10 + strd r6,[sp,#128] + add r3,r3,r11 + ldrd r6,[sp,#104] + ldrd r10,[sp,#32] + ldr r12,[sp,#236] + add r4,r4,r8 + add r5,r5,r9 + add r6,r6,r10 + add r7,r7,r11 + cmp r12,#0 + beq ._nomessage1 + ldr r8,[r12,#0] + ldr r9,[r12,#4] + ldr r10,[r12,#8] + ldr r11,[r12,#12] + eor r0,r0,r8 + ldr r8,[r12,#16] + eor r1,r1,r9 + ldr r9,[r12,#20] + eor r2,r2,r10 + ldr r10,[r12,#24] + eor r3,r3,r11 + ldr r11,[r12,#28] + eor r4,r4,r8 + eor r5,r5,r9 + eor r6,r6,r10 + eor r7,r7,r11 +._nomessage1: + ldr r14,[sp,#232] + vadd.i32 q4,q8,q1 + str r0,[r14,#0] + add r0,sp,#304 + str r1,[r14,#4] + vld1.8 {d16-d17},[r0,: 128] + str r2,[r14,#8] + vadd.i32 q5,q8,q5 + str r3,[r14,#12] + add r0,sp,#288 + str r4,[r14,#16] + vld1.8 {d16-d17},[r0,: 128] + str r5,[r14,#20] + vadd.i32 q9,q10,q0 + str r6,[r14,#24] + vadd.i32 q2,q8,q2 + str r7,[r14,#28] + vmov.i64 q8,#0xffffffff + ldrd r6,[sp,#128] + vext.32 d20,d8,d10,#1 + ldrd r0,[sp,#40] + vext.32 d25,d9,d11,#1 + ldrd r2,[sp,#120] + vbif q4,q9,q8 + ldrd r4,[sp,#56] + vext.32 d21,d5,d19,#1 + add r6,r6,r0 + vext.32 d24,d4,d18,#1 + add r7,r7,r1 + vbif q2,q5,q8 + add r2,r2,r4 + vrev64.i32 q5,q10 + add r3,r3,r5 + vrev64.i32 q9,q12 + adds r0,r0,#3 + vswp d5,d9 + adc r1,r1,#0 + strd r0,[sp,#40] + ldrd r8,[sp,#112] + ldrd r0,[sp,#88] + ldrd r10,[sp,#96] + ldrd r4,[sp,#80] + add r0,r8,r0 + add r1,r9,r1 + add r4,r10,r4 + add r5,r11,r5 + add r8,r14,#64 + cmp r12,#0 + beq ._nomessage2 + ldr r9,[r12,#32] + ldr r10,[r12,#36] + ldr r11,[r12,#40] + ldr r14,[r12,#44] + eor r6,r6,r9 + ldr r9,[r12,#48] + eor r7,r7,r10 + ldr r10,[r12,#52] + eor r4,r4,r11 + ldr r11,[r12,#56] + eor r5,r5,r14 + ldr r14,[r12,#60] + add r12,r12,#64 + eor r2,r2,r9 + vld1.8 {d20-d21},[r12]! + veor q4,q4,q10 + eor r3,r3,r10 + vld1.8 {d20-d21},[r12]! + veor q5,q5,q10 + eor r0,r0,r11 + vld1.8 {d20-d21},[r12]! + veor q2,q2,q10 + eor r1,r1,r14 + vld1.8 {d20-d21},[r12]! + veor q9,q9,q10 +._nomessage2: + vst1.8 {d8-d9},[r8]! + vst1.8 {d10-d11},[r8]! + vmov.i64 q4,#0xff + vst1.8 {d4-d5},[r8]! + vst1.8 {d18-d19},[r8]! + str r6,[r8,#-96] + add r6,sp,#336 + str r7,[r8,#-92] + add r7,sp,#320 + str r4,[r8,#-88] + vadd.i32 q2,q11,q1 + vld1.8 {d10-d11},[r6,: 128] + vadd.i32 q5,q5,q7 + vld1.8 {d14-d15},[r7,: 128] + vadd.i32 q9,q13,q0 + vadd.i32 q6,q7,q6 + str r5,[r8,#-84] + vext.32 d14,d4,d10,#1 + str r2,[r8,#-80] + vext.32 d21,d5,d11,#1 + str r3,[r8,#-76] + vbif q2,q9,q8 + str r0,[r8,#-72] + vext.32 d15,d13,d19,#1 + vshr.u32 q4,q4,#7 + str r1,[r8,#-68] + vext.32 d20,d12,d18,#1 + vbif q6,q5,q8 + ldr r0,[sp,#240] + vrev64.i32 q5,q7 + vrev64.i32 q7,q10 + vswp d13,d5 + vadd.i64 q3,q3,q4 + sub r0,r0,#192 + cmp r12,#0 + beq ._nomessage21 + vld1.8 {d16-d17},[r12]! + veor q2,q2,q8 + vld1.8 {d16-d17},[r12]! + veor q5,q5,q8 + vld1.8 {d16-d17},[r12]! + veor q6,q6,q8 + vld1.8 {d16-d17},[r12]! + veor q7,q7,q8 +._nomessage21: + vst1.8 {d4-d5},[r8]! + vst1.8 {d10-d11},[r8]! + vst1.8 {d12-d13},[r8]! + vst1.8 {d14-d15},[r8]! + str r12,[sp,#236] + add r14,sp,#272 + add r12,sp,#256 + str r8,[sp,#232] + cmp r0,#192 + bhs ._mlenatleast192 +._mlenlowbelow192: + cmp r0,#0 + beq ._done + b ._mlenatleast1 +._nextblock: + sub r0,r0,#64 +._mlenatleast1: +._handleblock: + str r0,[sp,#248] + ldrd r2,[sp,#48] + ldrd r6,[sp,#32] + ldrd r0,[sp,#64] + ldrd r4,[sp,#72] + ldrd r10,[sp,#80] + ldrd r8,[sp,#88] + strd r2,[sp,#96] + strd r6,[sp,#104] + ldrd r2,[sp,#56] + ldrd r6,[sp,#40] + ldr r12,[sp,#244] +._mainloop1: + str r12,[sp,#252] + add r12,r0,r2 + add r14,r5,r1 + eor r4,r4,r12,ROR #25 + eor r7,r7,r14,ROR #25 + add r12,r4,r0 + add r14,r7,r5 + eor r6,r6,r12,ROR #23 + eor r3,r3,r14,ROR #23 + add r12,r6,r4 + str r7,[sp,#132] + add r7,r3,r7 + ldr r14,[sp,#104] + eor r2,r2,r12,ROR #19 + str r6,[sp,#128] + eor r1,r1,r7,ROR #19 + ldr r7,[sp,#100] + add r6,r2,r6 + str r2,[sp,#120] + add r2,r1,r3 + ldr r12,[sp,#96] + eor r0,r0,r6,ROR #14 + str r3,[sp,#124] + eor r2,r5,r2,ROR #14 + ldr r3,[sp,#108] + add r5,r10,r14 + add r6,r9,r11 + eor r8,r8,r5,ROR #25 + eor r5,r7,r6,ROR #25 + add r6,r8,r10 + add r7,r5,r9 + eor r6,r12,r6,ROR #23 + eor r3,r3,r7,ROR #23 + add r7,r6,r8 + add r12,r3,r5 + eor r7,r14,r7,ROR #19 + eor r11,r11,r12,ROR #19 + add r12,r7,r6 + add r14,r11,r3 + eor r10,r10,r12,ROR #14 + eor r9,r9,r14,ROR #14 + add r12,r0,r5 + add r14,r2,r4 + eor r1,r1,r12,ROR #25 + eor r7,r7,r14,ROR #25 + add r12,r1,r0 + add r14,r7,r2 + eor r6,r6,r12,ROR #23 + eor r3,r3,r14,ROR #23 + add r12,r6,r1 + str r7,[sp,#104] + add r7,r3,r7 + ldr r14,[sp,#128] + eor r5,r5,r12,ROR #19 + str r3,[sp,#108] + eor r4,r4,r7,ROR #19 + ldr r7,[sp,#132] + add r12,r5,r6 + str r6,[sp,#96] + add r3,r4,r3 + ldr r6,[sp,#120] + eor r0,r0,r12,ROR #14 + str r5,[sp,#100] + eor r5,r2,r3,ROR #14 + ldr r3,[sp,#124] + add r2,r10,r7 + add r12,r9,r8 + eor r11,r11,r2,ROR #25 + eor r2,r6,r12,ROR #25 + add r6,r11,r10 + add r12,r2,r9 + eor r6,r14,r6,ROR #23 + eor r3,r3,r12,ROR #23 + add r12,r6,r11 + add r14,r3,r2 + eor r7,r7,r12,ROR #19 + eor r8,r8,r14,ROR #19 + add r12,r7,r6 + add r14,r8,r3 + eor r10,r10,r12,ROR #14 + eor r9,r9,r14,ROR #14 + ldr r12,[sp,#252] + subs r12,r12,#2 + bhi ._mainloop1 + strd r6,[sp,#128] + strd r2,[sp,#120] + strd r10,[sp,#112] + strd r8,[sp,#136] + ldrd r2,[sp,#96] + ldrd r6,[sp,#104] + ldrd r8,[sp,#64] + ldrd r10,[sp,#48] + add r0,r0,r8 + add r1,r1,r9 + add r2,r2,r10 + add r3,r3,r11 + ldrd r8,[sp,#72] + ldrd r10,[sp,#32] + add r4,r4,r8 + add r5,r5,r9 + add r6,r6,r10 + add r7,r7,r11 + ldr r12,[sp,#236] + cmp r12,#0 + beq ._nomessage10 + ldr r8,[r12,#0] + ldr r9,[r12,#4] + ldr r10,[r12,#8] + ldr r11,[r12,#12] + eor r0,r0,r8 + ldr r8,[r12,#16] + eor r1,r1,r9 + ldr r9,[r12,#20] + eor r2,r2,r10 + ldr r10,[r12,#24] + eor r3,r3,r11 + ldr r11,[r12,#28] + eor r4,r4,r8 + eor r5,r5,r9 + eor r6,r6,r10 + eor r7,r7,r11 +._nomessage10: + ldr r14,[sp,#232] + str r0,[r14,#0] + str r1,[r14,#4] + str r2,[r14,#8] + str r3,[r14,#12] + str r4,[r14,#16] + str r5,[r14,#20] + str r6,[r14,#24] + str r7,[r14,#28] + ldrd r6,[sp,#128] + ldrd r10,[sp,#112] + ldrd r0,[sp,#40] + ldrd r4,[sp,#80] + add r6,r6,r0 + add r7,r7,r1 + add r10,r10,r4 + add r11,r11,r5 + adds r0,r0,#1 + adc r1,r1,#0 + strd r0,[sp,#40] + ldrd r2,[sp,#120] + ldrd r8,[sp,#136] + ldrd r4,[sp,#56] + ldrd r0,[sp,#88] + add r2,r2,r4 + add r3,r3,r5 + add r0,r8,r0 + add r1,r9,r1 + cmp r12,#0 + beq ._nomessage11 + ldr r4,[r12,#32] + ldr r5,[r12,#36] + ldr r8,[r12,#40] + ldr r9,[r12,#44] + eor r6,r6,r4 + ldr r4,[r12,#48] + eor r7,r7,r5 + ldr r5,[r12,#52] + eor r10,r10,r8 + ldr r8,[r12,#56] + eor r11,r11,r9 + ldr r9,[r12,#60] + eor r2,r2,r4 + eor r3,r3,r5 + eor r0,r0,r8 + eor r1,r1,r9 + add r4,r12,#64 + str r4,[sp,#236] +._nomessage11: + str r6,[r14,#32] + str r7,[r14,#36] + str r10,[r14,#40] + str r11,[r14,#44] + str r2,[r14,#48] + str r3,[r14,#52] + str r0,[r14,#56] + str r1,[r14,#60] + add r0,r14,#64 + str r0,[sp,#232] + ldr r0,[sp,#248] + cmp r0,#64 + bhi ._nextblock +._done: + ldr r2,[sp,#160] + ldrd r4,[sp,#0] + ldrd r6,[sp,#8] + ldrd r8,[sp,#16] + ldrd r10,[sp,#24] + ldr r12,[sp,#228] + ldr r14,[sp,#224] + ldrd r0,[sp,#40] + strd r0,[r2] + sub r0,r12,sp + mov sp,r12 + vpop {q4,q5,q6,q7} + add r0,r0,#64 + bx lr +.size _gcry_arm_neon_salsa20_encrypt,.-_gcry_arm_neon_salsa20_encrypt; + +#endif diff --git a/cipher/salsa20.c b/cipher/salsa20.c index 892b9fc..f708b18 100644 --- a/cipher/salsa20.c +++ b/cipher/salsa20.c @@ -47,6 +47,15 @@ # define USE_AMD64 1 #endif +/* USE_ARM_NEON_ASM indicates whether to enable ARM NEON assembly code. */ +#undef USE_ARM_NEON_ASM +#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +# if defined(HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS) && \ + defined(HAVE_GCC_INLINE_ASM_NEON) +# define USE_ARM_NEON_ASM 1 +# endif +#endif + #define SALSA20_MIN_KEY_SIZE 16 /* Bytes. */ #define SALSA20_MAX_KEY_SIZE 32 /* Bytes. */ @@ -60,7 +69,16 @@ #define SALSA20R12_ROUNDS 12 -typedef struct +struct SALSA20_context_s; + +typedef unsigned int (*salsa20_core_t) (u32 *dst, struct SALSA20_context_s *ctx, + unsigned int rounds); +typedef void (* salsa20_keysetup_t)(struct SALSA20_context_s *ctx, + const byte *key, int keylen); +typedef void (* salsa20_ivsetup_t)(struct SALSA20_context_s *ctx, + const byte *iv); + +typedef struct SALSA20_context_s { /* Indices 1-4 and 11-14 holds the key (two identical copies for the shorter key size), indices 0, 5, 10, 15 are constant, indices 6, 7 @@ -74,6 +92,12 @@ typedef struct u32 input[SALSA20_INPUT_LENGTH]; u32 pad[SALSA20_INPUT_LENGTH]; unsigned int unused; /* bytes in the pad. */ +#ifdef USE_ARM_NEON_ASM + int use_neon; +#endif + salsa20_keysetup_t keysetup; + salsa20_ivsetup_t ivsetup; + salsa20_core_t core; } SALSA20_context_t; @@ -113,10 +137,10 @@ salsa20_ivsetup(SALSA20_context_t *ctx, const byte *iv) } static unsigned int -salsa20_core (u32 *dst, u32 *src, unsigned int rounds) +salsa20_core (u32 *dst, SALSA20_context_t *ctx, unsigned int rounds) { memset(dst, 0, SALSA20_BLOCK_SIZE); - return _gcry_salsa20_amd64_encrypt_blocks(src, dst, dst, 1, rounds); + return _gcry_salsa20_amd64_encrypt_blocks(ctx->input, dst, dst, 1, rounds); } #else /* USE_AMD64 */ @@ -149,9 +173,9 @@ salsa20_core (u32 *dst, u32 *src, unsigned int rounds) } while(0) static unsigned int -salsa20_core (u32 *dst, u32 *src, unsigned int rounds) +salsa20_core (u32 *dst, SALSA20_context_t *ctx, unsigned rounds) { - u32 pad[SALSA20_INPUT_LENGTH]; + u32 pad[SALSA20_INPUT_LENGTH], *src = ctx->input; unsigned int i; memcpy (pad, src, sizeof(pad)); @@ -236,6 +260,49 @@ static void salsa20_ivsetup(SALSA20_context_t *ctx, const byte *iv) #endif /*!USE_AMD64*/ +#ifdef USE_ARM_NEON_ASM + +/* ARM NEON implementation of Salsa20. */ +unsigned int +_gcry_arm_neon_salsa20_encrypt(void *c, const void *m, unsigned int nblks, + void *k, unsigned int rounds); + +static unsigned int +salsa20_core_neon (u32 *dst, SALSA20_context_t *ctx, unsigned int rounds) +{ + return _gcry_arm_neon_salsa20_encrypt(dst, NULL, 1, ctx->input, rounds); +} + +static void salsa20_ivsetup_neon(SALSA20_context_t *ctx, const byte *iv) +{ + memcpy(ctx->input + 8, iv, 8); + /* Reset the block counter. */ + memset(ctx->input + 10, 0, 8); +} + +static void +salsa20_keysetup_neon(SALSA20_context_t *ctx, const byte *key, int klen) +{ + static const unsigned char sigma32[16] = "expand 32-byte k"; + static const unsigned char sigma16[16] = "expand 16-byte k"; + + if (klen == 16) + { + memcpy (ctx->input, key, 16); + memcpy (ctx->input + 4, key, 16); /* Duplicate 128-bit key. */ + memcpy (ctx->input + 12, sigma16, 16); + } + else + { + /* 32-byte key */ + memcpy (ctx->input, key, 32); + memcpy (ctx->input + 12, sigma32, 16); + } +} + +#endif /*USE_ARM_NEON_ASM*/ + + static gcry_err_code_t salsa20_do_setkey (SALSA20_context_t *ctx, const byte *key, unsigned int keylen) @@ -257,7 +324,23 @@ salsa20_do_setkey (SALSA20_context_t *ctx, && keylen != SALSA20_MAX_KEY_SIZE) return GPG_ERR_INV_KEYLEN; - salsa20_keysetup (ctx, key, keylen); + /* Default ops. */ + ctx->keysetup = salsa20_keysetup; + ctx->ivsetup = salsa20_ivsetup; + ctx->core = salsa20_core; + +#ifdef USE_ARM_NEON_ASM + ctx->use_neon = (_gcry_get_hw_features () & HWF_ARM_NEON) != 0; + if (ctx->use_neon) + { + /* Use ARM NEON ops instead. */ + ctx->keysetup = salsa20_keysetup_neon; + ctx->ivsetup = salsa20_ivsetup_neon; + ctx->core = salsa20_core_neon; + } +#endif + + ctx->keysetup (ctx, key, keylen); /* We default to a zero nonce. */ salsa20_setiv (ctx, NULL, 0); @@ -290,7 +373,7 @@ salsa20_setiv (void *context, const byte *iv, unsigned int ivlen) else memcpy (tmp, iv, SALSA20_IV_SIZE); - salsa20_ivsetup (ctx, tmp); + ctx->ivsetup (ctx, tmp); /* Reset the unused pad bytes counter. */ ctx->unused = 0; @@ -340,12 +423,24 @@ salsa20_do_encrypt_stream (SALSA20_context_t *ctx, } #endif +#ifdef USE_ARM_NEON_ASM + if (ctx->use_neon && length >= SALSA20_BLOCK_SIZE) + { + unsigned int nblocks = length / SALSA20_BLOCK_SIZE; + _gcry_arm_neon_salsa20_encrypt (outbuf, inbuf, nblocks, ctx->input, + rounds); + length -= SALSA20_BLOCK_SIZE * nblocks; + outbuf += SALSA20_BLOCK_SIZE * nblocks; + inbuf += SALSA20_BLOCK_SIZE * nblocks; + } +#endif + while (length > 0) { /* Create the next pad and bump the block counter. Note that it is the user's duty to change to another nonce not later than after 2^70 processed bytes. */ - nburn = salsa20_core (ctx->pad, ctx->input, rounds); + nburn = ctx->core (ctx->pad, ctx, rounds); burn = nburn > burn ? nburn : burn; if (length <= SALSA20_BLOCK_SIZE) @@ -386,12 +481,13 @@ salsa20r12_encrypt_stream (void *context, } - static const char* selftest (void) { SALSA20_context_t ctx; byte scratch[8+1]; + byte buf[256+64+4]; + int i; static byte key_1[] = { 0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, @@ -418,6 +514,23 @@ selftest (void) salsa20_encrypt_stream (&ctx, scratch, scratch, sizeof plaintext_1); if (memcmp (scratch, plaintext_1, sizeof plaintext_1)) return "Salsa20 decryption test 1 failed."; + + for (i = 0; i < sizeof buf; i++) + buf[i] = i; + salsa20_setkey (&ctx, key_1, sizeof key_1); + salsa20_setiv (&ctx, nonce_1, sizeof nonce_1); + /*encrypt*/ + salsa20_encrypt_stream (&ctx, buf, buf, sizeof buf); + /*decrypt*/ + salsa20_setkey (&ctx, key_1, sizeof key_1); + salsa20_setiv (&ctx, nonce_1, sizeof nonce_1); + salsa20_encrypt_stream (&ctx, buf, buf, 1); + salsa20_encrypt_stream (&ctx, buf+1, buf+1, (sizeof buf)-1-1); + salsa20_encrypt_stream (&ctx, buf+(sizeof buf)-1, buf+(sizeof buf)-1, 1); + for (i = 0; i < sizeof buf; i++) + if (buf[i] != (byte)i) + return "Salsa20 encryption test 2 failed."; + return NULL; } diff --git a/configure.ac b/configure.ac index 114460c..19c97bd 100644 --- a/configure.ac +++ b/configure.ac @@ -1560,6 +1560,11 @@ if test "$found" = "1" ; then GCRYPT_CIPHERS="$GCRYPT_CIPHERS salsa20-amd64.lo" ;; esac + + if test x"$neonsupport" = xyes ; then + # Build with the NEON implementation + GCRYPT_CIPHERS="$GCRYPT_CIPHERS salsa20-armv7-neon.lo" + fi fi LIST_MEMBER(gost28147, $enabled_ciphers) commit 5a3d43485efdc09912be0967ee0a3ce345b3b15a Author: Jussi Kivilinna Date: Sat Oct 26 15:00:48 2013 +0300 Add AMD64 assembly implementation of Salsa20 * cipher/Makefile.am: Add 'salsa20-amd64.S'. * cipher/salsa20-amd64.S: New. * cipher/salsa20.c (USE_AMD64): New macro. [USE_AMD64] (_gcry_salsa20_amd64_keysetup, _gcry_salsa20_amd64_ivsetup) (_gcry_salsa20_amd64_encrypt_blocks): New prototypes. [USE_AMD64] (salsa20_keysetup, salsa20_ivsetup, salsa20_core): New. [!USE_AMD64] (salsa20_core): Change 'src' to non-constant, update block counter in 'salsa20_core' and return burn stack depth. [!USE_AMD64] (salsa20_keysetup, salsa20_ivsetup): New. (salsa20_do_setkey): Move generic key setup to 'salsa20_keysetup'. (salsa20_setkey): Fix burn stack depth. (salsa20_setiv): Move generic IV setup to 'salsa20_ivsetup'. (salsa20_do_encrypt_stream) [USE_AMD64]: Process large buffers in AMD64 implementation. (salsa20_do_encrypt_stream): Move stack burning to this function... (salsa20_encrypt_stream, salsa20r12_encrypt_stream): ...from these functions. * configure.ac [x86-64]: Add 'salsa20-amd64.lo'. -- Patch adds fast AMD64 assembly implementation for Salsa20. This implementation is based on public domain code by D. J. Bernstein and it is available at http://cr.yp.to/snuffle.html (amd64-xmm6). Implementation gains extra speed by processing four blocks in parallel with help SSE2 instructions. Benchmark results on Intel Core i5-4570 (3.2 Ghz): Before: SALSA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 3.88 ns/B 246.0 MiB/s 12.41 c/B STREAM dec | 3.88 ns/B 246.0 MiB/s 12.41 c/B = SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 2.46 ns/B 387.9 MiB/s 7.87 c/B STREAM dec | 2.46 ns/B 387.7 MiB/s 7.87 c/B After: SALSA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 0.985 ns/B 967.8 MiB/s 3.15 c/B STREAM dec | 0.987 ns/B 966.5 MiB/s 3.16 c/B = SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 0.636 ns/B 1500.5 MiB/s 2.03 c/B STREAM dec | 0.636 ns/B 1499.2 MiB/s 2.04 c/B Signed-off-by: Jussi Kivilinna diff --git a/cipher/Makefile.am b/cipher/Makefile.am index d7db933..e786713 100644 --- a/cipher/Makefile.am +++ b/cipher/Makefile.am @@ -71,7 +71,7 @@ md5.c \ rijndael.c rijndael-tables.h rijndael-amd64.S rijndael-arm.S \ rmd160.c \ rsa.c \ -salsa20.c \ +salsa20.c salsa20-amd64.S \ scrypt.c \ seed.c \ serpent.c serpent-sse2-amd64.S serpent-avx2-amd64.S \ diff --git a/cipher/salsa20-amd64.S b/cipher/salsa20-amd64.S new file mode 100644 index 0000000..691df58 --- /dev/null +++ b/cipher/salsa20-amd64.S @@ -0,0 +1,924 @@ +/* salsa20-amd64.S - AMD64 implementation of Salsa20 + * + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +/* + * Based on public domain implementation by D. J. Bernstein at + * http://cr.yp.to/snuffle.html + */ + +#ifdef __x86_64 +#include +#if defined(HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS) && defined(USE_SALSA20) + +.text + +.align 8 +.globl _gcry_salsa20_amd64_keysetup +.type _gcry_salsa20_amd64_keysetup, at function; +_gcry_salsa20_amd64_keysetup: + movl 0(%rsi),%r8d + movl 4(%rsi),%r9d + movl 8(%rsi),%eax + movl 12(%rsi),%r10d + movl %r8d,20(%rdi) + movl %r9d,40(%rdi) + movl %eax,60(%rdi) + movl %r10d,48(%rdi) + cmp $256,%rdx + jb ._kbits128 +._kbits256: + movl 16(%rsi),%edx + movl 20(%rsi),%ecx + movl 24(%rsi),%r8d + movl 28(%rsi),%esi + movl %edx,28(%rdi) + movl %ecx,16(%rdi) + movl %r8d,36(%rdi) + movl %esi,56(%rdi) + mov $1634760805,%rsi + mov $857760878,%rdx + mov $2036477234,%rcx + mov $1797285236,%r8 + movl %esi,0(%rdi) + movl %edx,4(%rdi) + movl %ecx,8(%rdi) + movl %r8d,12(%rdi) + jmp ._keysetupdone +._kbits128: + movl 0(%rsi),%edx + movl 4(%rsi),%ecx + movl 8(%rsi),%r8d + movl 12(%rsi),%esi + movl %edx,28(%rdi) + movl %ecx,16(%rdi) + movl %r8d,36(%rdi) + movl %esi,56(%rdi) + mov $1634760805,%rsi + mov $824206446,%rdx + mov $2036477238,%rcx + mov $1797285236,%r8 + movl %esi,0(%rdi) + movl %edx,4(%rdi) + movl %ecx,8(%rdi) + movl %r8d,12(%rdi) +._keysetupdone: + ret + +.align 8 +.globl _gcry_salsa20_amd64_ivsetup +.type _gcry_salsa20_amd64_ivsetup, at function; +_gcry_salsa20_amd64_ivsetup: + movl 0(%rsi),%r8d + movl 4(%rsi),%esi + mov $0,%r9 + mov $0,%rax + movl %r8d,24(%rdi) + movl %esi,44(%rdi) + movl %r9d,32(%rdi) + movl %eax,52(%rdi) + ret + +.align 8 +.globl _gcry_salsa20_amd64_encrypt_blocks +.type _gcry_salsa20_amd64_encrypt_blocks, at function; +_gcry_salsa20_amd64_encrypt_blocks: + /* + * Modifications to original implementation: + * - Number of rounds passing in register %r8 (for Salsa20/12). + * - Length is input as number of blocks, so don't handle tail bytes + * (this is done in salsa20.c). + */ + push %rbx + shlq $6, %rcx /* blocks to bytes */ + mov %r8, %rbx + mov %rsp,%r11 + and $31,%r11 + add $384,%r11 + sub %r11,%rsp + mov %rdi,%r8 + mov %rsi,%rsi + mov %rdx,%rdi + mov %rcx,%rdx + cmp $0,%rdx + jbe ._done +._start: + cmp $256,%rdx + jb ._bytes_are_64_128_or_192 + movdqa 0(%r8),%xmm0 + pshufd $0x55,%xmm0,%xmm1 + pshufd $0xaa,%xmm0,%xmm2 + pshufd $0xff,%xmm0,%xmm3 + pshufd $0x00,%xmm0,%xmm0 + movdqa %xmm1,0(%rsp) + movdqa %xmm2,16(%rsp) + movdqa %xmm3,32(%rsp) + movdqa %xmm0,48(%rsp) + movdqa 16(%r8),%xmm0 + pshufd $0xaa,%xmm0,%xmm1 + pshufd $0xff,%xmm0,%xmm2 + pshufd $0x00,%xmm0,%xmm3 + pshufd $0x55,%xmm0,%xmm0 + movdqa %xmm1,64(%rsp) + movdqa %xmm2,80(%rsp) + movdqa %xmm3,96(%rsp) + movdqa %xmm0,112(%rsp) + movdqa 32(%r8),%xmm0 + pshufd $0xff,%xmm0,%xmm1 + pshufd $0x55,%xmm0,%xmm2 + pshufd $0xaa,%xmm0,%xmm0 + movdqa %xmm1,128(%rsp) + movdqa %xmm2,144(%rsp) + movdqa %xmm0,160(%rsp) + movdqa 48(%r8),%xmm0 + pshufd $0x00,%xmm0,%xmm1 + pshufd $0xaa,%xmm0,%xmm2 + pshufd $0xff,%xmm0,%xmm0 + movdqa %xmm1,176(%rsp) + movdqa %xmm2,192(%rsp) + movdqa %xmm0,208(%rsp) +._bytesatleast256: + movl 32(%r8),%ecx + movl 52(%r8),%r9d + movl %ecx,224(%rsp) + movl %r9d,240(%rsp) + add $1,%ecx + adc $0,%r9d + movl %ecx,4+224(%rsp) + movl %r9d,4+240(%rsp) + add $1,%ecx + adc $0,%r9d + movl %ecx,8+224(%rsp) + movl %r9d,8+240(%rsp) + add $1,%ecx + adc $0,%r9d + movl %ecx,12+224(%rsp) + movl %r9d,12+240(%rsp) + add $1,%ecx + adc $0,%r9d + movl %ecx,32(%r8) + movl %r9d,52(%r8) + movq %rdx,288(%rsp) + mov %rbx,%rdx + movdqa 0(%rsp),%xmm0 + movdqa 16(%rsp),%xmm1 + movdqa 32(%rsp),%xmm2 + movdqa 192(%rsp),%xmm3 + movdqa 208(%rsp),%xmm4 + movdqa 64(%rsp),%xmm5 + movdqa 80(%rsp),%xmm6 + movdqa 112(%rsp),%xmm7 + movdqa 128(%rsp),%xmm8 + movdqa 144(%rsp),%xmm9 + movdqa 160(%rsp),%xmm10 + movdqa 240(%rsp),%xmm11 + movdqa 48(%rsp),%xmm12 + movdqa 96(%rsp),%xmm13 + movdqa 176(%rsp),%xmm14 + movdqa 224(%rsp),%xmm15 +._mainloop1: + movdqa %xmm1,256(%rsp) + movdqa %xmm2,272(%rsp) + movdqa %xmm13,%xmm1 + paddd %xmm12,%xmm1 + movdqa %xmm1,%xmm2 + pslld $7,%xmm1 + pxor %xmm1,%xmm14 + psrld $25,%xmm2 + pxor %xmm2,%xmm14 + movdqa %xmm7,%xmm1 + paddd %xmm0,%xmm1 + movdqa %xmm1,%xmm2 + pslld $7,%xmm1 + pxor %xmm1,%xmm11 + psrld $25,%xmm2 + pxor %xmm2,%xmm11 + movdqa %xmm12,%xmm1 + paddd %xmm14,%xmm1 + movdqa %xmm1,%xmm2 + pslld $9,%xmm1 + pxor %xmm1,%xmm15 + psrld $23,%xmm2 + pxor %xmm2,%xmm15 + movdqa %xmm0,%xmm1 + paddd %xmm11,%xmm1 + movdqa %xmm1,%xmm2 + pslld $9,%xmm1 + pxor %xmm1,%xmm9 + psrld $23,%xmm2 + pxor %xmm2,%xmm9 + movdqa %xmm14,%xmm1 + paddd %xmm15,%xmm1 + movdqa %xmm1,%xmm2 + pslld $13,%xmm1 + pxor %xmm1,%xmm13 + psrld $19,%xmm2 + pxor %xmm2,%xmm13 + movdqa %xmm11,%xmm1 + paddd %xmm9,%xmm1 + movdqa %xmm1,%xmm2 + pslld $13,%xmm1 + pxor %xmm1,%xmm7 + psrld $19,%xmm2 + pxor %xmm2,%xmm7 + movdqa %xmm15,%xmm1 + paddd %xmm13,%xmm1 + movdqa %xmm1,%xmm2 + pslld $18,%xmm1 + pxor %xmm1,%xmm12 + psrld $14,%xmm2 + pxor %xmm2,%xmm12 + movdqa 256(%rsp),%xmm1 + movdqa %xmm12,256(%rsp) + movdqa %xmm9,%xmm2 + paddd %xmm7,%xmm2 + movdqa %xmm2,%xmm12 + pslld $18,%xmm2 + pxor %xmm2,%xmm0 + psrld $14,%xmm12 + pxor %xmm12,%xmm0 + movdqa %xmm5,%xmm2 + paddd %xmm1,%xmm2 + movdqa %xmm2,%xmm12 + pslld $7,%xmm2 + pxor %xmm2,%xmm3 + psrld $25,%xmm12 + pxor %xmm12,%xmm3 + movdqa 272(%rsp),%xmm2 + movdqa %xmm0,272(%rsp) + movdqa %xmm6,%xmm0 + paddd %xmm2,%xmm0 + movdqa %xmm0,%xmm12 + pslld $7,%xmm0 + pxor %xmm0,%xmm4 + psrld $25,%xmm12 + pxor %xmm12,%xmm4 + movdqa %xmm1,%xmm0 + paddd %xmm3,%xmm0 + movdqa %xmm0,%xmm12 + pslld $9,%xmm0 + pxor %xmm0,%xmm10 + psrld $23,%xmm12 + pxor %xmm12,%xmm10 + movdqa %xmm2,%xmm0 + paddd %xmm4,%xmm0 + movdqa %xmm0,%xmm12 + pslld $9,%xmm0 + pxor %xmm0,%xmm8 + psrld $23,%xmm12 + pxor %xmm12,%xmm8 + movdqa %xmm3,%xmm0 + paddd %xmm10,%xmm0 + movdqa %xmm0,%xmm12 + pslld $13,%xmm0 + pxor %xmm0,%xmm5 + psrld $19,%xmm12 + pxor %xmm12,%xmm5 + movdqa %xmm4,%xmm0 + paddd %xmm8,%xmm0 + movdqa %xmm0,%xmm12 + pslld $13,%xmm0 + pxor %xmm0,%xmm6 + psrld $19,%xmm12 + pxor %xmm12,%xmm6 + movdqa %xmm10,%xmm0 + paddd %xmm5,%xmm0 + movdqa %xmm0,%xmm12 + pslld $18,%xmm0 + pxor %xmm0,%xmm1 + psrld $14,%xmm12 + pxor %xmm12,%xmm1 + movdqa 256(%rsp),%xmm0 + movdqa %xmm1,256(%rsp) + movdqa %xmm4,%xmm1 + paddd %xmm0,%xmm1 + movdqa %xmm1,%xmm12 + pslld $7,%xmm1 + pxor %xmm1,%xmm7 + psrld $25,%xmm12 + pxor %xmm12,%xmm7 + movdqa %xmm8,%xmm1 + paddd %xmm6,%xmm1 + movdqa %xmm1,%xmm12 + pslld $18,%xmm1 + pxor %xmm1,%xmm2 + psrld $14,%xmm12 + pxor %xmm12,%xmm2 + movdqa 272(%rsp),%xmm12 + movdqa %xmm2,272(%rsp) + movdqa %xmm14,%xmm1 + paddd %xmm12,%xmm1 + movdqa %xmm1,%xmm2 + pslld $7,%xmm1 + pxor %xmm1,%xmm5 + psrld $25,%xmm2 + pxor %xmm2,%xmm5 + movdqa %xmm0,%xmm1 + paddd %xmm7,%xmm1 + movdqa %xmm1,%xmm2 + pslld $9,%xmm1 + pxor %xmm1,%xmm10 + psrld $23,%xmm2 + pxor %xmm2,%xmm10 + movdqa %xmm12,%xmm1 + paddd %xmm5,%xmm1 + movdqa %xmm1,%xmm2 + pslld $9,%xmm1 + pxor %xmm1,%xmm8 + psrld $23,%xmm2 + pxor %xmm2,%xmm8 + movdqa %xmm7,%xmm1 + paddd %xmm10,%xmm1 + movdqa %xmm1,%xmm2 + pslld $13,%xmm1 + pxor %xmm1,%xmm4 + psrld $19,%xmm2 + pxor %xmm2,%xmm4 + movdqa %xmm5,%xmm1 + paddd %xmm8,%xmm1 + movdqa %xmm1,%xmm2 + pslld $13,%xmm1 + pxor %xmm1,%xmm14 + psrld $19,%xmm2 + pxor %xmm2,%xmm14 + movdqa %xmm10,%xmm1 + paddd %xmm4,%xmm1 + movdqa %xmm1,%xmm2 + pslld $18,%xmm1 + pxor %xmm1,%xmm0 + psrld $14,%xmm2 + pxor %xmm2,%xmm0 + movdqa 256(%rsp),%xmm1 + movdqa %xmm0,256(%rsp) + movdqa %xmm8,%xmm0 + paddd %xmm14,%xmm0 + movdqa %xmm0,%xmm2 + pslld $18,%xmm0 + pxor %xmm0,%xmm12 + psrld $14,%xmm2 + pxor %xmm2,%xmm12 + movdqa %xmm11,%xmm0 + paddd %xmm1,%xmm0 + movdqa %xmm0,%xmm2 + pslld $7,%xmm0 + pxor %xmm0,%xmm6 + psrld $25,%xmm2 + pxor %xmm2,%xmm6 + movdqa 272(%rsp),%xmm2 + movdqa %xmm12,272(%rsp) + movdqa %xmm3,%xmm0 + paddd %xmm2,%xmm0 + movdqa %xmm0,%xmm12 + pslld $7,%xmm0 + pxor %xmm0,%xmm13 + psrld $25,%xmm12 + pxor %xmm12,%xmm13 + movdqa %xmm1,%xmm0 + paddd %xmm6,%xmm0 + movdqa %xmm0,%xmm12 + pslld $9,%xmm0 + pxor %xmm0,%xmm15 + psrld $23,%xmm12 + pxor %xmm12,%xmm15 + movdqa %xmm2,%xmm0 + paddd %xmm13,%xmm0 + movdqa %xmm0,%xmm12 + pslld $9,%xmm0 + pxor %xmm0,%xmm9 + psrld $23,%xmm12 + pxor %xmm12,%xmm9 + movdqa %xmm6,%xmm0 + paddd %xmm15,%xmm0 + movdqa %xmm0,%xmm12 + pslld $13,%xmm0 + pxor %xmm0,%xmm11 + psrld $19,%xmm12 + pxor %xmm12,%xmm11 + movdqa %xmm13,%xmm0 + paddd %xmm9,%xmm0 + movdqa %xmm0,%xmm12 + pslld $13,%xmm0 + pxor %xmm0,%xmm3 + psrld $19,%xmm12 + pxor %xmm12,%xmm3 + movdqa %xmm15,%xmm0 + paddd %xmm11,%xmm0 + movdqa %xmm0,%xmm12 + pslld $18,%xmm0 + pxor %xmm0,%xmm1 + psrld $14,%xmm12 + pxor %xmm12,%xmm1 + movdqa %xmm9,%xmm0 + paddd %xmm3,%xmm0 + movdqa %xmm0,%xmm12 + pslld $18,%xmm0 + pxor %xmm0,%xmm2 + psrld $14,%xmm12 + pxor %xmm12,%xmm2 + movdqa 256(%rsp),%xmm12 + movdqa 272(%rsp),%xmm0 + sub $2,%rdx + ja ._mainloop1 + paddd 48(%rsp),%xmm12 + paddd 112(%rsp),%xmm7 + paddd 160(%rsp),%xmm10 + paddd 208(%rsp),%xmm4 + movd %xmm12,%rdx + movd %xmm7,%rcx + movd %xmm10,%r9 + movd %xmm4,%rax + pshufd $0x39,%xmm12,%xmm12 + pshufd $0x39,%xmm7,%xmm7 + pshufd $0x39,%xmm10,%xmm10 + pshufd $0x39,%xmm4,%xmm4 + xorl 0(%rsi),%edx + xorl 4(%rsi),%ecx + xorl 8(%rsi),%r9d + xorl 12(%rsi),%eax + movl %edx,0(%rdi) + movl %ecx,4(%rdi) + movl %r9d,8(%rdi) + movl %eax,12(%rdi) + movd %xmm12,%rdx + movd %xmm7,%rcx + movd %xmm10,%r9 + movd %xmm4,%rax + pshufd $0x39,%xmm12,%xmm12 + pshufd $0x39,%xmm7,%xmm7 + pshufd $0x39,%xmm10,%xmm10 + pshufd $0x39,%xmm4,%xmm4 + xorl 64(%rsi),%edx + xorl 68(%rsi),%ecx + xorl 72(%rsi),%r9d + xorl 76(%rsi),%eax + movl %edx,64(%rdi) + movl %ecx,68(%rdi) + movl %r9d,72(%rdi) + movl %eax,76(%rdi) + movd %xmm12,%rdx + movd %xmm7,%rcx + movd %xmm10,%r9 + movd %xmm4,%rax + pshufd $0x39,%xmm12,%xmm12 + pshufd $0x39,%xmm7,%xmm7 + pshufd $0x39,%xmm10,%xmm10 + pshufd $0x39,%xmm4,%xmm4 + xorl 128(%rsi),%edx + xorl 132(%rsi),%ecx + xorl 136(%rsi),%r9d + xorl 140(%rsi),%eax + movl %edx,128(%rdi) + movl %ecx,132(%rdi) + movl %r9d,136(%rdi) + movl %eax,140(%rdi) + movd %xmm12,%rdx + movd %xmm7,%rcx + movd %xmm10,%r9 + movd %xmm4,%rax + xorl 192(%rsi),%edx + xorl 196(%rsi),%ecx + xorl 200(%rsi),%r9d + xorl 204(%rsi),%eax + movl %edx,192(%rdi) + movl %ecx,196(%rdi) + movl %r9d,200(%rdi) + movl %eax,204(%rdi) + paddd 176(%rsp),%xmm14 + paddd 0(%rsp),%xmm0 + paddd 64(%rsp),%xmm5 + paddd 128(%rsp),%xmm8 + movd %xmm14,%rdx + movd %xmm0,%rcx + movd %xmm5,%r9 + movd %xmm8,%rax + pshufd $0x39,%xmm14,%xmm14 + pshufd $0x39,%xmm0,%xmm0 + pshufd $0x39,%xmm5,%xmm5 + pshufd $0x39,%xmm8,%xmm8 + xorl 16(%rsi),%edx + xorl 20(%rsi),%ecx + xorl 24(%rsi),%r9d + xorl 28(%rsi),%eax + movl %edx,16(%rdi) + movl %ecx,20(%rdi) + movl %r9d,24(%rdi) + movl %eax,28(%rdi) + movd %xmm14,%rdx + movd %xmm0,%rcx + movd %xmm5,%r9 + movd %xmm8,%rax + pshufd $0x39,%xmm14,%xmm14 + pshufd $0x39,%xmm0,%xmm0 + pshufd $0x39,%xmm5,%xmm5 + pshufd $0x39,%xmm8,%xmm8 + xorl 80(%rsi),%edx + xorl 84(%rsi),%ecx + xorl 88(%rsi),%r9d + xorl 92(%rsi),%eax + movl %edx,80(%rdi) + movl %ecx,84(%rdi) + movl %r9d,88(%rdi) + movl %eax,92(%rdi) + movd %xmm14,%rdx + movd %xmm0,%rcx + movd %xmm5,%r9 + movd %xmm8,%rax + pshufd $0x39,%xmm14,%xmm14 + pshufd $0x39,%xmm0,%xmm0 + pshufd $0x39,%xmm5,%xmm5 + pshufd $0x39,%xmm8,%xmm8 + xorl 144(%rsi),%edx + xorl 148(%rsi),%ecx + xorl 152(%rsi),%r9d + xorl 156(%rsi),%eax + movl %edx,144(%rdi) + movl %ecx,148(%rdi) + movl %r9d,152(%rdi) + movl %eax,156(%rdi) + movd %xmm14,%rdx + movd %xmm0,%rcx + movd %xmm5,%r9 + movd %xmm8,%rax + xorl 208(%rsi),%edx + xorl 212(%rsi),%ecx + xorl 216(%rsi),%r9d + xorl 220(%rsi),%eax + movl %edx,208(%rdi) + movl %ecx,212(%rdi) + movl %r9d,216(%rdi) + movl %eax,220(%rdi) + paddd 224(%rsp),%xmm15 + paddd 240(%rsp),%xmm11 + paddd 16(%rsp),%xmm1 + paddd 80(%rsp),%xmm6 + movd %xmm15,%rdx + movd %xmm11,%rcx + movd %xmm1,%r9 + movd %xmm6,%rax + pshufd $0x39,%xmm15,%xmm15 + pshufd $0x39,%xmm11,%xmm11 + pshufd $0x39,%xmm1,%xmm1 + pshufd $0x39,%xmm6,%xmm6 + xorl 32(%rsi),%edx + xorl 36(%rsi),%ecx + xorl 40(%rsi),%r9d + xorl 44(%rsi),%eax + movl %edx,32(%rdi) + movl %ecx,36(%rdi) + movl %r9d,40(%rdi) + movl %eax,44(%rdi) + movd %xmm15,%rdx + movd %xmm11,%rcx + movd %xmm1,%r9 + movd %xmm6,%rax + pshufd $0x39,%xmm15,%xmm15 + pshufd $0x39,%xmm11,%xmm11 + pshufd $0x39,%xmm1,%xmm1 + pshufd $0x39,%xmm6,%xmm6 + xorl 96(%rsi),%edx + xorl 100(%rsi),%ecx + xorl 104(%rsi),%r9d + xorl 108(%rsi),%eax + movl %edx,96(%rdi) + movl %ecx,100(%rdi) + movl %r9d,104(%rdi) + movl %eax,108(%rdi) + movd %xmm15,%rdx + movd %xmm11,%rcx + movd %xmm1,%r9 + movd %xmm6,%rax + pshufd $0x39,%xmm15,%xmm15 + pshufd $0x39,%xmm11,%xmm11 + pshufd $0x39,%xmm1,%xmm1 + pshufd $0x39,%xmm6,%xmm6 + xorl 160(%rsi),%edx + xorl 164(%rsi),%ecx + xorl 168(%rsi),%r9d + xorl 172(%rsi),%eax + movl %edx,160(%rdi) + movl %ecx,164(%rdi) + movl %r9d,168(%rdi) + movl %eax,172(%rdi) + movd %xmm15,%rdx + movd %xmm11,%rcx + movd %xmm1,%r9 + movd %xmm6,%rax + xorl 224(%rsi),%edx + xorl 228(%rsi),%ecx + xorl 232(%rsi),%r9d + xorl 236(%rsi),%eax + movl %edx,224(%rdi) + movl %ecx,228(%rdi) + movl %r9d,232(%rdi) + movl %eax,236(%rdi) + paddd 96(%rsp),%xmm13 + paddd 144(%rsp),%xmm9 + paddd 192(%rsp),%xmm3 + paddd 32(%rsp),%xmm2 + movd %xmm13,%rdx + movd %xmm9,%rcx + movd %xmm3,%r9 + movd %xmm2,%rax + pshufd $0x39,%xmm13,%xmm13 + pshufd $0x39,%xmm9,%xmm9 + pshufd $0x39,%xmm3,%xmm3 + pshufd $0x39,%xmm2,%xmm2 + xorl 48(%rsi),%edx + xorl 52(%rsi),%ecx + xorl 56(%rsi),%r9d + xorl 60(%rsi),%eax + movl %edx,48(%rdi) + movl %ecx,52(%rdi) + movl %r9d,56(%rdi) + movl %eax,60(%rdi) + movd %xmm13,%rdx + movd %xmm9,%rcx + movd %xmm3,%r9 + movd %xmm2,%rax + pshufd $0x39,%xmm13,%xmm13 + pshufd $0x39,%xmm9,%xmm9 + pshufd $0x39,%xmm3,%xmm3 + pshufd $0x39,%xmm2,%xmm2 + xorl 112(%rsi),%edx + xorl 116(%rsi),%ecx + xorl 120(%rsi),%r9d + xorl 124(%rsi),%eax + movl %edx,112(%rdi) + movl %ecx,116(%rdi) + movl %r9d,120(%rdi) + movl %eax,124(%rdi) + movd %xmm13,%rdx + movd %xmm9,%rcx + movd %xmm3,%r9 + movd %xmm2,%rax + pshufd $0x39,%xmm13,%xmm13 + pshufd $0x39,%xmm9,%xmm9 + pshufd $0x39,%xmm3,%xmm3 + pshufd $0x39,%xmm2,%xmm2 + xorl 176(%rsi),%edx + xorl 180(%rsi),%ecx + xorl 184(%rsi),%r9d + xorl 188(%rsi),%eax + movl %edx,176(%rdi) + movl %ecx,180(%rdi) + movl %r9d,184(%rdi) + movl %eax,188(%rdi) + movd %xmm13,%rdx + movd %xmm9,%rcx + movd %xmm3,%r9 + movd %xmm2,%rax + xorl 240(%rsi),%edx + xorl 244(%rsi),%ecx + xorl 248(%rsi),%r9d + xorl 252(%rsi),%eax + movl %edx,240(%rdi) + movl %ecx,244(%rdi) + movl %r9d,248(%rdi) + movl %eax,252(%rdi) + movq 288(%rsp),%rdx + sub $256,%rdx + add $256,%rsi + add $256,%rdi + cmp $256,%rdx + jae ._bytesatleast256 + cmp $0,%rdx + jbe ._done +._bytes_are_64_128_or_192: + movq %rdx,288(%rsp) + movdqa 0(%r8),%xmm0 + movdqa 16(%r8),%xmm1 + movdqa 32(%r8),%xmm2 + movdqa 48(%r8),%xmm3 + movdqa %xmm1,%xmm4 + mov %rbx,%rdx +._mainloop2: + paddd %xmm0,%xmm4 + movdqa %xmm0,%xmm5 + movdqa %xmm4,%xmm6 + pslld $7,%xmm4 + psrld $25,%xmm6 + pxor %xmm4,%xmm3 + pxor %xmm6,%xmm3 + paddd %xmm3,%xmm5 + movdqa %xmm3,%xmm4 + movdqa %xmm5,%xmm6 + pslld $9,%xmm5 + psrld $23,%xmm6 + pxor %xmm5,%xmm2 + pshufd $0x93,%xmm3,%xmm3 + pxor %xmm6,%xmm2 + paddd %xmm2,%xmm4 + movdqa %xmm2,%xmm5 + movdqa %xmm4,%xmm6 + pslld $13,%xmm4 + psrld $19,%xmm6 + pxor %xmm4,%xmm1 + pshufd $0x4e,%xmm2,%xmm2 + pxor %xmm6,%xmm1 + paddd %xmm1,%xmm5 + movdqa %xmm3,%xmm4 + movdqa %xmm5,%xmm6 + pslld $18,%xmm5 + psrld $14,%xmm6 + pxor %xmm5,%xmm0 + pshufd $0x39,%xmm1,%xmm1 + pxor %xmm6,%xmm0 + paddd %xmm0,%xmm4 + movdqa %xmm0,%xmm5 + movdqa %xmm4,%xmm6 + pslld $7,%xmm4 + psrld $25,%xmm6 + pxor %xmm4,%xmm1 + pxor %xmm6,%xmm1 + paddd %xmm1,%xmm5 + movdqa %xmm1,%xmm4 + movdqa %xmm5,%xmm6 + pslld $9,%xmm5 + psrld $23,%xmm6 + pxor %xmm5,%xmm2 + pshufd $0x93,%xmm1,%xmm1 + pxor %xmm6,%xmm2 + paddd %xmm2,%xmm4 + movdqa %xmm2,%xmm5 + movdqa %xmm4,%xmm6 + pslld $13,%xmm4 + psrld $19,%xmm6 + pxor %xmm4,%xmm3 + pshufd $0x4e,%xmm2,%xmm2 + pxor %xmm6,%xmm3 + paddd %xmm3,%xmm5 + movdqa %xmm1,%xmm4 + movdqa %xmm5,%xmm6 + pslld $18,%xmm5 + psrld $14,%xmm6 + pxor %xmm5,%xmm0 + pshufd $0x39,%xmm3,%xmm3 + pxor %xmm6,%xmm0 + paddd %xmm0,%xmm4 + movdqa %xmm0,%xmm5 + movdqa %xmm4,%xmm6 + pslld $7,%xmm4 + psrld $25,%xmm6 + pxor %xmm4,%xmm3 + pxor %xmm6,%xmm3 + paddd %xmm3,%xmm5 + movdqa %xmm3,%xmm4 + movdqa %xmm5,%xmm6 + pslld $9,%xmm5 + psrld $23,%xmm6 + pxor %xmm5,%xmm2 + pshufd $0x93,%xmm3,%xmm3 + pxor %xmm6,%xmm2 + paddd %xmm2,%xmm4 + movdqa %xmm2,%xmm5 + movdqa %xmm4,%xmm6 + pslld $13,%xmm4 + psrld $19,%xmm6 + pxor %xmm4,%xmm1 + pshufd $0x4e,%xmm2,%xmm2 + pxor %xmm6,%xmm1 + paddd %xmm1,%xmm5 + movdqa %xmm3,%xmm4 + movdqa %xmm5,%xmm6 + pslld $18,%xmm5 + psrld $14,%xmm6 + pxor %xmm5,%xmm0 + pshufd $0x39,%xmm1,%xmm1 + pxor %xmm6,%xmm0 + paddd %xmm0,%xmm4 + movdqa %xmm0,%xmm5 + movdqa %xmm4,%xmm6 + pslld $7,%xmm4 + psrld $25,%xmm6 + pxor %xmm4,%xmm1 + pxor %xmm6,%xmm1 + paddd %xmm1,%xmm5 + movdqa %xmm1,%xmm4 + movdqa %xmm5,%xmm6 + pslld $9,%xmm5 + psrld $23,%xmm6 + pxor %xmm5,%xmm2 + pshufd $0x93,%xmm1,%xmm1 + pxor %xmm6,%xmm2 + paddd %xmm2,%xmm4 + movdqa %xmm2,%xmm5 + movdqa %xmm4,%xmm6 + pslld $13,%xmm4 + psrld $19,%xmm6 + pxor %xmm4,%xmm3 + pshufd $0x4e,%xmm2,%xmm2 + pxor %xmm6,%xmm3 + sub $4,%rdx + paddd %xmm3,%xmm5 + movdqa %xmm1,%xmm4 + movdqa %xmm5,%xmm6 + pslld $18,%xmm5 + pxor %xmm7,%xmm7 + psrld $14,%xmm6 + pxor %xmm5,%xmm0 + pshufd $0x39,%xmm3,%xmm3 + pxor %xmm6,%xmm0 + ja ._mainloop2 + paddd 0(%r8),%xmm0 + paddd 16(%r8),%xmm1 + paddd 32(%r8),%xmm2 + paddd 48(%r8),%xmm3 + movd %xmm0,%rdx + movd %xmm1,%rcx + movd %xmm2,%rax + movd %xmm3,%r10 + pshufd $0x39,%xmm0,%xmm0 + pshufd $0x39,%xmm1,%xmm1 + pshufd $0x39,%xmm2,%xmm2 + pshufd $0x39,%xmm3,%xmm3 + xorl 0(%rsi),%edx + xorl 48(%rsi),%ecx + xorl 32(%rsi),%eax + xorl 16(%rsi),%r10d + movl %edx,0(%rdi) + movl %ecx,48(%rdi) + movl %eax,32(%rdi) + movl %r10d,16(%rdi) + movd %xmm0,%rdx + movd %xmm1,%rcx + movd %xmm2,%rax + movd %xmm3,%r10 + pshufd $0x39,%xmm0,%xmm0 + pshufd $0x39,%xmm1,%xmm1 + pshufd $0x39,%xmm2,%xmm2 + pshufd $0x39,%xmm3,%xmm3 + xorl 20(%rsi),%edx + xorl 4(%rsi),%ecx + xorl 52(%rsi),%eax + xorl 36(%rsi),%r10d + movl %edx,20(%rdi) + movl %ecx,4(%rdi) + movl %eax,52(%rdi) + movl %r10d,36(%rdi) + movd %xmm0,%rdx + movd %xmm1,%rcx + movd %xmm2,%rax + movd %xmm3,%r10 + pshufd $0x39,%xmm0,%xmm0 + pshufd $0x39,%xmm1,%xmm1 + pshufd $0x39,%xmm2,%xmm2 + pshufd $0x39,%xmm3,%xmm3 + xorl 40(%rsi),%edx + xorl 24(%rsi),%ecx + xorl 8(%rsi),%eax + xorl 56(%rsi),%r10d + movl %edx,40(%rdi) + movl %ecx,24(%rdi) + movl %eax,8(%rdi) + movl %r10d,56(%rdi) + movd %xmm0,%rdx + movd %xmm1,%rcx + movd %xmm2,%rax + movd %xmm3,%r10 + xorl 60(%rsi),%edx + xorl 44(%rsi),%ecx + xorl 28(%rsi),%eax + xorl 12(%rsi),%r10d + movl %edx,60(%rdi) + movl %ecx,44(%rdi) + movl %eax,28(%rdi) + movl %r10d,12(%rdi) + movq 288(%rsp),%rdx + movl 32(%r8),%ecx + movl 52(%r8),%eax + add $1,%ecx + adc $0,%eax + movl %ecx,32(%r8) + movl %eax,52(%r8) + cmp $64,%rdx + ja ._bytes_are_128_or_192 +._done: + add %r11,%rsp + mov %r11,%rax + pop %rbx + ret +._bytes_are_128_or_192: + sub $64,%rdx + add $64,%rdi + add $64,%rsi + jmp ._bytes_are_64_128_or_192 +.size _gcry_salsa20_amd64_encrypt_blocks,.-_gcry_salsa20_amd64_encrypt_blocks; + +#endif /*defined(USE_SALSA20)*/ +#endif /*__x86_64*/ diff --git a/cipher/salsa20.c b/cipher/salsa20.c index 6189bca..892b9fc 100644 --- a/cipher/salsa20.c +++ b/cipher/salsa20.c @@ -40,6 +40,14 @@ #include "cipher.h" #include "bufhelp.h" + +/* USE_AMD64 indicates whether to compile with AMD64 code. */ +#undef USE_AMD64 +#if defined(__x86_64__) && defined(HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS) +# define USE_AMD64 1 +#endif + + #define SALSA20_MIN_KEY_SIZE 16 /* Bytes. */ #define SALSA20_MAX_KEY_SIZE 32 /* Bytes. */ #define SALSA20_BLOCK_SIZE 64 /* Bytes. */ @@ -83,6 +91,36 @@ typedef struct static void salsa20_setiv (void *context, const byte *iv, unsigned int ivlen); static const char *selftest (void); + +#ifdef USE_AMD64 +/* AMD64 assembly implementations of Salsa20. */ +void _gcry_salsa20_amd64_keysetup(u32 *ctxinput, const void *key, int keybits); +void _gcry_salsa20_amd64_ivsetup(u32 *ctxinput, const void *iv); +unsigned int +_gcry_salsa20_amd64_encrypt_blocks(u32 *ctxinput, const void *src, void *dst, + size_t len, int rounds); + +static void +salsa20_keysetup(SALSA20_context_t *ctx, const byte *key, int keylen) +{ + _gcry_salsa20_amd64_keysetup(ctx->input, key, keylen * 8); +} + +static void +salsa20_ivsetup(SALSA20_context_t *ctx, const byte *iv) +{ + _gcry_salsa20_amd64_ivsetup(ctx->input, iv); +} + +static unsigned int +salsa20_core (u32 *dst, u32 *src, unsigned int rounds) +{ + memset(dst, 0, SALSA20_BLOCK_SIZE); + return _gcry_salsa20_amd64_encrypt_blocks(src, dst, dst, 1, rounds); +} + +#else /* USE_AMD64 */ + #if 0 @@ -110,8 +148,8 @@ static const char *selftest (void); x0 ^= ROTL32 (18, x3 + x2); \ } while(0) -static void -salsa20_core (u32 *dst, const u32 *src, unsigned rounds) +static unsigned int +salsa20_core (u32 *dst, u32 *src, unsigned int rounds) { u32 pad[SALSA20_INPUT_LENGTH]; unsigned int i; @@ -138,31 +176,24 @@ salsa20_core (u32 *dst, const u32 *src, unsigned rounds) u32 t = pad[i] + src[i]; dst[i] = LE_SWAP32 (t); } + + /* Update counter. */ + if (!++src[8]) + src[9]++; + + /* burn_stack */ + return ( 3*sizeof (void*) \ + + 2*sizeof (void*) \ + + 64 \ + + sizeof (unsigned int) \ + + sizeof (u32) ); } #undef QROUND #undef SALSA20_CORE_DEBUG -static gcry_err_code_t -salsa20_do_setkey (SALSA20_context_t *ctx, - const byte *key, unsigned int keylen) +static void +salsa20_keysetup(SALSA20_context_t *ctx, const byte *key, int keylen) { - static int initialized; - static const char *selftest_failed; - - if (!initialized ) - { - initialized = 1; - selftest_failed = selftest (); - if (selftest_failed) - log_error ("SALSA20 selftest failed (%s)\n", selftest_failed ); - } - if (selftest_failed) - return GPG_ERR_SELFTEST_FAILED; - - if (keylen != SALSA20_MIN_KEY_SIZE - && keylen != SALSA20_MAX_KEY_SIZE) - return GPG_ERR_INV_KEYLEN; - /* These constants are the little endian encoding of the string "expand 32-byte k". For the 128 bit variant, the "32" in that string will be fixed up to "16". */ @@ -192,6 +223,41 @@ salsa20_do_setkey (SALSA20_context_t *ctx, ctx->input[5] -= 0x02000000; /* Change to "1 dn". */ ctx->input[10] += 0x00000004; /* Change to "yb-6". */ } +} + +static void salsa20_ivsetup(SALSA20_context_t *ctx, const byte *iv) +{ + ctx->input[6] = LE_READ_UINT32(iv + 0); + ctx->input[7] = LE_READ_UINT32(iv + 4); + /* Reset the block counter. */ + ctx->input[8] = 0; + ctx->input[9] = 0; +} + +#endif /*!USE_AMD64*/ + +static gcry_err_code_t +salsa20_do_setkey (SALSA20_context_t *ctx, + const byte *key, unsigned int keylen) +{ + static int initialized; + static const char *selftest_failed; + + if (!initialized ) + { + initialized = 1; + selftest_failed = selftest (); + if (selftest_failed) + log_error ("SALSA20 selftest failed (%s)\n", selftest_failed ); + } + if (selftest_failed) + return GPG_ERR_SELFTEST_FAILED; + + if (keylen != SALSA20_MIN_KEY_SIZE + && keylen != SALSA20_MAX_KEY_SIZE) + return GPG_ERR_INV_KEYLEN; + + salsa20_keysetup (ctx, key, keylen); /* We default to a zero nonce. */ salsa20_setiv (ctx, NULL, 0); @@ -205,7 +271,7 @@ salsa20_setkey (void *context, const byte *key, unsigned int keylen) { SALSA20_context_t *ctx = (SALSA20_context_t *)context; gcry_err_code_t rc = salsa20_do_setkey (ctx, key, keylen); - _gcry_burn_stack (300/* FIXME*/); + _gcry_burn_stack (4 + sizeof (void *) + 4 * sizeof (void *)); return rc; } @@ -214,28 +280,22 @@ static void salsa20_setiv (void *context, const byte *iv, unsigned int ivlen) { SALSA20_context_t *ctx = (SALSA20_context_t *)context; + byte tmp[SALSA20_IV_SIZE]; - if (!iv) - { - ctx->input[6] = 0; - ctx->input[7] = 0; - } - else if (ivlen == SALSA20_IV_SIZE) - { - ctx->input[6] = LE_READ_UINT32(iv + 0); - ctx->input[7] = LE_READ_UINT32(iv + 4); - } + if (iv && ivlen != SALSA20_IV_SIZE) + log_info ("WARNING: salsa20_setiv: bad ivlen=%u\n", ivlen); + + if (!iv || ivlen != SALSA20_IV_SIZE) + memset (tmp, 0, sizeof(tmp)); else - { - log_info ("WARNING: salsa20_setiv: bad ivlen=%u\n", ivlen); - ctx->input[6] = 0; - ctx->input[7] = 0; - } - /* Reset the block counter. */ - ctx->input[8] = 0; - ctx->input[9] = 0; + memcpy (tmp, iv, SALSA20_IV_SIZE); + + salsa20_ivsetup (ctx, tmp); + /* Reset the unused pad bytes counter. */ ctx->unused = 0; + + wipememory (tmp, sizeof(tmp)); } @@ -246,6 +306,8 @@ salsa20_do_encrypt_stream (SALSA20_context_t *ctx, byte *outbuf, const byte *inbuf, unsigned int length, unsigned rounds) { + unsigned int nburn, burn = 0; + if (ctx->unused) { unsigned char *p = (void*)ctx->pad; @@ -266,26 +328,39 @@ salsa20_do_encrypt_stream (SALSA20_context_t *ctx, gcry_assert (!ctx->unused); } - for (;;) +#ifdef USE_AMD64 + if (length >= SALSA20_BLOCK_SIZE) + { + unsigned int nblocks = length / SALSA20_BLOCK_SIZE; + burn = _gcry_salsa20_amd64_encrypt_blocks(ctx->input, inbuf, outbuf, + nblocks, rounds); + length -= SALSA20_BLOCK_SIZE * nblocks; + outbuf += SALSA20_BLOCK_SIZE * nblocks; + inbuf += SALSA20_BLOCK_SIZE * nblocks; + } +#endif + + while (length > 0) { /* Create the next pad and bump the block counter. Note that it is the user's duty to change to another nonce not later than after 2^70 processed bytes. */ - salsa20_core (ctx->pad, ctx->input, rounds); - if (!++ctx->input[8]) - ctx->input[9]++; + nburn = salsa20_core (ctx->pad, ctx->input, rounds); + burn = nburn > burn ? nburn : burn; if (length <= SALSA20_BLOCK_SIZE) { buf_xor (outbuf, inbuf, ctx->pad, length); ctx->unused = SALSA20_BLOCK_SIZE - length; - return; + break; } buf_xor (outbuf, inbuf, ctx->pad, SALSA20_BLOCK_SIZE); length -= SALSA20_BLOCK_SIZE; outbuf += SALSA20_BLOCK_SIZE; inbuf += SALSA20_BLOCK_SIZE; - } + } + + _gcry_burn_stack (burn); } @@ -296,19 +371,7 @@ salsa20_encrypt_stream (void *context, SALSA20_context_t *ctx = (SALSA20_context_t *)context; if (length) - { - salsa20_do_encrypt_stream (ctx, outbuf, inbuf, length, SALSA20_ROUNDS); - _gcry_burn_stack (/* salsa20_do_encrypt_stream: */ - 2*sizeof (void*) - + 3*sizeof (void*) + sizeof (unsigned int) - /* salsa20_core: */ - + 2*sizeof (void*) - + 2*sizeof (void*) - + 64 - + sizeof (unsigned int) - + sizeof (u32) - ); - } + salsa20_do_encrypt_stream (ctx, outbuf, inbuf, length, SALSA20_ROUNDS); } @@ -319,19 +382,7 @@ salsa20r12_encrypt_stream (void *context, SALSA20_context_t *ctx = (SALSA20_context_t *)context; if (length) - { - salsa20_do_encrypt_stream (ctx, outbuf, inbuf, length, SALSA20R12_ROUNDS); - _gcry_burn_stack (/* salsa20_do_encrypt_stream: */ - 2*sizeof (void*) - + 3*sizeof (void*) + sizeof (unsigned int) - /* salsa20_core: */ - + 2*sizeof (void*) - + 2*sizeof (void*) - + 64 - + sizeof (unsigned int) - + sizeof (u32) - ); - } + salsa20_do_encrypt_stream (ctx, outbuf, inbuf, length, SALSA20R12_ROUNDS); } diff --git a/configure.ac b/configure.ac index 5b7ba0d..114460c 100644 --- a/configure.ac +++ b/configure.ac @@ -1553,6 +1553,13 @@ LIST_MEMBER(salsa20, $enabled_ciphers) if test "$found" = "1" ; then GCRYPT_CIPHERS="$GCRYPT_CIPHERS salsa20.lo" AC_DEFINE(USE_SALSA20, 1, [Defined if this module should be included]) + + case "${host}" in + x86_64-*-*) + # Build with the assembly implementation + GCRYPT_CIPHERS="$GCRYPT_CIPHERS salsa20-amd64.lo" + ;; + esac fi LIST_MEMBER(gost28147, $enabled_ciphers) ----------------------------------------------------------------------- Summary of changes: cipher/Makefile.am | 2 +- cipher/salsa20-amd64.S | 924 +++++++++++++++++++++++++++++++++++++++++++ cipher/salsa20-armv7-neon.S | 899 +++++++++++++++++++++++++++++++++++++++++ cipher/salsa20.c | 316 +++++++++++---- configure.ac | 12 + 5 files changed, 2076 insertions(+), 77 deletions(-) create mode 100644 cipher/salsa20-amd64.S create mode 100644 cipher/salsa20-armv7-neon.S hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Mon Oct 28 16:22:06 2013 From: cvs at cvs.gnupg.org (by Jussi Kivilinna) Date: Mon, 28 Oct 2013 16:22:06 +0100 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-344-g1faa618 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 1faa61845f180bd47e037e400dde2d864ee83c89 (commit) via 2cb6e1f323d24359b1c5b113be5c2f79a2a4cded (commit) from 3ff9d2571c18cd7a34359f9c60a10d3b0f932b23 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 1faa61845f180bd47e037e400dde2d864ee83c89 Author: Jussi Kivilinna Date: Mon Oct 28 17:11:21 2013 +0200 Fix typos in documentation * doc/gcrypt.texi: Fix some typos. -- Signed-off-by: Jussi Kivilinna diff --git a/doc/gcrypt.texi b/doc/gcrypt.texi index 91fe399..6dcb4b1 100644 --- a/doc/gcrypt.texi +++ b/doc/gcrypt.texi @@ -390,7 +390,7 @@ and freed memory, you need to initialize Libgcrypt this way: @example /* Version check should be the very first call because it - makes sure that important subsystems are intialized. */ + makes sure that important subsystems are initialized. */ if (!gcry_check_version (GCRYPT_VERSION)) @{ fputs ("libgcrypt version mismatch\n", stderr); @@ -405,7 +405,7 @@ and freed memory, you need to initialize Libgcrypt this way: /* ... If required, other initialization goes here. Note that the process might still be running with increased privileges and that - the secure memory has not been intialized. */ + the secure memory has not been initialized. */ /* Allocate a pool of 16k secure memory. This make the secure memory available and also drops privileges where needed. */ @@ -642,9 +642,9 @@ callbacks. @item GCRYCTL_ENABLE_QUICK_RANDOM; Arguments: none This command inhibits the use the very secure random quality level (@code{GCRY_VERY_STRONG_RANDOM}) and degrades all request down to - at code{GCRY_STRONG_RANDOM}. In general this is not recommened. However, + at code{GCRY_STRONG_RANDOM}. In general this is not recommended. However, for some applications the extra quality random Libgcrypt tries to create -is not justified and this option may help to get better performace. +is not justified and this option may help to get better performance. Please check with a crypto expert whether this option can be used for your application. @@ -652,19 +652,19 @@ This option can only be used at initialization time. @item GCRYCTL_DUMP_RANDOM_STATS; Arguments: none -This command dumps randum number generator related statistics to the +This command dumps random number generator related statistics to the library's logging stream. @item GCRYCTL_DUMP_MEMORY_STATS; Arguments: none -This command dumps memory managment related statistics to the library's +This command dumps memory management related statistics to the library's logging stream. @item GCRYCTL_DUMP_SECMEM_STATS; Arguments: none -This command dumps secure memory manamgent related statistics to the +This command dumps secure memory management related statistics to the library's logging stream. @item GCRYCTL_DROP_PRIVS; Arguments: none -This command disables the use of secure memory and drops the priviliges +This command disables the use of secure memory and drops the privileges of the current process. This command has not much use; the suggested way to disable secure memory is to use @code{GCRYCTL_DISABLE_SECMEM} right after initialization. @@ -758,7 +758,7 @@ these different instances is correlated to some extent. In a perfect attack scenario, the attacker can control (or at least guess) the PID and clock of the application, and drain the system's entropy pool to reduce the "up to 16 bytes" above to 0. Then the dependencies of the -inital states of the pools are completely known. Note that this is not +initial states of the pools are completely known. Note that this is not an issue if random of @code{GCRY_VERY_STRONG_RANDOM} quality is requested as in this case enough extra entropy gets mixed. It is also not an issue when using Linux (rndlinux driver), because this one @@ -795,11 +795,11 @@ This command does nothing. It exists only for backward compatibility. This command returns true if the library has been basically initialized. Such a basic initialization happens implicitly with many commands to get certain internal subsystems running. The common and suggested way to -do this basic intialization is by calling gcry_check_version. +do this basic initialization is by calling gcry_check_version. @item GCRYCTL_INITIALIZATION_FINISHED; Arguments: none This command tells the library that the application has finished the -intialization. +initialization. @item GCRYCTL_INITIALIZATION_FINISHED_P; Arguments: none This command returns true if the command@* @@ -825,7 +825,7 @@ proper random device. @item GCRYCTL_PRINT_CONFIG; Arguments: FILE *stream This command dumps information pertaining to the configuration of the library to the given stream. If NULL is given for @var{stream}, the log -system is used. This command may be used before the intialization has +system is used. This command may be used before the initialization has been finished but not before a @code{gcry_check_version}. @item GCRYCTL_OPERATIONAL_P; Arguments: none @@ -833,12 +833,12 @@ This command returns true if the library is in an operational state. This information makes only sense in FIPS mode. In contrast to other functions, this is a pure test function and won't put the library into FIPS mode or change the internal state. This command may be used before -the intialization has been finished but not before a @code{gcry_check_version}. +the initialization has been finished but not before a @code{gcry_check_version}. @item GCRYCTL_FIPS_MODE_P; Arguments: none This command returns true if the library is in FIPS mode. Note, that this is no indication about the current state of the library. This -command may be used before the intialization has been finished but not +command may be used before the initialization has been finished but not before a @code{gcry_check_version}. An application may use this command or the convenience macro below to check whether FIPS mode is actually active. @@ -857,7 +857,7 @@ already in FIPS mode, a self-test is triggered and thus the library will be put into operational state. This command may be used before a call to @code{gcry_check_version} and that is actually the recommended way to let an application switch the library into FIPS mode. Note that Libgcrypt will -reject an attempt to switch to fips mode during or after the intialization. +reject an attempt to switch to fips mode during or after the initialization. @item GCRYCTL_SET_ENFORCED_FIPS_FLAG; Arguments: none Running this command sets the internal flag that puts the library into @@ -866,7 +866,7 @@ does not affect the library if the library is not put into the FIPS mode and it must be used before any other libgcrypt library calls that initialize the library such as @code{gcry_check_version}. Note that Libgcrypt will reject an attempt to switch to the enforced fips mode during or after -the intialization. +the initialization. @item GCRYCTL_SET_PREFERRED_RNG_TYPE; Arguments: int These are advisory commands to select a certain random number @@ -875,7 +875,7 @@ an application actually wants or vice versa. Thus Libgcrypt employs a priority check to select the actually used RNG. If an applications selects a lower priority RNG but a library requests a higher priority RNG Libgcrypt will switch to the higher priority RNG. Applications -and libaries should use these control codes before +and libraries should use these control codes before @code{gcry_check_version}. The available generators are: @table @code @item GCRY_RNG_TYPE_STANDARD @@ -907,8 +907,8 @@ success or an error code on failure. @item GCRYCTL_DISABLE_HWF; Arguments: const char *name Libgcrypt detects certain features of the CPU at startup time. For -performace tests it is sometimes required not to use such a feature. -This option may be used to disabale a certain feature; i.e. Libgcrypt +performance tests it is sometimes required not to use such a feature. +This option may be used to disable a certain feature; i.e. Libgcrypt behaves as if this feature has not been detected. Note that the detection code might be run if the feature has been disabled. This command must be used at initialization time; i.e. before calling @@ -1929,7 +1929,7 @@ checking. @deftypefun size_t gcry_cipher_get_algo_blklen (int @var{algo}) -This functions returns the blocklength of the algorithm @var{algo} +This functions returns the block-length of the algorithm @var{algo} counted in octets. On error @code{0} is returned. This is a convenience functions which should be preferred over @@ -2292,7 +2292,7 @@ will be changed to implement 186-3. @item use-fips186-2 @cindex FIPS 186-2 Force the use of the FIPS 186-2 key generation algorithm instead of -the default algorithm. This algorithm is slighlty different from +the default algorithm. This algorithm is slightly different from FIPS 186-3 and allows only 1024 bit keys. This flag is only meaningful for DSA and only required for FIPS testing backward compatibility. @@ -4547,7 +4547,7 @@ Convenience function to release the @var{factors} array. @deftypefun gcry_error_t gcry_prime_check (gcry_mpi_t @var{p}, unsigned int @var{flags}) -Check wether the number @var{p} is prime. Returns zero in case @var{p} +Check whether the number @var{p} is prime. Returns zero in case @var{p} is indeed a prime, returns @code{GPG_ERR_NO_PRIME} in case @var{p} is not a prime and a different error code in case something went horribly wrong. @@ -4988,7 +4988,7 @@ checking function is exported as well. The generation of random prime numbers is based on the Lim and Lee algorithm to create practically save primes. at footnote{Chae Hoon Lim -and Pil Joong Lee. A key recovery attack on discrete log-based shemes +and Pil Joong Lee. A key recovery attack on discrete log-based schemes using a prime order subgroup. In Burton S. Kaliski Jr., editor, Advances in Cryptology: Crypto '97, pages 249?-263, Berlin / Heidelberg / New York, 1997. Springer-Verlag. Described on page 260.} @@ -5147,7 +5147,7 @@ output blocks. On Unix like systems the @code{GCRY_VERY_STRONG_RANDOM} and @code{GCRY_STRONG_RANDOM} generators are keyed and seeded using the -rndlinux module with the @file{/dev/radnom} device. Thus these +rndlinux module with the @file{/dev/random} device. Thus these generators may block until the OS kernel has collected enough entropy. When used with Microsoft Windows the rndw32 module is used instead. @@ -5162,7 +5162,7 @@ entropy for use by the ``real'' random generators. A self-test facility uses a separate context to check the functionality of the core X9.31 functions using a known answers test. During runtime each output block is compared to the previous one to -detect a stucked generator. +detect a stuck generator. The DT value for the generator is made up of the current time down to microseconds (if available) and a free running 64 bit counter. When @@ -5188,7 +5188,7 @@ incremented on each use. @c them. To use an S-expression with Libgcrypt it needs first be @c converted into the internal representation used by Libgcrypt (the type @c @code{gcry_sexp_t}). The conversion functions support a large subset - at c of the S-expression specification and further fature a printf like + at c of the S-expression specification and further feature a printf like @c function to convert a list of big integers or other binary data into @c an S-expression. @c @@ -5357,8 +5357,8 @@ The result is verified using the public key against the original data and against modified data. (@code{cipher/@/rsa.c:@/selftest_sign_1024}) @item A 1000 bit random value is encrypted and checked that it does not -match the orginal random value. The encrtypted result is then -decrypted and checked that it macthes the original random value. +match the original random value. The encrypted result is then +decrypted and checked that it matches the original random value. (@code{cipher/@/rsa.c:@/selftest_encr_1024}) @end enumerate @@ -5401,7 +5401,7 @@ keys. The table itself is protected using a SHA-1 hash. @c -------------------------------- @section Conditional Tests -The conditional tests are performed if a certain contidion is met. +The conditional tests are performed if a certain condition is met. This may occur at any time; the library does not necessary enter the ``Self-Test'' state to run these tests but will transit to the ``Error'' state if a test failed. @@ -5696,7 +5696,7 @@ documentation only. @item Power-On Libgcrypt is loaded into memory and API calls may be made. Compiler -introducted constructor functions may be run. Note that Libgcrypt does +introduced constructor functions may be run. Note that Libgcrypt does not implement any arbitrary constructor functions to be called by the operating system @@ -5721,7 +5721,7 @@ will automatically transit into the Shutdown state. @item Shutdown Libgcrypt is about to be terminated and removed from the memory. The -application may at this point still runing cleanup handlers. +application may at this point still running cleanup handlers. @end table @end float @@ -5738,18 +5738,18 @@ a shared library and having it linked to an application. @item 2 Power-On to Init is triggered by the application calling the -Libgcrypt intialization function @code{gcry_check_version}. +Libgcrypt initialization function @code{gcry_check_version}. @item 3 -Init to Self-Test is either triggred by a dedicated API call or implicit -by invoking a libgrypt service conrolled by the FSM. +Init to Self-Test is either triggered by a dedicated API call or implicit +by invoking a libgrypt service controlled by the FSM. @item 4 Self-Test to Operational is triggered after all self-tests passed successfully. @item 5 -Operational to Shutdown is an artifical state without any direct action +Operational to Shutdown is an artificial state without any direct action in Libgcrypt. When reaching the Shutdown state the library is deinitialized and can't return to any other state again. @@ -5770,7 +5770,7 @@ Error to Shutdown is similar to the Operational to Shutdown transition (5). @item 9 -Error to Fatal-Error is triggred if Libgrypt detects an fatal error +Error to Fatal-Error is triggered if Libgrypt detects an fatal error while already being in Error state. @item 10 @@ -5778,26 +5778,26 @@ Fatal-Error to Shutdown is automatically entered by Libgcrypt after having reported the error. @item 11 -Power-On to Shutdown is an artifical state to document that Libgcrypt -has not ye been initializaed but the process is about to terminate. +Power-On to Shutdown is an artificial state to document that Libgcrypt +has not ye been initialized but the process is about to terminate. @item 12 -Power-On to Fatal-Error will be triggerd if certain Libgcrypt functions +Power-On to Fatal-Error will be triggered if certain Libgcrypt functions are used without having reached the Init state. @item 13 -Self-Test to Fatal-Error is triggred by severe errors in Libgcrypt while +Self-Test to Fatal-Error is triggered by severe errors in Libgcrypt while running self-tests. @item 14 -Self-Test to Error is triggred by a failed self-test. +Self-Test to Error is triggered by a failed self-test. @item 15 Operational to Fatal-Error is triggered if Libcrypt encountered a non-recoverable error. @item 16 -Operational to Self-Test is triggred if the application requested to run +Operational to Self-Test is triggered if the application requested to run the self-tests again. @item 17 @@ -5868,7 +5868,7 @@ memory and thus also the encryption contexts with these keys. GCRYCTL_SET_RANDOM_DAEMON_SOCKET GCRYCTL_USE_RANDOM_DAEMON -The random damon is still a bit experimental, thus we do not document +The random daemon is still a bit experimental, thus we do not document them. Note that they should be used during initialization and that these functions are not really thread safe. commit 2cb6e1f323d24359b1c5b113be5c2f79a2a4cded Author: Jussi Kivilinna Date: Sun Oct 27 14:07:59 2013 +0200 Add ARM NEON assembly implementation of Serpent * cipher/Makefile.am: Add 'serpent-armv7-neon.S'. * cipher/serpent-armv7-neon.S: New. * cipher/serpent.c (USE_NEON): New macro. (serpent_context_t) [USE_NEON]: Add 'use_neon'. [USE_NEON] (_gcry_serpent_neon_ctr_enc, _gcry_serpent_neon_cfb_dec) (_gcry_serpent_neon_cbc_dec): New prototypes. (serpent_setkey_internal) [USE_NEON]: Detect NEON support. (_gcry_serpent_neon_ctr_enc, _gcry_serpent_neon_cfb_dec) (_gcry_serpent_neon_cbc_dec) [USE_NEON]: Use NEON implementations to process eight blocks in parallel. * configure.ac [neonsupport]: Add 'serpent-armv7-neon.lo'. -- Patch adds ARM NEON optimized implementation of Serpent cipher to speed up parallelizable bulk operations. Benchmarks on ARM Cortex-A8 (armhf, 1008 Mhz): Old: SERPENT128 | nanosecs/byte mebibytes/sec cycles/byte CBC dec | 43.53 ns/B 21.91 MiB/s 43.88 c/B CFB dec | 44.77 ns/B 21.30 MiB/s 45.13 c/B CTR enc | 45.21 ns/B 21.10 MiB/s 45.57 c/B CTR dec | 45.21 ns/B 21.09 MiB/s 45.57 c/B New: SERPENT128 | nanosecs/byte mebibytes/sec cycles/byte CBC dec | 26.26 ns/B 36.32 MiB/s 26.47 c/B CFB dec | 26.21 ns/B 36.38 MiB/s 26.42 c/B CTR enc | 26.20 ns/B 36.40 MiB/s 26.41 c/B CTR dec | 26.20 ns/B 36.40 MiB/s 26.41 c/B Signed-off-by: Jussi Kivilinna diff --git a/cipher/salsa20-armv7-neon.S b/cipher/salsa20-armv7-neon.S index 5b51301..7d31e9f 100644 --- a/cipher/salsa20-armv7-neon.S +++ b/cipher/salsa20-armv7-neon.S @@ -36,7 +36,7 @@ .text .align 2 -.global _gcry_arm_neon_salsa20_encrypt +.globl _gcry_arm_neon_salsa20_encrypt .type _gcry_arm_neon_salsa20_encrypt,%function; _gcry_arm_neon_salsa20_encrypt: /* Modifications: diff --git a/cipher/serpent-armv7-neon.S b/cipher/serpent-armv7-neon.S new file mode 100644 index 0000000..92e95a0 --- /dev/null +++ b/cipher/serpent-armv7-neon.S @@ -0,0 +1,869 @@ +/* serpent-armv7-neon.S - ARM/NEON assembly implementation of Serpent cipher + * + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include + +#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) && \ + defined(HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS) && \ + defined(HAVE_GCC_INLINE_ASM_NEON) + +.text + +.syntax unified +.fpu neon +.arm + +/* ARM registers */ +#define RROUND r0 + +/* NEON vector registers */ +#define RA0 q0 +#define RA1 q1 +#define RA2 q2 +#define RA3 q3 +#define RA4 q4 +#define RB0 q5 +#define RB1 q6 +#define RB2 q7 +#define RB3 q8 +#define RB4 q9 + +#define RT0 q10 +#define RT1 q11 +#define RT2 q12 +#define RT3 q13 + +#define RA0d0 d0 +#define RA0d1 d1 +#define RA1d0 d2 +#define RA1d1 d3 +#define RA2d0 d4 +#define RA2d1 d5 +#define RA3d0 d6 +#define RA3d1 d7 +#define RA4d0 d8 +#define RA4d1 d9 +#define RB0d0 d10 +#define RB0d1 d11 +#define RB1d0 d12 +#define RB1d1 d13 +#define RB2d0 d14 +#define RB2d1 d15 +#define RB3d0 d16 +#define RB3d1 d17 +#define RB4d0 d18 +#define RB4d1 d19 +#define RT0d0 d20 +#define RT0d1 d21 +#define RT1d0 d22 +#define RT1d1 d23 +#define RT2d0 d24 +#define RT2d1 d25 + +/********************************************************************** + helper macros + **********************************************************************/ + +#define transpose_4x4(_q0, _q1, _q2, _q3) \ + vtrn.32 _q0, _q1; \ + vtrn.32 _q2, _q3; \ + vswp _q0##d1, _q2##d0; \ + vswp _q1##d1, _q3##d0; + +/********************************************************************** + 8-way serpent + **********************************************************************/ + +/* + * These are the S-Boxes of Serpent from following research paper. + * + * D. A. Osvik, ?Speeding up Serpent,? in Third AES Candidate Conference, + * (New York, New York, USA), p. 317?329, National Institute of Standards and + * Technology, 2000. + * + * Paper is also available at: http://www.ii.uib.no/~osvik/pub/aes3.pdf + * + */ +#define SBOX0(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + veor a3, a3, a0; veor b3, b3, b0; vmov a4, a1; vmov b4, b1; \ + vand a1, a1, a3; vand b1, b1, b3; veor a4, a4, a2; veor b4, b4, b2; \ + veor a1, a1, a0; veor b1, b1, b0; vorr a0, a0, a3; vorr b0, b0, b3; \ + veor a0, a0, a4; veor b0, b0, b4; veor a4, a4, a3; veor b4, b4, b3; \ + veor a3, a3, a2; veor b3, b3, b2; vorr a2, a2, a1; vorr b2, b2, b1; \ + veor a2, a2, a4; veor b2, b2, b4; vmvn a4, a4; vmvn b4, b4; \ + vorr a4, a4, a1; vorr b4, b4, b1; veor a1, a1, a3; veor b1, b1, b3; \ + veor a1, a1, a4; veor b1, b1, b4; vorr a3, a3, a0; vorr b3, b3, b0; \ + veor a1, a1, a3; veor b1, b1, b3; veor a4, a3; veor b4, b3; + +#define SBOX0_INVERSE(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmvn a2, a2; vmvn b2, b2; vmov a4, a1; vmov b4, b1; \ + vorr a1, a1, a0; vorr b1, b1, b0; vmvn a4, a4; vmvn b4, b4; \ + veor a1, a1, a2; veor b1, b1, b2; vorr a2, a2, a4; vorr b2, b2, b4; \ + veor a1, a1, a3; veor b1, b1, b3; veor a0, a0, a4; veor b0, b0, b4; \ + veor a2, a2, a0; veor b2, b2, b0; vand a0, a0, a3; vand b0, b0, b3; \ + veor a4, a4, a0; veor b4, b4, b0; vorr a0, a0, a1; vorr b0, b0, b1; \ + veor a0, a0, a2; veor b0, b0, b2; veor a3, a3, a4; veor b3, b3, b4; \ + veor a2, a2, a1; veor b2, b2, b1; veor a3, a3, a0; veor b3, b3, b0; \ + veor a3, a3, a1; veor b3, b3, b1;\ + vand a2, a2, a3; vand b2, b2, b3;\ + veor a4, a2; veor b4, b2; + +#define SBOX1(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmvn a0, a0; vmvn b0, b0; vmvn a2, a2; vmvn b2, b2; \ + vmov a4, a0; vmov b4, b0; vand a0, a0, a1; vand b0, b0, b1; \ + veor a2, a2, a0; veor b2, b2, b0; vorr a0, a0, a3; vorr b0, b0, b3; \ + veor a3, a3, a2; veor b3, b3, b2; veor a1, a1, a0; veor b1, b1, b0; \ + veor a0, a0, a4; veor b0, b0, b4; vorr a4, a4, a1; vorr b4, b4, b1; \ + veor a1, a1, a3; veor b1, b1, b3; vorr a2, a2, a0; vorr b2, b2, b0; \ + vand a2, a2, a4; vand b2, b2, b4; veor a0, a0, a1; veor b0, b0, b1; \ + vand a1, a1, a2; vand b1, b1, b2;\ + veor a1, a1, a0; veor b1, b1, b0; vand a0, a0, a2; vand b0, b0, b2; \ + veor a0, a4; veor b0, b4; + +#define SBOX1_INVERSE(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmov a4, a1; vmov b4, b1; veor a1, a1, a3; veor b1, b1, b3; \ + vand a3, a3, a1; vand b3, b3, b1; veor a4, a4, a2; veor b4, b4, b2; \ + veor a3, a3, a0; veor b3, b3, b0; vorr a0, a0, a1; vorr b0, b0, b1; \ + veor a2, a2, a3; veor b2, b2, b3; veor a0, a0, a4; veor b0, b0, b4; \ + vorr a0, a0, a2; vorr b0, b0, b2; veor a1, a1, a3; veor b1, b1, b3; \ + veor a0, a0, a1; veor b0, b0, b1; vorr a1, a1, a3; vorr b1, b1, b3; \ + veor a1, a1, a0; veor b1, b1, b0; vmvn a4, a4; vmvn b4, b4; \ + veor a4, a4, a1; veor b4, b4, b1; vorr a1, a1, a0; vorr b1, b1, b0; \ + veor a1, a1, a0; veor b1, b1, b0;\ + vorr a1, a1, a4; vorr b1, b1, b4;\ + veor a3, a1; veor b3, b1; + +#define SBOX2(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmov a4, a0; vmov b4, b0; vand a0, a0, a2; vand b0, b0, b2; \ + veor a0, a0, a3; veor b0, b0, b3; veor a2, a2, a1; veor b2, b2, b1; \ + veor a2, a2, a0; veor b2, b2, b0; vorr a3, a3, a4; vorr b3, b3, b4; \ + veor a3, a3, a1; veor b3, b3, b1; veor a4, a4, a2; veor b4, b4, b2; \ + vmov a1, a3; vmov b1, b3; vorr a3, a3, a4; vorr b3, b3, b4; \ + veor a3, a3, a0; veor b3, b3, b0; vand a0, a0, a1; vand b0, b0, b1; \ + veor a4, a4, a0; veor b4, b4, b0; veor a1, a1, a3; veor b1, b1, b3; \ + veor a1, a1, a4; veor b1, b1, b4; vmvn a4, a4; vmvn b4, b4; + +#define SBOX2_INVERSE(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + veor a2, a2, a3; veor b2, b2, b3; veor a3, a3, a0; veor b3, b3, b0; \ + vmov a4, a3; vmov b4, b3; vand a3, a3, a2; vand b3, b3, b2; \ + veor a3, a3, a1; veor b3, b3, b1; vorr a1, a1, a2; vorr b1, b1, b2; \ + veor a1, a1, a4; veor b1, b1, b4; vand a4, a4, a3; vand b4, b4, b3; \ + veor a2, a2, a3; veor b2, b2, b3; vand a4, a4, a0; vand b4, b4, b0; \ + veor a4, a4, a2; veor b4, b4, b2; vand a2, a2, a1; vand b2, b2, b1; \ + vorr a2, a2, a0; vorr b2, b2, b0; vmvn a3, a3; vmvn b3, b3; \ + veor a2, a2, a3; veor b2, b2, b3; veor a0, a0, a3; veor b0, b0, b3; \ + vand a0, a0, a1; vand b0, b0, b1; veor a3, a3, a4; veor b3, b3, b4; \ + veor a3, a0; veor b3, b0; + +#define SBOX3(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmov a4, a0; vmov b4, b0; vorr a0, a0, a3; vorr b0, b0, b3; \ + veor a3, a3, a1; veor b3, b3, b1; vand a1, a1, a4; vand b1, b1, b4; \ + veor a4, a4, a2; veor b4, b4, b2; veor a2, a2, a3; veor b2, b2, b3; \ + vand a3, a3, a0; vand b3, b3, b0; vorr a4, a4, a1; vorr b4, b4, b1; \ + veor a3, a3, a4; veor b3, b3, b4; veor a0, a0, a1; veor b0, b0, b1; \ + vand a4, a4, a0; vand b4, b4, b0; veor a1, a1, a3; veor b1, b1, b3; \ + veor a4, a4, a2; veor b4, b4, b2; vorr a1, a1, a0; vorr b1, b1, b0; \ + veor a1, a1, a2; veor b1, b1, b2; veor a0, a0, a3; veor b0, b0, b3; \ + vmov a2, a1; vmov b2, b1; vorr a1, a1, a3; vorr b1, b1, b3; \ + veor a1, a0; veor b1, b0; + +#define SBOX3_INVERSE(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmov a4, a2; vmov b4, b2; veor a2, a2, a1; veor b2, b2, b1; \ + veor a0, a0, a2; veor b0, b0, b2; vand a4, a4, a2; vand b4, b4, b2; \ + veor a4, a4, a0; veor b4, b4, b0; vand a0, a0, a1; vand b0, b0, b1; \ + veor a1, a1, a3; veor b1, b1, b3; vorr a3, a3, a4; vorr b3, b3, b4; \ + veor a2, a2, a3; veor b2, b2, b3; veor a0, a0, a3; veor b0, b0, b3; \ + veor a1, a1, a4; veor b1, b1, b4; vand a3, a3, a2; vand b3, b3, b2; \ + veor a3, a3, a1; veor b3, b3, b1; veor a1, a1, a0; veor b1, b1, b0; \ + vorr a1, a1, a2; vorr b1, b1, b2; veor a0, a0, a3; veor b0, b0, b3; \ + veor a1, a1, a4; veor b1, b1, b4;\ + veor a0, a1; veor b0, b1; + +#define SBOX4(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + veor a1, a1, a3; veor b1, b1, b3; vmvn a3, a3; vmvn b3, b3; \ + veor a2, a2, a3; veor b2, b2, b3; veor a3, a3, a0; veor b3, b3, b0; \ + vmov a4, a1; vmov b4, b1; vand a1, a1, a3; vand b1, b1, b3; \ + veor a1, a1, a2; veor b1, b1, b2; veor a4, a4, a3; veor b4, b4, b3; \ + veor a0, a0, a4; veor b0, b0, b4; vand a2, a2, a4; vand b2, b2, b4; \ + veor a2, a2, a0; veor b2, b2, b0; vand a0, a0, a1; vand b0, b0, b1; \ + veor a3, a3, a0; veor b3, b3, b0; vorr a4, a4, a1; vorr b4, b4, b1; \ + veor a4, a4, a0; veor b4, b4, b0; vorr a0, a0, a3; vorr b0, b0, b3; \ + veor a0, a0, a2; veor b0, b0, b2; vand a2, a2, a3; vand b2, b2, b3; \ + vmvn a0, a0; vmvn b0, b0; veor a4, a2; veor b4, b2; + +#define SBOX4_INVERSE(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmov a4, a2; vmov b4, b2; vand a2, a2, a3; vand b2, b2, b3; \ + veor a2, a2, a1; veor b2, b2, b1; vorr a1, a1, a3; vorr b1, b1, b3; \ + vand a1, a1, a0; vand b1, b1, b0; veor a4, a4, a2; veor b4, b4, b2; \ + veor a4, a4, a1; veor b4, b4, b1; vand a1, a1, a2; vand b1, b1, b2; \ + vmvn a0, a0; vmvn b0, b0; veor a3, a3, a4; veor b3, b3, b4; \ + veor a1, a1, a3; veor b1, b1, b3; vand a3, a3, a0; vand b3, b3, b0; \ + veor a3, a3, a2; veor b3, b3, b2; veor a0, a0, a1; veor b0, b0, b1; \ + vand a2, a2, a0; vand b2, b2, b0; veor a3, a3, a0; veor b3, b3, b0; \ + veor a2, a2, a4; veor b2, b2, b4;\ + vorr a2, a2, a3; vorr b2, b2, b3; veor a3, a3, a0; veor b3, b3, b0; \ + veor a2, a1; veor b2, b1; + +#define SBOX5(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + veor a0, a0, a1; veor b0, b0, b1; veor a1, a1, a3; veor b1, b1, b3; \ + vmvn a3, a3; vmvn b3, b3; vmov a4, a1; vmov b4, b1; \ + vand a1, a1, a0; vand b1, b1, b0; veor a2, a2, a3; veor b2, b2, b3; \ + veor a1, a1, a2; veor b1, b1, b2; vorr a2, a2, a4; vorr b2, b2, b4; \ + veor a4, a4, a3; veor b4, b4, b3; vand a3, a3, a1; vand b3, b3, b1; \ + veor a3, a3, a0; veor b3, b3, b0; veor a4, a4, a1; veor b4, b4, b1; \ + veor a4, a4, a2; veor b4, b4, b2; veor a2, a2, a0; veor b2, b2, b0; \ + vand a0, a0, a3; vand b0, b0, b3; vmvn a2, a2; vmvn b2, b2; \ + veor a0, a0, a4; veor b0, b0, b4; vorr a4, a4, a3; vorr b4, b4, b3; \ + veor a2, a4; veor b2, b4; + +#define SBOX5_INVERSE(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmvn a1, a1; vmvn b1, b1; vmov a4, a3; vmov b4, b3; \ + veor a2, a2, a1; veor b2, b2, b1; vorr a3, a3, a0; vorr b3, b3, b0; \ + veor a3, a3, a2; veor b3, b3, b2; vorr a2, a2, a1; vorr b2, b2, b1; \ + vand a2, a2, a0; vand b2, b2, b0; veor a4, a4, a3; veor b4, b4, b3; \ + veor a2, a2, a4; veor b2, b2, b4; vorr a4, a4, a0; vorr b4, b4, b0; \ + veor a4, a4, a1; veor b4, b4, b1; vand a1, a1, a2; vand b1, b1, b2; \ + veor a1, a1, a3; veor b1, b1, b3; veor a4, a4, a2; veor b4, b4, b2; \ + vand a3, a3, a4; vand b3, b3, b4; veor a4, a4, a1; veor b4, b4, b1; \ + veor a3, a3, a4; veor b3, b3, b4; vmvn a4, a4; vmvn b4, b4; \ + veor a3, a0; veor b3, b0; + +#define SBOX6(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmvn a2, a2; vmvn b2, b2; vmov a4, a3; vmov b4, b3; \ + vand a3, a3, a0; vand b3, b3, b0; veor a0, a0, a4; veor b0, b0, b4; \ + veor a3, a3, a2; veor b3, b3, b2; vorr a2, a2, a4; vorr b2, b2, b4; \ + veor a1, a1, a3; veor b1, b1, b3; veor a2, a2, a0; veor b2, b2, b0; \ + vorr a0, a0, a1; vorr b0, b0, b1; veor a2, a2, a1; veor b2, b2, b1; \ + veor a4, a4, a0; veor b4, b4, b0; vorr a0, a0, a3; vorr b0, b0, b3; \ + veor a0, a0, a2; veor b0, b0, b2; veor a4, a4, a3; veor b4, b4, b3; \ + veor a4, a4, a0; veor b4, b4, b0; vmvn a3, a3; vmvn b3, b3; \ + vand a2, a2, a4; vand b2, b2, b4;\ + veor a2, a3; veor b2, b3; + +#define SBOX6_INVERSE(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + veor a0, a0, a2; veor b0, b0, b2; vmov a4, a2; vmov b4, b2; \ + vand a2, a2, a0; vand b2, b2, b0; veor a4, a4, a3; veor b4, b4, b3; \ + vmvn a2, a2; vmvn b2, b2; veor a3, a3, a1; veor b3, b3, b1; \ + veor a2, a2, a3; veor b2, b2, b3; vorr a4, a4, a0; vorr b4, b4, b0; \ + veor a0, a0, a2; veor b0, b0, b2; veor a3, a3, a4; veor b3, b3, b4; \ + veor a4, a4, a1; veor b4, b4, b1; vand a1, a1, a3; vand b1, b1, b3; \ + veor a1, a1, a0; veor b1, b1, b0; veor a0, a0, a3; veor b0, b0, b3; \ + vorr a0, a0, a2; vorr b0, b0, b2; veor a3, a3, a1; veor b3, b3, b1; \ + veor a4, a0; veor b4, b0; + +#define SBOX7(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmov a4, a1; vmov b4, b1; vorr a1, a1, a2; vorr b1, b1, b2; \ + veor a1, a1, a3; veor b1, b1, b3; veor a4, a4, a2; veor b4, b4, b2; \ + veor a2, a2, a1; veor b2, b2, b1; vorr a3, a3, a4; vorr b3, b3, b4; \ + vand a3, a3, a0; vand b3, b3, b0; veor a4, a4, a2; veor b4, b4, b2; \ + veor a3, a3, a1; veor b3, b3, b1; vorr a1, a1, a4; vorr b1, b1, b4; \ + veor a1, a1, a0; veor b1, b1, b0; vorr a0, a0, a4; vorr b0, b0, b4; \ + veor a0, a0, a2; veor b0, b0, b2; veor a1, a1, a4; veor b1, b1, b4; \ + veor a2, a2, a1; veor b2, b2, b1; vand a1, a1, a0; vand b1, b1, b0; \ + veor a1, a1, a4; veor b1, b1, b4; vmvn a2, a2; vmvn b2, b2; \ + vorr a2, a2, a0; vorr b2, b2, b0;\ + veor a4, a2; veor b4, b2; + +#define SBOX7_INVERSE(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vmov a4, a2; vmov b4, b2; veor a2, a2, a0; veor b2, b2, b0; \ + vand a0, a0, a3; vand b0, b0, b3; vorr a4, a4, a3; vorr b4, b4, b3; \ + vmvn a2, a2; vmvn b2, b2; veor a3, a3, a1; veor b3, b3, b1; \ + vorr a1, a1, a0; vorr b1, b1, b0; veor a0, a0, a2; veor b0, b0, b2; \ + vand a2, a2, a4; vand b2, b2, b4; vand a3, a3, a4; vand b3, b3, b4; \ + veor a1, a1, a2; veor b1, b1, b2; veor a2, a2, a0; veor b2, b2, b0; \ + vorr a0, a0, a2; vorr b0, b0, b2; veor a4, a4, a1; veor b4, b4, b1; \ + veor a0, a0, a3; veor b0, b0, b3; veor a3, a3, a4; veor b3, b3, b4; \ + vorr a4, a4, a0; vorr b4, b4, b0; veor a3, a3, a2; veor b3, b3, b2; \ + veor a4, a2; veor b4, b2; + +/* Apply SBOX number WHICH to to the block. */ +#define SBOX(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + SBOX##which (a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) + +/* Apply inverse SBOX number WHICH to to the block. */ +#define SBOX_INVERSE(which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + SBOX##which##_INVERSE (a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) + +/* XOR round key into block state in a0,a1,a2,a3. a4 used as temporary. */ +#define BLOCK_XOR_KEY(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vdup.32 RT3, RT0d0[0]; \ + vdup.32 RT1, RT0d0[1]; \ + vdup.32 RT2, RT0d1[0]; \ + vdup.32 RT0, RT0d1[1]; \ + veor a0, a0, RT3; veor b0, b0, RT3; \ + veor a1, a1, RT1; veor b1, b1, RT1; \ + veor a2, a2, RT2; veor b2, b2, RT2; \ + veor a3, a3, RT0; veor b3, b3, RT0; + +#define BLOCK_LOAD_KEY_ENC() \ + vld1.8 {RT0d0, RT0d1}, [RROUND]!; + +#define BLOCK_LOAD_KEY_DEC() \ + vld1.8 {RT0d0, RT0d1}, [RROUND]; \ + sub RROUND, RROUND, #16 + +/* Apply the linear transformation to BLOCK. */ +#define LINEAR_TRANSFORMATION(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vshl.u32 a4, a0, #13; vshl.u32 b4, b0, #13; \ + vshr.u32 a0, a0, #(32-13); vshr.u32 b0, b0, #(32-13); \ + veor a0, a0, a4; veor b0, b0, b4; \ + vshl.u32 a4, a2, #3; vshl.u32 b4, b2, #3; \ + vshr.u32 a2, a2, #(32-3); vshr.u32 b2, b2, #(32-3); \ + veor a2, a2, a4; veor b2, b2, b4; \ + veor a1, a0, a1; veor b1, b0, b1; \ + veor a1, a2, a1; veor b1, b2, b1; \ + vshl.u32 a4, a0, #3; vshl.u32 b4, b0, #3; \ + veor a3, a2, a3; veor b3, b2, b3; \ + veor a3, a4, a3; veor b3, b4, b3; \ + vshl.u32 a4, a1, #1; vshl.u32 b4, b1, #1; \ + vshr.u32 a1, a1, #(32-1); vshr.u32 b1, b1, #(32-1); \ + veor a1, a1, a4; veor b1, b1, b4; \ + vshl.u32 a4, a3, #7; vshl.u32 b4, b3, #7; \ + vshr.u32 a3, a3, #(32-7); vshr.u32 b3, b3, #(32-7); \ + veor a3, a3, a4; veor b3, b3, b4; \ + veor a0, a1, a0; veor b0, b1, b0; \ + veor a0, a3, a0; veor b0, b3, b0; \ + vshl.u32 a4, a1, #7; vshl.u32 b4, b1, #7; \ + veor a2, a3, a2; veor b2, b3, b2; \ + veor a2, a4, a2; veor b2, b4, b2; \ + vshl.u32 a4, a0, #5; vshl.u32 b4, b0, #5; \ + vshr.u32 a0, a0, #(32-5); vshr.u32 b0, b0, #(32-5); \ + veor a0, a0, a4; veor b0, b0, b4; \ + vshl.u32 a4, a2, #22; vshl.u32 b4, b2, #22; \ + vshr.u32 a2, a2, #(32-22); vshr.u32 b2, b2, #(32-22); \ + veor a2, a2, a4; veor b2, b2, b4; + +/* Apply the inverse linear transformation to BLOCK. */ +#define LINEAR_TRANSFORMATION_INVERSE(a0, a1, a2, a3, a4, b0, b1, b2, b3, b4) \ + vshr.u32 a4, a2, #22; vshr.u32 b4, b2, #22; \ + vshl.u32 a2, a2, #(32-22); vshl.u32 b2, b2, #(32-22); \ + veor a2, a2, a4; veor b2, b2, b4; \ + vshr.u32 a4, a0, #5; vshr.u32 b4, b0, #5; \ + vshl.u32 a0, a0, #(32-5); vshl.u32 b0, b0, #(32-5); \ + veor a0, a0, a4; veor b0, b0, b4; \ + vshl.u32 a4, a1, #7; vshl.u32 b4, b1, #7; \ + veor a2, a3, a2; veor b2, b3, b2; \ + veor a2, a4, a2; veor b2, b4, b2; \ + veor a0, a1, a0; veor b0, b1, b0; \ + veor a0, a3, a0; veor b0, b3, b0; \ + vshr.u32 a4, a3, #7; vshr.u32 b4, b3, #7; \ + vshl.u32 a3, a3, #(32-7); vshl.u32 b3, b3, #(32-7); \ + veor a3, a3, a4; veor b3, b3, b4; \ + vshr.u32 a4, a1, #1; vshr.u32 b4, b1, #1; \ + vshl.u32 a1, a1, #(32-1); vshl.u32 b1, b1, #(32-1); \ + veor a1, a1, a4; veor b1, b1, b4; \ + vshl.u32 a4, a0, #3; vshl.u32 b4, b0, #3; \ + veor a3, a2, a3; veor b3, b2, b3; \ + veor a3, a4, a3; veor b3, b4, b3; \ + veor a1, a0, a1; veor b1, b0, b1; \ + veor a1, a2, a1; veor b1, b2, b1; \ + vshr.u32 a4, a2, #3; vshr.u32 b4, b2, #3; \ + vshl.u32 a2, a2, #(32-3); vshl.u32 b2, b2, #(32-3); \ + veor a2, a2, a4; veor b2, b2, b4; \ + vshr.u32 a4, a0, #13; vshr.u32 b4, b0, #13; \ + vshl.u32 a0, a0, #(32-13); vshl.u32 b0, b0, #(32-13); \ + veor a0, a0, a4; veor b0, b0, b4; + +/* Apply a Serpent round to eight parallel blocks. This macro increments + `round'. */ +#define ROUND(round, which, a0, a1, a2, a3, a4, na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, nb0, nb1, nb2, nb3, nb4) \ + BLOCK_XOR_KEY (a0, a1, a2, a3, a4, b0, b1, b2, b3, b4); \ + BLOCK_LOAD_KEY_ENC (); \ + SBOX (which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4); \ + LINEAR_TRANSFORMATION (na0, na1, na2, na3, na4, nb0, nb1, nb2, nb3, nb4); + +/* Apply the last Serpent round to eight parallel blocks. This macro increments + `round'. */ +#define ROUND_LAST(round, which, a0, a1, a2, a3, a4, na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, nb0, nb1, nb2, nb3, nb4) \ + BLOCK_XOR_KEY (a0, a1, a2, a3, a4, b0, b1, b2, b3, b4); \ + BLOCK_LOAD_KEY_ENC (); \ + SBOX (which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4); \ + BLOCK_XOR_KEY (na0, na1, na2, na3, na4, nb0, nb1, nb2, nb3, nb4); + +/* Apply an inverse Serpent round to eight parallel blocks. This macro + increments `round'. */ +#define ROUND_INVERSE(round, which, a0, a1, a2, a3, a4, \ + na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, \ + nb0, nb1, nb2, nb3, nb4) \ + LINEAR_TRANSFORMATION_INVERSE (a0, a1, a2, a3, a4, b0, b1, b2, b3, b4); \ + SBOX_INVERSE (which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4); \ + BLOCK_XOR_KEY (na0, na1, na2, na3, na4, nb0, nb1, nb2, nb3, nb4); \ + BLOCK_LOAD_KEY_DEC (); + +/* Apply the first inverse Serpent round to eight parallel blocks. This macro + increments `round'. */ +#define ROUND_FIRST_INVERSE(round, which, a0, a1, a2, a3, a4, \ + na0, na1, na2, na3, na4, \ + b0, b1, b2, b3, b4, \ + nb0, nb1, nb2, nb3, nb4) \ + BLOCK_XOR_KEY (a0, a1, a2, a3, a4, b0, b1, b2, b3, b4); \ + BLOCK_LOAD_KEY_DEC (); \ + SBOX_INVERSE (which, a0, a1, a2, a3, a4, b0, b1, b2, b3, b4); \ + BLOCK_XOR_KEY (na0, na1, na2, na3, na4, nb0, nb1, nb2, nb3, nb4); \ + BLOCK_LOAD_KEY_DEC (); + +.align 3 +.type __serpent_enc_blk8,%function; +__serpent_enc_blk8: + /* input: + * r0: round key pointer + * RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3: eight parallel plaintext + * blocks + * output: + * RA4, RA1, RA2, RA0, RB4, RB1, RB2, RB0: eight parallel + * ciphertext blocks + */ + + transpose_4x4(RA0, RA1, RA2, RA3); + BLOCK_LOAD_KEY_ENC (); + transpose_4x4(RB0, RB1, RB2, RB3); + + ROUND (0, 0, RA0, RA1, RA2, RA3, RA4, RA1, RA4, RA2, RA0, RA3, + RB0, RB1, RB2, RB3, RB4, RB1, RB4, RB2, RB0, RB3); + ROUND (1, 1, RA1, RA4, RA2, RA0, RA3, RA2, RA1, RA0, RA4, RA3, + RB1, RB4, RB2, RB0, RB3, RB2, RB1, RB0, RB4, RB3); + ROUND (2, 2, RA2, RA1, RA0, RA4, RA3, RA0, RA4, RA1, RA3, RA2, + RB2, RB1, RB0, RB4, RB3, RB0, RB4, RB1, RB3, RB2); + ROUND (3, 3, RA0, RA4, RA1, RA3, RA2, RA4, RA1, RA3, RA2, RA0, + RB0, RB4, RB1, RB3, RB2, RB4, RB1, RB3, RB2, RB0); + ROUND (4, 4, RA4, RA1, RA3, RA2, RA0, RA1, RA0, RA4, RA2, RA3, + RB4, RB1, RB3, RB2, RB0, RB1, RB0, RB4, RB2, RB3); + ROUND (5, 5, RA1, RA0, RA4, RA2, RA3, RA0, RA2, RA1, RA4, RA3, + RB1, RB0, RB4, RB2, RB3, RB0, RB2, RB1, RB4, RB3); + ROUND (6, 6, RA0, RA2, RA1, RA4, RA3, RA0, RA2, RA3, RA1, RA4, + RB0, RB2, RB1, RB4, RB3, RB0, RB2, RB3, RB1, RB4); + ROUND (7, 7, RA0, RA2, RA3, RA1, RA4, RA4, RA1, RA2, RA0, RA3, + RB0, RB2, RB3, RB1, RB4, RB4, RB1, RB2, RB0, RB3); + ROUND (8, 0, RA4, RA1, RA2, RA0, RA3, RA1, RA3, RA2, RA4, RA0, + RB4, RB1, RB2, RB0, RB3, RB1, RB3, RB2, RB4, RB0); + ROUND (9, 1, RA1, RA3, RA2, RA4, RA0, RA2, RA1, RA4, RA3, RA0, + RB1, RB3, RB2, RB4, RB0, RB2, RB1, RB4, RB3, RB0); + ROUND (10, 2, RA2, RA1, RA4, RA3, RA0, RA4, RA3, RA1, RA0, RA2, + RB2, RB1, RB4, RB3, RB0, RB4, RB3, RB1, RB0, RB2); + ROUND (11, 3, RA4, RA3, RA1, RA0, RA2, RA3, RA1, RA0, RA2, RA4, + RB4, RB3, RB1, RB0, RB2, RB3, RB1, RB0, RB2, RB4); + ROUND (12, 4, RA3, RA1, RA0, RA2, RA4, RA1, RA4, RA3, RA2, RA0, + RB3, RB1, RB0, RB2, RB4, RB1, RB4, RB3, RB2, RB0); + ROUND (13, 5, RA1, RA4, RA3, RA2, RA0, RA4, RA2, RA1, RA3, RA0, + RB1, RB4, RB3, RB2, RB0, RB4, RB2, RB1, RB3, RB0); + ROUND (14, 6, RA4, RA2, RA1, RA3, RA0, RA4, RA2, RA0, RA1, RA3, + RB4, RB2, RB1, RB3, RB0, RB4, RB2, RB0, RB1, RB3); + ROUND (15, 7, RA4, RA2, RA0, RA1, RA3, RA3, RA1, RA2, RA4, RA0, + RB4, RB2, RB0, RB1, RB3, RB3, RB1, RB2, RB4, RB0); + ROUND (16, 0, RA3, RA1, RA2, RA4, RA0, RA1, RA0, RA2, RA3, RA4, + RB3, RB1, RB2, RB4, RB0, RB1, RB0, RB2, RB3, RB4); + ROUND (17, 1, RA1, RA0, RA2, RA3, RA4, RA2, RA1, RA3, RA0, RA4, + RB1, RB0, RB2, RB3, RB4, RB2, RB1, RB3, RB0, RB4); + ROUND (18, 2, RA2, RA1, RA3, RA0, RA4, RA3, RA0, RA1, RA4, RA2, + RB2, RB1, RB3, RB0, RB4, RB3, RB0, RB1, RB4, RB2); + ROUND (19, 3, RA3, RA0, RA1, RA4, RA2, RA0, RA1, RA4, RA2, RA3, + RB3, RB0, RB1, RB4, RB2, RB0, RB1, RB4, RB2, RB3); + ROUND (20, 4, RA0, RA1, RA4, RA2, RA3, RA1, RA3, RA0, RA2, RA4, + RB0, RB1, RB4, RB2, RB3, RB1, RB3, RB0, RB2, RB4); + ROUND (21, 5, RA1, RA3, RA0, RA2, RA4, RA3, RA2, RA1, RA0, RA4, + RB1, RB3, RB0, RB2, RB4, RB3, RB2, RB1, RB0, RB4); + ROUND (22, 6, RA3, RA2, RA1, RA0, RA4, RA3, RA2, RA4, RA1, RA0, + RB3, RB2, RB1, RB0, RB4, RB3, RB2, RB4, RB1, RB0); + ROUND (23, 7, RA3, RA2, RA4, RA1, RA0, RA0, RA1, RA2, RA3, RA4, + RB3, RB2, RB4, RB1, RB0, RB0, RB1, RB2, RB3, RB4); + ROUND (24, 0, RA0, RA1, RA2, RA3, RA4, RA1, RA4, RA2, RA0, RA3, + RB0, RB1, RB2, RB3, RB4, RB1, RB4, RB2, RB0, RB3); + ROUND (25, 1, RA1, RA4, RA2, RA0, RA3, RA2, RA1, RA0, RA4, RA3, + RB1, RB4, RB2, RB0, RB3, RB2, RB1, RB0, RB4, RB3); + ROUND (26, 2, RA2, RA1, RA0, RA4, RA3, RA0, RA4, RA1, RA3, RA2, + RB2, RB1, RB0, RB4, RB3, RB0, RB4, RB1, RB3, RB2); + ROUND (27, 3, RA0, RA4, RA1, RA3, RA2, RA4, RA1, RA3, RA2, RA0, + RB0, RB4, RB1, RB3, RB2, RB4, RB1, RB3, RB2, RB0); + ROUND (28, 4, RA4, RA1, RA3, RA2, RA0, RA1, RA0, RA4, RA2, RA3, + RB4, RB1, RB3, RB2, RB0, RB1, RB0, RB4, RB2, RB3); + ROUND (29, 5, RA1, RA0, RA4, RA2, RA3, RA0, RA2, RA1, RA4, RA3, + RB1, RB0, RB4, RB2, RB3, RB0, RB2, RB1, RB4, RB3); + ROUND (30, 6, RA0, RA2, RA1, RA4, RA3, RA0, RA2, RA3, RA1, RA4, + RB0, RB2, RB1, RB4, RB3, RB0, RB2, RB3, RB1, RB4); + ROUND_LAST (31, 7, RA0, RA2, RA3, RA1, RA4, RA4, RA1, RA2, RA0, RA3, + RB0, RB2, RB3, RB1, RB4, RB4, RB1, RB2, RB0, RB3); + + transpose_4x4(RA4, RA1, RA2, RA0); + transpose_4x4(RB4, RB1, RB2, RB0); + + bx lr; +.size __serpent_enc_blk8,.-__serpent_enc_blk8; + +.align 3 +.type __serpent_dec_blk8,%function; +__serpent_dec_blk8: + /* input: + * r0: round key pointer + * RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3: eight parallel + * ciphertext blocks + * output: + * RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3: eight parallel plaintext + * blocks + */ + + add RROUND, RROUND, #(32*16); + + transpose_4x4(RA0, RA1, RA2, RA3); + BLOCK_LOAD_KEY_DEC (); + transpose_4x4(RB0, RB1, RB2, RB3); + + ROUND_FIRST_INVERSE (31, 7, RA0, RA1, RA2, RA3, RA4, + RA3, RA0, RA1, RA4, RA2, + RB0, RB1, RB2, RB3, RB4, + RB3, RB0, RB1, RB4, RB2); + ROUND_INVERSE (30, 6, RA3, RA0, RA1, RA4, RA2, RA0, RA1, RA2, RA4, RA3, + RB3, RB0, RB1, RB4, RB2, RB0, RB1, RB2, RB4, RB3); + ROUND_INVERSE (29, 5, RA0, RA1, RA2, RA4, RA3, RA1, RA3, RA4, RA2, RA0, + RB0, RB1, RB2, RB4, RB3, RB1, RB3, RB4, RB2, RB0); + ROUND_INVERSE (28, 4, RA1, RA3, RA4, RA2, RA0, RA1, RA2, RA4, RA0, RA3, + RB1, RB3, RB4, RB2, RB0, RB1, RB2, RB4, RB0, RB3); + ROUND_INVERSE (27, 3, RA1, RA2, RA4, RA0, RA3, RA4, RA2, RA0, RA1, RA3, + RB1, RB2, RB4, RB0, RB3, RB4, RB2, RB0, RB1, RB3); + ROUND_INVERSE (26, 2, RA4, RA2, RA0, RA1, RA3, RA2, RA3, RA0, RA1, RA4, + RB4, RB2, RB0, RB1, RB3, RB2, RB3, RB0, RB1, RB4); + ROUND_INVERSE (25, 1, RA2, RA3, RA0, RA1, RA4, RA4, RA2, RA1, RA0, RA3, + RB2, RB3, RB0, RB1, RB4, RB4, RB2, RB1, RB0, RB3); + ROUND_INVERSE (24, 0, RA4, RA2, RA1, RA0, RA3, RA4, RA3, RA2, RA0, RA1, + RB4, RB2, RB1, RB0, RB3, RB4, RB3, RB2, RB0, RB1); + ROUND_INVERSE (23, 7, RA4, RA3, RA2, RA0, RA1, RA0, RA4, RA3, RA1, RA2, + RB4, RB3, RB2, RB0, RB1, RB0, RB4, RB3, RB1, RB2); + ROUND_INVERSE (22, 6, RA0, RA4, RA3, RA1, RA2, RA4, RA3, RA2, RA1, RA0, + RB0, RB4, RB3, RB1, RB2, RB4, RB3, RB2, RB1, RB0); + ROUND_INVERSE (21, 5, RA4, RA3, RA2, RA1, RA0, RA3, RA0, RA1, RA2, RA4, + RB4, RB3, RB2, RB1, RB0, RB3, RB0, RB1, RB2, RB4); + ROUND_INVERSE (20, 4, RA3, RA0, RA1, RA2, RA4, RA3, RA2, RA1, RA4, RA0, + RB3, RB0, RB1, RB2, RB4, RB3, RB2, RB1, RB4, RB0); + ROUND_INVERSE (19, 3, RA3, RA2, RA1, RA4, RA0, RA1, RA2, RA4, RA3, RA0, + RB3, RB2, RB1, RB4, RB0, RB1, RB2, RB4, RB3, RB0); + ROUND_INVERSE (18, 2, RA1, RA2, RA4, RA3, RA0, RA2, RA0, RA4, RA3, RA1, + RB1, RB2, RB4, RB3, RB0, RB2, RB0, RB4, RB3, RB1); + ROUND_INVERSE (17, 1, RA2, RA0, RA4, RA3, RA1, RA1, RA2, RA3, RA4, RA0, + RB2, RB0, RB4, RB3, RB1, RB1, RB2, RB3, RB4, RB0); + ROUND_INVERSE (16, 0, RA1, RA2, RA3, RA4, RA0, RA1, RA0, RA2, RA4, RA3, + RB1, RB2, RB3, RB4, RB0, RB1, RB0, RB2, RB4, RB3); + ROUND_INVERSE (15, 7, RA1, RA0, RA2, RA4, RA3, RA4, RA1, RA0, RA3, RA2, + RB1, RB0, RB2, RB4, RB3, RB4, RB1, RB0, RB3, RB2); + ROUND_INVERSE (14, 6, RA4, RA1, RA0, RA3, RA2, RA1, RA0, RA2, RA3, RA4, + RB4, RB1, RB0, RB3, RB2, RB1, RB0, RB2, RB3, RB4); + ROUND_INVERSE (13, 5, RA1, RA0, RA2, RA3, RA4, RA0, RA4, RA3, RA2, RA1, + RB1, RB0, RB2, RB3, RB4, RB0, RB4, RB3, RB2, RB1); + ROUND_INVERSE (12, 4, RA0, RA4, RA3, RA2, RA1, RA0, RA2, RA3, RA1, RA4, + RB0, RB4, RB3, RB2, RB1, RB0, RB2, RB3, RB1, RB4); + ROUND_INVERSE (11, 3, RA0, RA2, RA3, RA1, RA4, RA3, RA2, RA1, RA0, RA4, + RB0, RB2, RB3, RB1, RB4, RB3, RB2, RB1, RB0, RB4); + ROUND_INVERSE (10, 2, RA3, RA2, RA1, RA0, RA4, RA2, RA4, RA1, RA0, RA3, + RB3, RB2, RB1, RB0, RB4, RB2, RB4, RB1, RB0, RB3); + ROUND_INVERSE (9, 1, RA2, RA4, RA1, RA0, RA3, RA3, RA2, RA0, RA1, RA4, + RB2, RB4, RB1, RB0, RB3, RB3, RB2, RB0, RB1, RB4); + ROUND_INVERSE (8, 0, RA3, RA2, RA0, RA1, RA4, RA3, RA4, RA2, RA1, RA0, + RB3, RB2, RB0, RB1, RB4, RB3, RB4, RB2, RB1, RB0); + ROUND_INVERSE (7, 7, RA3, RA4, RA2, RA1, RA0, RA1, RA3, RA4, RA0, RA2, + RB3, RB4, RB2, RB1, RB0, RB1, RB3, RB4, RB0, RB2); + ROUND_INVERSE (6, 6, RA1, RA3, RA4, RA0, RA2, RA3, RA4, RA2, RA0, RA1, + RB1, RB3, RB4, RB0, RB2, RB3, RB4, RB2, RB0, RB1); + ROUND_INVERSE (5, 5, RA3, RA4, RA2, RA0, RA1, RA4, RA1, RA0, RA2, RA3, + RB3, RB4, RB2, RB0, RB1, RB4, RB1, RB0, RB2, RB3); + ROUND_INVERSE (4, 4, RA4, RA1, RA0, RA2, RA3, RA4, RA2, RA0, RA3, RA1, + RB4, RB1, RB0, RB2, RB3, RB4, RB2, RB0, RB3, RB1); + ROUND_INVERSE (3, 3, RA4, RA2, RA0, RA3, RA1, RA0, RA2, RA3, RA4, RA1, + RB4, RB2, RB0, RB3, RB1, RB0, RB2, RB3, RB4, RB1); + ROUND_INVERSE (2, 2, RA0, RA2, RA3, RA4, RA1, RA2, RA1, RA3, RA4, RA0, + RB0, RB2, RB3, RB4, RB1, RB2, RB1, RB3, RB4, RB0); + ROUND_INVERSE (1, 1, RA2, RA1, RA3, RA4, RA0, RA0, RA2, RA4, RA3, RA1, + RB2, RB1, RB3, RB4, RB0, RB0, RB2, RB4, RB3, RB1); + ROUND_INVERSE (0, 0, RA0, RA2, RA4, RA3, RA1, RA0, RA1, RA2, RA3, RA4, + RB0, RB2, RB4, RB3, RB1, RB0, RB1, RB2, RB3, RB4); + + transpose_4x4(RA0, RA1, RA2, RA3); + transpose_4x4(RB0, RB1, RB2, RB3); + + bx lr; +.size __serpent_dec_blk8,.-__serpent_dec_blk8; + +.align 3 +.globl _gcry_serpent_neon_ctr_enc +.type _gcry_serpent_neon_ctr_enc,%function; +_gcry_serpent_neon_ctr_enc: + /* input: + * r0: ctx, CTX + * r1: dst (8 blocks) + * r2: src (8 blocks) + * r3: iv + */ + + vmov.u8 RT1d0, #0xff; /* u64: -1 */ + push {r4,lr}; + vadd.u64 RT2d0, RT1d0, RT1d0; /* u64: -2 */ + vpush {RA4-RB2}; + + /* load IV and byteswap */ + vld1.8 {RA0}, [r3]; + vrev64.u8 RT0, RA0; /* be => le */ + ldr r4, [r3, #8]; + + /* construct IVs */ + vsub.u64 RA2d1, RT0d1, RT2d0; /* +2 */ + vsub.u64 RA1d1, RT0d1, RT1d0; /* +1 */ + cmp r4, #-1; + + vsub.u64 RB0d1, RA2d1, RT2d0; /* +4 */ + vsub.u64 RA3d1, RA2d1, RT1d0; /* +3 */ + ldr r4, [r3, #12]; + + vsub.u64 RB2d1, RB0d1, RT2d0; /* +6 */ + vsub.u64 RB1d1, RB0d1, RT1d0; /* +5 */ + + vsub.u64 RT2d1, RB2d1, RT2d0; /* +8 */ + vsub.u64 RB3d1, RB2d1, RT1d0; /* +7 */ + + vmov RA1d0, RT0d0; + vmov RA2d0, RT0d0; + vmov RA3d0, RT0d0; + vmov RB0d0, RT0d0; + rev r4, r4; + vmov RB1d0, RT0d0; + vmov RB2d0, RT0d0; + vmov RB3d0, RT0d0; + vmov RT2d0, RT0d0; + + /* check need for handling 64-bit overflow and carry */ + beq .Ldo_ctr_carry; + +.Lctr_carry_done: + /* le => be */ + vrev64.u8 RA1, RA1; + vrev64.u8 RA2, RA2; + vrev64.u8 RA3, RA3; + vrev64.u8 RB0, RB0; + vrev64.u8 RT2, RT2; + vrev64.u8 RB1, RB1; + vrev64.u8 RB2, RB2; + vrev64.u8 RB3, RB3; + /* store new IV */ + vst1.8 {RT2}, [r3]; + + bl __serpent_enc_blk8; + + vld1.8 {RT0, RT1}, [r2]!; + vld1.8 {RT2, RT3}, [r2]!; + veor RA4, RA4, RT0; + veor RA1, RA1, RT1; + vld1.8 {RT0, RT1}, [r2]!; + veor RA2, RA2, RT2; + veor RA0, RA0, RT3; + vld1.8 {RT2, RT3}, [r2]!; + veor RB4, RB4, RT0; + veor RT0, RT0; + veor RB1, RB1, RT1; + veor RT1, RT1; + veor RB2, RB2, RT2; + veor RT2, RT2; + veor RB0, RB0, RT3; + veor RT3, RT3; + + vst1.8 {RA4}, [r1]!; + vst1.8 {RA1}, [r1]!; + veor RA1, RA1; + vst1.8 {RA2}, [r1]!; + veor RA2, RA2; + vst1.8 {RA0}, [r1]!; + veor RA0, RA0; + vst1.8 {RB4}, [r1]!; + veor RB4, RB4; + vst1.8 {RB1}, [r1]!; + vst1.8 {RB2}, [r1]!; + vst1.8 {RB0}, [r1]!; + + vpop {RA4-RB2}; + + /* clear the used registers */ + veor RA3, RA3; + veor RB3, RB3; + + pop {r4,pc}; + +.Ldo_ctr_carry: + cmp r4, #-8; + blo .Lctr_carry_done; + beq .Lcarry_RT2; + + cmp r4, #-6; + blo .Lcarry_RB3; + beq .Lcarry_RB2; + + cmp r4, #-4; + blo .Lcarry_RB1; + beq .Lcarry_RB0; + + cmp r4, #-2; + blo .Lcarry_RA3; + beq .Lcarry_RA2; + + vsub.u64 RA1d0, RT1d0; +.Lcarry_RA2: + vsub.u64 RA2d0, RT1d0; +.Lcarry_RA3: + vsub.u64 RA3d0, RT1d0; +.Lcarry_RB0: + vsub.u64 RB0d0, RT1d0; +.Lcarry_RB1: + vsub.u64 RB1d0, RT1d0; +.Lcarry_RB2: + vsub.u64 RB2d0, RT1d0; +.Lcarry_RB3: + vsub.u64 RB3d0, RT1d0; +.Lcarry_RT2: + vsub.u64 RT2d0, RT1d0; + + b .Lctr_carry_done; +.size _gcry_serpent_neon_ctr_enc,.-_gcry_serpent_neon_ctr_enc; + +.align 3 +.globl _gcry_serpent_neon_cfb_dec +.type _gcry_serpent_neon_cfb_dec,%function; +_gcry_serpent_neon_cfb_dec: + /* input: + * r0: ctx, CTX + * r1: dst (8 blocks) + * r2: src (8 blocks) + * r3: iv + */ + + push {lr}; + vpush {RA4-RB2}; + + /* Load input */ + vld1.8 {RA0}, [r3]; + vld1.8 {RA1, RA2}, [r2]!; + vld1.8 {RA3}, [r2]!; + vld1.8 {RB0}, [r2]!; + vld1.8 {RB1, RB2}, [r2]!; + vld1.8 {RB3}, [r2]!; + + /* Update IV */ + vld1.8 {RT0}, [r2]!; + vst1.8 {RT0}, [r3]; + mov r3, lr; + sub r2, r2, #(8*16); + + bl __serpent_enc_blk8; + + vld1.8 {RT0, RT1}, [r2]!; + vld1.8 {RT2, RT3}, [r2]!; + veor RA4, RA4, RT0; + veor RA1, RA1, RT1; + vld1.8 {RT0, RT1}, [r2]!; + veor RA2, RA2, RT2; + veor RA0, RA0, RT3; + vld1.8 {RT2, RT3}, [r2]!; + veor RB4, RB4, RT0; + veor RT0, RT0; + veor RB1, RB1, RT1; + veor RT1, RT1; + veor RB2, RB2, RT2; + veor RT2, RT2; + veor RB0, RB0, RT3; + veor RT3, RT3; + + vst1.8 {RA4}, [r1]!; + vst1.8 {RA1}, [r1]!; + veor RA1, RA1; + vst1.8 {RA2}, [r1]!; + veor RA2, RA2; + vst1.8 {RA0}, [r1]!; + veor RA0, RA0; + vst1.8 {RB4}, [r1]!; + veor RB4, RB4; + vst1.8 {RB1}, [r1]!; + vst1.8 {RB2}, [r1]!; + vst1.8 {RB0}, [r1]!; + + vpop {RA4-RB2}; + + /* clear the used registers */ + veor RA3, RA3; + veor RB3, RB3; + + pop {pc}; +.size _gcry_serpent_neon_cfb_dec,.-_gcry_serpent_neon_cfb_dec; + +.align 3 +.globl _gcry_serpent_neon_cbc_dec +.type _gcry_serpent_neon_cbc_dec,%function; +_gcry_serpent_neon_cbc_dec: + /* input: + * r0: ctx, CTX + * r1: dst (8 blocks) + * r2: src (8 blocks) + * r3: iv + */ + + push {lr}; + vpush {RA4-RB2}; + + vld1.8 {RA0, RA1}, [r2]!; + vld1.8 {RA2, RA3}, [r2]!; + vld1.8 {RB0, RB1}, [r2]!; + vld1.8 {RB2, RB3}, [r2]!; + sub r2, r2, #(8*16); + + bl __serpent_dec_blk8; + + vld1.8 {RB4}, [r3]; + vld1.8 {RT0, RT1}, [r2]!; + vld1.8 {RT2, RT3}, [r2]!; + veor RA0, RA0, RB4; + veor RA1, RA1, RT0; + veor RA2, RA2, RT1; + vld1.8 {RT0, RT1}, [r2]!; + veor RA3, RA3, RT2; + veor RB0, RB0, RT3; + vld1.8 {RT2, RT3}, [r2]!; + veor RB1, RB1, RT0; + veor RT0, RT0; + veor RB2, RB2, RT1; + veor RT1, RT1; + veor RB3, RB3, RT2; + veor RT2, RT2; + vst1.8 {RT3}, [r3]; /* store new IV */ + veor RT3, RT3; + + vst1.8 {RA0, RA1}, [r1]!; + veor RA0, RA0; + veor RA1, RA1; + vst1.8 {RA2, RA3}, [r1]!; + veor RA2, RA2; + vst1.8 {RB0, RB1}, [r1]!; + veor RA3, RA3; + vst1.8 {RB2, RB3}, [r1]!; + veor RB3, RB3; + + vpop {RA4-RB2}; + + /* clear the used registers */ + veor RB4, RB4; + + pop {pc}; +.size _gcry_serpent_neon_cbc_dec,.-_gcry_serpent_neon_cbc_dec; + +#endif diff --git a/cipher/serpent.c b/cipher/serpent.c index a8ee15f..cfda742 100644 --- a/cipher/serpent.c +++ b/cipher/serpent.c @@ -46,6 +46,15 @@ # endif #endif +/* USE_NEON indicates whether to enable ARM NEON assembly code. */ +#undef USE_NEON +#if defined(HAVE_ARM_ARCH_V6) && defined(__ARMEL__) +# if defined(HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS) && \ + defined(HAVE_GCC_INLINE_ASM_NEON) +# define USE_NEON 1 +# endif +#endif + /* Number of rounds per Serpent encrypt/decrypt operation. */ #define ROUNDS 32 @@ -71,6 +80,9 @@ typedef struct serpent_context #ifdef USE_AVX2 int use_avx2; #endif +#ifdef USE_NEON + int use_neon; +#endif } serpent_context_t; @@ -114,6 +126,26 @@ extern void _gcry_serpent_avx2_cfb_dec(serpent_context_t *ctx, unsigned char *iv); #endif +#ifdef USE_NEON +/* Assembler implementations of Serpent using ARM NEON. Process 8 block in + parallel. + */ +extern void _gcry_serpent_neon_ctr_enc(serpent_context_t *ctx, + unsigned char *out, + const unsigned char *in, + unsigned char *ctr); + +extern void _gcry_serpent_neon_cbc_dec(serpent_context_t *ctx, + unsigned char *out, + const unsigned char *in, + unsigned char *iv); + +extern void _gcry_serpent_neon_cfb_dec(serpent_context_t *ctx, + unsigned char *out, + const unsigned char *in, + unsigned char *iv); +#endif + /* A prototype. */ static const char *serpent_test (void); @@ -634,6 +666,14 @@ serpent_setkey_internal (serpent_context_t *context, } #endif +#ifdef USE_NEON + context->use_neon = 0; + if ((_gcry_get_hw_features () & HWF_ARM_NEON)) + { + context->use_neon = 1; + } +#endif + _gcry_burn_stack (272 * sizeof (u32)); } @@ -861,6 +901,34 @@ _gcry_serpent_ctr_enc(void *context, unsigned char *ctr, } #endif +#ifdef USE_NEON + if (ctx->use_neon) + { + int did_use_neon = 0; + + /* Process data in 8 block chunks. */ + while (nblocks >= 8) + { + _gcry_serpent_neon_ctr_enc(ctx, outbuf, inbuf, ctr); + + nblocks -= 8; + outbuf += 8 * sizeof(serpent_block_t); + inbuf += 8 * sizeof(serpent_block_t); + did_use_neon = 1; + } + + if (did_use_neon) + { + /* serpent-neon assembly code does not use stack */ + if (nblocks == 0) + burn_stack_depth = 0; + } + + /* Use generic code to handle smaller chunks... */ + /* TODO: use caching instead? */ + } +#endif + for ( ;nblocks; nblocks-- ) { /* Encrypt the counter. */ @@ -948,6 +1016,33 @@ _gcry_serpent_cbc_dec(void *context, unsigned char *iv, } #endif +#ifdef USE_NEON + if (ctx->use_neon) + { + int did_use_neon = 0; + + /* Process data in 8 block chunks. */ + while (nblocks >= 8) + { + _gcry_serpent_neon_cbc_dec(ctx, outbuf, inbuf, iv); + + nblocks -= 8; + outbuf += 8 * sizeof(serpent_block_t); + inbuf += 8 * sizeof(serpent_block_t); + did_use_neon = 1; + } + + if (did_use_neon) + { + /* serpent-neon assembly code does not use stack */ + if (nblocks == 0) + burn_stack_depth = 0; + } + + /* Use generic code to handle smaller chunks... */ + } +#endif + for ( ;nblocks; nblocks-- ) { /* INBUF is needed later and it may be identical to OUTBUF, so store @@ -1028,6 +1123,33 @@ _gcry_serpent_cfb_dec(void *context, unsigned char *iv, } #endif +#ifdef USE_NEON + if (ctx->use_neon) + { + int did_use_neon = 0; + + /* Process data in 8 block chunks. */ + while (nblocks >= 8) + { + _gcry_serpent_neon_cfb_dec(ctx, outbuf, inbuf, iv); + + nblocks -= 8; + outbuf += 8 * sizeof(serpent_block_t); + inbuf += 8 * sizeof(serpent_block_t); + did_use_neon = 1; + } + + if (did_use_neon) + { + /* serpent-neon assembly code does not use stack */ + if (nblocks == 0) + burn_stack_depth = 0; + } + + /* Use generic code to handle smaller chunks... */ + } +#endif + for ( ;nblocks; nblocks-- ) { serpent_encrypt_internal(ctx, iv, iv); diff --git a/configure.ac b/configure.ac index 19c97bd..e3471d0 100644 --- a/configure.ac +++ b/configure.ac @@ -1502,6 +1502,11 @@ if test "$found" = "1" ; then # Build with the AVX2 implementation GCRYPT_CIPHERS="$GCRYPT_CIPHERS serpent-avx2-amd64.lo" fi + + if test x"$neonsupport" = xyes ; then + # Build with the NEON implementation + GCRYPT_CIPHERS="$GCRYPT_CIPHERS serpent-armv7-neon.lo" + fi fi LIST_MEMBER(rfc2268, $enabled_ciphers) ----------------------------------------------------------------------- Summary of changes: cipher/salsa20-armv7-neon.S | 2 +- cipher/serpent-armv7-neon.S | 869 +++++++++++++++++++++++++++++++++++++++++++ cipher/serpent.c | 122 ++++++ configure.ac | 5 + doc/gcrypt.texi | 86 ++--- 5 files changed, 1040 insertions(+), 44 deletions(-) create mode 100644 cipher/serpent-armv7-neon.S hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Tue Oct 29 15:02:44 2013 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Tue, 29 Oct 2013 15:02:44 +0100 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-348-gba6bffa Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via ba6bffafd17bea11985afc500022d66da261d59a (commit) via c284f15db99e9cb135612de710199abb23baafd3 (commit) via 39ccf743a11ca349ecf00d086fef53135cc4fbe0 (commit) via ba892a0a874c8b2a83dbf0940608cd7e2911ce01 (commit) from 1faa61845f180bd47e037e400dde2d864ee83c89 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit ba6bffafd17bea11985afc500022d66da261d59a Author: Werner Koch Date: Tue Oct 29 15:01:01 2013 +0100 tests: Add feature to skip benchmarks. * tests/benchmark.c (main): Add feature to skip the test. * tests/bench-slope.c (main): Ditto. (get_slope): Repace C++ style comment. (double_cmp, cipher_bench, _hash_bench): Repalce system reserved symbols. -- During development a quick run of the regression is often useful, however the benchmarks take a lot of time and thus this feature allows to skip theses tests. Signed-off-by: Werner Koch diff --git a/tests/bench-slope.c b/tests/bench-slope.c index 62543bc..5687bf1 100644 --- a/tests/bench-slope.c +++ b/tests/bench-slope.c @@ -239,7 +239,7 @@ get_slope (double (*const get_x) (unsigned int idx, void *priv), sumx += x; sumy += y; sumx2 += x * x; - //sumy2 += y * y; + /*sumy2 += y * y;*/ sumxy += x * y; } @@ -275,12 +275,12 @@ get_num_measurements (struct bench_obj *obj) static int -double_cmp (const void *__a, const void *__b) +double_cmp (const void *_a, const void *_b) { const double *a, *b; - a = __a; - b = __b; + a = _a; + b = _b; if (*a > *b) return 1; @@ -847,7 +847,7 @@ cipher_bench_one (int algo, struct bench_cipher_mode *pmode) static void -__cipher_bench (int algo) +_cipher_bench (int algo) { const char *algoname; int i; @@ -876,14 +876,14 @@ cipher_bench (char **argv, int argc) { algo = gcry_cipher_map_name (argv[i]); if (algo) - __cipher_bench (algo); + _cipher_bench (algo); } } else { for (i = 1; i < 400; i++) if (!gcry_cipher_test_algo (i)) - __cipher_bench (i); + _cipher_bench (i); } } @@ -978,7 +978,7 @@ hash_bench_one (int algo, struct bench_hash_mode *pmode) } static void -__hash_bench (int algo) +_hash_bench (int algo) { int i; @@ -1001,14 +1001,14 @@ hash_bench (char **argv, int argc) { algo = gcry_md_map_name (argv[i]); if (algo) - __hash_bench (algo); + _hash_bench (algo); } } else { for (i = 1; i < 400; i++) if (!gcry_md_test_algo (i)) - __hash_bench (i); + _hash_bench (i); } bench_print_footer (); @@ -1064,6 +1064,11 @@ main (int argc, char **argv) argv++; } + /* We skip this test if we are running under the test suite (no args + and srcdir defined) and GCRYPT_NO_BENCHMARKS is set. */ + if (!argc && getenv ("srcdir") && getenv ("GCRYPT_NO_BENCHMARKS")) + exit (77); + while (argc && last_argc != argc) { last_argc = argc; diff --git a/tests/benchmark.c b/tests/benchmark.c index d3ef1a2..8326ab2 100644 --- a/tests/benchmark.c +++ b/tests/benchmark.c @@ -1169,6 +1169,11 @@ main( int argc, char **argv ) if (argc) { argc--; argv++; } + /* We skip this test if we are running under the test suite (no args + and srcdir defined) and GCRYPT_NO_BENCHMARKS is set. */ + if (!argc && getenv ("srcdir") && getenv ("GCRYPT_NO_BENCHMARKS")) + exit (77); + while (argc && last_argc != argc ) { last_argc = argc; commit c284f15db99e9cb135612de710199abb23baafd3 Author: Werner Koch Date: Tue Oct 29 14:36:58 2013 +0100 ecc: Finish Ed25519/ECDSA hack. * cipher/ecc.c (ecc_generate): Fix Ed25519/ECDSA case. (ecc_verify): Implement ED25519/ECDSA uncompression. -- With this change Ed25519 may be used with ECDSA while using the Ed25519 standard compression technique. Signed-off-by: Werner Koch diff --git a/cipher/ecc.c b/cipher/ecc.c index 5a52829..752dfc1 100644 --- a/cipher/ecc.c +++ b/cipher/ecc.c @@ -498,7 +498,7 @@ ecc_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) if (_gcry_mpi_ec_get_affine (x, y, &sk.E.G, ctx)) log_fatal ("ecgen: Failed to get affine coordinates for %s\n", "G"); base = _gcry_ecc_ec2os (x, y, sk.E.p); - if (sk.E.dialect == ECC_DIALECT_ED25519 && !ed25519_with_ecdsa) + if (sk.E.dialect == ECC_DIALECT_ED25519) { unsigned char *encpk; unsigned int encpklen; @@ -978,7 +978,22 @@ ecc_verify (gcry_sexp_t s_sig, gcry_sexp_t s_data, gcry_sexp_t s_keyparms) else { point_init (&pk.Q); - rc = _gcry_ecc_os2ec (&pk.Q, mpi_q); + if (pk.E.dialect == ECC_DIALECT_ED25519) + { + mpi_ec_t ec; + + /* Fixme: Factor the curve context setup out of eddsa_verify + and ecdsa_verify. So that we don't do it twice. */ + ec = _gcry_mpi_ec_p_internal_new (pk.E.model, pk.E.dialect, + pk.E.p, pk.E.a, pk.E.b); + + rc = _gcry_ecc_eddsa_decodepoint (mpi_q, ec, &pk.Q, NULL, NULL); + _gcry_mpi_ec_free (ec); + } + else + { + rc = _gcry_ecc_os2ec (&pk.Q, mpi_q); + } if (rc) goto leave; diff --git a/tests/pubkey.c b/tests/pubkey.c index 4dadf88..e41050c 100644 --- a/tests/pubkey.c +++ b/tests/pubkey.c @@ -1050,6 +1050,12 @@ check_ed25519ecdsa_sample_key (void) " (q #044C056555BE4084BB3D8D8895FDF7C2893DFE0256251923053010977D12658321" " 156D1ADDC07987713A418783658B476358D48D582DB53233D9DED3C1C2577B04#)" "))"; + static const char ecc_public_key_comp[] = + "(public-key\n" + " (ecc\n" + " (curve \"Ed25519\")\n" + " (q #047b57c2c1d3ded93332b52d588dd45863478b658387413a718779c0dd1a6d95#)" + "))"; static const char hash_string[] = "(data (flags ecdsa rfc6979)\n" " (hash sha256 #00112233445566778899AABBCCDDEEFF" @@ -1061,38 +1067,49 @@ check_ed25519ecdsa_sample_key (void) if (verbose) fprintf (stderr, "Checking sample Ed25519/ECDSA key.\n"); + /* Sign. */ if ((err = gcry_sexp_new (&hash, hash_string, 0, 1))) die ("line %d: %s", __LINE__, gpg_strerror (err)); - if ((err = gcry_sexp_new (&key, ecc_private_key, 0, 1))) die ("line %d: %s", __LINE__, gpg_strerror (err)); - if ((err = gcry_pk_sign (&sig, hash, key))) die ("gcry_pk_sign failed: %s", gpg_strerror (err)); + /* Verify. */ gcry_sexp_release (key); if ((err = gcry_sexp_new (&key, ecc_public_key, 0, 1))) die ("line %d: %s", __LINE__, gpg_strerror (err)); - if ((err = gcry_pk_verify (sig, hash, key))) die ("gcry_pk_verify failed: %s", gpg_strerror (err)); - /* Now try signing without the Q parameter. */ + /* Verify again using a compressed public key. */ + gcry_sexp_release (key); + if ((err = gcry_sexp_new (&key, ecc_public_key_comp, 0, 1))) + die ("line %d: %s", __LINE__, gpg_strerror (err)); + if ((err = gcry_pk_verify (sig, hash, key))) + die ("gcry_pk_verify failed (comp): %s", gpg_strerror (err)); + /* Sign without a Q parameter. */ gcry_sexp_release (key); if ((err = gcry_sexp_new (&key, ecc_private_key_wo_q, 0, 1))) die ("line %d: %s", __LINE__, gpg_strerror (err)); - gcry_sexp_release (sig); if ((err = gcry_pk_sign (&sig, hash, key))) - die ("gcry_pk_sign without Q failed: %s", gpg_strerror (err)); + die ("gcry_pk_sign w/o Q failed: %s", gpg_strerror (err)); + /* Verify. */ gcry_sexp_release (key); if ((err = gcry_sexp_new (&key, ecc_public_key, 0, 1))) die ("line %d: %s", __LINE__, gpg_strerror (err)); + if ((err = gcry_pk_verify (sig, hash, key))) + die ("gcry_pk_verify signed w/o Q failed: %s", gpg_strerror (err)); + /* Verify again using a compressed public key. */ + gcry_sexp_release (key); + if ((err = gcry_sexp_new (&key, ecc_public_key_comp, 0, 1))) + die ("line %d: %s", __LINE__, gpg_strerror (err)); if ((err = gcry_pk_verify (sig, hash, key))) - die ("gcry_pk_verify signed without Q failed: %s", gpg_strerror (err)); + die ("gcry_pk_verify signed w/o Q failed (comp): %s", gpg_strerror (err)); extract_cmp_data (sig, "r", ("a63123a783ef29b8276e08987daca4" "655d0179e22199bf63691fd88eb64e15")); commit 39ccf743a11ca349ecf00d086fef53135cc4fbe0 Author: Werner Koch Date: Tue Oct 29 11:27:58 2013 +0100 Typo fix. -- diff --git a/mpi/mpicoder.c b/mpi/mpicoder.c index b598521..7c4f5ca 100644 --- a/mpi/mpicoder.c +++ b/mpi/mpicoder.c @@ -179,7 +179,7 @@ mpi_fromstr (gcry_mpi_t val, const char *str) /* Return an allocated buffer with the MPI (msb first). NBYTES receives the length of this buffer. If FILL_LE is not 0, the returned value is stored as little endian and right padded with - zeroes so that the returned buffer has at least LILL_LE bytes. + zeroes so that the returned buffer has at least FILL_LE bytes. Caller must free the return string. This function returns an allocated buffer with NBYTES set to zero if the value of A is zero. commit ba892a0a874c8b2a83dbf0940608cd7e2911ce01 Author: Werner Koch Date: Fri Oct 25 15:44:03 2013 +0200 ecc: Add flags "noparam" and "comp". * src/cipher.h (PUBKEY_FLAG_NOPARAM, PUBKEY_FLAG_COMP): New. * cipher/pubkey-util.c (_gcry_pk_util_parse_flaglist): Parse new flags and change code for possible faster parsing. * cipher/ecc.c (ecc_generate): Implement the "noparam" flag. (ecc_sign): Ditto. (ecc_verify): Ditto. * tests/keygen.c (check_ecc_keys): Use the "noparam" flag. * cipher/ecc.c (ecc_generate): Fix parsing of the deprecated transient-flag parameter. (ecc_verify): Do not make Q optional in the extract-param call. -- Note that the "comp" flag has not yet any effect. Signed-off-by: Werner Koch diff --git a/cipher/ecc.c b/cipher/ecc.c index dca0423..5a52829 100644 --- a/cipher/ecc.c +++ b/cipher/ecc.c @@ -423,14 +423,6 @@ ecc_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) return GPG_ERR_INV_OBJ; /* No curve name or value too large. */ } - /* Parse the optional transient-key flag. */ - l1 = gcry_sexp_find_token (genparms, "transient-key", 0); - if (l1) - { - flags |= PUBKEY_FLAG_TRANSIENT_KEY; - gcry_sexp_release (l1); - } - /* Parse the optional flags list. */ l1 = gcry_sexp_find_token (genparms, "flags", 0); if (l1) @@ -441,6 +433,14 @@ ecc_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) goto leave; } + /* Parse the deprecated optional transient-key flag. */ + l1 = gcry_sexp_find_token (genparms, "transient-key", 0); + if (l1) + { + flags |= PUBKEY_FLAG_TRANSIENT_KEY; + gcry_sexp_release (l1); + } + /* NBITS is required if no curve name has been given. */ if (!nbits && !curve_name) return GPG_ERR_NO_OBJ; /* No NBITS parameter. */ @@ -524,24 +524,43 @@ ecc_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) goto leave; } - if (ed25519_with_ecdsa) + if ((flags & PUBKEY_FLAG_NOPARAM) || ed25519_with_ecdsa) { - rc = gcry_sexp_build (&curve_flags, NULL, "(flags ecdsa)"); + rc = gcry_sexp_build + (&curve_flags, NULL, + ((flags & PUBKEY_FLAG_NOPARAM) && ed25519_with_ecdsa)? + "(flags noparam ecdsa)" : + ((flags & PUBKEY_FLAG_NOPARAM))? + "(flags noparam)" : + "(flags ecdsa)"); if (rc) goto leave; } - rc = gcry_sexp_build (r_skey, NULL, - "(key-data" - " (public-key" - " (ecc%S%S(p%m)(a%m)(b%m)(g%m)(n%m)(q%m)))" - " (private-key" - " (ecc%S%S(p%m)(a%m)(b%m)(g%m)(n%m)(q%m)(d%m)))" - " )", - curve_info, curve_flags, - sk.E.p, sk.E.a, sk.E.b, base, sk.E.n, public, - curve_info, curve_flags, - sk.E.p, sk.E.a, sk.E.b, base, sk.E.n, public, secret); + if ((flags & PUBKEY_FLAG_NOPARAM) && E.name) + rc = gcry_sexp_build (r_skey, NULL, + "(key-data" + " (public-key" + " (ecc%S%S(q%m)))" + " (private-key" + " (ecc%S%S(q%m)(d%m)))" + " )", + curve_info, curve_flags, + public, + curve_info, curve_flags, + public, secret); + else + rc = gcry_sexp_build (r_skey, NULL, + "(key-data" + " (public-key" + " (ecc%S%S(p%m)(a%m)(b%m)(g%m)(n%m)(q%m)))" + " (private-key" + " (ecc%S%S(p%m)(a%m)(b%m)(g%m)(n%m)(q%m)(d%m)))" + " )", + curve_info, curve_flags, + sk.E.p, sk.E.a, sk.E.b, base, sk.E.n, public, + curve_info, curve_flags, + sk.E.p, sk.E.a, sk.E.b, base, sk.E.n, public, secret); if (rc) goto leave; @@ -709,9 +728,13 @@ ecc_sign (gcry_sexp_t *r_sig, gcry_sexp_t s_data, gcry_sexp_t keyparms) /* * Extract the key. */ - rc = _gcry_sexp_extract_param (keyparms, NULL, "-p?a?b?g?n?/q?+d", - &sk.E.p, &sk.E.a, &sk.E.b, &mpi_g, &sk.E.n, - &mpi_q, &sk.d, NULL); + if ((ctx.flags & PUBKEY_FLAG_NOPARAM)) + rc = _gcry_sexp_extract_param (keyparms, NULL, "/q?+d", + &mpi_q, &sk.d, NULL); + else + rc = _gcry_sexp_extract_param (keyparms, NULL, "-p?a?b?g?n?/q?+d", + &sk.E.p, &sk.E.a, &sk.E.b, &mpi_g, &sk.E.n, + &mpi_q, &sk.d, NULL); if (rc) goto leave; if (mpi_g) @@ -871,9 +894,13 @@ ecc_verify (gcry_sexp_t s_sig, gcry_sexp_t s_data, gcry_sexp_t s_keyparms) /* * Extract the key. */ - rc = _gcry_sexp_extract_param (s_keyparms, NULL, "-p?a?b?g?n?/q?", - &pk.E.p, &pk.E.a, &pk.E.b, &mpi_g, &pk.E.n, - &mpi_q, NULL); + if ((ctx.flags & PUBKEY_FLAG_NOPARAM)) + rc = _gcry_sexp_extract_param (s_keyparms, NULL, "/q", + &mpi_q, NULL); + else + rc = _gcry_sexp_extract_param (s_keyparms, NULL, "-p?a?b?g?n?/q", + &pk.E.p, &pk.E.a, &pk.E.b, &mpi_g, &pk.E.n, + &mpi_q, NULL); if (rc) goto leave; if (mpi_g) diff --git a/cipher/pubkey-util.c b/cipher/pubkey-util.c index 0db5840..88d6bb6 100644 --- a/cipher/pubkey-util.c +++ b/cipher/pubkey-util.c @@ -47,7 +47,7 @@ pss_verify_cmp (void *opaque, gcry_mpi_t tmp) /* Parser for a flag list. On return the encoding is stored at - R_ENCODING and the flags are stored at R_FLAGS. if any of them is + R_ENCODING and the flags are stored at R_FLAGS. If any of them is not needed, NULL may be passed. The function returns 0 on success or an error code. */ gpg_err_code_t @@ -65,61 +65,99 @@ _gcry_pk_util_parse_flaglist (gcry_sexp_t list, { s = gcry_sexp_nth_data (list, i, &n); if (!s) - ; /* not a data element*/ - else if (n == 7 && !memcmp (s, "rfc6979", 7)) - { - flags |= PUBKEY_FLAG_RFC6979; - } - else if (n == 5 && !memcmp (s, "eddsa", 5)) - { - encoding = PUBKEY_ENC_RAW; - flags |= PUBKEY_FLAG_EDDSA; - } - else if (n == 5 && !memcmp (s, "ecdsa", 5)) - { - flags |= PUBKEY_FLAG_ECDSA; - } - else if (n == 4 && !memcmp (s, "gost", 4)) - { - encoding = PUBKEY_ENC_RAW; - flags |= PUBKEY_FLAG_GOST; - } - else if (n == 3 && !memcmp (s, "raw", 3) - && encoding == PUBKEY_ENC_UNKNOWN) - { - encoding = PUBKEY_ENC_RAW; - flags |= PUBKEY_FLAG_RAW_FLAG; /* Explicitly given. */ - } - else if (n == 5 && !memcmp (s, "pkcs1", 5) - && encoding == PUBKEY_ENC_UNKNOWN) - { - encoding = PUBKEY_ENC_PKCS1; - flags |= PUBKEY_FLAG_FIXEDLEN; - } - else if (n == 4 && !memcmp (s, "oaep", 4) - && encoding == PUBKEY_ENC_UNKNOWN) - { - encoding = PUBKEY_ENC_OAEP; - flags |= PUBKEY_FLAG_FIXEDLEN; - } - else if (n == 3 && !memcmp (s, "pss", 3) - && encoding == PUBKEY_ENC_UNKNOWN) + continue; /* Not a data element. */ + + switch (n) { - encoding = PUBKEY_ENC_PSS; - flags |= PUBKEY_FLAG_FIXEDLEN; + case 3: + if (!memcmp (s, "pss", 3) && encoding == PUBKEY_ENC_UNKNOWN) + { + encoding = PUBKEY_ENC_PSS; + flags |= PUBKEY_FLAG_FIXEDLEN; + } + else if (!memcmp (s, "raw", 3) && encoding == PUBKEY_ENC_UNKNOWN) + { + encoding = PUBKEY_ENC_RAW; + flags |= PUBKEY_FLAG_RAW_FLAG; /* Explicitly given. */ + } + else + rc = GPG_ERR_INV_FLAG; + break; + + case 4: + if (!memcmp (s, "comp", 4)) + flags |= PUBKEY_FLAG_COMP; + else if (!memcmp (s, "oaep", 4) && encoding == PUBKEY_ENC_UNKNOWN) + { + encoding = PUBKEY_ENC_OAEP; + flags |= PUBKEY_FLAG_FIXEDLEN; + } + else if (!memcmp (s, "gost", 4)) + { + encoding = PUBKEY_ENC_RAW; + flags |= PUBKEY_FLAG_GOST; + } + else + rc = GPG_ERR_INV_FLAG; + break; + + case 5: + if (!memcmp (s, "eddsa", 5)) + { + encoding = PUBKEY_ENC_RAW; + flags |= PUBKEY_FLAG_EDDSA; + } + else if (!memcmp (s, "ecdsa", 5)) + { + flags |= PUBKEY_FLAG_ECDSA; + } + else if (!memcmp (s, "pkcs1", 5) && encoding == PUBKEY_ENC_UNKNOWN) + { + encoding = PUBKEY_ENC_PKCS1; + flags |= PUBKEY_FLAG_FIXEDLEN; + } + else + rc = GPG_ERR_INV_FLAG; + break; + + case 7: + if (!memcmp (s, "rfc6979", 7)) + flags |= PUBKEY_FLAG_RFC6979; + else if (!memcmp (s, "noparam", 7)) + flags |= PUBKEY_FLAG_NOPARAM; + else + rc = GPG_ERR_INV_FLAG; + break; + + case 8: + if (!memcmp (s, "use-x931", 8)) + flags |= PUBKEY_FLAG_USE_X931; + else + rc = GPG_ERR_INV_FLAG; + break; + + case 11: + if (!memcmp (s, "no-blinding", 11)) + flags |= PUBKEY_FLAG_NO_BLINDING; + else if (!memcmp (s, "use-fips186", 11)) + flags |= PUBKEY_FLAG_USE_FIPS186; + else + rc = GPG_ERR_INV_FLAG; + break; + + case 13: + if (!memcmp (s, "use-fips186-2", 13)) + flags |= PUBKEY_FLAG_USE_FIPS186_2; + else if (!memcmp (s, "transient-key", 13)) + flags |= PUBKEY_FLAG_TRANSIENT_KEY; + else + rc = GPG_ERR_INV_FLAG; + break; + + default: + rc = GPG_ERR_INV_FLAG; + break; } - else if (n == 11 && ! memcmp (s, "no-blinding", 11)) - flags |= PUBKEY_FLAG_NO_BLINDING; - else if (n == 13 && ! memcmp (s, "transient-key", 13)) - flags |= PUBKEY_FLAG_TRANSIENT_KEY; - else if (n == 8 && ! memcmp (s, "use-x931", 8)) - flags |= PUBKEY_FLAG_USE_X931; - else if (n == 11 && ! memcmp (s, "use-fips186", 11)) - flags |= PUBKEY_FLAG_USE_FIPS186; - else if (n == 13 && ! memcmp (s, "use-fips186-2", 13)) - flags |= PUBKEY_FLAG_USE_FIPS186_2; - else - rc = GPG_ERR_INV_FLAG; } if (r_flags) diff --git a/doc/gcrypt.texi b/doc/gcrypt.texi index 6dcb4b1..4a202dd 100644 --- a/doc/gcrypt.texi +++ b/doc/gcrypt.texi @@ -2230,6 +2230,14 @@ named `flags'. Flag names are case-sensitive. The following flags are known: @table @code + + at item comp + at cindex comp +If supported and not yet the default return ECC points in compact +(compressed) representation. The compact representation requires a +small overhead before a point can be used but halves the size of a to +be conveyed public key. + @item pkcs1 @cindex PKCS1 Use PKCS#1 block type 2 padding for encryption, block type 1 padding @@ -2264,6 +2272,16 @@ order to prevent leaking of secret information. Blinding is only implemented by RSA, but it might be implemented by other algorithms in the future as well, when necessary. + at item noparam + at cindex noparam +For ECC key generation do not return the domain parameters but only +the name of the curve. For ECC signing and verification ignore any +provided domain parameters of the public or private key and use only +the curve name. It is more secure to rely on the curve name and thus +use the curve parameters as known by Libgcrypt. This option shouild +have been the default but for backward compatibility reasons this is +not possible. It is best to always use this flag with ECC keys. + @item transient-key @cindex transient-key This flag is only meaningful for RSA, DSA, and ECC key generation. If @@ -2836,7 +2854,7 @@ is in general not recommended. @example (genkey (ecc - (flags transient-key ecdsa))) + (flags noparam transient-key ecdsa))) @end example @item transient-key @@ -2856,7 +2874,8 @@ private and public keys are returned in one container and may be accompanied by some miscellaneous information. @noindent -As an example, here is what the Elgamal key generation returns: +Here are two examples; the first for Elgamal and the second for +elliptic curve key generation: @example (key-data @@ -2875,6 +2894,21 @@ As an example, here is what the Elgamal key generation returns: (pm1-factors @var{n1 n2 ... nn})) @end example + at example +(key-data + (public-key + (ecc + (curve Ed25519) + (flags noparam) + (q @var{q-value}))) + (private-key + (ecc + (curve Ed25519) + (flags noparam) + (q @var{q-value}) + (d @var{d-value})))) + at end example + @noindent As you can see, some of the information is duplicated, but this provides an easy way to extract either the public or the private key. diff --git a/src/cipher.h b/src/cipher.h index 20818ba..551dc66 100644 --- a/src/cipher.h +++ b/src/cipher.h @@ -38,6 +38,8 @@ #define PUBKEY_FLAG_ECDSA (1 << 9) #define PUBKEY_FLAG_EDDSA (1 << 10) #define PUBKEY_FLAG_GOST (1 << 11) +#define PUBKEY_FLAG_NOPARAM (1 << 12) +#define PUBKEY_FLAG_COMP (1 << 12) enum pk_operation diff --git a/tests/keygen.c b/tests/keygen.c index 5ab8e9d..18fe211 100644 --- a/tests/keygen.c +++ b/tests/keygen.c @@ -375,7 +375,8 @@ check_ecc_keys (void) if (verbose) show ("creating ECC key using curve %s\n", curves[testno]); rc = gcry_sexp_build (&keyparm, NULL, - "(genkey(ecc(curve %s)))", curves[testno]); + "(genkey(ecc(curve %s)(flags noparam)))", + curves[testno]); if (rc) die ("error creating S-expression: %s\n", gpg_strerror (rc)); rc = gcry_pk_genkey (&key, keyparm); @@ -410,6 +411,21 @@ check_ecc_keys (void) if (verbose > 1) show_sexp ("ECC key:\n", key); + if (verbose) + show ("creating ECC key using curve Ed25519 for ECDSA\n"); + rc = gcry_sexp_build (&keyparm, NULL, + "(genkey(ecc(curve Ed25519)(flags noparam ecdsa)))"); + if (rc) + die ("error creating S-expression: %s\n", gpg_strerror (rc)); + rc = gcry_pk_genkey (&key, keyparm); + gcry_sexp_release (keyparm); + if (rc) + die ("error generating ECC key using curve Ed25519 for ECDSA: %s\n", + gpg_strerror (rc)); + + if (verbose > 1) + show_sexp ("ECC key:\n", key); + gcry_sexp_release (key); } ----------------------------------------------------------------------- Summary of changes: cipher/ecc.c | 100 ++++++++++++++++++++++++---------- cipher/pubkey-util.c | 146 +++++++++++++++++++++++++++++++------------------- doc/gcrypt.texi | 38 ++++++++++++- mpi/mpicoder.c | 2 +- src/cipher.h | 2 + tests/bench-slope.c | 25 +++++---- tests/benchmark.c | 5 ++ tests/keygen.c | 18 ++++++- tests/pubkey.c | 31 ++++++++--- 9 files changed, 263 insertions(+), 104 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Wed Oct 30 08:34:13 2013 From: cvs at cvs.gnupg.org (by Jussi Kivilinna) Date: Wed, 30 Oct 2013 08:34:13 +0100 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.5.0-349-gd1cadd1 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via d1cadd145199040299538891ab2ccd1208f7776e (commit) from ba6bffafd17bea11985afc500022d66da261d59a (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit d1cadd145199040299538891ab2ccd1208f7776e Author: Jussi Kivilinna Date: Wed Oct 30 08:57:15 2013 +0200 bithelp: fix undefined behaviour with rol and ror * cipher/bithelp.h (rol, ror): Mask shift with 31. -- Signed-off-by: Jussi Kivilinna diff --git a/cipher/bithelp.h b/cipher/bithelp.h index 601ecac..418bdf5 100644 --- a/cipher/bithelp.h +++ b/cipher/bithelp.h @@ -26,14 +26,14 @@ /**************** * Rotate the 32 bit unsigned integer X by N bits left/right */ -static inline u32 rol( u32 x, int n) +static inline u32 rol(u32 x, int n) { - return ( (x << n) | (x >> (32-n)) ); + return ( (x << (n&(32-1))) | (x >> ((32-n)&(32-1))) ); } static inline u32 ror(u32 x, int n) { - return ( (x >> n) | (x << (32-n)) ); + return ( (x >> (n&(32-1))) | (x << ((32-n)&(32-1))) ); } /* Byte swap for 32-bit and 64-bit integers. If available, use compiler ----------------------------------------------------------------------- Summary of changes: cipher/bithelp.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From jussi.kivilinna at iki.fi Thu Oct 31 13:41:41 2013 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Thu, 31 Oct 2013 14:41:41 +0200 Subject: [PATCH] Add CMAC mode (Cipher-based MAC) Message-ID: <20131031124141.22682.33976.stgit@localhost6.localdomain6> * cipher/Makefile.am: Add 'cipher-cmac.c'. * cipher/cipher-cmac.c: New. * cipher/cipher-internal.h (gcry_cipher_handle): Add 'u_keys'. (gcry_cipher_handle.u_mode): Add 'cmac'. (_gcry_cipher_cmac_authenticate, _gcry_cipher_cmac_get_tag) (_gcry_cipher_cmac_check_tag, _gcry_cipher_cmac_set_subkeys): New prototypes. * cipher/cipher.c (gcry_cipher_open, cipher_setkey, cipher_encrypt) (cipher_decrypt, _gcry_cipher_authenticate, _gcry_cipher_gettag) (_gcry_cipher_checktag): Add handling for CMAC mode. (cipher_reset): Do not reset 'marks.key'. * src/gcrypt.h.in (gcry_cipher_modes): Add 'GCRY_CIPHER_MODE_CMAC'. * doc/gcrypt.texi: Add documentation for GCRY_CIPHER_MODE_CMAC. * tests/basic.c (check_mac_cipher): New. (check_cipher_modes): Call 'check_mac_cipher'. * tests/bench-slope.c (bench_authenticate_do_bench): New. (authenticate_ops): New. (cipher_modes): Add CMAC test. -- Patch adds CMAC (Cipher-based MAC) mode as defined in RFC 4493 and NIST Special Publication 800-38B. Example of usage: /* Message 1 is split to two buffers, buf1_a and buf1_b. */ gcry_cipher_setkey(h, key, len(key)); gcry_cipher_authenticate(h, buf1_a, len(buf1_a)); gcry_cipher_authenticate(h, buf1_b, len(buf1_b)); gcry_cipher_gettag(h, buf1_tag, len(buf1_tag)); /* Message 2, MAC with same key.. can use reset instead of setkey. */ gcry_cipher_reset(h); gcry_cipher_authenticate(h, buf2, len(buf2)); gcry_cipher_gettag(h, buf2_tag, len(buf2_tag)); Checking tag: /* Message 3, compare with existing tag. */ gcry_cipher_setkey(h, key, len(key)); gcry_cipher_authenticate(h, buf3, len(buf3)); gcry_cipher_checktag(h, buf3_tag, len(buf3_tag)); if (gpg_err_code (err) == GPG_ERR_CHECKSUM) { /* Authentication failed. */ } else if (err == 0) { /* Authentication ok. */ } Signed-off-by: Jussi Kivilinna --- cipher/Makefile.am | 2 cipher/cipher-cmac.c | 242 +++++++++++++++++++++++++++++++++++++++ cipher/cipher-internal.h | 24 ++++ cipher/cipher.c | 37 ++++++ doc/gcrypt.texi | 25 +++- src/gcrypt.h.in | 3 tests/basic.c | 287 ++++++++++++++++++++++++++++++++++++++++++++++ tests/bench-slope.c | 42 +++++++ 8 files changed, 651 insertions(+), 11 deletions(-) create mode 100644 cipher/cipher-cmac.c diff --git a/cipher/Makefile.am b/cipher/Makefile.am index 95d484e..87f693e 100644 --- a/cipher/Makefile.am +++ b/cipher/Makefile.am @@ -40,7 +40,7 @@ libcipher_la_LIBADD = $(GCRYPT_MODULES) libcipher_la_SOURCES = \ cipher.c cipher-internal.h \ cipher-cbc.c cipher-cfb.c cipher-ofb.c cipher-ctr.c cipher-aeswrap.c \ -cipher-ccm.c \ +cipher-ccm.c cipher-cmac.c \ cipher-selftest.c cipher-selftest.h \ pubkey.c pubkey-internal.h pubkey-util.c \ md.c \ diff --git a/cipher/cipher-cmac.c b/cipher/cipher-cmac.c new file mode 100644 index 0000000..236a0dc --- /dev/null +++ b/cipher/cipher-cmac.c @@ -0,0 +1,242 @@ +/* cmac.c - CMAC, Cipher-based MAC. + * Copyright ? 2013 Jussi Kivilinna + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include +#include +#include +#include + +#include "g10lib.h" +#include "cipher.h" +#include "cipher-internal.h" +#include "bufhelp.h" + + +#define set_burn(burn, nburn) do { \ + unsigned int __nburn = (nburn); \ + (burn) = (burn) > __nburn ? (burn) : __nburn; } while (0) + + +static void +cmac_write (gcry_cipher_hd_t c, const byte * inbuf, size_t inlen) +{ + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; + const unsigned int blocksize = c->spec->blocksize; + byte outbuf[MAX_BLOCKSIZE]; + unsigned int burn = 0; + unsigned int nblocks; + + if (!inlen || !inbuf) + return; + + /* Last block is needed for cmac_final. */ + if (c->unused + inlen <= blocksize) + { + for (; inlen && c->unused < blocksize; inlen--) + c->lastiv[c->unused++] = *inbuf++; + return; + } + + if (c->unused) + { + for (; inlen && c->unused < blocksize; inlen--) + c->lastiv[c->unused++] = *inbuf++; + + buf_xor (c->u_iv.iv, c->u_iv.iv, c->lastiv, blocksize); + set_burn (burn, enc_fn (&c->context.c, c->u_iv.iv, c->u_iv.iv)); + + c->unused = 0; + } + + if (c->bulk.cbc_enc && inlen > blocksize) + { + nblocks = inlen / blocksize; + nblocks -= (nblocks * blocksize == inlen); + + c->bulk.cbc_enc (&c->context.c, c->u_iv.iv, outbuf, inbuf, nblocks, 1); + inbuf += nblocks * blocksize; + inlen -= nblocks * blocksize; + + wipememory (outbuf, sizeof (outbuf)); + } + else + while (inlen > blocksize) + { + buf_xor (c->u_iv.iv, c->u_iv.iv, inbuf, blocksize); + set_burn (burn, enc_fn (&c->context.c, c->u_iv.iv, c->u_iv.iv)); + inlen -= blocksize; + inbuf += blocksize; + } + + /* Make sure that last block is passed to cmac_final. */ + if (inlen == 0) + BUG (); + + for (; inlen && c->unused < blocksize; inlen--) + c->lastiv[c->unused++] = *inbuf++; + + if (burn) + _gcry_burn_stack (burn + 4 * sizeof (void *)); +} + + +static void cmac_generate_subkeys (gcry_cipher_hd_t c) +{ + const unsigned int blocksize = c->spec->blocksize; + byte rb, carry, t, bi; + unsigned int burn; + int i, j; + union + { + size_t _aligned; + byte buf[MAX_BLOCKSIZE]; + } u; + + if (MAX_BLOCKSIZE < blocksize) + BUG(); + + /* encrypt zero block */ + memset (u.buf, 0, blocksize); + burn = c->spec->encrypt (&c->context.c, u.buf, u.buf); + + /* Currently supported blocksizes are 16 and 8. */ + rb = blocksize == 16 ? 0x87 : 0x1B /*blocksize == 8 */ ; + + for (j = 0; j < 2; j++) + { + /* Generate subkeys K1 and K2 */ + carry = 0; + for (i = blocksize - 1; i >= 0; i--) + { + bi = u.buf[i]; + t = carry | (bi << 1); + carry = bi >> 7; + u.buf[i] = t & 0xff; + c->u_keys.cmac.subkeys[j][i] = u.buf[i]; + } + u.buf[blocksize - 1] ^= carry ? rb : 0; + c->u_keys.cmac.subkeys[j][blocksize - 1] = u.buf[blocksize - 1]; + } + + wipememory (&u, sizeof (u)); + if (burn) + _gcry_burn_stack (burn + 4 * sizeof (void *)); +} + + +static void +cmac_final (gcry_cipher_hd_t c) +{ + const unsigned int blocksize = c->spec->blocksize; + unsigned int count = c->unused; + unsigned int burn; + byte *subkey; + + if (count == blocksize) + subkey = c->u_keys.cmac.subkeys[0]; /* K1 */ + else + { + subkey = c->u_keys.cmac.subkeys[1]; /* K2 */ + c->lastiv[count++] = 0x80; + while (count < blocksize) + c->lastiv[count++] = 0; + } + + buf_xor (c->lastiv, c->lastiv, subkey, blocksize); + + buf_xor (c->u_iv.iv, c->u_iv.iv, c->lastiv, blocksize); + burn = c->spec->encrypt (&c->context.c, c->u_iv.iv, c->u_iv.iv); + if (burn) + _gcry_burn_stack (burn + 4 * sizeof (void *)); + + c->unused = 0; +} + + +static gcry_err_code_t +cmac_tag (gcry_cipher_hd_t c, unsigned char *tag, size_t taglen, int check) +{ + if (!tag || taglen == 0 || taglen > c->spec->blocksize) + return GPG_ERR_INV_ARG; + + if (!c->u_mode.cmac.tag) + { + cmac_final (c); + c->u_mode.cmac.tag = 1; + } + + if (!check) + { + memcpy (tag, c->u_iv.iv, taglen); + return GPG_ERR_NO_ERROR; + } + else + { + int diff, i; + + /* Constant-time compare. */ + for (i = 0, diff = 0; i < taglen; i++) + diff -= !!(tag[i] - c->u_iv.iv[i]); + + return !diff ? GPG_ERR_NO_ERROR : GPG_ERR_CHECKSUM; + } +} + + +gcry_err_code_t +_gcry_cipher_cmac_authenticate (gcry_cipher_hd_t c, + const unsigned char *abuf, size_t abuflen) +{ + if (abuflen > 0 && !abuf) + return GPG_ERR_INV_ARG; + if (c->u_mode.cmac.tag) + return GPG_ERR_INV_STATE; + /* To support new blocksize, update cmac_generate_subkeys() then add new + blocksize here. */ + if (c->spec->blocksize != 16 && c->spec->blocksize != 8) + return GPG_ERR_INV_CIPHER_MODE; + + cmac_write (c, abuf, abuflen); + + return GPG_ERR_NO_ERROR; +} + + +gcry_err_code_t +_gcry_cipher_cmac_get_tag (gcry_cipher_hd_t c, + unsigned char *outtag, size_t taglen) +{ + return cmac_tag (c, outtag, taglen, 0); +} + + +gcry_err_code_t +_gcry_cipher_cmac_check_tag (gcry_cipher_hd_t c, + const unsigned char *intag, size_t taglen) +{ + return cmac_tag (c, (unsigned char *) intag, taglen, 1); +} + +gcry_err_code_t +_gcry_cipher_cmac_set_subkeys (gcry_cipher_hd_t c) +{ + cmac_generate_subkeys (c); + + return GPG_ERR_NO_ERROR; +} diff --git a/cipher/cipher-internal.h b/cipher/cipher-internal.h index f528c84..74e57d7 100644 --- a/cipher/cipher-internal.h +++ b/cipher/cipher-internal.h @@ -136,8 +136,20 @@ struct gcry_cipher_handle processed. */ unsigned int tag:1; /* Set to 1 if tag has been finalized. */ } ccm; + /* Mode specific storage for CMAC mode. */ + struct { + unsigned int tag:1; /* Set to 1 if tag has been finalized. */ + } cmac; } u_mode; + /* Mode specific storage for subkeys. _Not_ cleared by gcry_cipher_reset. */ + union { + /* Mode specific storage for CMAC mode. */ + struct { + unsigned char subkeys[2][MAX_BLOCKSIZE]; + } cmac; + } u_keys; + /* What follows are two contexts of the cipher in use. The first one needs to be aligned well enough for the cipher operation whereas the second one is a copy created by cipher_setkey and @@ -217,5 +229,17 @@ gcry_err_code_t _gcry_cipher_ccm_check_tag const unsigned char *intag, size_t taglen); +/*-- cipher-cmac.c --*/ +gcry_err_code_t _gcry_cipher_cmac_authenticate +/* */ (gcry_cipher_hd_t c, const unsigned char *abuf, size_t abuflen); +gcry_err_code_t _gcry_cipher_cmac_get_tag +/* */ (gcry_cipher_hd_t c, + unsigned char *outtag, size_t taglen); +gcry_err_code_t _gcry_cipher_cmac_check_tag +/* */ (gcry_cipher_hd_t c, + const unsigned char *intag, size_t taglen); +gcry_err_code_t _gcry_cipher_cmac_set_subkeys +/* */ (gcry_cipher_hd_t c); + #endif /*G10_CIPHER_INTERNAL_H*/ diff --git a/cipher/cipher.c b/cipher/cipher.c index 73a97b1..d9117dd 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -388,6 +388,7 @@ gcry_cipher_open (gcry_cipher_hd_t *handle, case GCRY_CIPHER_MODE_OFB: case GCRY_CIPHER_MODE_CTR: case GCRY_CIPHER_MODE_AESWRAP: + case GCRY_CIPHER_MODE_CMAC: if (!spec->encrypt || !spec->decrypt) err = GPG_ERR_INV_CIPHER_MODE; break; @@ -567,10 +568,20 @@ cipher_setkey (gcry_cipher_hd_t c, byte *key, unsigned int keylen) (void *) &c->context.c, c->spec->contextsize); c->marks.key = 1; + + switch (c->mode) + { + case GCRY_CIPHER_MODE_CMAC: + _gcry_cipher_cmac_set_subkeys (c); + break; + default: + break; + }; } else c->marks.key = 0; + return gcry_error (ret); } @@ -613,6 +624,10 @@ cipher_setiv (gcry_cipher_hd_t c, const byte *iv, unsigned ivlen) static void cipher_reset (gcry_cipher_hd_t c) { + unsigned int marks_key; + + marks_key = c->marks.key; + memcpy (&c->context.c, (char *) &c->context.c + c->spec->contextsize, c->spec->contextsize); @@ -622,6 +637,8 @@ cipher_reset (gcry_cipher_hd_t c) memset (c->u_ctr.ctr, 0, c->spec->blocksize); memset (&c->u_mode, 0, sizeof c->u_mode); c->unused = 0; + + c->marks.key = marks_key; } @@ -717,6 +734,10 @@ cipher_encrypt (gcry_cipher_hd_t c, byte *outbuf, unsigned int outbuflen, rc = _gcry_cipher_ccm_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); break; + case GCRY_CIPHER_MODE_CMAC: + rc = GPG_ERR_INV_CIPHER_MODE; + break; + case GCRY_CIPHER_MODE_STREAM: c->spec->stencrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf, inbuflen); @@ -814,6 +835,10 @@ cipher_decrypt (gcry_cipher_hd_t c, byte *outbuf, unsigned int outbuflen, rc = _gcry_cipher_ccm_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); break; + case GCRY_CIPHER_MODE_CMAC: + rc = GPG_ERR_INV_CIPHER_MODE; + break; + case GCRY_CIPHER_MODE_STREAM: c->spec->stdecrypt (&c->context.c, outbuf, (byte*)/*arggg*/inbuf, inbuflen); @@ -936,6 +961,10 @@ _gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, rc = _gcry_cipher_ccm_authenticate (hd, abuf, abuflen); break; + case GCRY_CIPHER_MODE_CMAC: + rc = _gcry_cipher_cmac_authenticate (hd, abuf, abuflen); + break; + default: log_error ("gcry_cipher_authenticate: invalid mode %d\n", hd->mode); rc = GPG_ERR_INV_CIPHER_MODE; @@ -956,6 +985,10 @@ _gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, size_t taglen) rc = _gcry_cipher_ccm_get_tag (hd, outtag, taglen); break; + case GCRY_CIPHER_MODE_CMAC: + rc = _gcry_cipher_cmac_get_tag (hd, outtag, taglen); + break; + default: log_error ("gcry_cipher_gettag: invalid mode %d\n", hd->mode); rc = GPG_ERR_INV_CIPHER_MODE; @@ -976,6 +1009,10 @@ _gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen) rc = _gcry_cipher_ccm_check_tag (hd, intag, taglen); break; + case GCRY_CIPHER_MODE_CMAC: + rc = _gcry_cipher_cmac_check_tag (hd, intag, taglen); + break; + default: log_error ("gcry_cipher_checktag: invalid mode %d\n", hd->mode); rc = GPG_ERR_INV_CIPHER_MODE; diff --git a/doc/gcrypt.texi b/doc/gcrypt.texi index 4a202dd..f30384b 100644 --- a/doc/gcrypt.texi +++ b/doc/gcrypt.texi @@ -1641,6 +1641,12 @@ Counter with CBC-MAC mode is an Authenticated Encryption with Associated Data (AEAD) block cipher mode, which is specified in 'NIST Special Publication 800-38C' and RFC 3610. + at item GCRY_CIPHER_MODE_CMAC + at cindex CMAC, Cipher-based MAC +In this mode, the block cipher algorithm becomes a CMAC message +authentication algorithm. This mode is specified in 'NIST Special Publication +800-38B' and RFC 4493. + @end table @node Working with cipher handles @@ -1670,10 +1676,10 @@ with some algorithms - in particular, stream mode (@code{GCRY_CIPHER_MODE_STREAM}) only works with stream ciphers. The block cipher modes (@code{GCRY_CIPHER_MODE_ECB}, @code{GCRY_CIPHER_MODE_CBC}, @code{GCRY_CIPHER_MODE_CFB}, - at code{GCRY_CIPHER_MODE_OFB} and @code{GCRY_CIPHER_MODE_CTR}) will work -with any block cipher algorithm. The @code{GCRY_CIPHER_MODE_CCM} will -only work with block cipher algorithms which have the block size of -16 bytes. + at code{GCRY_CIPHER_MODE_OFB}, @code{GCRY_CIPHER_MODE_CTR}) and + at code{GCRY_CIPHER_MODE_CMAC} will work with any block cipher algorithm. +The @code{GCRY_CIPHER_MODE_CCM} will only work with block cipher algorithms +which have the block size of 16 bytes. The third argument @var{flags} can either be passed as @code{0} or as the bit-wise OR of the following constants. @@ -1762,15 +1768,16 @@ call to gcry_cipher_setkey and clear the initialization vector. Note that gcry_cipher_reset is implemented as a macro. @end deftypefun -Authenticated Encryption with Associated Data (AEAD) block cipher -modes require the handling of the authentication tag and the additional -authenticated data, which can be done by using the following +Message Authentication Code (MAC) and Authenticated Encryption with Associated +Data (AEAD) block cipher modes require the handling of the authentication tag +and the authenticated data, which can be done by using the following functions: @deftypefun gcry_error_t gcry_cipher_authenticate (gcry_cipher_hd_t @var{h}, const void *@var{abuf}, size_t @var{abuflen}) -Process the buffer @var{abuf} of length @var{abuflen} as the additional -authenticated data (AAD) for AEAD cipher modes. +Process the buffer @var{abuf} of length @var{abuflen} as the authenticated +data for MAC cipher modes or the additional authenticated data (AAD) for +AEAD cipher modes. @end deftypefun diff --git a/src/gcrypt.h.in b/src/gcrypt.h.in index 2742556..2d27fdb 100644 --- a/src/gcrypt.h.in +++ b/src/gcrypt.h.in @@ -886,7 +886,8 @@ enum gcry_cipher_modes GCRY_CIPHER_MODE_OFB = 5, /* Outer feedback. */ GCRY_CIPHER_MODE_CTR = 6, /* Counter. */ GCRY_CIPHER_MODE_AESWRAP= 7, /* AES-WRAP algorithm. */ - GCRY_CIPHER_MODE_CCM = 8 /* Counter with CBC-MAC. */ + GCRY_CIPHER_MODE_CCM = 8, /* Counter with CBC-MAC. */ + GCRY_CIPHER_MODE_CMAC = 9 /* Cipher-based MAC. */ }; /* Flags used with the open function. */ diff --git a/tests/basic.c b/tests/basic.c index 21af21d..0cf31bf 100644 --- a/tests/basic.c +++ b/tests/basic.c @@ -1909,6 +1909,292 @@ check_ccm_cipher (void) static void +check_mac_cipher (void) +{ + static const struct tv + { + int algo; + int mode; + int klen; + const char *key; + struct { + int dlen; + const char *data; + int tlen; + const char *tag; + } data[MAX_DATA_LEN]; + } tv[] = { + /* CMAC AES and DES test vectors from + http://web.archive.org/web/20130930212819/http://csrc.nist.gov/publications/nistpubs/800-38B/Updated_CMAC_Examples.pdf */ + { GCRY_CIPHER_AES, GCRY_CIPHER_MODE_CMAC, + 16, + "\x2b\x7e\x15\x16\x28\xae\xd2\xa6\xab\xf7\x15\x88\x09\xcf\x4f\x3c", + { { + 0, + "", + 16, + "\xbb\x1d\x69\x29\xe9\x59\x37\x28\x7f\xa3\x7d\x12\x9b\x75\x67\x46" + }, { + 16, + "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96\xe9\x3d\x7e\x11\x73\x93\x17\x2a", + 16, + "\x07\x0a\x16\xb4\x6b\x4d\x41\x44\xf7\x9b\xdd\x9d\xd0\x4a\x28\x7c" + }, { + 40, + "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96\xe9\x3d\x7e\x11\x73\x93\x17\x2a" + "\xae\x2d\x8a\x57\x1e\x03\xac\x9c\x9e\xb7\x6f\xac\x45\xaf\x8e\x51" + "\x30\xc8\x1c\x46\xa3\x5c\xe4\x11", + 16, + "\xdf\xa6\x67\x47\xde\x9a\xe6\x30\x30\xca\x32\x61\x14\x97\xc8\x27" + }, { + 64, + "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96\xe9\x3d\x7e\x11\x73\x93\x17\x2a" + "\xae\x2d\x8a\x57\x1e\x03\xac\x9c\x9e\xb7\x6f\xac\x45\xaf\x8e\x51" + "\x30\xc8\x1c\x46\xa3\x5c\xe4\x11\xe5\xfb\xc1\x19\x1a\x0a\x52\xef" + "\xf6\x9f\x24\x45\xdf\x4f\x9b\x17\xad\x2b\x41\x7b\xe6\x6c\x37\x10", + 16, + "\x51\xf0\xbe\xbf\x7e\x3b\x9d\x92\xfc\x49\x74\x17\x79\x36\x3c\xfe" + } } }, + { GCRY_CIPHER_AES, GCRY_CIPHER_MODE_CMAC, + 24, + "\x8e\x73\xb0\xf7\xda\x0e\x64\x52\xc8\x10\xf3\x2b\x80\x90\x79\xe5" + "\x62\xf8\xea\xd2\x52\x2c\x6b\x7b", + { { + 0, + "", + 16, + "\xd1\x7d\xdf\x46\xad\xaa\xcd\xe5\x31\xca\xc4\x83\xde\x7a\x93\x67" + }, { + 16, + "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96\xe9\x3d\x7e\x11\x73\x93\x17\x2a", + 16, + "\x9e\x99\xa7\xbf\x31\xe7\x10\x90\x06\x62\xf6\x5e\x61\x7c\x51\x84" + }, { + 40, + "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96\xe9\x3d\x7e\x11\x73\x93\x17\x2a" + "\xae\x2d\x8a\x57\x1e\x03\xac\x9c\x9e\xb7\x6f\xac\x45\xaf\x8e\x51" + "\x30\xc8\x1c\x46\xa3\x5c\xe4\x11", + 16, + "\x8a\x1d\xe5\xbe\x2e\xb3\x1a\xad\x08\x9a\x82\xe6\xee\x90\x8b\x0e" + }, { + 64, + "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96\xe9\x3d\x7e\x11\x73\x93\x17\x2a" + "\xae\x2d\x8a\x57\x1e\x03\xac\x9c\x9e\xb7\x6f\xac\x45\xaf\x8e\x51" + "\x30\xc8\x1c\x46\xa3\x5c\xe4\x11\xe5\xfb\xc1\x19\x1a\x0a\x52\xef" + "\xf6\x9f\x24\x45\xdf\x4f\x9b\x17\xad\x2b\x41\x7b\xe6\x6c\x37\x10", + 16, + "\xa1\xd5\xdf\x0e\xed\x79\x0f\x79\x4d\x77\x58\x96\x59\xf3\x9a\x11" + } } }, + { GCRY_CIPHER_AES, GCRY_CIPHER_MODE_CMAC, + 32, + "\x60\x3d\xeb\x10\x15\xca\x71\xbe\x2b\x73\xae\xf0\x85\x7d\x77\x81" + "\x1f\x35\x2c\x07\x3b\x61\x08\xd7\x2d\x98\x10\xa3\x09\x14\xdf\xf4", + { { + 0, + "", + 16, + "\x02\x89\x62\xf6\x1b\x7b\xf8\x9e\xfc\x6b\x55\x1f\x46\x67\xd9\x83" + }, { + 16, + "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96\xe9\x3d\x7e\x11\x73\x93\x17\x2a", + 16, + "\x28\xa7\x02\x3f\x45\x2e\x8f\x82\xbd\x4b\xf2\x8d\x8c\x37\xc3\x5c" + }, { + 40, + "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96\xe9\x3d\x7e\x11\x73\x93\x17\x2a" + "\xae\x2d\x8a\x57\x1e\x03\xac\x9c\x9e\xb7\x6f\xac\x45\xaf\x8e\x51" + "\x30\xc8\x1c\x46\xa3\x5c\xe4\x11", + 16, + "\xaa\xf3\xd8\xf1\xde\x56\x40\xc2\x32\xf5\xb1\x69\xb9\xc9\x11\xe6" + }, { + 64, + "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96\xe9\x3d\x7e\x11\x73\x93\x17\x2a" + "\xae\x2d\x8a\x57\x1e\x03\xac\x9c\x9e\xb7\x6f\xac\x45\xaf\x8e\x51" + "\x30\xc8\x1c\x46\xa3\x5c\xe4\x11\xe5\xfb\xc1\x19\x1a\x0a\x52\xef" + "\xf6\x9f\x24\x45\xdf\x4f\x9b\x17\xad\x2b\x41\x7b\xe6\x6c\x37\x10", + 16, + "\xe1\x99\x21\x90\x54\x9f\x6e\xd5\x69\x6a\x2c\x05\x6c\x31\x54\x10" + } } }, + { GCRY_CIPHER_3DES, GCRY_CIPHER_MODE_CMAC, + 24, + "\x8a\xa8\x3b\xf8\xcb\xda\x10\x62\x0b\xc1\xbf\x19\xfb\xb6\xcd\x58" + "\xbc\x31\x3d\x4a\x37\x1c\xa8\xb5", + { { + 0, + "", + 8, + "\xb7\xa6\x88\xe1\x22\xff\xaf\x95" + }, { + 8, + "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96", + 8, + "\x8e\x8f\x29\x31\x36\x28\x37\x97" + }, { + 20, + "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96\xe9\x3d\x7e\x11\x73\x93\x17\x2a" + "\xae\x2d\x8a\x57", + 8, + "\x74\x3d\xdb\xe0\xce\x2d\xc2\xed" + }, { + 32, + "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96\xe9\x3d\x7e\x11\x73\x93\x17\x2a" + "\xae\x2d\x8a\x57\x1e\x03\xac\x9c\x9e\xb7\x6f\xac\x45\xaf\x8e\x51", + 8, + "\x33\xe6\xb1\x09\x24\x00\xea\xe5" + } } }, + { GCRY_CIPHER_3DES, GCRY_CIPHER_MODE_CMAC, + 24, + "\x4c\xf1\x51\x34\xa2\x85\x0d\xd5\x8a\x3d\x10\xba\x80\x57\x0d\x38" + "\x4c\xf1\x51\x34\xa2\x85\x0d\xd5", + { { + 0, + "", + 8, + "\xbd\x2e\xbf\x9a\x3b\xa0\x03\x61" + }, { + 8, + "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96", + 8, + "\x4f\xf2\xab\x81\x3c\x53\xce\x83" + }, { + 20, + "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96\xe9\x3d\x7e\x11\x73\x93\x17\x2a" + "\xae\x2d\x8a\x57", + 8, + "\x62\xdd\x1b\x47\x19\x02\xbd\x4e" + }, { + 32, + "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96\xe9\x3d\x7e\x11\x73\x93\x17\x2a" + "\xae\x2d\x8a\x57\x1e\x03\xac\x9c\x9e\xb7\x6f\xac\x45\xaf\x8e\x51", + 8, + "\x31\xb1\xe4\x31\xda\xbc\x4e\xb8" + } } }, + /* CMAC Camellia test vectors from + http://tools.ietf.org/html/draft-kato-ipsec-camellia-cmac96and128-05 */ + { GCRY_CIPHER_CAMELLIA128, GCRY_CIPHER_MODE_CMAC, + 16, + "\x2b\x7e\x15\x16\x28\xae\xd2\xa6\xab\xf7\x15\x88\x09\xcf\x4f\x3c", + { { + 0, + "", + 16, + "\xba\x92\x57\x82\xaa\xa1\xf5\xd9\xa0\x0f\x89\x64\x80\x94\xfc\x71" + }, { + 16, + "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96\xe9\x3d\x7e\x11\x73\x93\x17\x2a", + 16, + "\x6d\x96\x28\x54\xa3\xb9\xfd\xa5\x6d\x7d\x45\xa9\x5e\xe1\x79\x93" + }, { + 40, + "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96\xe9\x3d\x7e\x11\x73\x93\x17\x2a" + "\xae\x2d\x8a\x57\x1e\x03\xac\x9c\x9e\xb7\x6f\xac\x45\xaf\x8e\x51" + "\x30\xc8\x1c\x46\xa3\x5c\xe4\x11", + 16, + "\x5c\x18\xd1\x19\xcc\xd6\x76\x61\x44\xac\x18\x66\x13\x1d\x9f\x22" + }, { + 64, + "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96\xe9\x3d\x7e\x11\x73\x93\x17\x2a" + "\xae\x2d\x8a\x57\x1e\x03\xac\x9c\x9e\xb7\x6f\xac\x45\xaf\x8e\x51" + "\x30\xc8\x1c\x46\xa3\x5c\xe4\x11\xe5\xfb\xc1\x19\x1a\x0a\x52\xef" + "\xf6\x9f\x24\x45\xdf\x4f\x9b\x17\xad\x2b\x41\x7b\xe6\x6c\x37\x10", + 16, + "\xc2\x69\x9a\x6e\xba\x55\xce\x9d\x93\x9a\x8a\x4e\x19\x46\x6e\xe9" + } } } + }; + + gcry_cipher_hd_t hda; + unsigned char out[MAX_DATA_LEN]; + int i, j; + gcry_error_t err = 0; + + if (verbose) + fprintf (stderr, " Starting cipher MAC checks.\n"); + + for (i = 0; i < sizeof (tv) / sizeof (tv[0]); i++) + { + if (verbose) + fprintf (stderr, " checking MAC mode %d for %s [%i]\n", + tv[i].mode, gcry_cipher_algo_name (tv[i].algo), tv[i].algo); + + err = gcry_cipher_open (&hda, tv[i].algo, tv[i].mode, 0); + if (err) + { + fail ("MAC, gcry_cipher_open for MAC mode failed: %s\n", + gpg_strerror (err)); + continue; + } + + err = gcry_cipher_setkey (hda, tv[i].key, tv[i].klen); + if (err) + { + fail ("MAC, gcry_cipher_setkey failed: %s\n", gpg_strerror (err)); + goto next; + } + + for (j = 0; tv[i].data[j].dlen > 0; j++) + { + err = gcry_cipher_reset (hda); + if (err) + { + fail ("MAC, gcry_cipher_reset failed: %s\n", + gpg_strerror (err)); + goto next; + } + + err = + gcry_cipher_authenticate (hda, tv[i].data[j].data, + tv[i].data[j].dlen); + if (err) + { + fail ("MAC, gcry_cipher_authenticate failed: %s\n", + gpg_strerror (err)); + goto next; + } + + err = + gcry_cipher_checktag (hda, tv[i].data[j].tag, tv[i].data[j].tlen); + if (gpg_err_code (err) == GPG_ERR_CHECKSUM) + fail ("MAC, checktag mismatch entry %d:%d (checktag)\n", i, j); + + err = gcry_cipher_reset (hda); + if (err) + { + fail ("MAC, gcry_cipher_reset failed: %s\n", + gpg_strerror (err)); + goto next; + } + + err = + gcry_cipher_authenticate (hda, tv[i].data[j].data, + tv[i].data[j].dlen); + if (err) + { + fail ("MAC, gcry_cipher_authenticate failed: %s\n", + gpg_strerror (err)); + goto next; + } + + err = gcry_cipher_gettag (hda, out, tv[i].data[j].tlen); + if (err) + { + fail ("MAC, gcry_cipher_gettag failed: %s\n", + gpg_strerror (err)); + goto next; + } + + if (memcmp (tv[i].data[j].tag, out, tv[i].data[j].tlen)) + fail ("MAC, gettag mismatch entry %d:%d (gettag)\n", i, j); + } + next: + gcry_cipher_close (hda); + } + + if (verbose) + fprintf (stderr, " Completed cipher MAC checks.\n"); +} + + +static void check_stream_cipher (void) { struct tv @@ -3226,6 +3512,7 @@ check_cipher_modes(void) check_cfb_cipher (); check_ofb_cipher (); check_ccm_cipher (); + check_mac_cipher (); check_stream_cipher (); check_stream_cipher_large_block (); diff --git a/tests/bench-slope.c b/tests/bench-slope.c index 5687bf1..e821618 100644 --- a/tests/bench-slope.c +++ b/tests/bench-slope.c @@ -610,6 +610,41 @@ bench_decrypt_do_bench (struct bench_obj *obj, void *buf, size_t buflen) } } +static void +bench_authenticate_do_bench (struct bench_obj *obj, void *buf, size_t buflen) +{ + gcry_cipher_hd_t hd = obj->priv; + char tag[8]; + int err; + + err = gcry_cipher_reset (hd); + if (err) + { + fprintf (stderr, PGM ": gcry_cipher_reset failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + err = gcry_cipher_authenticate (hd, buf, buflen); + if (err) + { + fprintf (stderr, PGM ": gcry_cipher_authenticate failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } + + err = gcry_cipher_gettag (hd, tag, sizeof (tag)); + if (err) + { + fprintf (stderr, PGM ": gcry_cipher_gettag failed: %s\n", + gpg_strerror (err)); + gcry_cipher_close (hd); + exit (1); + } +} + static struct bench_ops encrypt_ops = { &bench_encrypt_init, &bench_encrypt_free, @@ -622,6 +657,12 @@ static struct bench_ops decrypt_ops = { &bench_decrypt_do_bench }; +static struct bench_ops authenticate_ops = { + &bench_encrypt_init, + &bench_encrypt_free, + &bench_authenticate_do_bench +}; + static void @@ -802,6 +843,7 @@ static struct bench_cipher_mode cipher_modes[] = { {GCRY_CIPHER_MODE_CCM, "CCM enc", &ccm_encrypt_ops}, {GCRY_CIPHER_MODE_CCM, "CCM dec", &ccm_decrypt_ops}, {GCRY_CIPHER_MODE_CCM, "CCM auth", &ccm_authenticate_ops}, + {GCRY_CIPHER_MODE_CMAC, "CMAC auth", &authenticate_ops}, {0}, }; From wk at gnupg.org Thu Oct 31 16:01:32 2013 From: wk at gnupg.org (Werner Koch) Date: Thu, 31 Oct 2013 16:01:32 +0100 Subject: [PATCH] Add CMAC mode (Cipher-based MAC) In-Reply-To: <20131031124141.22682.33976.stgit@localhost6.localdomain6> (Jussi Kivilinna's message of "Thu, 31 Oct 2013 14:41:41 +0200") References: <20131031124141.22682.33976.stgit@localhost6.localdomain6> Message-ID: <871u31mqlv.fsf@vigenere.g10code.de> On Thu, 31 Oct 2013 13:41, jussi.kivilinna at iki.fi said: > Patch adds CMAC (Cipher-based MAC) mode as defined in RFC 4493 and NIST > Special Publication 800-38B. > > Example of usage: > /* Message 1 is split to two buffers, buf1_a and buf1_b. */ > gcry_cipher_setkey(h, key, len(key)); I wonder whether it is a good idea to put MAC algorithms into the cipher module. It is a bit of surprise because a MAC has IMHO more of a hash algorithm than of a cipher algorithm. Also, we already have HMAC in gcry_md_*. Right, there is quite some overlap now with the authenticated methods. The later are however justified in gcry_cipher because their main purpose is encryption. Opinions? Salam-Shalom, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. From dbaryshkov at gmail.com Thu Oct 31 23:16:12 2013 From: dbaryshkov at gmail.com (Dmitry Eremin-Solenikov) Date: Fri, 1 Nov 2013 02:16:12 +0400 Subject: [PATCH] Add CMAC mode (Cipher-based MAC) In-Reply-To: <871u31mqlv.fsf@vigenere.g10code.de> References: <20131031124141.22682.33976.stgit@localhost6.localdomain6> <871u31mqlv.fsf@vigenere.g10code.de> Message-ID: Hello, On Thu, Oct 31, 2013 at 7:01 PM, Werner Koch wrote: > On Thu, 31 Oct 2013 13:41, jussi.kivilinna at iki.fi said: > >> Patch adds CMAC (Cipher-based MAC) mode as defined in RFC 4493 and NIST >> Special Publication 800-38B. >> >> Example of usage: >> /* Message 1 is split to two buffers, buf1_a and buf1_b. */ >> gcry_cipher_setkey(h, key, len(key)); > > I wonder whether it is a good idea to put MAC algorithms into the cipher > module. It is a bit of surprise because a MAC has IMHO more of a hash > algorithm than of a cipher algorithm. Also, we already have HMAC in > gcry_md_*. Right, there is quite some overlap now with the > authenticated methods. The later are however justified in gcry_cipher > because their main purpose is encryption. What about GMAC, which is GCM, but no encrypted text? Should it be in gcry_cipher_* space (if I enable multiple subsequent _authenticate() calls in my GCM code) or in gcry_md_* space (as HMAC is)? Maybe we should just converge gcry_md_* and gcry_cipher_* spaces? -- With best wishes Dmitry